Big Data Solutions in the Cloud

Big Data.  Do you have big data?  What does that even mean?  In this episode I explore some of the concepts of how organizations can manage their data and what questions you might need to ask before you implement the latest and greatest tool.  I am joined by James Serra, Microsoft Cloud Architect, to get his thoughts on implementing cloud solutions, where they can contribute, and why you might not be able to go all cloud.  I am interested to see if more traditional DBAs move toward architecture roles and help their organizations manage the various types of data.  What types of issues are giving you troubles as you adopt a more diverse data ecosystem?

Episode Quotes

“People ask what is big data? I say it’s really all data.”

“By having a data lake and also keeping a relational database store, you have the best of both worlds.”

“It’s really important when you’re looking…to build this large data warehouse and a data lake is to add a lot of time in there for data governance.

“The people who really succeed in their careers were willing to take risks, to jump out of their comfort zone, to do something that they weren’t sure if they can do.”

Listen to Learn

00:41     Intro
01:07     Compañero Shout-Outs
03:03     Conference
03:49     Intro to the guest and topic
04:42     What is big data?
06:07     The difference between SMP & MPP – The cloud allows you to scale up and down
10:21     How fast is the data coming at me, and what do I need to do with it?
13:10     What does a data lake do for you?
18:11     Have a data lake and a relational database store – the best of both worlds
20:41     You need to have many tools in your kit to get the results you need
22:05     Data governance is more important than ever
24:37     How to get your data into SQL Server
30:11     Consider your team’s skills when deciding what tools to use
32:42     What is your tolerance for change? Are you open to open source?
35:05     It’s important to pick the right technology
36:24     SQL Family Questions
42:11     Closing Thoughts

Our Guest

james serra 228x300

James Serra

James is a big data and data warehousing solution architect at Microsoft.  He is a thought leader in the use and application of Big Data and advanced analytics, including solutions involving hybrid technologies of relational and non-relational data, Hadoop, MPP, IoT, Data Lake, and private and public cloud.  Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer.  He is a prior SQL Server MVP with over 30 years of IT experience.  James is a popular blogger (JamesSerra.com) and speaker, having presented at dozens of PASS events including the PASS Business Analytics conference and the PASS Summit, as well as the Enterprise Data World conference.  He is the author of the book “Reporting with Microsoft SQL Server 2012”.  He received a Bachelor of Science degree in Computer Engineering from the University of Nevada-Las Vegas.

blog URL: www.jamesserra.com

Meet the Hosts

carlos chacon headshot

Carlos Chacon

With more than 10 years of working with SQL Server, Carlos helps businesses ensure their SQL Server environments meet their users’ expectations. He can provide insights on performance, migrations, and disaster recovery. He is also active in the SQL Server community and regularly speaks at user group meetings and conferences. He helps support the free database monitoring tool found at databasehealth.com and provides training through SQL Trail events.

eugene meidinger headshot

Eugene Meidinger

Eugene works as an independent BI consultant and Pluralsight author, specializing in Power BI and the Azure Data Platform. He has been working with data for over 8 years and speaks regularly at user groups and conferences. He also helps run the GroupBy online conference.

kevin feasel headshot

Kevin Feasel

Kevin is a Microsoft Data Platform MVP and proprietor of Catallaxy Services, LLC, where he specializes in T-SQL development, machine learning, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL, president of the Triangle Area SQL Server Users Group, and author of the books PolyBase Revealed (Apress, 2020) and Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python (Apress, 2022). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.

Want to Submit Some Feedback?

Did we miss something or not quite get it right? Want to be a guest or suggest a guest/topic for the podcast?

Let's find what you're looking for