PolyBase Use Cases
After Episode 180 on PolyBase came out, we received a few questions on use cases for PolyBase. Compañeros, when you ask, we reply! This episode we carve out 3 use cases for PolyBase and ways in which you might change workflows based on this SQL Server 2019 feature. If you want to dive much deeper into PolyBase, you can always check out Kevin’s book on Apress!
Listen to Learn
00:38 Intro to the team & topic
02:13 Compañero Shout-Outs
02:58 Doing ETL from a SQL Server and a Postgres server
06:03 How PolyBase fits into ELT
08:46 Predicate push-down in PolyBase V1 vs V2
10:59 When PolyBase can and can’t be thought of as an SSIS replacement
14:23 Our condolences if you have to loop over many servers to get data
15:25 You have the option of scale-out groups with PolyBase
16:49 Nearly static data that needs to be on every server – transactional replication
19:52 Modifications would be a lot easier with PolyBase
23:59 The scenario of cold storage – Kevin loves it
28:20 It all depends on architecture needs, costs, etc.
32:00 Closing Thoughts
It doesn’t have to be just Postgres, and it doesn’t have to be just one data source. This is where PolyBase as data virtualization technology can really play fairly well.
Meet the Hosts
Carlos Chacon
With more than 10 years of working with SQL Server, Carlos helps businesses ensure their SQL Server environments meet their users’ expectations. He can provide insights on performance, migrations, and disaster recovery. He is also active in the SQL Server community and regularly speaks at user group meetings and conferences. He helps support the free database monitoring tool found at databasehealth.com and provides training through SQL Trail events.
Eugene Meidinger
Eugene works as an independent BI consultant and Pluralsight author, specializing in Power BI and the Azure Data Platform. He has been working with data for over 8 years and speaks regularly at user groups and conferences. He also helps run the GroupBy online conference.
Kevin Feasel
Kevin is a Microsoft Data Platform MVP and proprietor of Catallaxy Services, LLC, where he specializes in T-SQL development, machine learning, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL, president of the Triangle Area SQL Server Users Group, and author of the books PolyBase Revealed (Apress, 2020) and Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python (Apress, 2022). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.
Want to Submit Some Feedback?
Did we miss something or not quite get it right? Want to be a guest or suggest a guest/topic for the podcast?