Why is ETL So Hard?
Compañeros! Welcome back to the SQL Trail. If you’ve ever thought about improving your ETL game, this episode is for you. In Episode 43 of the SQL Data Partners Podcast I talk with Rafael Salas. Rafael is a Data Platform MVP and SQL Server Architect with over 12 years of experience building business intelligence solutions. We talk about the nuts and bolts of your data’s back-end and the data warehousing that powers the data-driven decision-making in your organization.
Rafael’s Blog
Follow Rafael on Twitter
SSIS: Design principles for robust ETL processes
Implementing a Data Warehouse with SQL Server Jump Start
Designing BI Solutions with Microsoft SQL Server
PowerBI
BIDS Helper
Red Gate SQL Prompt
Listen to Learn
- Why it’s hard for IT folks to learn ETL
- Why ETL tools alone aren’t enough to get good data
- How to transition from an ETL mindset to a data architecture mindset
- Why the best data architects spend a lot of time learning their dataset
- Where to start when you want to architect your own data
- The data warehousing method Rafael Salas recommends
- Where to find good ETL learning resources
- How Big Data will affect ETL developers
- The SQL tools Rafael uses daily
This episode is sponsored by COZYROC
Our Guest
Rafael Salas
Rafael Salas is a Business Intelligence architect with more than 14 years of experience providing data architecture solutions. He is a SQL Server MVP, MCTS, and an active member of PASS and local user groups where he regularly speaks on Power BI, ETL Architecture, and SSIS. He’s also a part time professor and program advisor for the Computer Technology Institute at Central Piedmont Community College (CPCC), participating in the Business Intelligence continuing education program and the Regional Effort to Advance Charlotte Information Technology (REACH IT) Business Intelligence program. Connect with him on Twitter and LinkedIn.
I always recommend that the first stage of any ETL project is to spend the time on very basic things. Understand your requirements and go and profile the data. Get familiar with the data.
Meet the Hosts
Carlos Chacon
With more than 10 years of working with SQL Server, Carlos helps businesses ensure their SQL Server environments meet their users’ expectations. He can provide insights on performance, migrations, and disaster recovery. He is also active in the SQL Server community and regularly speaks at user group meetings and conferences. He helps support the free database monitoring tool found at databasehealth.com and provides training through SQL Trail events.
Eugene Meidinger
Eugene works as an independent BI consultant and Pluralsight author, specializing in Power BI and the Azure Data Platform. He has been working with data for over 8 years and speaks regularly at user groups and conferences. He also helps run the GroupBy online conference.
Kevin Feasel
Kevin is a Microsoft Data Platform MVP and proprietor of Catallaxy Services, LLC, where he specializes in T-SQL development, machine learning, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL, president of the Triangle Area SQL Server Users Group, and author of the books PolyBase Revealed (Apress, 2020) and Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python (Apress, 2022). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.
Want to Submit Some Feedback?
Did we miss something or not quite get it right? Want to be a guest or suggest a guest/topic for the podcast?