Do You Have a HEAP of Hurting?
Are you giving yourself a HEAP of trouble by using tables with no clustered indexes? But it is faster you say . . . because you don’t have to worry about the sorting when inserting. Our guest, Rick Lowe, suggests everything is fast until you decide to SELECT, UPDATE, or DELETE. We talk about why you may or may not want to use HEAP tables. I know I was surprised to get his take on the use of HEAP tables and I think you will be, too. Do you agree with our assessment? Let us know in the comments below.
“A heap is a table that doesn’t have a clustered index. What that really means is a table that doesn’t have a primary key. There are times when the primary key and clustered index could be different, but more often than not, they’re the same.”
“The case I try to make when I talk about them is that there might not be as many use cases for heaps as we assume. They’re great for write-only workloads doesn’t mean that they’re great for workloads where you mostly write and sometimes do other things.”
“If it’s a situation where the table is going to just kind of be loaded and then just radically, completely and totally transformed and destroyed and written somewhere else, that could possibly be a use case for a heap.”
Listen to Learn
01:09 Compañero Shout-Outs
02:03 Intro to the guest and topic
05:09 Heaps are great if your workload is write-only
06:15 Deadlocks happen & Definition of “small table”
07:35 Heaps with ETL vs ELT
11:53 The SELECT INTO use case
12:58 Are we just skirting around heaps when we’re using an identity column?
14:19 Scenarios where you might use a heap
15:48 GUID as your primary key?
16:36 What about hot page scenarios?
19:25 It’s an awful idea to create a clustered index and then drop it
22:18 A heap may be better than a really poorly designed clustered index
24:32 It is okay to just stay away from heaps, especially if you’re new to execution plans
28:11 SQL Family Questions
31:22 Closing Thoughts
Rick is a Microsoft Certified Master with more than 15 years of experience working with SQL Server. He currently works as an independent performance DBA/Developer for clients across the USA and Canada. He started his career as a database developer, but over time became more and more interested in how the database engine operated … eventually becoming more focused on operational and performance issues than code.
Rick will work with all things relational, but most enjoys helping smaller companies get better performance from MSSQL, as well as smoothing over relationships between DBA and development teams.
Rick’s blog: https://dataflowe.wordpress.com/
Meet the Hosts
With more than 10 years of working with SQL Server, Carlos helps businesses ensure their SQL Server environments meet their users’ expectations. He can provide insights on performance, migrations, and disaster recovery. He is also active in the SQL Server community and regularly speaks at user group meetings and conferences. He helps support the free database monitoring tool found at databasehealth.com and provides training through SQL Trail events.
Eugene works as an independent BI consultant and Pluralsight author, specializing in Power BI and the Azure Data Platform. He has been working with data for over 8 years and speaks regularly at user groups and conferences. He also helps run the GroupBy online conference.
Kevin is a Microsoft Data Platform MVP and proprietor of Catallaxy Services, LLC, where he specializes in T-SQL development, machine learning, and pulling rabbits out of hats on demand. He is the lead contributor to Curated SQL, president of the Triangle Area SQL Server Users Group, and author of the books PolyBase Revealed (Apress, 2020) and Finding Ghosts in Your Data: Anomaly Detection Techniques with Examples in Python (Apress, 2022). A resident of Durham, North Carolina, he can be found cycling the trails along the triangle whenever the weather's nice enough.