What Happens if I Drop a Clustered Index?

Q: I’ve heard that the clustered index is “the data,” but I don’t fully understand what that means. If I drop a clustered index, will I lose the data?

A: I get asked this question a lot, and I find that index structures tend to confuse people; indexes seem mysterious and, as a result, are unintentionally thought of as very complicated. A table can be stored internally with or without a clustered index. If a table doesn’t have a clustered index, it’s called a heap. If the table has a clustered index, it’s often referred to as a clustered table. When a clustered index is created, SQL Server will temporarily duplicate and sort the data from the heap into the clustered index key order (because the key defines the ordering of the data) and remove the original pages associated with the heap. From this point forward, SQL Server will maintain order logically through a doubly-linked list and a B+ tree that’s used to navigate to specific points within the data.

Related: Use CREATE INDEX's DROP_EXISTING Clause When Recreating a Clustered Index

In addition, a clustered index helps you quickly navigate to the data when queries make use of nonclustered indexes—the other main type of index that SQL Server allows. A nonclustered index provides a way to efficiently look up data in the table using a different key from the clustered index key. For example, if you create a clustered index on EmployeeID in the Employee table, then the EmployeeID will be duplicated in each nonclustered index record and used for navigation from the nonclustered indexes to retrieve columns from the clustered index data row. (This process is often known as a bookmark lookup or a Key Lookup.)

However, all of these things change if you drop the clustered index on a table. The data isn’t removed, just the maintenance of order (i.e., the index/navigational component of the clustered index). However, nonclustered indexes use the clustering key to look up the corresponding row of data, so when a clustered index is dropped, the nonclustered indexes must be modified to use another method to lookup the corresponding data row because the clustering key no longer exists.

The only way to jump directly to a record in the table without a clustered index is to use its physical location in the database (i.e., a particular record number on a particular data page in a particular data file, known as a row identifier—RID), and this physical location must be included in the nonclustered indexes now that the table is no longer clustered. So when a clustered index is dropped, all the nonclustered indexes must be rebuilt to use RIDs to look up the corresponding row within the heap.

Rebuilding all the nonclustered indexes on a table can be very expensive. And, if the clustered index is also enforcing a relational key (primary or unique), it might also have foreign key references. Before you can drop a primary key, you need to first remove all the referencing foreign keys. So although dropping a clustered index doesn’t remove data, you still need to think very carefully before dropping it.

I have a huge queue of index-related questions, so I’m sure indexing best practices will be a topic frequently covered in this column. I’ll tackle another indexing question next to keep you on the right track.

Discuss this Blog Entry 4

on Apr 26, 2010
Hi,

I wish all my teachers were like you.

Thanks again.



Baburaj (not verified)
on Jul 6, 2010
Thank you very much for this article :)
on Feb 29, 2012
Hey Kim Aree so much 'Marcos', had you've been my teacher, I probably would be writing my own DB engine :-) You guys are AWSOME!!!! Varun | DB Architect
Maninder (not verified)
on Apr 29, 2010
Well explained again Kimberly as always... Thanks

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×