Page Splits—A Quick Primer

Page splits occur when inserting—and in some cases updating—data in a database. If there's not enough room to insert the new value or deal with the ramifications of the updated value of a key, then an additional page may be added to the leaf level of the index (and depending on the state of the intermediate leaf levels in the index pages may be added there as well. Then, rows will shuffle to the left (to the left) or to the right (to the right) to compensate. Sorry, I could not help but channel my inner Cupid Shuffle song stylings there.

Voila... page split!

But what I just described is the page split we all know and loathe. What about benign page splits? They still register as page splits but in fact they occur all the time and are not necessarily evil. These page splits occur when you have an index key that is always incrementing in a single direction (such as a date stamp or identity-constrained integer or big integer column). As you fill up an index page based upon this type of key until it reaches its fill factor (hopefully, in this case, set to 0 or 100), you'll then force a page split where a new page is created at the end of the leaf level of the index and records begin to be added there on the new page until it fills and you go through the process all over again.

The important thing then is to not get sidetracked in looking at page splits as a performance concern without first eliminating benign page splits from your tuning results. That's what we're going to look at today—identifying benign page splits.

Identifying Page Splits as a Whole—The Good & The Bad

Love is a strong word and it took me a long time to even tell my wife I loved her when we were dating. Of course, I made a lot of mistakes back then—heck we celebrate our 20th anniversary next month so I must have overcome those shortcomings (though sometime ask me about the garbage disposal I gave her as our 2nd wedding anniversary present).

Since love is such a loaded word, I'm going to use the phrase greatly appreciate when it comes to describing my feelings in regards to the indexing Dynamic Management Objects (DMO) in Microsoft SQL Server. When fellow SQL Server MVP (and long-time friend) Louis Davidson and I were working out the plans for our book on performance tuning using DMOs (almost-correctly titled Performance Tuning with SQL Server Dynamic Management Views), I was sure to stake my claim to the chapter on the indexing DMOs right at the start of the process. It's one of those functions that is quite valuable with identifying page splits in your SQL Server indexes and heaps.

Below you'll see we can utilize the sys.dm_db_index_operational_stats DMF to pass in parameters for database_id, object_id, index_id and partition_id to collect valuable detailed metadata about the indexes in our environment. In this case, by passing in db_id(), NULL, NULL, NULL respectively I've instructed the query engine that I want information on all partitions for all indexes and objects in the current database:

SELECT TOP 10 SO.[object_id]
        , SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixO.partition_number
        , ixO.leaf_allocation_count
        , ixO.nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats (db_id(), NULL, NULL, NULL) AS ixO
        INNER JOIN sys.indexes SI
                ON ixO.[object_id] = SI.[object_id]
                        AND ixO.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id]
ORDER BY ixO.leaf_allocation_count DESC;

Page Split Metadata Lifetime

It's important to understand that the leaf_allocation_count and nonleaf_allocation_count values are reset upon an index rebuild as is illustrated in the following test. Here we take a snapshot of the record in sys.dm_db_index_operational_stats for an index before and after it is rebuilt:

SELECT TOP 1 SO.[object_id]
        , SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixO.partition_number
        , ixO.leaf_allocation_count
        , ixO.nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats (db_id(), 545241543, 1, 1) AS ixO
        INNER JOIN sys.indexes SI
                ON ixO.[object_id] = SI.[object_id]
                        AND ixO.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id]


ALTER INDEX dbo_AVAILABILITY_CT_clustered_idx ON cdc.dbo_AVAILABILITY_CT REBUILD;


SELECT TOP 1 SO.[object_id]
        , SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixO.partition_number
        , ixO.leaf_allocation_count
        , ixO.nonleaf_allocation_count
FROM sys.dm_db_index_operational_stats (db_id(), 545241543, 1, 1) AS ixO
        INNER JOIN sys.indexes SI
                ON ixO.[object_id] = SI.[object_id]
                        AND ixO.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id]

In both cases the first 7 columns remain the same before and after the rebuild (as they simply describe the index's properties.  You'll see that before the rebuild there was a total of 407,521 leaf level page splits that had accumulated over time:

After the index rebuild, those now are reset to 0. The non_leaf_allocation_count is adjusted to account for all the page splits that occurred in the index rebuild process at the non-leaf levels:

At this point now, any page splits that occur until the next index rebuild will begin to increment upward.

Looking at the Physical Structure of the Indexes Identified

It's important to take not just the page splits into consideration, but also the fragmentation percent and page count under advisement when looking to see if a large number of page splits is even an issue. Why wouldn't any page split be bad?  Well, let's look at the first three indexes identified as having the most leaf level page splits from our first query. We'll pass the values for the table id and index id into another Dynamic Management Function: sys.dm_db_index_physical_stats:

SELECT
        SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixP.avg_fragmentation_in_percent AS frag_pct
        , ixP.page_count
FROM sys.dm_db_index_physical_stats (db_id(), 123147484, 1, 1, 'limited') AS ixP
        INNER JOIN sys.indexes SI
                ON ixP.[object_id] = SI.[object_id]
                        AND ixP.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id];


SELECT
        SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixP.avg_fragmentation_in_percent AS frag_pct
        , ixP.page_count
FROM sys.dm_db_index_physical_stats (db_id(), 402816497, 18, 1, 'limited') AS ixP
        INNER JOIN sys.indexes SI
                ON ixP.[object_id] = SI.[object_id]
                        AND ixP.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id]


SELECT
        SO.[name] AS TABLE_NAME
        , SI.index_id
        , SI.[name] AS index_name
        , SI.fill_factor
        , SI.type_desc AS index_type
        , ixP.avg_fragmentation_in_percent AS frag_pct
        , ixP.page_count
FROM sys.dm_db_index_physical_stats (db_id(), 123147484, 77, 1, 'limited') AS ixP
        INNER JOIN sys.indexes SI
                ON ixP.[object_id] = SI.[object_id]
                        AND ixP.[index_id] = SI.[index_id]
        INNER JOIN sys.objects SO ON SI.[object_id] = SO.[object_id]

What you'll see may be a bit surprising until you look a bit closer. While page splits were considerable, they were not malignant. In each case the fill factor was set to 0, meaning that the pages were packed full. At the same time, the fragmentation percentage was very low, less than 5 percent in all cases, even though the indexes were of a moderate size. What this leads me to believe is that in these cases (or at least the first two), the indexes were built on keys that auto-incremented upward on a fairly consistent basis—perhaps based on dates or integer values that seldom inter-wove into existing values. It happened occasionally (because there was some level of fragmentation) but it was rare.

However, there is one more bit of information I would want to look at to make sure my assumptions are correct: the last time the indexes were rebuilt. Is it possible that we were seeing large amounts of page splits with little or no fragmentation of note in a short period of time? Certainly. Perhaps a large process loaded a great deal of data in nearly-sequentially. I'm betting, however, that what we'll see is no recent index rebuilds and that the page splits were occuring over a great deal of time. There's one way to find out, but it depends upon how you perform your index maintenance.

Identifying the Last Index Maintenance Date

I'm using Ola Hallengren's Maintenance Solution like many of my peers. If you're not familiar with his suite of maintenance tools, please take the time to check out Ola Hallengren's Maintenance tools. One of the features of this product is the CommandLog table that is available for logging all maintenance commands run against your instance if you choose the logging option. Using this table, I'm able to drill in to see the last time the indexes I'm focused on were rebuilt to get an idea of how long of a period the page splits have been incrementing:

SELECT TOP 1 ObjectName
        , IndexName
        , ExtendedInfo
        , Command
        , StartTime
        , EndTime  
FROM iDBA.dbo.CommandLog
WHERE [ObjectName] = 'Processed'
        AND IndexName = 'pkProcessed'
        AND CommandType = 'ALTER_INDEX'
        AND Command LIKE '%REBUILD%'
ORDER BY EndTime DESC;


SELECT TOP 1 ObjectName
        , IndexName
        , ExtendedInfo
        , Command
        , StartTime
        , EndTime  
FROM iDBA.dbo.CommandLog
WHERE [ObjectName] = 'TrendSumOrg'
        AND IndexName = 'ndxOwnerDate'
        AND CommandType = 'ALTER_INDEX'
        AND Command LIKE '%REBUILD%'
ORDER BY EndTime DESC;


SELECT TOP 1 ObjectName
        , IndexName
        , ExtendedInfo
        , Command
        , StartTime
        , EndTime  
FROM iDBA.dbo.CommandLog
WHERE [ObjectName] = 'Processed'
        AND IndexName = 'ndxProcessedCallHandlingFields'
        AND CommandType = 'ALTER_INDEX'
        AND Command LIKE '%REBUILD%'
ORDER BY EndTime DESC;

In running these three statements no results were returned.  Of course observing a value here is dependent on the amount of data you retain in this table before purging it (of which Ola provides a SQL Agent Job for purging records older than a specific date.)  I tend to keep my data around for 60 days.  If there was a rebuild in that date range the most recent one would be returned in these scripts.  In order to do so I also need to violate a best practice in terms of query performance and sargability.  If I don't include the last predicate of

AND Command LIKE '%REBUILD%'

I would end up getting the latest index rebuild or reorganization. Considering this table is not going to be gigantic and that I'm not running the query frequently, I'm willing to take the hit. Your situation may be different.  When I commented out the

                              AND Command LIKE '%REBUILD%'

statement I returned results that corresponded to index reorganizations that occurred within the last 60 days:

What I really like about querying the CommandLog table is that I can also bypass the query to sys.dm_db_index_physical_stats to get the fragmentation and page count information. I had to scale down the columns returned for ExtendedInfo and Command, but if I were to expand those out this is the sort of information you'd see:

ExtendedInfo returns XML for the fragmentation and page count of the index in question:

<ExtendedInfo>
  <PageCount>2904845</PageCount>
  <Fragmentation>0.478235</Fragmentation>
</ExtendedInfo>

Whereas, the Command column returns the actual command that was executed at the time of the rebuild/reorganization:

ALTER INDEX [pkProcessed] ON [IFW].[dbo].[Processed] REORGANIZE
WITH (LOB_COMPACTION = ON)

End of Part One: The Benign Page Splits

Based upon just looking at the first three indexes with the most page splits, I'd not consider fill factor an issue or the page splits of much concern. I'm looking for situations where I'm seeing high page splits and high fragmentation that typically result from mid-page splits, not the sequential splits I'm seeing here. We'll look at the bad side of splits in a future sequel to this article.