Iron out contention-causing page allocations
Last month, in "Mixed Extent Usage" (InstantDoc ID 43418), I discussed the algorithm that SQL Server 2000 and 7.0 use to allocate space to a table or index. I also introduced the undocumented DBCC EXTENTINFO command. This command can show you which extents SQL Server has allocated to an object as uniform extents (eight consecutive pages) and which pages it allocated from mixed extents—that is, one page at a time. Knowing when and how often SQL Server allocates mixed extents can help you determine whether mixed-extent allocation is causing concurrency problems. Let's examine the reasons that too many mixed-extent allocations can be a problem, then look at several possible solutions.
The Ups and Downs of Space Allocation
For effective space management, allocating single pages from mixed extents for the first few pages of a table is generally a good thing because SQL Server has no way to know how big the table will grow to be. However, single-page allocation has a drawback. If you're creating many small tables concurrently, you can encounter a special kind of contention that can be hard to troubleshoot. Brian Moran discussed this problem in his October 23, 2003, SQL Server Perspectives column, "Is Your Tempdb Stressed Out?" (InstantDoc ID 40615). Moran described a situation in which many users were running a stored procedure that created tens of thousands of temporary tables quickly in a client's tempdb database. As I mentioned last month, each object—including temporary tables—needs to allocate two pages from mixed extents when it gets its first row, so in this situation, you have tens of thousands of simultaneous requests for page allocation.
The problem occurs because SQL Server uses a special allocation page called a Secondary Global Allocation Map (SGAM) to find unallocated pages in mixed extents. I discussed SGAMs in my April 1999 column, "The New Space Management" (InstantDoc ID 5110), so I won't go into detail here about their structure. When hundreds or thousands of concurrent processes need to access the same SGAM nine times for each table (once for the table's Index Allocation Map—IAM—and once for each of the first eight data or index pages), the SGAM becomes a bottleneck. The most obvious symptom of contention on the SGAM is excessive latching, which is the locking of the physical pages as SQL Server reads them from or writes them to disk. You can detect this latching by looking at the lastwaittype column in the sysprocesses table for a value of pagelatch_io. The waitresource column in the sysprocesses row for the connection will show 2:1:1 (PFS Page) or 2:1:3 (SGAM Page) as a resource that the connection is waiting on. (For information about how to use sysprocesses' lastwaittype and waitresource columns, see my August 2003 column, "Track Down Troublemakers," InstantDoc ID 39453. For other information about using waits for performance tuning, see Tom Davidson's January 2004 article, "Opening Microsoft's Performance-Tuning Toolbox," InstantDoc ID 40925.)
Moran refers to the Microsoft article "FIX: Concurrency enhancements for the tempdb database" (http://support.microsoft.com/?id=328551), which mentions three possible solutions to the contention on SGAM pages. One solution involves applying a patch to SQL Server to change the algorithm SQL Server uses to find a page on a mixed extent. The original algorithm always looks at the first SGAM in a file, and if it finds no available mixed extents, it tries a second SGAM (if there's more than one in that file). The patch changes the algorithm to a round-robin scheme, which means that SQL Server will start by looking at a different SGAM each time a process needs a page from a mixed extent and the queue will be shorter because you have less contention for each SGAM. Microsoft incorporated this change into SQL Server 2000 Service Pack 3 (SP3), so if you've already installed that service pack, your system is running the new algorithm. Note that this fix works only if you have multiple SGAMs in a single file in tempdb. Because each SGAM services 4GB of file space, a tempdb that uses only one data file would have to be larger than 4GB.
A second solution requires that you redefine the structure of tempdb by placing it on multiple files. Each file always has at least one SGAM, so even for relatively small files—say half a gigabyte each—you'll have more than one SGAM. With a database on multiple files, SQL Server always proportionally allocates space from each file, so it automatically uses multiple SGAMs without needing a special patch. This way, you can get some relief from contention on the SGAMs even for small tempdb files.
The third solution involves using a trace flag that prevents SQL Server from allocating any data or index pages from mixed extents. After you enable this flag, the only mixed-extent allocation will be for the object's IAM page; SQL Server will allocate all other pages from uniform extents. You can see this behavior for yourself by running the code that Listing 1 shows. I used this code last month to illustrate the output of DBCC EXTENTINFO, but this time, it turns on the new trace flag before it starts populating the table. The final step of this code (at callout A) runs the DBCC EXTENTINFO command, which should show that SQL Server has allocated one uniform extent to the table and no pages from mixed extents.
One Problem, Multiple Solutions
If you're already running SQL Server 2000 SP3 and you suspect contention on SGAM pages because lots of processes show a waitresource value of 2:1:1 (PFS Page) or 2:1:3 (SGAM Page) in the sysprocesses table, you can try separating tempdb into multiple files or turning on the 1118 trace flag. The drawback to separating tempdb into multiple files is that if you're creating tens of thousands of tables quickly, you'll still have contention; it will just be spread across the SGAM pages in each of the files. Each SGAM will have less contention than if you had only one SGAM page, but you could still have enough contention to slow down your system. The drawback to using the trace flag is that your databases might end up being much larger because trace flag 1118 applies to all databases, not just tempdb. However, if it solves the problem of excessive pagelatch_io waits on the SGAM pages, this solution might be worth using. Using the undocumented command DBCC EXTENTINFO can show you how SQL Server performs its page allocations. So, you can verify that using the trace flag is producing fewer mixed-extent allocations, then run your own tests to determine how your code affects your SQL Server space management.