Bigger pages in SQL Server 7.0 accompany powerful new features and more efficient I/O

You might have heard the news: Pages in SQL Server 7.0 are no longer 2KB in size; they're 8KB. This change was necessary for Microsoft to implement SQL Server 7.0's powerful new features. This page capacity growth required other features to change in order to efficiently support the new, larger pages.

In this article, I'll describe details about how the system organizes 8KB pages for storing data rows. I'll also show you how you and your applications can benefit from the new, larger page size.

Page Structure


SQL Server 7.0 divides the 8KB (8192 bytes) on a data page into three sections, in much the same way that SQL Server 6.5 organizes a data page. Figure 1 shows the three sections of a data page in SQL Server 7.0. In SQL Server 6.5, the page structure was the same, but the sections were smaller. The first section on the SQL Server 7.0 data page is the header, which uses 96 bytes. For efficient reading, the header has three cache lines of 32 bytes each. (Figure 1 doesn't show this three-part subdivision.) The first 32-byte line includes the most frequently needed information about a page, such as the page type and some status information. The second 32-byte line includes less frequently needed information. SQL Server 7.0 doesn't use the third line but reserves the space for future use.

The data rows follow the page header. The number of data rows on a page depends on the size of the rows, but 8096 bytes remain following the header (the pages are 8192 bytes total with 96 bytes for the header). The 8096 bytes allow for big rows. Rows still can't span pages, but SQL Server 7.0 page size limits you less than SQL Server 6.5's single row limit of 1962 bytes.

The last element on the page is a row offset array, which some documentation calls a row offset table. I prefer to use the term array for this structure to avoid confusion with a relational database table. The array has a 2-byte slot for each row in the table. The slot contains the byte offset on the page in which you can locate each row. The array starts with slot 0 at the end of the page, and slot numbers increase as the array grows inward from the bottom of the page. Each row has a row number on the page, and that number is the index in the array for the start of the row. For example, if you need to retrieve row 2 from a data page, SQL Server looks in the third 2-byte slot from the end of the page and finds a byte offset. That offset is the page position of the data row. Figure 2, page 22, shows a data page and its offset array, which identifies the byte offset for each of the three rows on the page.

Advantages


What do you gain from this new bigger page size, in addition to longer rows? First, I/O is more efficient. When SQL Server 7.0 reads in a page (its basic unit of input), the read now gets four times as much data as in SQL Server 6.5. The greater number of available bytes allows longer character fields. In SQL Server 7.0, columns of type char and varchar can now be up to 8000 bytes long. That length accommodates 8000 characters, unless you are using a Unicode datatype, which has a 4000-character limit.

Disadvantages


Because the page size changes, you can't use your SQL Server 6.5 databases with SQL Server 7.0. You must re-create each database in a new physical location. For a large database, an upgrade can take a lot of time, depending on your hardware. However, Microsoft has provided a wizard to step you through the SQL Server 7.0 upgrade, and the process is non-destructive. You preserve your SQL Server 6.5 database and create a whole new SQL Server 7.0 database. Upgrading is a one-time operation. You will soon forget the time you spent on the upgrade as you start exploring the new features and improved performance of SQL Server 7.0.

Microsoft made other key changes in SQL Server 7.0 to circumvent disadvantages that would arise with the 8KB pages. For example, SQL Server 7.0 has row locking instead of page locking, which SQL Server 6.5 uses. Most people learned to live with page-level locking in SQL Server 6.5, in which the pages were 2KB in size. With 8KB pages, however, page locks mean that you lock four times as much data and potentially block many more users than when the pages were 2KB. Microsoft realized such blocking is unacceptable, and the company has implemented full row-level locking for SQL Server 7.0.

Besides the page-locking problem, another potential disadvantage of larger page sizes in SQL Server 7.0 involves page splitting. If a page in a table with a clustered index is full and SQL Server needs to insert a new row into that page, SQL Server will split the page and put half the rows onto a newly linked page. Moving the data can take four times as long in SQL Server 7.0 as in SQL Server 6.5 because each page can have four times as much data. However, in SQL Server 6.5, the split was part of the insert transaction. With SQL Server 6.5, rolling back the insert involved undoing the split and the row movement. In SQL Server 7.0, Microsoft has changed the behavior of page splits; a transaction rollback does not need to undo the split. Even if you roll back the insert, the split has already occurred, and the next process that needs to insert onto the previously full page gets a lucky break: The page that used to be full has room for a new row.

Changes in Data Rows


Let's look at some changes in the structure of the individual data rows. A few more bytes of overhead occur for rows with all fixed-length columns. If no variable-length fields occur in a row, SQL Server needs room for the data, plus 6 overhead bytes in the row, plus 1 or more bytes for the bitmap to keep track of which columns are null. For SQL Server 7.0 to mark null values as null, SQL Server stores a bitmap in each row, with one bit for each column in the table. The size of the bitmap in bytes is the number of columns divided by eight rounded up to the nearest integer. If a table has eight or fewer columns in the table, SQL Server needs 7 bytes (6 overhead bytes plus the single null bitmap byte) for the row overhead. To compute the number of rows that will fit on a page, you must take into account the 2 bytes in the row offset array for each row. To compute how many rows will fit on a page, you need to divide 8096 by the size of the fixed-length fields plus nine.

SQL Server 7.0's tracking of nulls in a special bitmap in each row gives you several advantages over SQL Server 6.5. One advantage is that type bit columns can now be null-able. Also, columns that allow nulls are no longer considered variable-length fields. Unless the column is of a variable-length datatype, a column that is null will use the full amount of space; this way, a column of type int not null will always use exactly 4 bytes, whether it contains a null or not. Because SQL Server 6.5 treated nulls as zero-length strings, you could not specify a true zero-length field, such as a character string with no characters (an empty string). In SQL Server 7.0, because nulls have their own means of storage, you can have true empty strings. A varchar(20) field, for example, can now have anywhere from 0 characters to 20 characters.

If your rows have any variable-length columns, you must take a few more things into account when determining the row length. In addition to the length of the variable-length columns, SQL Server needs 4 bytes for the first variable-length column and 2 bytes for each additional variable-length column.

Get Ready to Rebuild


The change in page size from 2KB to 8KB requires you to rebuild all your data. Microsoft has made substantial changes to how you access and manipulate data in order to accommodate the new page structures. The change provides many advantages to you as a database administrator (DBA) or developer, including longer rows, longer character columns, empty strings that are truly empty, and null columns that are truly null.