Question: Why does SQL Server tend to use allocation order scans when NOLOCK or TABLOCK is specified?

With the TABLOCK hint it’s obvious—when the table is fully locked, no data changes are allowed so no data movement is possible, so it’s safe to use allocation order scans and still guarantee consistency.

There might be cases where based on cost estimations SQL Server proactively decides to acquire a table or index lock (aka rowset lock) without a hint. In those cases, the implications are the same as when specifying the TABLOCK hint. You can use a trace to figure out what kinds of locks were taken.

With the NOLOCK hint (or setting the isolation level of the session to READ UNCOMMITTED) you tell SQL Server that you don’t expect consistency, so there are no guarantees. Bear in mind though that “inconsistent data” does not only mean that you might see uncommitted changes that were later rolled back, or data changes in an intermediate state of the transaction. It also means that in a simple query that scans all table/index data SQL Server may lose the scan position, or you might end up getting the same row twice.

I’d like to make a short side note before demonstrating inconsistency issues. In SQL Server 2000, my tests showed that allocation order scans were used when NOLOCK or TABLOCK were specified regardless of table size. However, in SQL Server 2005, even with the hints, index order scans were used up to a table size of 64 pages; from this point and beyond allocation order scans were used when one of the hints was specified. The reasoning behind the change in SQL Server 2005 is that the cost of initiating an unordered scan is quite high. Unordered scans are performed on big indexes where the hope is that the benefit outweighs the cost. To see this yourself, you can play with the table size by revising the number of iterations of the loop in the script I provided to create and populate T1. With 300 iterations the table size is in the area of 100 pages. If you change the number of iteration to a small enough value (e.g., 100), the table size will be smaller than 64 pages and then SQL Server 2005 will perform an index order scan even when you specify the NOLOCK or TABLOCK hints.

Next, I’ll demonstrate that you can get the same row twice when using the NOLOCK hint. I’ll recreate the table T1 such that the clustered index on col1 will be defined as a unique index with the option IGNORE_DUP_KEY. This means that duplicate values cannot exist in col1, and also that an attempt to insert a duplicate key will not fail the transaction and generate an error rather just generate a warning. I’ll insert rows with random values in col1 in an infinite loop that breaks as soon as it gets a signal from another session. The signal is in the form of non-existence of a global temporary table called ##DupsNotFound created earlier. From another session, in an infinite loop, I’ll query T1 with the NOLOCK hint, copying the data aside to a temporary table called #T. I’ll check if there’s any col1
value that appears more than once in the temp table (meaning that the same row was read more than once from T1), and if there is, I’ll drop the global temporary table ##DupsNotFound. The non-existence of the global temporary table ##DupsNotFound is a signal to both sessions to break from the infinite loop since we confirmed our suspicions.

From one session run the following code to recreate T1 and insert rows:

SET NOCOUNT ON;

USE testdb;

GO

IF OBJECT_ID('dbo.T1', 'U') IS NOT NULL

  DROP TABLE dbo.T1;

GO

CREATE TABLE dbo.T1

(

  col1   INT NOT NULL,

  filler CHAR(2000) NOT NULL DEFAULT('a')

);

CREATE UNIQUE CLUSTERED INDEX idx_cl_col1 ON dbo.T1(col1)

  WITH IGNORE_DUP_KEY;

GO


IF OBJECT_ID('tempdb..##DupsNotFound', 'U') IS NOT
NULL

  DROP TABLE ##DupsNotFound;

GO

CREATE TABLE ##DupsNotFound(col1 INT);

GO


WHILE OBJECT_ID('tempdb..##DupsNotFound', 'U') IS
NOT NULL

  INSERT INTO dbo.T1(col1)

    SELECT 1 + ABS(CHECKSUM(NEWID()) % 1000000000);

From another session run the following code in text output mode to read the data from T1 and check whether the same row was read more than once:

SET NOCOUNT ON;

USE testdb;

GO

WAITFOR DELAY '00:00:05';


WHILE OBJECT_ID('tempdb..##DupsNotFound', 'U') IS
NOT NULL

BEGIN

  IF OBJECT_ID('tempdb..#T', 'U') IS NOT NULL

    DROP TABLE #T;


  SELECT col1 INTO #T

  FROM dbo.T1 WITH (NOLOCK);


  SELECT col1, COUNT(*) AS cnt

  FROM #T

  GROUP BY col1

  HAVING COUNT(*) > 1;


  IF @@ROWCOUNT > 0

    DROP TABLE ##DupsNotFound;

END

You may have guessed that the reason I entered a delay of 5 seconds is to allow T1 to grow big enough so that SQL Server will consider an allocation order scan. After a few seconds I got the following result from the last iteration of the loop:

col1        cnt

----------- -----------

782866256   2

783744406   2

As you can see, there are two rows that were read twice due to page splits. This is proof that a query with the NOLOCK hint may return the same row more than once, and brings a whole new perspective to the meaning of inconsistent reads.

Is there a difference in performance between allocation order scans and index order scans?

This depends on the level of logical fragmentation of the index. The higher is the fragmentation, the faster is an allocation order scan than an index order scan.

To demonstrate this, first run the following code to recreate and populate T1 with 50,000 rows. It should take a few minutes for the script to run. The script inserts a row at a time (after creating the index) with random col1 values in the range 1 through 1,000,000, introducing a high level of fragmentation:

SET NOCOUNT ON;

USE master;

GO

IF DB_ID('testdb') IS NULL

  CREATE DATABASE testdb;

GO

USE testdb;

GO

IF OBJECT_ID('dbo.T1', 'U') IS NOT NULL

  DROP TABLE dbo.T1;

GO

CREATE TABLE dbo.T1

(

  col1   INT NOT NULL,

  filler CHAR(2000) NOT NULL DEFAULT('a')

);

CREATE CLUSTERED INDEX idx_cl_col1 ON dbo.T1(col1);

GO


DECLARE @i AS INT;

SET @i = 1;

WHILE @i
BEGIN

  INSERT INTO dbo.T1(col1)

    VALUES(1 + ABS(CHECKSUM(NEWID()) % 1000000));

  SET @i = @i + 1;

END

Check the level of fragmentation:

SELECT avg_fragmentation_in_percent FROM
sys.dm_db_index_physical_stats

(

  DB_ID('testdb'),

  OBJECT_ID('dbo.T1'),

  1,

  NULL,

  NULL

);

I got over 99 percents of fragmentation.

Turn on the “Discard results after execution” in SSMS so that the time it takes to generate the output will not be taken into consideration. Run the following code to perform an index order scan after clearing the data cache:

DBCC DROPCLEANBUFFERS;

SELECT * FROM dbo.T1;

This code ran on my system for 33 seconds.

Next, run the following code to perform an allocation order scan using the NOLOCK hint:

DBCC DROPCLEANBUFFERS;

SELECT * FROM dbo.T1 WITH (NOLOCK);

This code ran on my system for 11 seconds; 3 times faster than an index order scan!

Run the following code to perform an allocation order scan using the TABLOCK hint:

DBCC DROPCLEANBUFFERS;

SELECT * FROM dbo.T1 WITH (TABLOCK);

This code ran on my system for 11 seconds—same as with the NOLOCK hint.

Next, run the following code to rebuild the index, and minimize the level of logical fragmentation:

ALTER INDEX idx_cl_col1 ON dbo.T1 REBUILD

  WITH (SORT_IN_TEMPDB = ON, MAXDOP = 1);

Rerun the performance tests. Now I got 7 seconds for the index order scan (no hint specified), and 7 seconds for the allocation order scans (NOLOCK or TABLOCK hints).

Conclusions and Best Practices

The most important advice that I hope that you will carry after reading this blog entry goes back to the standard. ANSI SQL says that a query without an ORDER BY clause is not guaranteed to return the data in any particular order. If you want to guarantee that the data will be returned in a particular order, specify an ORDER BY clause. Regardless of what you know or think you know about the internals of SQL Server this should be your work-premise. There
may be access methods that you’re not aware of that are used in special circumstances, and new access methods might be introduced in future versions of SQL Server or even future service pack levels.

Another lesson I learned from this experience is that I should never trust blindly something without testing it, and this includes things that might seem simple and obvious.

As for more technical conclusions and best practices…

* Creating a clustered index on a table does not guarantee that the data is stored in the file in index key order. Data movement caused by page splits, changing index key values, and expanding dynamic columns generate logical fragmentation. When you create or rebuild an index on an existing table, SQL Server will make effort to create it in a contiguous manner (least amount of fragmentation), but there are no guarantees.

* Allocation order scans of data in the leaf level of an index are considered when you specify NOLOCK or TABLOCK (or when SQL Server decides to take a rowset lock proactively for cost-based reasons); in all other cases SQL Server uses index order scans.

* Be aware of the implications of reading data with the NOLOCK hint (or in a read uncommitted isolation). It’s not just a matter of reading uncommitted changes, or data in an intermediate state of the transaction; rather you might even get the same row twice, or SQL Server might lose the scan position during the scan.

* In a clustered table, the higher is the level of logical fragmentation, the faster is an allocation order scan than an index order scan.

Similarly a table scan against a clustered table can be slower than a table scan against a heap; that’s the case when the index is fragmented and SQL Server used an index order scan.

Hopefully, you WILL NOT be hasty to conclude the following:

  • That you should get rid of all of your clustered indexes and from now on work with heaps; there are many advantages to using clustered indexes but that’s a different discussion
  • hat from now on you should use NOLOCK in all your queries; remember that NOLOCK has serious inconsistency issues
  • That from now on you should use TABLOCK in all your queries; though this guarantees consistent data and a faster scan, it can alsocause serious blocking issues

--
BG