Downloads
19989.zip

Shrinking Log Files Immediately

Editor's Note: Share your SQL Server tips and tricks with other SQL Server Magazine readers. Email your ideas (400 words or fewer) to r2r@sqlmag.com. If we print your submission, you'll get $50.

You can use the DBCC Shrinkfile statement in SQL Server 7.0 to shrink the size of a specified data file or log file for a related database. But SQL Server professionals often ask me why the DBCC Shrinkfile command doesn't shrink the size of their transaction logs immediately and how to make their transaction logs smaller.

SQL Server 7.0 doesn't immediately shrink log files when you issue the DBCC Shrinkfile command. The DBCC Shrinkfile operation occurs only at checkpoints or transaction log backups. SQL Server segments each physical log file internally into a number of virtual log files (VLFs), which make up the transaction log. SQL Server marks the VLFs as truncateable either after SQL Server has backed them up or at checkpoints. At any given time, you might have VLFs with free or reusable space at the beginning, middle, and end of the log. Only when the VLFs that SQL Server marked as truncateable are at the end of the log file can the DBCC Shrinkfile operation remove the VLFs and shrink the log file. Because SQL Server can shrink a log file only to a virtual-log-file boundary, you can't shrink a log file to a size smaller than the size of a virtual log file—even if you aren't using the log file. (For more information about virtual log files, see SQL Server Books Online—BOL.)

Listing 1 shows a script that shrinks the log file immediately after you stop running the script. The script first marks a shrinkpoint, which tells the DBCC Shrinkfile (or the DBCC Shrinkdatabase) command where to shrink the log file to. The script then forces truncateable VLFs to the end of the log file and issues a BACKUP command to truncate the log. In my experience, you need to run the script for 3 to 4 minutes before stopping it manually.


Checking a Disk Controller's Write Cache


As DBAs, we need to know and monitor a disk controller's write cache configuration because the write cache often has a huge impact on database performance—especially when the database is write-intensive. Unfortunately, in many large enterprise environments, you can't always conveniently access this information because the disk systems often fall within the realm of network administrators who are in different internal organizations and because the controller tools are often vendor specific.

To get a quick feel for the controller's write cache configuration, we developed the simple T-SQL script that Listing 2, page 18, shows. This script performs 20,000 inserts into a table with a single varchar(255) column. To test a disk controller's write caching, execute this script in a user database whose log files SQL Server accesses through that controller. Run the script twice: once with the BEGIN TRAN and COMMIT TRAN statements, then with those two statements commented out. Record the elapsed time for each run.

We found that the write cache isn't available when Run 2 (i.e., the one without the user-defined transactions) lasts significantly longer than Run 1. What is a significant performance difference between the two runs? The answer depends on your disk system's configuration. In our tests (which mostly used Compaq 3200 controllers and 10,000rpm drives), when the write cache was set to zero (or disabled), Run 2 would last at least 10 times longer than Run 1. When write cache was enabled, we saw no significant performance difference between the two runs. Note that because of the small data volume that this script generates, a write cache of a few megabytes would be more than enough to hold all the user-defined transaction's log records.

The significant performance difference that appears when write cache is disabled makes sense because a SQL Server commit operation forces all log records for a transaction to the transaction log file. With the inserts lumped into the user-defined transactions, SQL Server needs to commit only 200 times instead of 20,000 times. When a commit operation requires writing the data to actual hard-disk sectors instead of writing to a controller cache, 20,000 commit operations take a lot longer than 200 commit operations, even on a high-quality disk system.

Note that if you ran this script in SQL Server 7.0's tempdb database, you would find practically no performance difference between runs in which the inserts are lumped in the user-defined transactions and runs without the user-defined transaction. However, you see dramatic performance improvements when you run this script in SQL Server 6.5's tempdb and group the inserts into transactions. These findings seem to confirm that significantly less logging occurs in SQL Server 7.0 tempdb than in SQL Server 6.5 tempdb.

This script is no substitute for tools, such as the Compaq Array Configuration utility, that give you precise and comprehensive configuration information. But this script comes in handy in many situations in which you have no convenient access to such vendor tools.


Another Way to Find Missing Records


Will Haney's "Find and Insert Missing Records" (July 2000) provides a script that finds records that are in an update table but not in a production table, then inserts those records into the production table. The script builds production and update tables to hold data for comparison, then creates a temporary table (#worktable) to hold record keys that are missing from the production table. However, you can replace the tip's script with the following statement, which uses the production and update tables from the original tip but doesn't require the additional temporary table #worktable:

INSERT INTO #production
SELECT *
FROM #update u
WHERE NOT EXISTS (SELECT 1 FROM #production WHERE field1 = u.field1)

Note that the #production and #update tables have the same structure—field1 is the primary key in both tables. If you need to only display such records, you can omit the INSERT statement and just issue

SELECT *
FROM #update u
WHERE NOT EXISTS (SELECT 1 FROM #production WHERE field1 = u.field1)