Make this misunderstood recovery model work for you
Editor’s Note: This article is adapted from Chapter 9, “Database Environment Basics for Recovery,” of the new book Microsoft SQL Server 2000 High Availability (Microsoft Press, 2003), by Allan Hirt and with technical contributions by Cathan Cook, Kimberly L. Tripp, and Frank McBath.
In looking at what’s most important for averting disaster, people often get wrapped up in technology. Although you need to understand the available technologies so that you can make the best choices, a single technology rarely provides a complete solution. You need to consider many important factors in building a sound disaster-preparedness strategy, including which database recovery model is best for your environment. Your choice of recovery model can have a tremendous effect on what data you can recover after a failure—and the worst time to discover you’ve selected the wrong recovery-model strategy is in the middle of a disaster.
SQL Server 2000 introduced three database recovery models: Full, Bulk_Logged, and Simple. Microsoft created these recovery models in large part to better tie together the concepts of logging, recovery, and performance. However, many DBAs wrongly assume that these recovery models work just like pre-SQL Server 2000 database options such as SELECT INTO/Bulk Copy and Trunc. Log on Chkpt. Although there are some similarities between the recovery models and previous database options, there’s no direct correlation between them. Table 1 lists SQL Server 2000’s recovery models and debunks the common misunderstandings of how each model compares to earlier database-option settings. For example, the Full recovery model isn’t the same as not setting SELECT INTO/Bulk Copy and Trunc. Log on Chkpt. The Full model logs information in a new and, for some operations, more extensive way. You might have learned this the hard way as you watched your batch operations take more time and use more log space than they did in previous releases. But probably the most misunderstood recovery model is Bulk_Logged, which provides similar performance to setting the SELECT INTO/Bulk Copy option but provides different recovery capabilities. Let’s get a clearer picture of how recovery works with Bulk_Logged, see why you might want to switch between Full and Bulk_Logged recovery models, then explore a Bulk_Logged vulnerability that you shouldn’t overlook.
Understanding the Bulk_Logged Recovery Model
The Bulk_Logged recovery model, which minimally logs certain operations, allows some operations to run more efficiently than the Full recovery model, which logs every operation fully. Bulk_ Logged logs only the extents modified during an operation, keeping the active log small and possibly letting you have a smaller defined transaction-log size than the Full recovery model allows. To be able to recover the operation, you need to back up the transaction log immediately upon the completion of any bulk-logged operation.
When you’re in Bulk_Logged mode, SQL Server backs up the transaction log in two steps. First—and this is the big difference between using the Bulk_Logged recovery model instead of the Full recovery model—SQL Server backs up all the extents that the bulk operations modified (I list the specific commands defined as "bulk" in a moment). Second, SQL Server backs up the transaction log as it would during a log backup in the Full recovery model. This two-step process is similar in concept to how a differential backup works, except that with Bulk_Logged, SQL Server backs up only the extents that the bulk operation changed.
The Bulk_Logged model lets some operations occur quickly and with minimal logging (only a bitmap of the changed extents is maintained through the operation), but your recovery options are limited. First, if you perform a bulk operation, the transaction-log backup doesn’t allow point-in-time recovery during a restore. Second, if the data portion of the database isn’t accessible (because, for example, the disks failed), a transaction-log backup isn’t possible after a bulk operation. Bulk_Logged minimally logs the following bulk operations:
- Index creation or rebuilds
- Bulk loading of data (fast load), including (but not limited to) BULK INSERT, Data Transformation Services (DTS) Bulk Load, and bulk copy program (bcp) operations
- SELECT INTO operations when creating permanent tables
- WRITETEXT and UPDATETEXT operations for binary large object (BLOB) manipulation
Technically, you can still have point-in-time and up-to-the-minute recovery when running in Bulk_Logged mode, but this is possible only when bulk-logged operations haven’t occurred since the last transaction-log backup. And the process can be confusing and error-prone. Instead of running in Bulk_Logged recovery mode all the time, changing between the Full and Bulk_Logged recovery models as part of your batch processes might be your best strategy. By switching between these models, you can force transaction-log backups to occur at the most appropriate times, minimizing the potential for data loss.
You still must perform log backups immediately after a batch operation to ensure that all your data is recoverable. Consider this time-line scenario:
- 12:00 A.M.—Transaction-log backup occurs (transaction-log backups occur hourly).
- 12:10 A.M.—Batch operation begins.
- 12:20 A.M.—Batch operation completes.
- 12:47 A.M.—Database becomes suspect due to drive failure.
- 12:50 A.M.—You become aware of the suspect database. You try to access the tail of the transaction log, but you receive the following errors:
Server: Msg 4216, Level 16, State 1, Line 1 Minimally logged operations cannot be backed up when the database is unavailable. Server: Msg 3013, Level 16, State 1, Line 1 BACKUP LOG is terminating abnormally.
At 12:50 A.M., all you can do is restore the database and the logs up to 12:00 A.M. If you had backed up the log at 12:20 A.M., your database would not have been in a bulk-logged state (regardless of whether or not you were running in the Bulk_Logged recovery model setting). You can back up the tail of the transaction log when you’re running in Bulk_Logged mode only if no bulk operations have occurred. By backing up the transaction log immediately after a bulk operation, you are in effect resetting the bulk-logged state so that transaction-log backups can be performed without requiring access to the data portion of the database. So if the database hadn’t been in a bulk-logged state at 12:50 A.M., you would have been able to back up the tail of the transaction log. If the tail of the log had been accessible, you would have had up-to-the-minute recovery and no data loss. Instead, you lose all activity since 12:00 A.M.
Let’s take this concept further by looking at another scenario. Let’s say the database becomes corrupt at 12:15 A.M., in the middle of the batch operation. You know that the tail of the transaction log isn’t accessible because you’re in the process of a bulk operation in the Bulk_Logged recovery model. As we just saw, your data loss is everything past 12:00 A.M. But you could have prevented some—and possibly all—of this data loss. Performing a transaction-log backup at 12:10 A.M., when the database was accessible (right before the bulk operation began), would have at least brought you up to 12:10 A.M., the moment before the bulk operation. If the bulk operation were the only operation occurring from 12:10 A.M. to 12:15 A.M. (when the database became corrupt), you could use the transaction-log backup to bring the database up to 12:10 A.M. Once recovered to 12:10 A.M., you could execute the bulk operation again to bring the database up to the time of the failure and continue it moving forward.
It’s crucial that you back up your transaction log immediately before performing a batch operation and immediately after performing a batch operation. Doing both minimizes the overall potential for data loss in the event of a failure. Remember that if the database is set to the Bulk_Logged recovery model and you’ve performed a bulk operation, you cannot back up the tail of the log even if the transaction-log file is accessible. If you haven’t performed a bulk operation, you can back up the log. Because log backups are possible in the Bulk_Logged recovery model, some people might consider always running in Bulk_Logged mode. However, always running in Bulk_Logged mode can be dangerous because you’re no longer entirely in control of the recovery. Performing bulk operations isn’t necessarily limited to DBAs or systems administrators. Anyone who owns a table can create or rebuild indexes on their tables, anyone with Create Table permissions can use SELECT INTO to create a permanent table, and anyone who has access to text data can manipulate it with WRITETEXT and UPDATETEXT.
Because you don’t control who performs bulk operations or when, it’s important to know and limit when operations are fully or minimally logged. If you’re responsible for data recovery and your environment can’t afford data loss, the only way to minimize data loss is by running in the Full recovery model and controlling changes through the Bulk_Logged recovery model. And you should switch to Bulk_Logged only if and when it’s appropriate for your environment. In some environments, switching might never be appropriate.
If your databases aren’t processing transactions around the clock or if you’re willing to have work-loss exposure to achieve better performance of your batch operations, you might consider a temporary change to the Bulk_Logged recovery model. The best practice, if you determine that you can periodically change to the Bulk_Logged recovery model, is to change recovery models within batch processes, which usually occur after normal working hours. This practice ensures that the window of potential work loss is limited to the safest times of the day. When you change to Bulk_ Logged during the batch operation, perform a log backup before the switch, then change back to the Full recovery model when the operation is complete, performing a log backup after the switch. To see how to programmatically and optimally change recovery models during a bulk operation, download the script DB Alter for Batch Operation.sql by clicking Download the Code.
In addition, you might consider breaking large or complex batch operations into smaller, simpler batch operations to keep the transaction log from growing excessively large. To minimize the potential for data loss (because you can’t back up the tail of the log if the database becomes suspect), consider performing log backups during the batch process and between some steps of the bulk operations. Breaking down large or complex operations and performing log backups between the larger steps gives you more recovery options.
A Recovery-Model Test
To see how your choice of recovery model can affect the database’s transaction-log size (the amount of space required to log the operation), the transaction-log backup size, and the operation’s speed let’s look at a simple test that uses the SELECT INTO operation to create a new table called TestTable, based on a table called charge from another database. The charge table has 800,000 rows, and the data is roughly 40MB in size.
As Table 2 shows, the Simple and Bulk_Logged recovery models seem to provide the same performance and the same active log size. However, recovery models don’t affect all operations. In a second test, I performed an UPDATE operation against all 800,000 rows in TestTable. As Table 3 shows, compared to the SELECT INTO operation, the UPDATE operation caused the transaction log for all three databases to grow significantly to handle the modification, but the UPDATE operation’s duration and the size of the transaction-log backup (where a transaction-log backup was permitted) were the same for all the recovery models.
In looking at the test results in Tables 2 and 3, you might think that Bulk_Logged is the best recovery model to use because it seems to allow transaction-log backups and because the operations affected by recovery models run faster than with the Full recovery model. However, remember that the transaction log isn’t always available for a transaction-log backup when you’re running the Bulk_Logged recovery model. If the device on which the data resides isn’t available when SQL Server attempts a transaction-log backup, SQL Server can’t perform the transaction-log backup, resulting in data loss. Thus, up-to-the-minute recovery isn’t always possible with Bulk_Logged. Take some time to get familiar with the different recovery models and their trade-offs and determine out how they affect speed, logging, and recovery for your production databases.