Restoring a SQL Server database after a disaster is one of the most important jobs of a systems administrator (sa). But recovery often receives less attention than its companion operation, backup. Most competent administrators know they need to regularly back up the company's mission-critical data. So, to make sure it gets finished—and because making backups is usually a straightforward operation—they assign the job of backing up to a novice member of the administration team. And there's nothing wrong with having novice SQL Server administrators perform backups as long as they use a consistent procedure.
Restore operations, however, are rarely left to novices. Still, because restoring a database isn't a daily operation, a SQL Server administrator might manage databases for years without having to perform an emergency restore. So on the day that you have to perform a restore after a disaster, the many subtleties of the restore process can take you by surprise. Unexpected glitches in the restore can mean that you're searching SQL Server Books Online (BOL) and the Microsoft Knowledge Base for solutions to problems while your entire company is waiting for you to make data available. In addition to being prepared for unexpected problems, you need to test your recovery plan. If you haven't already completely tested your recovery operations in a simulated disaster scenario, start planning such a simulation as soon as you finish reading this article.
In this article, I review various types of backup operations, including full, differential, and transaction log backups. Then, I discuss basic restore operations and describe what SQL Server does when it restores your data. In future SQL Server Magazine articles, I'll discuss details you need to know when you're moving a database to a new location that has different users. I'll also look at problems you might face when restoring an entire SQL Server system instead of just an individual user database.
Backup and Restore
Although backup is usually a straightforward operation, you need to understand what happens during different types of backups so you can plan your restore operations. When you perform a backup, you're copying data, the transaction log, or both to another, presumably safe, location. That location can be a local disk file (which you then copy to tape or other media) or tape. Although you can copy to a remote disk file, writing to a local file and using the OS file copy operations to move the file to another machine is usually more efficient. How fast and how completely you can restore backed-up data depends on the type of backup you've made and how well you've planned your restore operation. (For example, you need to plan ahead for how much space your restore will require, as I explain in the sidebar "Planning the Space for a Restore.") SQL Server 2000 supports three main types of backups: full, differential, and log. For tips about using these backup types, see Michael D. Reilly, "Top 11 Backup Tips," September 2001.
Full backup. A full database backup copies all the pages from a database to a backup device, which can be a local or network disk file, a local tape drive, or even a named pipe. SQL Server also copies the portion of the transaction log that was active while the backup was in process.
Differential backup. A differential backup copies only the extents that have changed since the last full backup. SQL Server 2000 can quickly tell which extents need to be backed up by examining a special page called the Differential Changed Map (DCM) in each file of the database. A file's DCM contains a bit for each extent in the file. Each time you make a full backup, all the bit values revert to 0. When any page in an extent is changed, the page's corresponding bit in the DCM page changes to 1. SQL Server copies the portion of the transaction log that was active during the backup. Typically, you make several differential backups between full backups, and each differential backup contains all the changes since the last full backup.
Log backup. A transaction log backup copies all the log records that SQL Server has written since the last log backup. Even if you've made full database backups, a log backup always contains all the records since the last log backup. Thus, you can restore from any full database backup as long as you have all the subsequent log backups. However, the exact behavior of the BACKUP LOG command depends on your database's recovery-model setting. If the database is using the full recovery model, the BACKUP LOG command copies the entire contents of the transaction log. In the bulk_logged recovery model, a transaction log backup copies the contents of the log and all the extents containing data pages that bulk operations have modified since the last log backup. If the database is using the simple recovery model, you can't perform a log backup because the log is truncated regularly, so no useful information is available. (For more information about recovery models, see Inside SQL Server, "Database Recovery Models," June 2000.) In a typical recovery scenario, an administrator would make a series of log backups between full database backups, with each log backup containing only the log records recorded since the last log backup.
SQL Server supports variations on these basic backup types, including file or filegroup backups, which are useful in environments that use very large databases (VLDBs). For information about file and filegroup backup and recovery, see the sidebar "Backing Up and Restoring Files and Filegroups." The more kinds of backups you make and the more frequently you make them, the more options you have for restoring a database quickly and completely. Restore operations, however, involve more work for SQL Server than backups do. When you perform a complete restore operation, SQL Server must make sure that the data in the database agrees with the transaction records in the transaction log. The process of verifying that the data and the log are in agreement is called database recovery.
Restore vs. Recovery
Database recovery and database restore are similar, but they aren't the same. A complete restore almost always includes a recovery operation, but recovery doesn't have to include a restore. Restore is a manual process of loading backed-up data that an administrator must initiate. Recovery is an automatic process that can occur at the end of a restore operation and occurs every time an administrator restarts SQL Server. Although recovery is usually part of a restore operation, the administrator controls the process, as I discuss later. (For information about a difference between the recovery process of SQL Server 2000 and 7.0, see the sidebar "Recovery in SQL Server 2000 vs. 7.0.")
During database recovery (either at the end of a restore operation or after a restart), SQL Server compares the records of transactions in the transaction log with the data in the database. Recovery performs both redo (roll-forward) and undo (rollback) operations. In a redo operation, SQL Server examines the log and verifies that each change is already in the database. If a committed transaction is recorded in the transaction log but doesn't exist in the data pages, SQL Server redoes the transaction. After a redo, SQL Server guarantees that every change that the transaction made has been applied. If parts of a transaction appear in the database but that transaction was never committed, SQL Server performs an undo operation to remove the changes that the uncommitted transaction made. After the entire recovery process is finished, the database and the transaction log are synchronized. All completed transactions recorded in the log appear in the database, and no part of an uncompleted transaction appears in the database.
Restoring a Damaged Database
If you need to restore a database because of a media failure or because of an error that destroyed or changed data, you can restore the database to the original SQL Server. To restore a database, you start by restoring the most recent full backup. If you're restoring the database because a disk is damaged, you need to restore the database to a new (undamaged) location. By default, a SQL Server restore operation creates the database if it doesn't already exist and uses the same file locations for the data and log files that the original database used. To verify those file locations, you can run the following command against the disk or tape backup file:
To create the restored database in a different location on the same SQL Server, you must use the RESTORE DATABASE command's MOVE option. (BOL contains the complete syntax for the MOVE option.) If your physical media hasn't been damaged and you're restoring only to correct user errors, you might want to restore a backed-up copy of your database to the same location on disk—right on top of your current database.
Note that no users can be in the database when you perform this operation. In SQL Server 2000, you can use the ALTER DATABASE command to change the database status to single-user mode and disconnect users from the database. (BOL contains complete documentation for ALTER DATABASE.) However, SQL Server 7.0 has no automatic way to disconnect users from a database. You can run a script that checks the sysprocesses table for connected users, issue a KILL command for each, then try to change the database to single-user mode—but you can't keep new users from logging in during the process. To try to prevent logins, you can use the OS service manager to put the server in a pause state, but this pause affects more databases than just the one you want to restore. Also remember that using Enterprise Manager to inspect your database or perform the restore uses up a connection to the database, so if you're in single-user mode, you can't simultaneously use Query Analyzer to connect to the database.
Restoring and Renaming a Database
If you want to restore a database and give it a new name, perhaps because you want an identical copy of a database on the same server, you might need to use the REPLACE option with the RESTORE DATABASE command. The REPLACE option is necessary when the database name in the backup files doesn't match the name of the database you're specifying in the RESTORE DATABASE command and the name you're specifying in the RESTORE command already exists on the SQL Server. The REPLACE option helps prevent you from accidentally overwriting one database with the backup of a different database. For example, if you have a backup of the Northwind database in a file and you try to restore the backup into the Pubs database , SQL Server won't let you perform the restore without the REPLACE option. However, if you do specify REPLACE, the backed-up data from Northwind will overwrite the original Pubs data.
You also need the REPLACE option when you use the MOVE option and specify an existing file. The MOVE option lets you use the RESTORE command to recreate a database in a different physical location, perhaps on a new, faster disk. However, if the file specified as the destination of the MOVE option already exists, SQL Server assumes that the existing file belongs to a different database. Usually, RESTORE DATABASE refuses to overwrite existing files, but REPLACE lets RESTORE DATABASE overwrite an existing file. You need to be aware of one caveat when you use the REPLACE option: Usually, the user who is running the RESTORE DATABASE command has to be only the database owner (DBO). But when you're using REPLACE, SQL Server behaves as if you're creating a new database. The user who is running the RESTORE DATABASE command with the REPLACE option must have permission to create a database or be in a group that has such permissions. In my next article, I'll discuss a few other caveats that also apply to a nonadministrator DBO running a restore operation with REPLACE.
Restoring WITH RECOVERY and WITH NORECOVERY
The RESTORE DATABASE command does two things: It copies all the data, log, and index pages from the backup media to the database files, and it applies all the transactions in the backed-up portion of the log. You must determine whether to tell SQL Server to roll back incomplete transactions. If you want rollback, you can use the WITH RECOVERY option of the RESTORE DATABASE command to recover the database. The WITH RECOVERY option rolls back incomplete transactions and opens the database for use. If you plan to restore subsequent transaction log backups, and you don't want to recover the database and have SQL Server perform rollbacks until after the last transaction log is restored, you won't need to use the WITH RECOVERY option. The RESTORE LOG command also lets you specify either WITH RECOVERY or WITH NORECOVERY.
Remember that SQL Server 2000 and 7.0 log backups don't overlap—each log backup starts where the previous one ended. Consider a transaction that makes hundreds of updates to one table. If you back up the log in the middle of the updating and again after the updating is finished, the first log backup will include the beginning of the transaction and some of the updates, and the second log backup will include the remainder of the updates and the commit. Suppose you then need to restore these log backups after restoring the full database. If you choose to restore the first log backup WITH RECOVERY, SQL Server will roll back the incomplete transaction in the first part of the log. If you then try to restore the second log backup, the restore will start in the middle of a transaction and SQL Server won't know what the beginning of the transaction did. You can't recover transactions that occurred after this large update because their operations might depend on part of the update that you lost. So, SQL Server won't let you do any more restoring. The alternative is to run WITH NORECOVERY, which leaves the transaction incomplete. SQL Server will know that the database is inconsistent and won't let any users into the database until you run recovery on it.
So, should you choose WITH RECOVERY or WITH NORECOVERY? If you use the RESTORE command to restore a database or log backup, the default restore option is WITH RECOVERY. But generally, you should use the WITH NORECOVERY option for all but the last log restore. If you make a mistake and forget to specify WITH NORECOVERY, you'll have to restart your restore operation because the database has now been recovered and incomplete transactions have been rolled back. However, if you forget to specify WITH RECOVERY for the last log restore, the fix is simple. Just use the command below to recover the database without specifying any backup device to restore from:
For more information about recovery scenarios that use different combinations of differential and log backups, see Michael D. Reilly, Certifiably SQL, "Backup Strategies."
An incomplete restore is useful if someone accidentally destroys crucial data with an UPDATE or DELETE command and you want to just restore the database to the point before the data was destroyed. If you follow the usual restore procedure and apply the entire last transaction log, you'll reapply the transaction that damaged your database. But SQL Server 2000 and 7.0 let you restore your database up to a specific point in time.
The STOPAT option of the RESTORE command lets you specify a point at which to stop restoring a transaction log. Because each log record includes a datetime value that shows when the transaction began, SQL Server will stop restoring transactions as soon as it encounters a transaction that happened after the specified STOPAT time. If the transaction log backup doesn't contain the requested time (e.g., the time specified is after the time that the transaction log covers), SQL Server generates a warning and the database remains unrecovered, as if you'd run RESTORE WITH NORECOVERY.
In SQL Server 7.0, you can use the STOPAT option only with the RESTORE LOG command. In SQL Server 2000, you can also use STOPAT with the RESTORE DATABASE command, and STOPAT includes log records applied after all the database pages were loaded. However, you can't use STOPAT when you're restoring differential backups because most of the work of a differential restore is replacing data that has changed and no datetime value is associated with the changed pages. In addition, you can't use STOPAT with a file or filegroup restore because all the logs must be applied completely so that the restored file or filegroup is current through the same point in time as the rest of the files in the database.
Restoring WITH STANDBY
As useful as the STOPAT option can be, it's not perfect. What happens if you know that data was destroyed, but you don't know exactly when it happened? For example, you discover at 5:00 p.m. that a crucial table was dropped during the day, but you don't know when. You'd like to restore the database to a time as close as possible to the time the data was damaged.
In a typical restore, as I mentioned earlier, you have the choice of either specifying WITH RECOVERY to roll back incomplete transactions or specifying WITH NORECOVERY. If you use WITH RECOVERY, you can't restore subsequent log backups, but the database is fully usable. If you run WITH NORECOVERY, the database might be inconsistent, in which case SQL Server won't let you use it.
But what if you had a way to combine the two approaches by restoring one log backup, then looking at the data before restoring more log backups? Such a combined approach would be particularly helpful if you're trying to do a point-in-time recovery but you don't know what the right point is.
SQL Server provides an option called STANDBY that lets you recover the database and still restore more log backups. If you restore a log backup and specify
SQL Server rolls back incomplete transactions but keeps a record of the rolled-back work in a specified file called an undo file. The default suffix for this undo file is .ldf because its structure is just like the transaction log's structure, which also uses the .ldf suffix.
The next RESTORE LOG operation reads the contents of the undo file, redoes the operations that were rolled back, then restores the log backup specified in the RESTORE LOG command. If that RESTORE LOG command also specifies WITH STANDBY, the restore again rolls back incomplete transactions but saves a record of those rolled-back transactions. After each RESTORE LOG ... WITH STANDBY operation, the data is consistent because no half-completed transactions are included in the database, so users can access the database and read the data. Thus, you can determine after restoring each log whether a particular change has already taken place. (Keep in mind that you can't modify any data if you've restored WITH STANDBY—SQL Server generates an error message if you try. But you can read the data and continue to restore more logs if you want to.) You must restore the final log WITH RECOVERY (and SQL Server won't keep an undo file) to make the database fully usable.
You can use the RESTORE WITH STANDBY option to try to track down the time that data was damaged, but the process isn't fun. After you restore a log and discover that it contains the undesired operation, you have to try to narrow down the time within that span of log records when the damage occurred. You need to go all the way back to the beginning of the restore process and use the STOPAT option to stop at some point in the middle of the time that the log backup spans. If you examine the data and it's still good, you know the damage happened at a later time; if the damage already shows up, you know the change happened at an earlier time than the middle of the log backup. You repeat this bisecting process, stopping either a little later or a little earlier in the log each time. This tedious process can help you recover to a time right before your data was damaged so that you minimize your data loss.
As an alternative, you can use Lumigent Technologies' Log Explorer 2.5, which can show you all the records in the transaction log and the time each transaction was initiated. You can use the transaction datetime information to determine the point to specify in your STOPAT option, or you can use Log Explorer's facilities to reverse the transaction. But be cautious about using Log Explorer to undo a change unless you have no other option; other changes might depend on the one you're trying to undo. If possible, use Log Explorer for information only and use SQL Server's restore capability to bring the databases back to a desired state.
SQL Server 2000 lets you perform a partial restore of a database in emergencies. Although the description and the syntax of a partial restore are similar to those of file and filegroup restores, the operations are quite different. For restoring file and filegroup backups, you start with a complete database and replace one or more files or filegroups with previously backed-up versions. (For details about file and filegroup backups and restores, see the sidebar "Backing Up and Restoring Files and Filegroups.") For a partial database restore, you don't start with a full database. You restore individual filegroups (including the primary filegroup, which contains all the system tables) to a new location. Any filegroups you don't restore no longer exist, and SQL Server refers to them as offline when you attempt to reference data stored in them.
After a partial restore, you can restore log backups or differential backups to bring the data in the restored filegroups to a later point in time. A partial restore gives you the option of recovering the data from a subset of tables after an accidental deletion or modification of table data. You can use the partially restored database to extract the data from the lost tables, then copy the data back to your original database. (For more information about partial restores, see "The Road to Recovery," September 2001.)
Prepare for the Extraordinary
You can choose from many options when restoring a database, and you have to remember lots of details. When your restore operations are successful, they seem easy and straightforward. However, if you want to do anything unusual with your restore or if the process doesn't work as you expect, knowing what SQL Server is doing during the restore process and what SQL Server expects from you can help prepare you for some of the possible extraordinary circumstances. Start now to develop your step-by-step recovery plan, and test it to make sure it behaves as expected.