Transaction log corruption and backups

Question: I came across a situation where a transaction log backup failed because of corruption in the transaction log. One of my colleagues put the database offline and then online and problem went away. Can you explain why and whether log corruption can lead to a suspect database?

Answer: Transaction log corruption is interesting because it doesn’t usually cause any problems apart from failed backups. However, that doesn’t mean it should be ignored.

As I’ve stated many times, the vast majority of corruptions are caused by the I/O subsystem and a transaction log file is just as likely to be corrupted as a data file. Unless the corruption occurs in the active portion of the log (the portion that is required for some reason by SQL Server – see this article for background information) then its likely that no-one will ever know it occurred.

Related: Do log records ever move in the transaction log?

This may seem disturbing to you, and in a way it is – undiscovered corruption isn’t good – but there’s no way for SQL Server to process the entire log, only the active portion. DBCC CHECKDB doesn’t analyze the log at all any more since I rewrote it in SQL Server 2005, instead it will only use the active portion of the log as a by-product of creating a database snapshot to run the consistency checks on. There’s no consistency checking of the transaction log – only checksums that are checked as log records are read, for whatever reason.

The active portion of the log will also be used if a transaction rolls back, if crash recovery has to run, or when a transaction log backup occurs (plus a bunch of other uses like replication and mirroring – the aforementioned article has more information). If a damaged log record is encountered during one of these uses then an error will be thrown and the operation will fail.

The only time a database will be marked SUSPECT because of a corruption in the transaction log is if the corruption is encountered during crash recovery or during a transaction rollback. In that case, the operation will fail and the database will be transactionally inconsistent – the definition of a suspect database.

If a corrupt log record is encountered during a transaction log backup, the backup will fail – but that’s all. You can work around this by:

  • Switching the database to the Simple recovery model
  • Performing a checkpoint (which should clear the active log as long as nothing else requires the log to be kept active)
  • Switching back to the Full recovery model
  • Reestablishing the log backup chain by performing a full or differential backup

This isn’t guaranteed to work in all situations, but is the simplest method to have the backups essentially skip the damaged portion of the log. Just be aware that by doing this, you’re limiting your disaster recovery options if the data backups become corrupt and you want to be able to restore a long chain of log backups – you’ve broken the chain at the point you switched to Simple. However, you really have no choice as there’s no way to back up that log and be able to use the log backup in any meaningful way.

Of course, as in any corruption situation, make sure to do some root-cause analysis to figure out why the log became corrupt in the first place.

Discuss this Blog Entry 2

pc3500 (not verified)
on Jan 30, 2012
If the transaction log file still exist and we are getting error while backup, is there a way to backup as much log data as we could other than the corrupted log data ? In production we do not simply want to sacrifice the log file. e.g. one of the technent article and your own old articles links below, http://www.sqlskills.com/BLOGS/PAUL/post/CHECKDB-From-Every-Angle-EMERGENCY-mode-repair-the-very-very-last-resort.aspx http://technet.microsoft.com/en-us/magazine/dd450974.aspx
on Mar 11, 2012
I ran into this issue with one database out of 24 all residing on the same volume. Taking the database offline and bringing it back online fixed the backups until it happened again a day or two later. It turned out that this one database had auto close enabled. After disabling auto close, the problem never resurfaced. There were no signs of hardware issues or any other issues with this particular server.

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×