Disappearing Database Corruptions

Q: Every so often our system produces an 824 error and our weekly consistency-checking job fails. However, when I run DBCC CHECKDB during the day there are no errors. What’s going on?

A: This phenomenon is fairly common—I call it “disappearing corruptions,” although that’s a little bit of a misnomer. It can be very frustrating, especially when it happens repeatedly, to have your automated consistency checking job fail or to be alerted to an 823/824 corruption problem, but have nothing show up when you run DBCC CHECKDB manually. How can these corruptions just go away? There are two answers that explain this phenomenon: changes to the database and transient I/O subsystem problems.

When DBCC CHECKDB completes, it’s giving you information about all the data file pages that comprised the database at the time that DBCC CHECKDB started running. This is because it creates a database snapshot when it begins, in order to provide an unchanging, transactionally consistent view of the database, which makes the consistency checks much simpler to engineer. This means that if anything in the database changes between successive executions of DBCC CHECKDB, the set of database pages being checked might be different. If a page was corrupt on one execution of DBCC CHECKDB, it might have been deallocated, and therefore is no longer part of the database on the next execution of DBCC CHECKDB. This means it won’t be processed by DBCC CHECKDB, so the corruption won’t be reported any more—and will seem to have disappeared.

What causes pages to be deallocated? Common operations that can deallocate a large amount of database pages are index rebuilds, which are usually performed by a maintenance job. If DBCC CHECKDB runs before the index maintenance job, it’s entirely possible that some pages that were found to be corrupt are subsequently deallocated by index maintenance.

Is this a problem? Yes because the corruptions can’t be investigated any further, although there’s nothing that can be done about it. It’s not a bug or anything wrong with how SQL Server performs operations.

The second cause of the disappearing corruptions phenomenon is transient I/O subsystem problems. These do occur and are the reason read-retry functionality was added to SQL Server in SQL Server 2005. In SQL Server 2005 and later, if a page read fails, SQL Server will retry the read four times before raising an 823 or 824 error. You need to know that read-retry was needed because that’s an early warning that your I/O subsystem is failing. For more information, see my blog post “ A little-known sign of impending doom: error 825 .”

If a corruption problem is transient in the I/O subsystem, obviously it might not show up again the next time DBCC CHECKDB is executed. The best thing to do in this situation is to enable Agent alerts so that you know about these problems as soon as they occur. (For more information about Agent alerts, see my blog post “ Easy monitoring of high-severity errors: create Agent alerts.” )

Discuss this Blog Entry 2

on May 27, 2011
Yes - if you're going to do it as part of the same maintenance job.
on May 27, 2011
Thank you very much Paul... So hving DBCC Checkdb after Re indexing helps us right???

-Aditya

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×