On April 28, 2011 I presented part of a free presentation on SQL Server Disaster Recovery and Prevention. This online presentation will be available (on-demand – starting the day AFTER the presentation) for 3 months after the initial delivery date, and covers core consideration for addressing database corruption and disasters in SQL Server environments.
Here’s a copy of the abstracts or overview for my session:
Despite increasing trends towards the use of SQL Server High Availability (HA) solutions to pro-actively mitigate down-time, HA solutions simply can't protect against all forms of disaster - especially database corruption. DBAs therefore need to utilize a defense-in-depth approach that properly emphasizes regularly scheduled and tested backups along with the need to regularly check for and know how to properly address problems stemming from data corruption.
In this free, hour-long Web seminar, independent SQL Server expert Michael K. Campbell will talk about best practices for using BACKUP, RESTORE, and DBCC commands to keep your organization's data protected in the event of disaster. You'll also learn how to confidently implement regular backups and address problems with database corruption in emergency scenarios - while gaining practical insight into how to mitigate down-time and decrease data loss.
You will learn the underlying technology and use of how disaster prevention tools can work better for your business. Restoration tools enable users to mount backups as live databases, get access to database backups with read and write options as well as allowing for large space savings and faster backup access.
If you have any questions on corruption or disaster recovery and prevention, I therefore recommend you take an hour and watch this presentation if you haven’t already done so.
During this presentation I blurred through a lot of technical content, and made reference to additional resources. Consequently, I’m providing links to those additional resources in this post.
Additional Background and Tutorials on Backup Basics
As part of my presentation, I covered that there wasn’t enough time to address backup basics – but that I had made 2.5+ hours of Free SQL Server Videos available that cover backups. As such, here’s a link to 2.5+ hours of free SQL Server videos on SQL Server Backups. (And I can’t stress enough, again, that unless you are very familiar with the process of executing backups and recovering databases, then you’re not prepared for disasters.)
Additional Background and information on DBCC CHECKDB (and friends)
My session BLURRED through corruption and correction techniques. Paul Randal (blog | twitter) was part of the team that built DBCC CHECKDB, and provided a phenomenal presentation on DBCC CHECKDB internals – which I recommend that you watch if you’re curious to learn more. (Note, however, that Paul is showing the ins and outs of this feature of SQL Server. Best advice still remains that you should use backups in almost ALL cases to recover from corruption – as explained in my presentation.)
PAGE_VERIFY option for SQL Server Databases
As part of my presentation, I covered how PAGE_VERIFY can be used to pro-actively detect corruption or IO problems. Setting this option is part of ALTER DATABASE and is best set to the CHECKSUM option.
MSDB’s suspect_pages table
During my second demo, I used suspect_pages to determine which pages were corrupt – to speed the restoration process. Managing this table is up to you – as outlined in books online.
SQL Server Page Restores
In my second demo, I showcased how to use the PAGE switch/option in RESTORE to target specific pages. Books Online outlines this technique here – along with guidance/caveats/etc.
Throughout my presentation I kept referring to DBCC CHECKD and ‘friends’ in the sense that DBCC CHECKDB ‘wraps’ or is the super-set of multiple DBCC commands. As with backups, you need to be familiar with these tools and commands if you’re going to be an effective DBA. Consequently, the following links within Books Online are essential reading:
And note, too, that Paul Randal provided additional information/insights/guidance on many DBCC commands back when he was working at Microsoft.
I made an oblique reference to DBCC PAGE – a non-documented feature of SQL Server that you can use to view information in your individual pages. Additional information found here.
Setting up IO Alerts
Finally, as part of my presentation, I recommended that you set up alerts for SQL Server Errors 823,824, and 825. To do this, run the following command:
USE msdb GO EXEC msdb.dbo.sp_add_alert @name = N'823 - Read/Write Failure', @message_id = 823, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1 GO EXEC msdb.dbo.sp_add_alert @name = N'824 - Page Error', @message_id = 824, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1 GO EXEC msdb.dbo.sp_add_alert @name = N'825 - Read-Retry Required', @message_id = 825, @severity = 0, @enabled = 1, @delay_between_responses = 0, @include_event_description_in = 1 GO
Then, you’ll want to zip into the SQL Server Agent, expand the Alerts node, and double-click on each of these alerts, and set up a response from the Response tab (i.e., such as setting up email alerts and so on.)
Update: I received a couple of requests for slides and some questions about the orchestration of my demo. (The demo was pretty simplistic – but I think folks are mostly curious about the order of operations for a RESTORE operation involving the PAGE parameter.)
So, I’ve included a copy of both my slides and a simple demo script in a .zip file (353KB) here.