Question: We’ve moved to using database mirroring for disaster recovery and we’ve hit a problem. We’re used to switching to the Bulk-Logged recovery model when performing index rebuilds but we can’t do that now the database is mirrored. The index rebuild is generating way too much log traffic for the mirror and so we had to stop index maintenance – but now performance is suffering! What can we do?
Answer: This is a common problem that people encounter when moving a production database to be mirrored. Although many people conduct testing of workload performance when prototyping a mirroring environment, in my experience the majority do not include maintenance operations in their testing. Regular maintenance is a part of your business cycle and so should be included in all testing scenarios.
As a quick bit of background, using the bulk_logged recovery model allows an index rebuild operation (using ALTER INDEX … REBUILD) to generate less transaction log, meaning the transaction log does not have to be as large compared to performing the operation in the full recovery model. Database mirroring only allows the full recovery model to be used, however, and so index rebuilds will be fully logged. The amount of 'extra' log that is generated can easily saturate the network link between the principal and mirror, and contribute to a very large SEND queue on the principal. If the SEND queue becomes large, it can equate to more potential for data loss (if the principal crashes) than your data-loss service level agreement can tolerate.
So what can you do?
The answer is to do staggered index maintenance using ALTER INDEX … REORGANIZE. Reorganizing an index only addresses fragmentation that exists, and is interruptible without loss of already-completed work. Rebuild, on the other hand, always builds a new index no matter how much (or how little) fragmentation exists, and if you interrupt it you get nothing – everything is rolled back.
For the larger indexes you have that are not practical to rebuild, perform the following steps:
- Day 1: Start an ALTER INDEX … REORGANIZE during your maintenance window. Let it run for an hour or so. Kill the command. It won’t roll anything back and will have made some progress through the index removing fragmentation.
- Day 2: Start the reorganize again. It doesn’t remember where it got to on day 1, but should quickly traverse the work it did on day 1 and start removing fragmentation from the next portion of the index. Kill it again after an hour or so.
- Repeat until the fragmentation level drops below whatever threshold you’re comfortable with, or just continue the day-by-day process indefinitely.
This method allows you to limit the amount of transaction log that is generated by your regular index maintenance. If you wanted to get more advanced, instead of killing the reorganize process after a certain amount of time, you could monitor how much transaction log is being generated and kill it once a threshold is reached (see this blog post for details of how to do that).