Should I Be Using SAN Snapshots as a Backup Solution?

While at the SQL PASS Summit I was asked a question about using SAN Snapshots as a way to backup the databases on a SQL Server. The person in question was receiving a mandate from above that all SQL Server backups should be using the SAN Snapshot technology for all database backups and they wanted to know if this was a good idea or a bad idea, and what my thoughts on the topic were.

Related: Why do people purchase SAN over DAS?

My response is pretty basic. As the DBA I want to control the database backups and I don't want to be using SAN Snapshots as a backup, because no matter what any vendor says a Snapshot is not a backup. Here are some of my reasons why I wouldn't want to use SAN snapshots instead of database backups.

1. The DBA can’t control the backup and restore process.

If there’s a problem and the database needs to be restored it’s the DBA that’s going to be thrown under the bus not the storage admin. If I'm going to be the one getting the blame, I'm going to be in control of the situation.

2. There’s no option for page level restore.

If one or two database pages get corrupt do I really want to restore the entire database to the last snapshot and loose all the changes since then? What is the page become corrupt a month ago and it wasn't found until now? Now we have to loose a months worth of data to restore the corrupt page? I want the ability to restore just that page using the native page level restore features which require having actual SQL Server backups.

3. There’s no point in time restore. You are limited to the times when the snapshot was taken. If I want to roll the database to a point between two snapshots that isn’t possible.

When it comes to point in time restore I need the ability to control to which exact point in time the restore happens. Telling me that it'll be restored to what ever point in time it was when the snapshot was taken isn't good enough a lot of the time. I need to be able to restore the a specific millisecond.

4. If the LUN which the database on it fails all the backups are lost (they are snapshots not clones). And the excuse of but that won’t happen isn’t a valid excuse. Anything that can fail, will.

When you are taking snapshots you are assuming that the LUN hosting the original data will still be there. If that LUN goes away for some reason (failed disk, human error, etc.) we've just lost the snapshots as well which means we have no backups.

5. The backups are now stored on the same device as the production data. If the device fails you’ve lost access to your backups until the device is restored.

See above.

6. The backups can’t be compressed.

Those snapshots are going to get large, fast. With native SQL Server backups I've got backup compression (assuming you are running the right version and edition of SQL Server) and I've got 3rd party tools which I can use to compress the backups with.

7. As the DBA I have no control as to how many backups are kept.

While it's awesome that the storage array can keep 500 backups, that means that we are responsible for making sure that 500 additional copies of our data aren't being lost, stolen, copied to another company, copied to another server, mounted to the wrong server, etc. One of the reasons that DBAs only want a small number of backups on site at any one time is so that we don't have to keep track of so many backups and who's touching them.

8. Taking a recoverable snapshot requires pausing the IO within the SQL Server every time the snapshot is taken which can lead to inconsistent performance for the end users.

In order to snap the databases to get these database snapshots we have to checkpoint the database and pause all IO while the snapshot is being taken. For any users who are writing to the database while this is happening they will see their sessions hang for up to 10 seconds while this is happening. They then complain to the DBA that the database is slow when in fact it's the snapshot which is causing the problem.

9. If there’s a problem with a backup there’s no way to know without attaching the backup to a SQL Server, usually after there’s been a major problem. With native SQL backups I can easily restore the backups to another server to test them rolling transaction logs forward as I see fit.

When we take backups we say that you don't have a good backup until that backup has been restored. This means that someone needs to take the backup, restore it to the SQL Server and verify that the database can be restored. With the native backups I can do this very easily and roll the logs forward as much as I'd like, all without any risk of performance impact to the production systems. When taking snapshots as backups we now have to attach every snapshot to test it as each backup is totally independent. This requires attaching the snapshot to another server and attaching the databases, which depending on how much data is in the transaction logs as active and needing to be rolled forward or backward could put a lot of stress on the production disks which are being shared with the snapshot (see number 4 above).

10. If I want to encrypt the backups, I don't have that option with SAN snapshots.

Given that the database backups will at some point be leaving the secure data center they need to be encrypted so that if the tapes are lost the database backup is useless to who ever finds the tapes. As the SAN snapshots can't be encrypted this means that we have to rely on the encryption process within the tape backup vendor who may or may not be doing encryption correctly, and they may or may not put the keys in the same place as the backup. While SQL Server doesn't have an encryption option (other than TDE) as a native feature there are several third party backup products which can encrypt the database backups as they are taken which are known to be secure.

In conclusion, I usually recommend that my SQL Server clients do not use database snapshots. Hopefully if you are being pushed into using SAN based backups like this person was, you can use this as some reasons not to.

Denny

Read more blog posts from Denny Cherry at "Troubleshooting SQl Server Storage Problems."

Discuss this Blog Entry 5

on Nov 21, 2012
Hey Denny - Are you referring to crash-consistency snapshots? Or database snapshots....meaning SQL sees the snapshot as a full backup, and the backup is done via VDI and VSS? If you're talking about crash-consistency dumps on the storage subsystem, then I agree with most/all your points. If you're talking about the latter, then I'd disagree with most of your points. Thanks.
on Nov 27, 2012
I would add, clearing the transaction log for Full recovery Model as well
on Dec 6, 2012
Hello Denny, For point 3, this is not what I read on Microsoft web site: http://msdn.microsoft.com/en-us/library/cc966458.aspx PITR on split-mirror backup are apparently possible. Then on the article globally, snapshot backup is the suggest Microsoft solution for "highly business critical systems". Reference is Microsoft training material for SQL Server 2008. Thanks, Yannick.
on Dec 10, 2012
@Dimitrios Thanks for your answer. Sorry but I'm getting lost... Yes article is old but revised in May 2002 so still relevant I suppose... Yes snapshot and split-mirror are different but not that much in my opinion as in first case the disk block is copied while in second case disk block is copied only when modified on source disk block (unless a simple vector). I assume both are working with VSS unless the copy/snapshot would not be consistent so no interest... Then on Microsoft suggested solution of course it is using VSS, and for me same as split-mirror backup unless, as written above, the copy would not be consistent so no interest. Yannick.
on Dec 9, 2012
@amoericke: Denny writes: "In order to snap the databases to get these database snapshots we have to checkpoint the database and pause all IO while the snapshot is being taken. For any users who are writing to the database while this is happening they will see their sessions hang for up to 10 seconds while this is happening." Obviously, Denny is not referring to crash-consistency backup snapshots. He is referring to consistent database snapshots that are made with VSS and that SQL Server sees as full backups. So, Denny is not referring to crash-consistency backup snapshots. Crash-consistency would be unacceptable for SQL Server backups, anyway. Now, if you disagree with Denny, I would, for one, be very interested as to why you disagree and in what you have to say. @Yannick: First of all, the MSDN article you mention is from the November 2001 issue of SQL Server Magazine! Great, huh? Now, this MSDN article is not about VSS snapshots, because the technology had not been implemented yet. This MSDN article is about split-mirror backups, which have absolutely nothing to do with what Denny is referring to in the article above. The MSDN article refers to a hardware mirror volume being added and then removed from the mirror intentionally. The volume is added intentionally in order for the database to be synchronized on it. Then the volume is removed intentionally, in order to separate the database from SQL Server and provide a snapshot for it. But this snapshot has nothing to do with the Volume Shadow-copy Service (VSS) snapshots Denny is referring to. Denny is referring to a completely different technique. You also mention that the MOC for SQL Server 2008 states that snapshot backups are the suggested Microsoft solution. Of course, but Microsoft refers to VSS snapshot backups, not snapshots taken using the split-mirror backup technique.

Please or Register to post comments.

What's Troubleshooting SQL Server Storage Problems?

Practical advice, insight, and help for core SQL Server considerations.

Contributors

Denny Cherry

Denny Cherry is the owner and principal consultant for Denny Cherry & Associates Consulting and has over a decade of experience working with platforms such as Microsoft SQL Server, Hyper-V,...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×