In my previous post, I outlined the importance of SQL Server RTOs and RPOs. I also explained why regularly testing Disaster Recovery (DR) and High Availability (HA) solutions is essential to ensuring that you can actually meet business goals and needs when it comes to overall continuity.
Therefore, to help make regular continuity testing that much easier, I’m happy to officially announce Version 1.0 of SQLContinuitySIM – a totally free tool that DBAs and IT Pros can use to help verify that their HA and DR solutions are meeting RPOs/RTOs by making it easier to test and evaluate gaps in continuity.
SQLContinuitySIM is a very simple application. Point it at a SQL Server and database, have it create a ‘logging’ table, tell it how frequently you want it to log entries to that table, configure the maximum number of errors you will allow it to encounter before terminating, and then turn it on.
Once it’s logging, it will attempt to add a new, very simple, logging message to a special logging table at the interval specified.
You can then test or simulate failures, failovers, and other disaster recovery scenarios while SQLContinuitySIM is still doing its best to log regular, timed, entries into your system and then, once your HA or DR solution is successfully back online, you can then stop SQLContinuitySIM, and then query the logging table it created to get a feeling for what kind of down-time you incurred during your test.
Comparing this downtime, or gap in overall continuity, can then be compared with your RTOs and RPOs to ensure that you’re able to meet stated goals and react accordingly when you’re not.
Once you open SQLContinuitySIM, you’ll need to Specify the SQL Server and database that you wish to connect to for logging purposes.
After you’ve specified connection details, SQLContinuitySIM will turn those into a connection string – which the the connection string that it uses for testing (meaning that you can modify this connection string manually for more advanced testing needs if desired).
A Test button is provided so that you can test your connection string if you’ve modified it or just want to verify that your connection is working correctly.
Once connection details are successfully configured, you will need to create a logging table – or a table that is used to store logging information. To do this, click on the Create Logging Table button.
Then, once the logging table has been created, you’ll then want to set up an error threshold and specify how regularly SQLContinuitySIM should try to log a new entry.
Every time SQLContinuitySIM encounters an error, it will log it to the output screen. And, after SQLContinuitySIM encounters the number of errors specified as the ‘cutoff’ threshold, it will just stop logging (think of this as a bit of a failsafe). Otherwise, if SQLContinuitySIM doesn’t encounter any errors, it will just log new entries as specified, and increment the ‘Generated Logging Entries’ counter for each successful addition.
To start logging, push the Start Logging button. You can similarly stop logging in the same fashion – and you can turn logging on and off as you’d like.
There’s also a textbox that allows you to push ‘manual’ entries into the log file – such that you can log something like “Getting ready to failover” into the logging table to use as a bit of a bookmark. To use this features, just specify the text you want to enter, and push Add Entry.
Otherwise, the application should be largely self explanatory.
And, that said, while this solution IS designed to help you test and validate disaster recovery and high availability solutions IN your production environments, it should go without saying that you’ll want to schedule these tests for times when they won’t cause true disruptions in service.
What makes SQLContinuitySIM powerful is the fact that it’s simple. The connection strings that it generates when you click on the Specify button are rudimentary. But that doesn’t mean you can’t specify your own, more complex, connection strings as needed – such as in cases where you might want to test Database Mirroring by providing a Failover Partner entry as well – and so on.
Otherwise, once you’ve set it up against a given server or database, you can test some very powerful scenarios with minimal effort. Here a some sample ideas or use-cases you can employ with SQLContinuitySIM:
SQL Server’s Recovery Time Interval defaults to roughly 1 minute – which means you should, theoretically, be able to watch your databases come back online in roughly a minute or so after a restart. Are you confident that’s what you’d see in your environment? If not, you can schedule some time to restart SQL Server when you won’t cause any problems and then set up SQLContinuitySIM against one of the databases on your server to test this out. Just set the error threshold high (something like 2000), and use a fairly aggressive logging interval (say 200ms). Once that’s set up, log into your SQL Server and execute the “SHUTDOWN WITH NOWAIT” command. SQL Server should shut down immediately. Then, as soon as it shuts down, restart it. Once you see the errors stop being logged in SQLContinuitySIM you know that your target database is back up, and you can stop logging and then go into the logging table itself and check out how long your continuity down-time was during that restart.
Point SQLContinuitySIM at your cluster, set a fairly high error threshold (say 2000) and a decent logging interval (200ms), then initiate a failover. Once failover is complete (i.e., after you see SQLContinuitySIM stop announcing errors), stop logging and go check out the gap in overall continuity. Or, for extra fun, failback and then check out both of your gaps once failover and failback operations are complete.
Testing Mirroring solutions can be done, effectively, in the same fashion as testing clusters – only you’ll have to provide SQLContinuitySIM with a mirroring-enabled connection string.
Point SQLContinuitySIM at a database and then let it run or log for a little while. Add a ‘manual log entry’ – something like “Developer UPDATEs table without a WHERE clause”. Then let things run a bit longer – to simulate the time it takes for such a problem to come to light. You can then use your backups and transaction log to see if you can recover to the point in time where you made your manual log entry - (or right around that same time frame).
This is a TERRIBLY simple continuity test (and it doesn’t ensure that you could recover from an UPDATE without a WHERE clause because it doesn’t address that solution). But it does provide you with a VERY easy way to see if your backups and log files (and/or log file backups) can let you recover from a very simple disaster. Because if you’re not able to get to this point in time, then you have very real continuity problems that you need to address.
SQLContinuitySIM if Free. It requires .NET Framework 4.0.
It comes with a very simple .msi installer that will help you download and install .NET 4.0 if it’s not already installed.
Or, you can download a stand-alone version of SQLContinuitySIM that doesn’t require any installation (other than .NET 4.0).