Question: SQL Server has a one minute recovery interval for all databases and I’ve read the SQL Server guarantees that the databases can always be brought online after a crash within one minute. Can you confirm this? It seems unlikely to me.
Answer: The quick answer is that the recovery interval is not guaranteed in any way—it’s simply a goal.
I’ve also seen people stating that it’s an absolute value and SQL Server can always achieve it—to me this demonstrates a fundamental misunderstanding of how logging and recovery works.
Crash recovery has two tasks to perform: replaying log records from committed transactions (called REDO) and removing the effect of log records from uncommitted transactions (called UNDO).
The recovery interval is one of the triggers of a database checkpoint, which brings what’s in the data files up-to-date with what’s in the transaction log—reducing the amount of REDO recovery that needs to be done in the event of a crash. The idea is that after a crash there should only be enough REDO and UNDO to perform that it will take one minute to complete. SQL Server estimates this by counting the number of log records that have been generated for the database since the last checkpoint occurred.
One thing to note is that the recovery interval applies to all databases in the SQL Server instance. In, with the new indirect checkpoints, the recovery interval can be overridden per database with a value less than one minute.
But how can SQL Server *guarantee*any recovery interval? it simply cannot.
Consider the case where an update of millions of rows starts and continues for several hours. Checkpoints will automatically occur for the database, but if a crash occurs, the entire update transaction has to be rolled back. There is no way this can be done in under one minute if there are log records for several hours of a transaction to UNDO.
SQL Server simply has the recovery interval as a goal, which helps to drive automatic checkpoints. I’ve seen plenty of systems where the recovery interval is never hit, because of the propensity of long-running transactions on those systems.
You can read more about this in my article Understanding Logging and Recovery in SQL Server.