Why Isn’t the Mirroring Partner Timeout Working?

Q: I’ve read that I must increase the mirroring partner timeout when combining database mirroring and failover clustering. I tried a manual failover after increasing the timeout, but database mirroring immediately failed over. What am I doing wrong?

A: The mirror server and witness server verify whether the principal server is available by essentially pinging it once per second. The number of pings that have to go unanswered before the principal server is declared unavailable is the mirroring partner timeout.

The mirroring partner timeout therefore controls how long the mirror server waits for a response from the principal server. If a witness server is also configured, and the witness server agrees that it also can’t get a response from the principal server, the mirror server will initiate an automatic failover.

As an aside, there's a misconception that the witness server initiates the failover. However, that’s not true—the witness server exists solely to agree (or not) with the mirror server about the state of the principal server. When the witness server and mirror server agree, the mirror server is said to have "quorum" and can initiate the failover.

 Getting back to the question, the mirroring partner timeout only comes into play if the mirror server doesn't get a response from the principal server at all—in other words, if the Windows server hosting the principal SQL Server instance is offline for some reason.

If the Windows server is still available, but SQL Server is offline, the principal Windows server will respond to the mirror server, saying that the principal SQL Server instance is offline. This lets the mirror server initiate a failover (as long as it gets quorum with the witness server, of course).

The mirroring partner timeout value needs to be increased when combining failover clustering with database mirroring so that a local cluster failover (which means the principal server is unavailable for a time) doesn’t trigger a database mirroring failover. If, after increasing the mirroring partner timeout value in this configuration you perform a manual failover (as in your case), you're only making SQL Server unavailable. The Windows server responds to the mirror server that the principal SQL Server instance is unavailable, and the mirror server performs the failover. The way to avoid this behavior is to temporarily remove the witness server from the mirroring configuration before performing the manual cluster failover.

Discuss this Blog Entry 7

on May 18, 2011
...the plot thickens...well, curdles.

In the 'High availability with database mirroring' chapter of Rod Colledge's excellent book 'SQL Server 2008 Administration in Action', he says:

"If a mirroring principal is set up as a clustered instance, consider adjusting the mirroring session timeout value to greater than the default 10 seconds. Doing so will prevent mirroring failover during a clustering failover. Typical cluster failovers take up to 90 seconds to complete, so adjusting the mirroring timeout to at least this value is recommended. Use ALTER DATABASE x SET PARTNER TIMEOUT 90 or something similar."

This goes against my testing and, unless I've misunderstood, your article, too.

Is what Rod is saying inaccurate?







on May 18, 2011
No - he's accurate but is missing the part about a manual failover not being covered by the mirroring partner timeout, because the Windows Server is still alive to respond to the ping request - hence my article (and your testing). Thanks.
sap (not verified)
on Jul 21, 2010
Though the principal server has been completely shutdown, its not waiting for the partner timeout value that has been set. During the time resources failover to the other node of cluster, databases failover to the mirror server. Resources came up on other node in 30 secs and Partner time out was set to 120 secs. Please suggest.
on Jul 22, 2010
The Windows server was shutdown or just the SQL Server? If the latter, the mirroring failover will occur as you describe.
on May 13, 2011
I breathed a sigh of relief when I found this article as I have asked myself the same question over the last few days.

Your comment "the mirroring partner timeout only comes into play if the mirror server doesn't get a response from the principal server at all" explains a lot, including why our mirror database was failing over during our manual cluster failovers, despite the PARTNER TIMEOUT value being several minutes.

I don't think this important point is emphasised in many other articles on this topic.

What would be the recommended way "to temporarily remove the witness server from the mirroring configuration"? Perhaps stop the SQL Service on the mirror server, or could we simply "Pause" mirroring during the cluster failover and start it again afterwards?





on May 13, 2011
http://msdn.microsoft.com/en-us/library/ms191519.aspx shows the easiest way - just remove the witness before the controlled failover and put it back again afterwards.
on May 13, 2011
Perfect - Thanks, Paul.

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×