The Curious Case of: the failed database mirroring failover

Question: We’ve set up database mirroring for one of our critical databases with a witness server to allow automatic failover. Everything has been running fine until we had a power outage on the data center recently. Database mirroring performed the failover but people reported that the application just hung. When we manually failed back, the application worked again. Why didn’t the application failover too?

Answer: This is a reasonably common problem that people encounter when using database mirroring and a production failure like this often happens because no failover testing is conducted after database mirroring is implemented. We became involved after the failed failover.

To avoid taking production downtime, we duplicated the mirroring environment on the client’s test systems. After confirming that the application and database mirroring were working, we powered down the principal server and the application completely hung.

We checked that the mirror server had successfully initiated a failover and was online as the principal server. We also checked that the mirror database was online, could be used locally on the new principal server, and the principal server was accessible from a remote client, like the application uses.

This led us to check the application. We talked to the developer, who confirmed the application was using ADO.NET to connect to SQL Server and was using explicit client redirection, specifying the mirror server name in the ConnectionString property of the SqlConnection. (As an aside, it is always better to use explicit client redirection rather than relying on the mirror server name being cached automatically on the client once a connection has been made – implicit client redirection).

So why wasn’t the application failing over?

We dug deeper into how the application was handling connection failures and discovered that it was not coded to cope with the existing connection failing at all! Basically the application would open a connection to SQL Server when the application initialized and would never attempt to reconnect.

It turns out that although the DBAs had implemented database mirroring, no-one had discussed high availability with the application developer and so the application code was not changed. After re-working the connection layer in the application to cope with connection failures and implement reconnection logic, it worked perfectly after a database mirroring failover.

The moral of the story is that anything connecting to SQL Server has to be architected to cope with connection failures that occur and reconnect so that failover works correctly. In this case, our client would have discovered the problem during testing if they had tried a database mirroring failover before going into production.

Discuss this Blog Entry 1

on Jan 9, 2012
Thanks Kim and Paul for that insightful tip. Unfortunately this is only applicable when you are the owner of the client-side product development... Let's take the example of a large ERP system that relies on SQL database connectivity and where the client only connects once at the start-up... In such cases, the DB fail-over will be useful only as a quick DR situation, because the client would never be able to reconnect to the new server instance, even if the name remains the same. I think the only way that would work would be in conjunction with VMWare's HA solution where the server would be instantly replaced by a new VM on the fly (that was synchronizing and waiting standby in the background). Have a great year 2012 with the new Denali. @GP_Beat

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×