In the old days, SQL Server's primary users were small departments and workgroups, and administrators used the "when in doubt, reboot" philosophy to keep SQL Server up and running. Times sure have changed! Now many enterprise-class organizations use SQL Server and depend on 24 x 7 server availability. Customers spend a lot of time and money designing highly available, fault-tolerant systems for their businesses.

SQL Server's high-availability capabilities have improved, but no one can claim 100 percent availability. DBAs commonly refer to a server's "number of nines" when talking about availability. For example, a server that has four nines is available 99.99 percent of the time. A year contains about 31,500,000 seconds, so 99.99 percent means your system can be down for about 52 minutes per year. Five nines, 99.999 percent, allows you only 315 seconds of downtime. Adding a sixth nine would allow you only 32 seconds of annual downtime. So you can see how "adding nines" to your system's availability becomes increasingly difficult. Thinking about an unscheduled reboot, just for the heck of it? Forget about it.

Microsoft is always looking for ways to improve SQL Server so that customers can achieve higher levels of availability. The company performs extensive availability research and likes to receive feedback from customers that helps SQL Server developers evaluate availability trends. A colleague of mine at Microsoft is putting together a list of the 10 most common causes of unplanned outages, and I told him that you would help. Yes, you. Microsoft needs to know why systems become unavailable so that the development team can fix the problems.

Send me an e-mail message telling me what problems make you reboot your servers and what causes your planned or unplanned downtime. After all, even a planned outage is difficult to accommodate when you're trying to achieve four nines of availability. I'll send your messages to Microsoft and share the summarized reader comments with you in an upcoming commentary.