Manually monitoring SQL Server instances for problems can be tedious, so I developed an automated solution that performs daily health checks on all my SQL Server instances. The daily health checks include checking CPU utilization, database status, available disk space, network packet errors, database backup status, and more.
Here’s how the solution works. You use a scheduled T-SQL job (or Windows scheduled task) to execute a batch file named InvokeHealthChks.bat. As Listing 1 shows below, this batch file uses the SQLCMD utility to execute a T-SQL script named InvokeHealthChks.sql.
Listing 1: InvokeHealthChks.bat
sqlcmd -E -S <server name> -i InvokeHealthChks.sql > result.log
As you can see in Listing 2, InvokeHealthChks.sql, is a simple script that has one purpose: to execute a stored procedure named chk_wrapper_mssql_health.
Listing 2: InvokeHealthChks.sql
This stored procedure controls the execution of the eight health checks described in Table 1.
Each health check is performed by a dedicated stored procedure. These stored procedures get the data they need to perform the checks from a variety of sources, including the sys.dm_os_performance_counters Dynamic Management View (DMV) and the sysprocesses and msdb.dbo.backupset system tables. To determine whether there’s a problem, most of them compare the data with a specified threshold. You specify the desired thresholds in a table named Configuration, which is created with the conf.sql script. You can adjust the threshold values for each SQL Server instance.
The output of each dedicated stored procedure is written to the result.log file. If a SQL Server instance fails a check, it prints an error message.
I’ve tested this solution on SQL Server 2005, and later. If you’d like to try it, you can download the scripts from by clicking the 139521.zip hotlink at the top of the article page. Follow these steps on each SQL Server instance you want to monitor:
1. On the SQL Server instance, create a directory named HealthChk on the C drive.
2. In the master database, execute conf.sql to create the Configuration table. In it, specify the desired thresholds.
3. In the master database, execute chk_wrapper_mssql_health.sql and the eight scripts listed in Table 1. You don’t need to customize any of these scripts.
4. Place InvokeHealthChks.bat in the C:\HealthChk directory. In the batch file, replace <server name> with the name of your server.
5. Place InvokeHealthChks.sql in the C:\HealthChk directory. You don’t need to customize this script.
6. Make sure that the SQL Server Agent service is running on the SQL Server instance if you’re using a T-SQL job to run InvokeHealthChks.bat.
7. Create a job that will execute InvokeHealthChks.bat, with your preferred schedule in the first step. Optionally, you can add a second step that uses the SQL Server database mail component or a utility such as Bmail to email the result.log file to you.
8. Test the solution.
This solution is a simple way to monitor SQL Server instances and get reports on their health. You don’t have to deal with complex maintenance plans or manually check performance counters. You could easily extend the solution so that it performs health checks in other areas of SQL Server.