A "poor-man's" solution to monitoring your SQL Server environment
Many nifty tools are available to help you monitor your SQL Server environment. However, every tool, no matter how effective, seems to include a hefty price tag. Monitoring software for just one SQL Server can cost thousands of dollars. And monitoring tools sometimes come with agents that you must install on your SQL Server and that traditionally consume as much as 3 percent to 5 percent of the server's processing power. In addition, many tools provide much more functionality than you really need. At the children's hospital where I work, our DBA group manages 8 SQL Servers and more than 250 databases. In our situation, where patient information is accessed 24 x 7, 365 days a year, we must have a common method of accessing information about the health of our SQL Server environment from anywhere on the network. Ironically, the method we've found most useful for retrieving current and historical performance data isn't in a pricey monitoring package but comes from tools readily available within SQL Server, Windows .NET Server, and Microsoft Office XP. For example, on my office computer, I added an active channel that links to the hospital's SQL Server Web site. Microsoft Active Channel technology is integrated with Internet Explorer (IE) to "channel" browser data to a client's desktop without user intervention. Active Channel technology integrates with an active Windows desktop so you can view Web pages as desktop backgrounds or screen savers, for example. So if I have an active desktop, I can use our Web site as my screen saver. When I get to my office in the morning, I can view the results of database backups, read messages from last night's Database Consistency Checker (DBCC) operations, see which databases are growing fastest, and view a variety of performance graphs on the Web—before I even log in.
Let's see how you can implement the solution our DBA group uses to view current and historical SQL Server performance data through a graph on a Web page, as Figure 1 shows. First, let's look at how you can monitor your environment by logging System Monitor data from Win.NET Server to SQL Server. Then, I'll show you how we render the data to a dynamic Web page by using Microsoft Access data access pages. To implement the techniques I describe in this article, you need SQL Server 2000, Office XP with Access installed, an IIS 5.0 or 4.0 Web server, Win.NET Server or Windows XP, and IE 4.01 or later. You also need an Active Directory (AD) or Windows NT 4.0 domain so that you can use domain user accounts for trusted connections or Windows authentication. For simplicity, all the client computers you use should also be part of the domain.
Before you can view your SQL Server performance data, you must first create in Win.NET Server the System Monitor counter logs that collect the performance data, then send it to any SQL Server on your network. If you don't have Win.NET Server (I used Release Candidate 1—RC1), you can use XP to walk through the System Monitor steps in this article. \[Editor's Note: See Correction for information about how a Windows XP Service Pack 1a (SP1a) bug affects the CounterDetails table.\] Both Win.NET Server and XP can store System Monitor data in a unique format on SQL Server. However, Windows 2000 and NT can store performance-related data only to a file. Win2K uses System Monitor to log data to a binary or text file, and NT uses Performance Monitor to log data to a binary file. Later, I briefly describe the format we use and show you how to transform the data from the performance tables and optimize data retrieval by using a stored procedure. Then, I show how you can use the data in an Access data access page to graph your data points. After you've created your data access page, you can save the page to your Web server and view your data from the Web.
Note that if you're trying to keep costs low, you can use Microsoft Data Engine (MSDE) to log your performance data to SQL Server. You can install MSDE as many times as you want once you've purchased a SQL Server or Office product that contains MSDE. Although Enterprise Manager doesn't come as part of MSDE, you can use an Enterprise Manager installation to connect to a local or remote installation of MSDE. MSDE can use a maximum of two processors, and databases on MSDE can grow to a maximum of 2GB. MSDE also has scheduling and replication capabilities that you can use on many single-use SQL Servers. Finally, at the time I wrote this article, you could use Win.NET Server RC1 for free for 360 days. So, the software that you use for this monitoring solution is essentially free—but only for a time.
Configuring System Monitor
To begin, you need to create System Monitor logs in either Win.NET Server or XP to record performance information to a SQL Server database. On either OS, you also need to use the Control Panel's Administrative Tools to create a system Data Source Name (DSN) in the Data Sources (ODBC) applet that connects to a database where you'll store performance data. We created a database called DBADMIN on a remote SQL Server called SQL2KDEV1 to store this data.
To follow our example, create the DBADMIN database on the SQL Server you want to use; the SQL Server installation and database can exist locally or remotely. Then, create a system DSN that points to the DBADMIN database and uses either Win.NET Server or XP running System Monitor to log performance information. Create the DSN by using a trusted connection to the DBADMIN database in which the trusted user has database owner (DBO) rights. We created our DSN, SQLPerfData, on our Win.NET Server, which is named WINNET, by using a trusted connection to the DBDMIN database on SQL2KDEV1 through our test domain's Q\administrator account. We used this account for simplicity, but be aware that it's not a best practice. For simplicity and testing purposes, make sure the domain user account you're working with is also the DBO of the DBADMIN database. Within the DSN, select the DBADMIN database as the default database. Test the DSN to verify that it works properly.
Next, you configure System Monitor to connect to the DBADMIN database. To begin, click the Start menu and open the Run command on the Win.NET Server or XP workstation. Run perfmon.exe to open System Monitor. Expand Performance Logs and Alerts, and right-click Counter Logs. Select New Log Settings from the context menu. In the New Log Settings dialog box, type the name of the log as StorePerfSQL and click OK. Then, on the General tab, click Add Counters to add counters from a particular SQL Server 2000 instance (running on Win2K) that you want to monitor. Table 1 shows examples of the kinds of counters you might monitor.
Choose the interval for which you want to log information. For testing purposes, you might want to use the default of 15 seconds or less. However, for production purposes, you should consider logging data at 1-minute intervals or greater to reduce the logging traffic on the network and to reduce the number of aggregations that your queries perform.
Next, you need to configure System Monitor logging to properly authenticate to SQL Server by using a trusted connection. For example, in the Run As text box, we typed the name of our test domain's account, Q\administrator. Because this user is part of the Q domain and not a SQL Server login, SQL Server uses Windows authentication to "trust" that the account is valid. Then, click Set Password, and enter the appropriate password for the domain user.
After you complete the General tab, you need to configure where you want your data to be stored. On the Log Files tab, select the SQL Database Log file type, click Configure, select the system DSN StorePerfSQL from the System DSN drop-down list, and click OK. Next, click the Schedule tab, select Apply to start the performance logging, and click OK.
If you choose to start the log manually, you also have to restart the logging manually when your logging server or workstation reboots. However, if you start the logs by using the At option in the Start log section of the Schedule tab, the OS will automatically start the logs for you when the system restarts.
Verifying the Table Structures and Data
You can verify whether you correctly configured the logging by checking a couple of items. First, verify that the logging icon on the logging machine's System Monitor is green. If the icon is red after you start System Monitor, review the application event log for associated error messages. Second, if logging is configured correctly, you'll be able to retrieve data from the DBADMIN database tables CounterData, CounterDetails, and DisplayToID, which the System Monitor configuration created. If these tables exist, it's a good sign that System Monitor will be subsequently logging data into them. Specifically, you should be able to select data from the CounterData table and review the CounterDateTime field for the latest logging entries. For example, we ran the following SELECT statement to retrieve the latest logging entries:
SELECT TOP 10 * FROM CounterData ORDER BY CounterDateTime DESC
If you find that you still can't successfully log System Monitor data but the CounterData, CounterDetails, and DisplayToID tables exist, you might want to try deleting these tables and letting System Monitor recreate them. Remember to archive any data you want to keep before deleting the tables. Microsoft doesn't support modifying system tables, and doing so might cause System Monitor to stop logging data, so we recommend that you don't alter these tables.
Data Access Pages and Slow Queries
At this point, if you look at the logged performance data contained in the DBADMIN database on SQL Server, you'll notice that it looks much different from a System Monitor Comma Separated Value (CSV) file. A CSV file contains column headings representing counter names, followed by rows with timestamps and counter data.
To make this data suitable for graphing, you need to massage the data somewhat. For example, we found that we can't create effective queries that pull data directly from the CounterData and CounterDetails tables; execution of such queries is slow and inefficient for several reasons, including lack of indexes. (For information about how we address this problem, see the sidebar "Improving Query Performance.") Specifically, the CounterDateTime field is an unindexed char(24) field, which makes pulling data from a date range a costly process.
Additionally, to manipulate CounterDateTime as a datetime field, you must truncate the hidden ASCII character at the end of CounterDateTime before converting to a datetime data type to avoid conversion errors. At first, we created a view to perform the necessary data manipulation. But we found that the view execution time was too long because of the unindexed source tables, the datetime conversion, and the sheer amount of data being returned. To speed data retrieval, we would have liked System Monitor to create the logging tables with the appropriate foreign key indexes, create indexes on columns that are used as possible WHERE criteria, and store the date and time information in datetime format rather than character format. But doing all these things might have significantly slowed inserts to the logging tables. Again, we recommend that you don't modify the logging tables that System Monitor creates, especially if you want Microsoft support for your implementation. To work around these shortcomings, we created a stored procedure called ConvertCounterData—which Listing 1, page 32, shows—in the DBADMIN database to port the data in the streamlined format to another table called MyCounterData. Listing 2, page 32, shows the code that creates the MyCounterData table.
ConvertCounterData creates a derived table to place the performance-counter data into columns. Then, the procedure aggregates the data by using the GROUP BY clause. During aggregation, the procedure converts float data to numeric (18,2) to round the trailing decimal points to two so that they're easier to view. Notice the hour, minute, and second variables, which we use in our data access pages to let us drill down graphically into our data by hour, minute, and second. A typical data access page uses the hour, minute, and second fact tables, but those tables don't work for the three-dimensional data access page design we use. In a typical data access page, hours, minutes, and seconds are duplicated for every day you view. So a graph of 2 days would show 48 hours in a row on the X-axis, and a graph of 3 days would show 72 hours, including minutes and seconds, instead of representing all the days within the graph by using a common 24-hour time bar. You can circumvent this annoying behavior by adding the hours, minutes, and seconds variables that Listing 2 shows. Additionally, the procedure uses the RecordIndex variable to verify that it's importing only new data into MyCounterData from CounterData. Keep in mind that our counters might not be the counters you want to track. If not, you need to modify the ConvertCounterData procedure and the MyCounterData table to suit your needs.
If you're a keen observer, you've probably noticed that we're first truncating, then converting the CounterData table's CounterDateTime value to a datetime field. As I mentioned earlier, in setting up our solution, we found that CounterDateTime is a char(24) field by default, not a datetime field, which was disappointing—especially because Microsoft describes CounterDateTime as being stored in Universal Time Coordinate (UTC) format. Also, the System Monitor logging tacks an extra, hidden ASCII character to the CounterDateTime field, making datetime conversion impossible unless you first truncate the field to 23 characters—thereby removing the extra character. To remove the hidden character, we use the LEFT() function, as Listing 1's code shows. As you implement the ConvertCounterData procedure, verify that the CounterID numbers in your implementation match the proper counter names, as they do in our example. (For information about another possible bug in data access pages, see the sidebar "30-Second Timeout.")
After you've created ConvertCounterData, use a SQL Server Agent job to schedule it to run every 5 minutes. You might want to run the procedure manually at first to verify that it's importing rows from CounterData and CounterDetails. And run it as many times as you like—it won't import redundant data. Run the job manually at first and view the job history to verify that the job is functioning as intended. If you create queries to run against the MyCounterData, CounterData, and CounterDetails tables, you'll notice that you can pull data from MyCounterData faster than from CounterData and CounterDetails because data has been aggregated, consolidated, and indexed.
When you begin storing data in the MyCounterData table, you might realize another benefit besides added speed. If System Monitor corrupts its tables or recreates them, you've got your historical data intact in the MyCounterData table as an alternate source untouched by System Monitor.
Creating the Data Access Page
At this point in the implementation, you've created a database—DBADMIN—to store System Monitor data, created a DSN to connect to the DBADMIN database, and configured System Monitor to log data to DBADMIN. System Monitor has generated system tables in the DBADMIN database and started to capture data to them, but data retrieval from those tables is slow. To speed your data retrieval, you created a stored procedure, ConvertCounterData, to convert System Monitor table data into a table called MyCounterData where data retrieval is faster. You then scheduled the ConvertCounterData procedure to run every 5 minutes by using a SQL Server Agent job. Now the hard work is complete.
Next, you create the data access page, which is a relatively easy process. First, open Access, select New from the File menu, and click Blank Data Access Page in the New File window on the right. Then, click New Source in the resulting Select Data Source dialog box. Verify that Microsoft SQL Server is selected in the Data Connection Wizard's list box, and click Next. In the Data Connection Wizard dialog box, type your SQL Server's name in the Server name text box—we typed SQL2KDEV1. Be sure that the Use Windows 2000 Security option is selected, and click Next. In the Data Connection Wizard-Choose Data dialog box, select the DBADMIN database from the drop-down list, and click Next. Accept the Office DataSource Connection (.odc) filename in the Data Connection Wizard Finish dialog box, and click Finish.
Now, let's graph some collected SQL Server Cache Hit Ratio data to show how you can begin to graph similar System Monitor data. With the .odc file in the Select Data Source dialog box, click Open. A new data access page is now available for design.
Expand the design grid to six squares across by three squares down. Verify that the toolbox is visible (click the toolbox button on the toolbar if it's not), and click Office Chart on the Toolbox. Drag the Office chart from the upper left to the lower right to consume most of the grid. Double-click the Office Chart Object, and select the Data from a database table or query option on the Data Source tab in the Commands and Options dialog box.
Now, click Edit on the Data Details tab, and in the resulting Select Data Source dialog box, choose the .odc file you created, and click Open. In the Use data from section of the Data Details tab, select the Data member, table, view, or cube name option. Select the MyCounterData table to complete the requirements of the Data Details tab.
Next, you'll want to make the chart easier to view on a Web page in three dimensions, so click the Type tab and choose the 3D Area and the specific area chart on the bottom left, as Figure 2 shows. Now, click the 3D View tab. When you're in Projection mode, click Orthographic. Right-click the chart, and select Field List from the context menu to show the Chart Field List, which isn't the same as the Field List on a chartless data access page.
Next, on the Chart Field List, drag a counter such as Cache Hit Ratio to the Drop Data Fields Here section of the chart. Right-click Sum of Cache Hit Ratio, and select Auto Calc and Average. Now, drag Hour to the Drop Category Fields Here section from the Chart Field List dialog box. Do the same for Minute. Next, expand CounterDateTime by month and drag Days to the Drop Series Fields Here section of the chart.
Now you're going to get rid of some clutter caused by too many annotations on the X-axis. Right-click the black X-axis and click Commands and Options. Click the Axis tab and choose None from the Major Tick Marks drop-down list. Finally, to complete your data access page, click the title text box and type the title System Monitor Performance Data. Now, you can save the data access page to a Web site on your development IIS 4.0 (or later) server and open the resulting HTML page from your IE 4.01 or later browser. For example, we saved our page as wSQLPerfData.htm in the C:\inetpubs\wwwroot folder on our Web server. Then, we browsed to the page at http://<our server>/wSQLPerfData.htm. The data access page lets you drill down into your data by hour, by minute, and by day on the Web page, as you saw in Figure 1. The page also lets you calculate additional statistics on the data, including the average, sum, and standard deviation. When you browse to the page, you'll get the alert message
This page accesses data on another domain. Do you want to allow this?
To avoid getting this message every time you render a data access page, you need to add your IIS server as a trusted site to IE.
To use IE to view the chart on a workstation, you need to have Office Web Components (OWC) installed on the workstation. Office XP installs these components by default. If your clients don't have Office XP installed, you can supply the components (available at http://office.microsoft.com/downloads/default.aspx) on a Web page for clients to download. Once OWC is downloaded and installed, clients can use IE and a read-only version of the component to view, but not drill down into, data. This OWC read-only access is currently free from Microsoft.
A Rich "Poor Man's" Solution
Our DBA team has used techniques similar to these to view SQL Server charts over the Web for more than 8 months. Now, we've established historical baselines that we can use for comparison if a performance problem occurs. Because we view the charts regularly, we know a few things about our system: We know that our cache-hit ratios are excellent, processor usage is acceptable, OS backups punish hard disks in the early morning hours, and most user activity occurs between 9:00 AM and 5:00 PM. In the current economy, the word "free" has also perked up our management's ears—especially when we compare our solution to the cost of commercial Web-charting and SQL Server—monitoring solutions. My boss likes to go to the Web site and view SQL Server-related charts—he thinks it's a great way to see what his DBAs do. And the Web charts help my Oracle and DB2 DBA peers to troubleshoot SQL Server problems when I'm not around. I hope you'll realize these and other benefits when you try this "poormon" monitoring solution.Corrections to this Article:
- A bug in Windows XP Service Pack 1a (SP1a) causes the CounterDetails table data to be duplicated whenever a System Monitor log that's logging to SQL Server is stopped and started. The bug isn't present if you run the solution on Windows XP without SP1 applied. We apologize for any inconvenience and have informed Microsoft of this problem.