Profiler, SQL Server's built-in tracing feature, is an invaluable tool for performance tuning and debugging SQL Server applications. With it, you can trace every request a database application makes to SQL Server and use that data to improve performance. For example, if you run a Profiler trace during a long-running process, you might find that SQL Server is using a table scan, encountering table locks, or running into other performance-robbing events. In most cases, I run Profiler at the request of a developer who's trying to debug some tricky database code. Occasionally, if time permits, I go on a preemptive strike and run my favorite trace, the duration trace. This Profiler trace looks for processes accessing SQL Server that take more than 1 second. One second seems to be a good limit, letting you see the processes that are consuming the most resources without trapping everything.
If I run this trace only when I have a little free time, the data collection is sporadic and inconsistent. But what if I could run the trace on a regular schedule, gathering this performance data routinely throughout the day? I could find out whether certain processes take longer at different times of day and which databases most of the slower processes are accessing. All I have to do is schedule a SQL Server job to run my duration trace. By bringing together two great SQL Server tools—the scheduler and Profiler's trace functionality—you can easily set up a job to gather performance metrics, load this data into a table, then analyze it. And by combining Profiler traces with the job scheduler, you can run a trace on a set schedule and for limited periods without having to think about it.
This article assumes that you have some working knowledge of Profiler and that you've built and run traces before. If you've never used the Profiler interface to build traces, I recommend running a few built-in traces against a test SQL Server to familiarize yourself with the technology. For Profiler tips, perform a keyword search on Profiler at http://www.sqlmag.com. The examples I show use Profiler to build a trace that you can run through any SQL Server command-line tool, such as Query Analyzer. Although a trace could affect the performance of the server you're tracing if you choose too many events to watch, the trace functionality just observes the running processes, so you can't harm any data, transactions, or other processes.
Now let's look at the nine steps you can follow to set up a Profiler trace to run at set intervals and gather data about what's happening on your SQL Server. Before you begin, I recommend creating a separate database to hold the tables and stored procedure the examples create. In the scripts that create the scheduled jobs, I call the database TracerDB. If you use another name, be sure to adjust the example scripts. The T-SQL script to create both jobs is included in Web Listing 1.
Step 1: Use Profiler to Create the Trace Definition
To create the Profiler trace definition, I recommend using one of the built-in trace templates as a starting point. From the Profiler File menu, select Open, Trace Template, and select SQLProfilerTSQL_Duration. Before you forget, click Save As, and give your new template file a name.
Next, add a few data columns and filters to the trace. To exclude SQL Serverrun background processes from your trace, add the NTUserName filter and set the Not Like value to the account that runs your SQL Server and SQL Agent services. You might need to experiment with this; if your scheduled jobs run under a service account, you'll miss any processes that run through the job scheduler. I added the data columns StartTime and EndTime because I need to know when each process ran. To get a better picture of what's running, I added the data columns DatabaseID, Reads, and Writes. (I didn't include DatabaseName because DatabaseID consistently has a value, whereas DatabaseName doesn't.)
I set the Duration filter to at least 1000 milliseconds (1 second) so that I'd capture only the most attention-worthy processes. Depending on your environment, you can adjust that value according to how many processes you have time to analyze. SQL Server can do an amazing amount of work in less than 1 second, so setting Duration to 1000 eliminates most of the processes SQL Server runs. Click Save, and you're all set.
Step 2: Run the Trace
Now that you've saved the trace template, you need to run the trace so that you can create the trace output file. You'll use this file later to create the tables in SQL Server to capture your trace results. Select File, New, Trace from the Profiler menu to load the trace template you created. From the resulting Database Connection dialog box, you can connect to your server, then select the template you created from the Template Name drop-down box. Click Run. Let the trace run for a few minutes to collect some results in the trace window, then stop the trace.
Step 3: Save Files for Trace Definition and Trace Output
Now, save the T-SQL script that created the trace you just ran. From the File menu, select Script Trace, For SQL Server 2000. In the Save As dialog box, save the file with a .sql extension. You'll use this script to create a stored procedure to build the trace. Next, you need to create a file that contains the results of the trace you just ran. From the File menu, select Save As, Trace File, and save the file with a .trc extension. You'll use this file to create a table in SQL Server to save future trace results in. For this example, I saved the file to D:\sqlbackup\trace_dump_00.trc.
Step 4: Create a Stored Procedure from the Trace Definition Script
Next, use Query Analyzer to open the .sql script file you saved in Step 3. This file will be the heart of the scheduled trace. I don't like to run big chunks of SQL in a scheduled job, and it's easier for me to see what's going on in my databases when everything is in a stored procedure. Therefore, I recommend converting this script into a stored procedure. Create the procedure with two parameters: a file name and the trace ID. The trace ID will be an output parameter; you don't need this ID to build the scheduled trace, but if you ever want to run the stored procedure separately, you might want to know the trace ID.
When you convert the script to a stored procedure, you need to make two important modifications to the script. First, replace the "InsertFileNameHere" string in the sp_trace_create statement with the parameter you created for the file name. Second, add a RAISERROR statement to return an error message if sp_trace_create throws an error. (For style points, you can also get rid of the goto.) Listing 1 shows the converted script. Note that the sp_trace_create stored procedure expects a file name without an extension—it will fail if you append the extension yourself—whereas the fn_trace_gettable() function requires you to add the .trc extension.
Step 5: Create a Table from the Trace Output File
The next task is to create a table in a SQL Server database to hold the results of your scheduled trace. SQL Server has a built-in function, fn_trace_gettable(), that lets you load the data from a trace output file into a table. Executing the following SQL statement with the appropriate name of your trace output file creates the table without you having to type the entire CREATE TABLE statement:
/* Setting the second parameter
as default tells the function
to get all the files if the
trace created more than one
You created this trace file earlier by running the trace and saving the output. No matter what kind of trace you just ran, all the possible trace-result columns (e.g., start time, end time, number of reads, duration) will be in this trace file, and this table will accommodate a variety of future traces.
Step 6: Create a Table to Hold the Trace Path and File Name
This next step will make sense after you create the scheduled jobs. To avoid having to type the trace file path and name in several places, you can create a table to hold the path and file names. This table also lets you easily change where SQL Server writes the traces to and lets you set up and manage multiple traces and jobs. Creating the table is simple; just use the following code:
tid int primary key
identity(1,1) NOT NULL,
(output_path_file + '.trc'))
After creating this table, insert a row with the full path and file name, omitting the .trc extension. My INSERT statement looks like this:
The myTraceOutput part is the name of the file that will store the results of the duration trace. Leave off the extension; the computed column you added to the tracer table will add it for you. I highly recommend placing the trace file on a network drive or a drive SQL Server isn't using, especially when running this trace on a production server. You need to evaluate the best arrangement for your environment. Remember that the goal for this particular trace is performance tuning, so whatever your configuration, try not to skew the results. In any trace, keep the number of events you capture and data columns you return to a minimum. Note that the SQL Server Service account you use to run SQL Server must have write permission wherever you direct your trace output. The scheduled job must run under an account that has full permissions for the trace file as well.
Step 7: Schedule a Job to Run Your Trace
Now, create two scheduled jobs to run and stop the trace. Web Listing 1 contains the T-SQL scripts that create these two jobs. The first job has only one step, which starts the trace by running the _duration_trace stored procedure that you created earlier. Before running the stored procedure, the job reads the tracer table to get the full path and file name of the output file the trace will create. The step looks like this:
DECLARE @path_file nvarchar(200)
FROM tracer WHERE tid = 1
EXEC _duration_trace @path_file, null
The WHERE tid = 1 clause returns the tracer table row that contains the name of the trace output file.
Setting the schedule for this job depends on how often you want to run the trace. The first job only starts the trace; the second job ends it. Without the second job, the trace would run forever, so make sure the second scheduled job is working. I recommend using the built-in notification features for scheduled jobs so that you can receive a page or email if the second job fails. Also, the sp_trace_create procedure lets you set a maximum file size, which can keep a runaway job from consuming all your disk space.
The first job starts the trace and completes almost immediately; it doesn't run for the duration of the trace. To see whether the trace is running, use this SQL statement in Query Analyzer:
— Default or 0 will return all
— running traces.
This statement returns multiple rows for each trace running on that SQL Server. Each trace will have a unique value in the traceid column. This traceid is the value the stored procedure returns when it runs the sp_trace_create procedure. If no traces are running, the statement returns zero rows.
Step 8: Schedule a Job to Stop the Trace, Load the File, and Clean Up
The second scheduled job, which has three tasks, does most of the work. First, it stops the trace; second, it loads the output file from the trace and writes the results into the myTrace table; and third, it deletes the trace file from the hard disk.
Listing 2 shows the T-SQL script for creating the three steps. The first task calls the sp_trace_setstatus stored procedure twice—once to stop the trace and again to close it. Until these two calls are made, the trace is running and the output trace file is locked on the disk—it can't be moved or deleted while the trace is running. The first part of this task reads the parameter table (tracer) that you set up earlier to get the full path and file name of the trace output file. Getting this information from a table lets you avoid hard-coding the file name into all the job steps.
The second task also reads the file name from the tracer table into a variable but uses the output_path_file_ext computed column with the .trc extension. With the file name in the @path_file_ext variable, the following SQL statement writes the results from your trace output file into the table you created earlier:
The third task is crucial. Using master.dbo.xp_cmdshell, you must delete the trace output file. The built-in stored procedure sp_create_trace will fail with an "Error 12" message if the file already exists on disk the next time the scheduled trace runs. The schedule of this second job, which stops the trace, determines how long your trace will run. If you schedule the first job to run at 10:10 am and you want the trace to capture 5 minutes of data, schedule the second job for 10:15 am.
Step 9: Look at the Data and Learn
Now for the fun part. The scheduled trace writes a steady stream of information about the requests sent to your SQL Server to the myTrace table, where you can easily query and analyze it. With my duration trace, I'm looking for ways to improve performance. After a few queries against the myTrace table, I can quickly see which databases the long-running processes are hitting. But best of all, I can spot trends over time and see which databases are showing performance deterioration and need more attention. Because I'm capturing the starting time of each process, I can see when the slowest processes run. I can find out which queries are doing the most reading or writing and which applications are the busiest and when. From this information, I can optimize queries, add an index, or rewrite a stored procedure to try to get it to run in less than a second.
Things to Consider
You need to consider a few things about scheduled traces. Because these traces run at set intervals, they might miss a long-running process that runs outside your scheduled time. One way to handle this limitation is to vary the schedule from day to day. If you set the trace to run for 5 minutes at 15-minute intervals beginning at 6:00 am on Mondays and Wednesdays, you could start the trace at 6:07 am on Tuesdays and Thursdays. The process still isn't random, but you can at least design in some variety.
In a perfect world, you'd set up these traces to run against your development database during the testing phase to spot problems before moving the database into production. But realistically, you might need to run these traces against production servers because a test server will almost never have the same load as your production server. Minimizing the number of events and data columns in your trace will help reduce the stress on a production server. Another way to mitigate this stress is to set up a linked server and write the SQL Server trace results table to that server. In addition, writing the trace output file to a network drive helps by not putting stress on the SQL Server disk subsystem.
The benefits of automated traces are well worth the effort. With performance data in a table, you can easily learn about the applications accessing your server. Just running a simple query that counts the number of processes that take more than 1 second and groups them by database ID can show which databases have the most long-running processes. The code in Web Listing 2 creates a lookup table, courtesy of Kalen Delaney, which will let you translate those cryptic EventClass numbers you see when running Profiler traces into descriptions. She also included a sample SQL statement that lets you quickly retrieve averages by EventClass.
Following these steps creates a framework for running any trace you can dream up. Create the trace with Profiler, save the SQL script as a stored procedure, and plug it into the job you've already set up. Profiler can capture so much information about SQL Server—from stored-procedure recompiles to lock acquisitions to database-file growth. No matter what you're looking to improve, setting up these scheduled traces can give you consistent insight into your databases and help you learn more about your SQL Server.
Corrections to this Article:
- Web Listing 1 contained some incorrect code. This has been corrected as of 8/24/04.