Tracing the server side of SQL Server 2000

The SQL Server Trace tool lets you keep track of all the operations that SQL Server performs. For SQL Server 7.0, Microsoft completely rewrote the Trace tool. Microsoft dubbed the new incarnation SQL Server Profiler. However, the SQL Server 7.0 documentation never made it completely clear that Profiler has two separate components: the Profiler and the server-side traces.

Profiler is the GUI that lets you define all the events and data you want to capture from SQL Server. A server-side trace runs a set of extended stored procedures entirely on the server, from which it captures trace data. Itzik Ben-Gan described the relationship between the SQL Server 7.0 version of Profiler and server-side traces in "Problem-Solving with SQL Profiler," April 2000, and "Trace That Event with SQL Server Profiler," April 2000.

As wonderful and powerful as the SQL Server 7.0 tracing capability is, Microsoft knew some changes were in order. Microsoft completely redesigned the trace functionality again, this time for SQL Server 2000. The Profiler GUI for SQL Server 2000 looks like the GUI for SQL Server 7.0. Only when you start to really examine all the tool's possibilities do you begin to see the differences.

The biggest reason to rewrite the tracing tools was to absolutely guarantee that SQL Server 2000 wouldn't lose any trace records under stress. SQL Server 2000 supports C2 auditing (see the sidebar "C2 Certified and Audited," page 40), and losing any trace information is unacceptable if auditing is enabled. In addition, if any records are missing, Microsoft Product Support Services (PSS) will find using Profiler to reproduce problems much more difficult.

Tracing in SQL Server 2000

SQL Server 2000 divides the tracing functionality into two separate but connected components. On the server side is the SQL Trace facility, which manages queues of events. Another thread, called a consumer thread, reads events from the queues and filters the events before it sends them to the requesting process. Events are the main unit of activity for tracing, and an event can be anything that happens inside SQL Server or between SQL Server and a client.

The other component is Profiler. You incur more overhead when you use Profiler to define and capture the events. The server-side consumer thread that captures the requested events returns OLE DB rowsets to a client-side consumer thread, then Profiler must display the rowsets' data appropriately. To reduce the amount of data sent from server to client, each row contains a minimum amount of information—an event code, a data-column code, and a value.

For example, suppose through Profiler a user requests a trace of the text (or name) of each stored procedure when it finishes, the time it finishes, how much time it took to execute, and the requestor's login ID. The server-side trace sends multiple rows to the client every time a stored procedure finishes. Each row contains the event code (43 for Stored Procedure Completed) and one of the requested data-column codes. One row exists for the procedure name, one row for the end time, one row for the execution duration, one row for the user's login name, and so forth. Other rows exist for data columns the user didn't explicitly request because every trace returns a set of default data columns. Profiler must read these columns and pivot them to display one row for each event.

Defining a Server-Side Trace

All the events that are available through Profiler are also available through a server-side trace. For example, creating a table is an event, executing a stored procedure is an event, and sending a T-SQL batch from a client to SQL Server is an event. In "What's New in SQL Server Profiler 2000," page 32, Ben-Gan describes the changes to available events in SQL Server 2000, so I won't elaborate.

You can use a set of stored procedures that Microsoft provides with SQL Server 2000 to define which events you want to trace, what data to collect, and where to save the collected information. You can specify event filters to reduce the amount of information the trace collects and stores, as you can when you use Profiler. However, when you define server-side traces, you must specify numbers instead of Profiler's meaningful names for the events and data columns.

To start a particular trace programmatically, you use the sp_trace_* internal system procedures to define a server-side trace. Although these procedures start with the characters sp_ like most system procedures, these procedures aren't written in T-SQL. If you look at the definition of these procedures through Enterprise Manager or sp_helptext, SQL Server informs you that these procedures are internal. They have the special prefix sp_, so you can invoke them from any database. You can include the calls to these procedures in any SQL Server stored procedure or batch. To use client programs such as Query Analyzer to define a server-side trace, you must know only four (or optionally five) procedures, which Table 1 shows.

The trickiest part of setting up a server-side trace comes when you define the events and data columns: You must use their internal ID numbers. Fortunately, the sp_trace_setevent article in SQL Server 2000 Books Online (BOL) that documents the sp_trace_setevent procedure provides the internal ID number for each event and each data column. The code in Listing 1 creates a trace that writes events to the C:\traceselect file. Note that I supplied a filename without a suffix. SQL Server adds the suffix .trc when it creates this file. (If I specified a filename with a suffix—for example, traceselect.trc—SQL Server would create the traceselect.trc.trc file.) If the file exists, you can't create the trace and the call to sp_trace_setstatus returns an error code of 12. The trace in Listing 1's code captures the start time, duration, name, and database of each Statement Completed event (Event ID 41), but the filter restricts the trace so that it captures only events that have SELECT in the name data column.

The parameters of these trace stored procedures must have the correct data type. For example, the filename parameter for sp_trace_create has the nvarchar data type, so you must supply the filename as a Unicode constant with the n prefix. For the sp_trace_setevent procedure, the last argument, which sets data tracing for the event and the columns specified in the second and third parameters to ON or OFF, must have the bit data type. Because you can't use a CAST or CONVERT statement in the procedure call to force a constant of 1 to have the bit data type, I defined a variable to hold the value, and I passed that variable to every call. To modify this trace without stopping and redefining it, you need to pass a bit value of 0 to stop capturing a particular data column. Finally, for the sp_trace_setfilter procedure, you must explicitly pass the character arguments, such as the pattern string N'%select %', as Unicode constants.

Let's take a closer look at the procedure that creates the trace.

sp_trace_create \[ @traceid = \] trace_id OUTPUT
        , \[ @options = \] option_value
        , \[ @tracefile = \] 'trace_file'
        \[ , \[ @maxfilesize = \] max_file_size \]
        \[ , \[ @stoptime = \] 'stop_time' \]

Note that this trace has two additional parameters that I didn't supply in Listing 1: a maximum file size and a stop time. The stop time is applicable only after the trace has actually started; later in the code in Listing 1, the sp_trace_setstatus procedure starts the trace. The @traceid parameter returns an integer that I must use if I want to modify the trace, stop or restart it, or look at its properties. The code in Listing 1 returns the trace ID's value for informational purposes only, so I'll have the ID if I need to refer to this trace while it runs.

Sp_trace_create's second parameter, @options, lets you specify one or more trace options. Table 2 contains the possible @options values that you can supply to sp_trace_create. A value of 1 tells the trace to produce a rowset and send it to Profiler. You can't use this option value if you capture to a server-side file. Typically, only Profiler-created traces have this option value. Traces that use the sp_trace_create stored procedure should never have an option value of 1, because this value is reserved for Profiler-defined traces.

Because the option_value parameter is a bitmap, you can combine values by adding them together. For example, if you want a trace that enables file rollover and shuts down SQL Server if SQL Server can't write to the trace file, the option value is 6 (4 + 2). Note that you can't combine all option values with all others (e.g., option value 8 by definition doesn't combine with any other option value).

Constantly running some traces can be useful, but be careful about the amount of information you capture. Trace files can grow quite large if they run the entire time that SQL Server runs. To configure a trace to begin automatically every time SQL Server starts, you can put the trace definition into a stored procedure, then use sp_procoption to give that procedure the startup property. Listing 2 illustrates how to create a procedure that defines the trace in Listing 1 and marks that procedure to start automatically.

SQL Server 2000 provides several functions you can use to inspect the status of your traces. The most useful is probably fn_trace_getinfo(), which reports all the parameters that the sp_trace_create procedure specifies, as well as the trace's status (1 = running, 0 = stopped). Here's an example of how to use fn_trace_getinfo():

SELECT * FROM :: fn_trace_getinfo(default)

You could supply a particular trace ID instead of the word default, but if you don't know the trace ID, you can use this form of the statement to see all the traces. Table 3, page 42, contains the output, which shows five rows for five properties, that I received from fn_trace_getinfo() for a trace with a trace ID of 1. The Property 1 row contains the @options parameter value. A trace with a Property 1 value of 1 is most likely a trace started from Profiler. The Property 2 row contains the trace filename, if any. The Property 3 row contains the maximum file size, which in this case defaulted to 5MB, and the Property 4 row contains the stop time, which has no value for my trace. Finally, the Property 5 row shows the trace's status.

To view the trace file's contents and use Profiler to see all the events you've captured, you must remove the trace definition from SQL Server. Stopping the trace isn't enough to let you open the trace file, but you must stop the trace before you remove it. If you try to remove the trace before you stop it, nothing will happen—you won't even get an error message! Listing 3 shows you how to stop and remove a trace with an ID of 1. If you don't know the ID, you can examine the results from the fn_trace_getinfo() function.

Fn_trace_geteventinfo() shows you the events and data columns that a particular trace captures, but the function returns the data with the event and data column IDs instead of a name or explanation, so you must track down their meanings. Fn_trace_getfilterinfo() returns information about a particular trace's filters.

You might have noticed that none of the supplied system procedures let you capture your server-side trace data to a table. This apparent omission is actually a benefit because writing to a table isn't as quick as writing to a file; no matter how busy your server is, having the trace-capture operations not lose any events is essential. Writing to a server-side file uses the same mechanism that SQL Server uses to write to the transaction log, and the trace opens the file only for sequential writing. If you want to capture the data as events occur, table writes would need to happen a row at a time.

If you want to view your trace data in a table, you can copy the data from the trace file to a table after the trace collects the data. Having the trace data in a table lets you use SQL queries to analyze the data. To copy the captured trace data to a table, you can use the table-valued function fn_trace_gettable() in the SELECT INTO statement's FROM clause. This function returns a rowset with one row for each event and a column for every data column that you can trace. You'll probably be interested in only a few columns. Make sure you stop and remove the trace definition before you try to copy the trace data to a table. The example in Listing 4 saves the trace data from the file C:\traceselect.trc to a table called TraceDataTable.

When you store trace data in a table, the EventClass column stores the event type as an integer. The event names that correspond to each integer are in BOL along with the documentation for sp_trace_setevent, but having those event names in a SQL Server table would be useful so that you could do a table lookup. Web Listing 1's table definition and set of INSERT statements let you build a lookup table called EventID_Table, which maps EventClass numbers to event names. (For download instructions, see the More on the Web box.)

Listing 5 shows an example of how to use the EventID_Table and the TraceDataTable to analyze the captured data. The query reports the average duration and number of events for each event type.

Server-side traces are wonderfully useful in all kinds of situations and much more efficient than client-side tracing with Profiler. If you avoid server-side traces because you think writing all the procedure calls and looking up all the necessary event and data column ID information is too hard, take heart. You can use Profiler to define all the events, data columns, and filters that you want, then use the Script Trace option on the File menu to produce the calls to the stored procedures that correspond to the trace elements you select. The only change you need to make is to provide a filename to hold the captured data. Then, you can run the newly created script through Query Analyzer to run your server-side trace.

Performance Considerations

SQL Server tracing incurs no overhead unless it captures an event, and most events need very few resources. Profiler becomes expensive only if you trace all the more than 100 event classes and capture all the data from those events. My preliminary testing shows a maximum of 5 percent to 10 percent overhead if you capture everything. Most of the performance hit is because of a longer code path; the actual resources that the trace needs to capture event data aren't particularly CPU-intensive. In addition, to minimize the performance hit, you can define all your traces as server-side traces and avoid the overhead of producing rowsets to send to the Profiler client.