SQL Server 2000 SP3 enhancements make troubleshooting easier
In my previous two columns, I told you about several system tables that I find most useful and examined some columns in one of my favorite system tables: sysprocesses. I also described how you can use columns from sysprocesses to troubleshoot locking and blocking problems. (See "Inside Sysprocesses," July 2003, InstantDoc ID 38674, and "Track Down Troublemakers," August 2003, InstantDoc ID 39453, for details.) This month, I finish discussing the usefulness of sysprocesses by looking at some columns that Microsoft added to that table in SQL Server 2000 Service Pack 3 (SP3).
Microsoft originally added these columns for a customer who was developing a tool for SQL Server monitoring. But these changes are now part of the released product; if you've installed SP3 or the newer SP3a, the product enhancements are available to you as well.
Microsoft released SP3a in May. Because most of the changes in SP3a involve the installation process, you don't need to apply SP3a to any instances of SQL Server 2000—or Microsoft SQL Server Desktop Engine (MSDE) 2000—that you've already upgraded to SP3. For instances that you haven't upgraded to SP3, Microsoft recommends upgrading them directly to SP3a. Now, let's look at the following SP3 product enhancements:
- The new table-valued system function fn_get_sql()
- Three new sysprocesses columns: sql_handle (binary), stmt_start (int), and stmt_end (int)
- The new trace flag 2861
These three features can assist you when you use the sysprocesses table for troubleshooting.
Fn_get_sql() is a table-valued system function that returns the text of the T-SQL batch that's currently running on a specific connection. This function is useful because knowing the current command text can be an important part of troubleshooting a poorly performing or unresponsive query. To use the fn_get_sql() function, you must first know the server process ID (SPID) of the connection you're troubleshooting; then, you can use that SPID in the WHERE clause of a query that retrieves data from sysprocesses. The function also needs you to pass it a SQLHANDLE (a type of pointer) for the connection you're investigating. You can find the SQLHANDLE in that connection's row in the sysprocesses table.
After you retrieve the data, including the SQLHANDLE, from sysprocesses, you can pass the SQLHANDLE as an argument to the fn_get_sql() function. If you pass it a handle that no longer exists in cache, fn_get_sql() returns an empty result set. Because the function returns a table, it can't return a NULL. If you pass an invalid SQLHANDLE parameter, the batch aborts and you get an error message.
Let's look at the details of how to use this function. The fn_get_sql() function returns four columns of information in addition to a column containing the text of the current batch. The updated SQL Server 2000 SP3 Books Online (BOL) shows the definitions of all five columns, but in this discussion, I look only at the column called text.
Sysprocesses has three new columns containing the following information: the SQLHANDLE to use as the argument to the function, a starting position, and an ending position in the batch for the currently executing statement. Table 1, page 22, shows the BOL descriptions of these columns.
Listing 1's code, page 22, demonstrates how to use the SQLHANDLE and the starting and ending values from sysprocesses as arguments to the fn_get_sql() function. You'll need to change the double pound sign (##) to the SPID for the connection where you're running Listing 1. You can use @@spid to get the current SPID.
In the statement that calls the fn_get_sql() function in Listing 1, the SUBSTRING() function looks quite complex. First, you need to use the stmt_start and stmt_end values from sysprocesses, which you stored in the local variables @start and @end. Although the function returns the batch as a text data type, the @start and @end values are offsets in an ntext field. Each ntext data type character uses 2 bytes, so you need to divide the @start and @end values by 2 to get the correct offsets in the function's text column.
In addition, the SUBSTRING() function requires a position in the string, and positions usually start at 1, whereas offsets start at 0. Another quirk is that the third argument to the SUBSTRING() function isn't the ending position that stmt_end returns but rather the length of the desired substring, so you need to perform a few more calculations, as Listing 1 shows. It took me a while to get this code working because BOL doesn't mention these subtle details. However, I had help in solving these mysteries from Erland Sommarskog, a SQL Server MVP from Stockholm who has also been working with the fn_get_sql() function.
The SUBSTRING() function in this code also includes one special case. An @end value of -1 means that you want all the characters until the end of the text string.
Listing 1 returns results that look similar to Figure 1: a T-SQL statement. You might have a different value for the SPID instead of the 55 that I used. Note that this statement is just one of the statements in the batch comprising Listing 1, but this statement was executing when the fn_get_sql() function examined the sysprocesses table, so it was the one that the function returned.
In some cases, you can't get information by using the fn_get_sql() function. For example, SQL Server can't cache certain statements, such as bulk-operation statements and statements that contain string literals larger than 8KB. Handles to those statements aren't retrievable at all because they don't exist. Also, if a statement is part of an encrypted stored procedure, you can't see the text. In that case, although you can retrieve a SQLHANDLE, the value SQL Server returns for the text column when you pass it this SQLHANDLE is NULL.
Finally, if SQL Server considers the plan for the batch or statement to be trivial or cheap, it frequently doesn't store that plan in cache, so you can't retrieve it with fn_get_sql(). Many factors determine whether SQL Server considers a plan trivial or cheap and whether it caches such a plan. In my testing, I found that SQL Server only sometimes cached the batch in Listing 1. To force SQL Server to always cache even trivial or cheap plans, the documentation states that you can enable trace flag 2861. Because you don't know in advance when you'll have a troublesome query that you'll need to investigate, you might consider turning on this trace flag as part of your SQL Server instance's startup parameters. From Enterprise Manager, right-click your server and choose Properties (not Edit SQL Server Registration Properties). On the General tab, click Startup Parameters near the bottom of the dialog box. Type -T2861 in the Parameter box, click Add, then click OK twice. To finish, you need to restart your SQL Server.
Alternatively, you can execute DBCC TRACEON(2861, -1), which causes every new connection to start with this trace flag on but doesn't affect existing connections. Also, stopping your SQL Server turns off the trace flag if you use this method.
If you've been troubleshooting SQL Server for a while, you might be aware that the information that fn_get_sql() returns is similar to the information that the DBCC INPUTBUFFER command returns (i.e., the last batch a connection executed). However, fn_
get_sql() provides more benefits than DBCC INPUTBUFFER, including extra options you can use with it (e.g., the starting and ending positions of the current statement within the batch). And whereas DBCC INPUTBUFFER can return no more than 255 characters, fn_get_sql() is limited only to the size of a text field: 2GB.
Another big difference between fn_get_sql() and DBCC INPUTBUFFER is that the new function can show you statements even within nested stored procedures, but DBCC INPUTBUFFER can show you only the initial batch that called the outermost procedure. The code in Listing 2 provides an example of this difference. Running the script creates a table, testtable, and populates it with a row of data, then creates the procedure testproc, which updates a row in the table. Finally, the script creates a second procedure, exectest, which calls the first procedure. After running the script, start a transaction that retrieves the row in testtable but doesn't commit and release its lock:
SELECT * FROM testtable (HOLDLOCK)
In a new connection, run the outer procedure:
This call will be blocked because the nested procedure is trying to update a row in a table that's locked. In a third connection, run the code from Listing 1, but change the SPID to the process ID of the connection where you're running the outer procedure, exectest. You should get the following result:
UPDATE testtable SET a = 12;
You can compare this result to the result of running DBCC INPUTBUFFER. This command requires only the SPID as a parameter, so if you substitute the SPID of your blocked process for XX, you can run
This command returns only the outermost batch, as Figure 2 shows.
The documentation for fn_get_sql() in the SP3 version of BOL shows the additional columns of information available from sysprocesses. With the head start I've given you, you'll be able to use that additional information to extract even more data and write a robust stored procedure to capture whatever information you need regarding your problematic queries. The ability to use the information returned from sysprocesses to call fn_get_sql() is another reason that sysprocesses is one of the most useful troubleshooting tools in SQL Server.