Error handling is an essential part of all T-SQL code, but many developers regard it as at best a nuisance or at worst irrelevant. Even when developers do implement error handling, misconceptions about how various error-handling strategies behave under different conditions can lead to poor implementation. Because the use and misuse of error-handling routines is poorly understood, I've introduced a major education drive at my workplace to improve the standard of new and modified code we're releasing into our production environment. To help you improve your organization's error-handling procedures, I want to share what we've learned by introducing rigorous scrutiny into our code-review process. First, I explain the reasons for using error handling in code and the best ways to use error handling. Then, I offer template code snippets that you can paste into your own T-SQL code. Finally, I show examples of common mistakes in T-SQL error handling.

Why Use Error Handling?

T-SQL changes data by using Data Manipulation Language (DML) commands, namely INSERT, UPDATE, DELETE, and SELECT. Although these commands generally work correctly, DML statements can fail for any of several reasons including lock timeout, deadlock, and constraint violations. When such failures happen, the executing code needs to detect the condition, send a response to the code that called it, and take an appropriate corrective action.

Failure to correct an error can result in undesirable situations such as "leaky" transaction handling, which leaves transactions and causes excessive blocking. Non-termination of a transaction that encounters an error can mean that subsequent updates to that transaction continue and complete, causing data inconsistency. If you don't detect the resulting data errors that lead to incorrect results, those errors persist and propagate within the database.

When and How to Use Error Handling

After every DML statement, your code should use the @@ERROR system function to test for system errors and the @@ROWCOUNT system function to test for logic errors. If your code doesn't include this testing, the calling application won't know that anything has gone wrong and therefore can't take appropriate action.

The way error handling works depends on the type of application environment you use. For example, many applications use COM+ architecture to make database requests through Microsoft Transaction Server (MTS), which lets the application layer optionally control error handling. Other applications (e.g., Query Analyzer, osql) don't use MTS, which means that the T-SQL code must control error handling. It's possible to call one piece of code from both types of environment, depending upon the circumstances, so error handling should work with both types of architecture.

Recommended Error-Handling System Functions

Whenever a T-SQL error occurs, you can get the error number by calling the @@ERROR system function. A non-zero value for this function indicates that a DML statement has executed incorrectly and you need to take action. @@ERROR returns the status value of the most recently executed statement. Table 1 lists the various values that @@ERROR will take on.

Note that nearly every statement— including such statements as IF, PRINT, and SET—will modify the value of @@ERROR. The three ex- ceptions to this rule are RETURN statements that have no specified return value (RETURN n resets @@ERROR to zero), BEGIN statements, and DECLARE statements.

Because the value that @@ERROR returns always has the potential to change, testing the value of the function directly is an unreliable error-handling method. You need to save the value in a local variable immediately after a DML statement; then, you can base tests for the statement's success or failure on the more predictable value of the local variable.

The @@ROWCOUNT system function displays the count of the number of rows affected by the most recent DML statement. However, like @@ERROR, nearly all statements affect this function's value. The two exceptions to this rule are the BEGIN statement and the DECLARE statement, which don't affect the value of @@ROWCOUNT. Table 2 lists the value of @@ROWCOUNT when it appears after different categories of command statements.

Because the value of @@ROWCOUNT, like @@ERROR, is constantly changing, you should store the function's current value in a local variable immediately after every DML statement so that subsequent tests on that value are reliable. Most DML statements affect an expected number of rows (i.e., you usually know whether a statement will affect 0, 1, or many rows), and you should always use @@ROWCOUNT to confirm the number of affected rows. For example, if the update of a supposedly unique row actually affects two or more rows, your code should roll back that transaction and display a sensible message indicating the nature of the error so that you can investigate the problem. In some situations, knowing the correct number of records is impossible or irrelevant, so the rowcount check isn't mandatory. However, I would urge you to use it whenever possible.

As I explained, I don't recommend testing the value of system functions directly because these values can change frequently. Thus, you need to save the two functions' values into local variables, but you must do so in one command. If you save the values in separate commands, the action of saving the first value will modify the value that the remaining function returns. The code in Listing 1 shows how to correctly save the values, whereas the code in Listing 2 shows what happens when you try to save the values incorrectly.

The RETURN Statement

All T-SQL code should exit with a RETURN status on both success and failure. You can select the value you want SQL Server to return based on your own standards and how you want the calling code to respond, but I recommend returning 0 on success because this value matches the no-error value of @@ERROR and is the default value if you use a RETURN statement without an explicit value. On failure, I recommend your code returns the error number so that you can determine exactly what caused the failure. If an unexpected rowcount is the culprit, the code can't return an error number because none exists, so I suggest a standard return code (we use -999) that any calling application knows to interpret as "unexpected rowcount."

Untrappable Errors

The previous examples show how to "trap" the error number of a particular error and act on it. Some types of error cause immediate termination of a batch, giving you no opportunity to call @@ERROR, and so are "untrappable." For example, if you reference a non-existent table, the result is a resolution error with the error message Server: Msg 208, Level 16, State 1, Line 1 Invalid object name '%d!

When this error occurs, no further code executes in that batch or stored procedure and the current transaction does not roll back. In this example, @@ERROR is set to 208, but no subsequent tests for @@ERROR in that batch or stored procedure are valid (SQL Server will never execute those lines of code), and no return status is set. The code that calls the routine in which the error occurred needs to test for @@ERROR immediately after calling the routine. I haven't been able to find any definitive documentation listing which particular errors behave this way, but another common example is a data-type conversion error. Such errors are generally of severity 16 or higher, but not all severity 16 errors are untrappable. Microsoft acknowledges this problem and provides a workaround in the article "PRB: Inconsistent Error Handling Behavior Occurs with @@ERROR in SQL Server," default.aspx?scid=kb;en-us;811034.

Recommended Error-Handling Templates

Listing 3 shows a suggested error-handling template that caters to both MTS and non-MTS environments. This template provides a consistent type of check for both errors and rowcount and ensures that a return value is always specified. Listing 4 shows a suggested template for executing stored procedures and testing for their return status.

Listing 4's template is designed to work in partnership with the procedure code in Listing 3. Where I work, we now insist that everyone writes all new code to this format, and any bug fixes that involve changes to existing code must also include modifications to the error handling to bring the old-style code in line with the new as much as possible.

Common Error-Handling Pitfalls

The code in Listings 5 through 10 shows some problematic error-handling implementations that I've encountered over the years. Some of the code is just plain wrong. Some might appear to work at first glance but on closer inspection is unreliable. In all cases, these snippets assume the update is failing with a trappable error.

Not testing @@ERROR immediately. In the example code that Listing 5 shows, even if the update fails with a trappable error, @@ERROR will be zero when the code tests it because it takes the value of the most recent T-SQL statement, which in this case is the SET command.

Storing system function values separately. In the example that Listing 6 shows, the code stores @@ERROR and @@ ROWCOUNT in separate SELECT statements. As a result, the SELECT @row_ count command will reset @@ERROR. Always store @@ERROR and @@ROW COUNT in one command.

Inadvertently resetting @@ERROR to 0. This one is a bit more subtle. In Listing 7 the line IF @@ERROR <> 0 will reset @@ERROR to zero as it executes. This means that the line set @err_status = @@ERROR will always set @err_status to zero. The result is that the RETURN command always returns zero instead of the error number. Thus, the calling code will take a different action than it would if a non-zero result was returned. Listing 8 shows another way this common mistake can occur. Like the code in Listing 7, the line IF @@ERROR <> 0 resets @@ERROR to zero as it executes.

Calling code doesn't check correctly for return status values. A common assumption is that simply testing for @@ERROR outside a stored procedure call is sufficient to trap an error. It isn't. True, this is the only way to detect untrappable errors occurring within the stored procedure, but such errors are comparatively rare when compared with trappable errors, and the template in Listing 4 works for both trappable and untrappable errors far more reliably.

Code that incorrectly tests for return status from stored procedures. In the code that Listing 9 shows, the value of @@ERROR will be set to whatever the last statement in SP_1 happens to be. This means that @@ ERROR isn't guaranteed to be non-zero, even if SP_1 fails and rolls back correctly. The value might be non-zero, but it's the unreliability that makes this method a poor choice. The only time that @@ERROR will definitely be non-zero is if SP_1 fails with an untrappable error.

Listing 10 shows another example of how this kind of problem can occur. Because of the way that SP_1 is coded in Listing 10, the stored procedure always resets @@ERROR to zero before it returns. Even though the stored procedure returns a non-zero value, the calling code doesn't test for a non-zero value; it merely tests for the value of @@ERROR, which by now is zero. Thus, the calling code never enters the error handler. Although the missed rollback isn't important (SP_1 has already rolled back the transaction thus far), the fact that the calling code doesn't return at this point is disastrous because it continues to execute and calls SP_2. If SP_2 completes successfully, you'll have a partially complete transaction instead of one that has completely aborted.

Error-Handling Best Practices

Although incorrect error handling is common, many valid ways of achieving a good, thorough error-handling standard exist; the templates in this article are only one approach. In general, remember that you must save the values of both @@ERROR and @@ ROWCOUNT in local variables immediately after every DML statement—and save them in one statement. Remember that all stored procedures must include both MTS and T-SQL error handling (i.e., Raiserror and Rollback). Always check the expected rowcount. Remember that all stored procedures must return an error status—preferably 0 for success or non-zero for failure, because 0 is the default return status in T-SQL. Also, @@ERROR takes the value 0 to indicate "no error." Finally, stored procedure execution must test for non-zero return status and also test for @@ERROR to catch untrappable errors from within the stored procedure.