Editor's Note: Thanks to the many readers who have submitted T-SQL Black Belt solutions! You'll see the first of the reader-contributed solutions in the October issue. If you'd like to contribute to the Black Belt section, send a description of your problem and the T-SQL code that finally did the trick to email@example.com. If we use your submission, you'll receive $100 and an exclusive T-SQL Black Belt shirt.
Some 3GLs such as the C language have an interesting feature called short circuit, which aborts any further processing of a logical expression as soon as the system can determine the expression's results. To understand how short circuit works, let's first examine how a programming language evaluates expressions. (Learn more from "A/C Circuits and T-SQL" and "Cycling with CTEs").
Operator precedence determines the order in which the system processes parts of an expression. The system processes operators with higher precedence before operators with lower precedence. The system processes operators with the same precedence from left to right. The following list, adapted from SQL Server Books Online (BOL), shows T-SQL's operator precedence:
- () Primary Grouping
- + (Positive), - (Negative), ~ (Bitwise NOT)
- * (Multiply), / (Division), % (Modulo)
- + (Add), + (Concatenate), - (Subtract)
- =, >, <, >=, <=, <>, !=, !>, !< (Comparison operators)
- ^ (Bitwise Exclusive OR), & (Bitwise AND), | (Bitwise OR)
- ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
Now, consider the following logical expression:1 = 0 AND 1 = 1
According to the operator precedence rule, the system will process the two equality comparisons first, and the two results will be joined with an AND operator (e.g., FALSE AND TRUE). However, systems that support short circuit will not evaluate the right part of the expression (1 = 1) at all, because the first part (1 = 0) is FALSE and the whole expression cannot evaluate to TRUE. This feature saves CPU cycles and in certain situations even prevents runtime errors. Consider the following expression:IF (0 <> 0) AND (1/0 > 0)
PRINT 'This works!'
A system that supports short circuits won't evaluate the right part of the expression (1/0 > 0) at all if the left part is FALSE, as it is in this example. So no runtime error, such as divide by zero, occurs. The same behavior applies in expressions that use the OR operator:IF (1 > 0) OR (1/0 > 0)
PRINT 'This works!'
Here, you can tell from the left part of the logical expression (1 > 0) that the whole expression evaluates to TRUE, so the system doesn't need to continue processing it.
Let's use a more practical example. Consider the following table:CREATE TABLE T1(
col1 int NOT NULL,
col2 int NOT NULL)
INSERT INTO T1 VALUES(1, 1)
INSERT INTO T1 VALUES(1, 0)
INSERT INTO T1 VALUES(-1, 1)
INSERT INTO T1 VALUES(2, 1)
Suppose you want to return all the rows from T1, where the result of col1 divided by col2 is greater than zero. For this discussion, suppose you prefer not to return rows that have a zero value in col2, in which the computation of col1 divided by col2 would result in omega (eternity). The following query results in a divide by zero error:SELECT * FROM T1
WHERE (col1 / col2 > 0)
Note that before the error occurs, SQL Server processes and returns one row. As soon as the error occurs, the query breaks and SQL Server processes no further rows. Let's slightly revise the query:SELECT * FROM T1
WHERE (col2 0)
AND (col1 / col2 > 0)
When I ran this query on my machine with SQL Server 7.0 and Service Pack 2 (SP2), the query ran successfully, returning two rows. Let's examine the execution plan output for the query:|--Filter(WHERE:(\[T1\].\[col2\]0 AND
The system scanned the table and performed a filter to limit the result set according to the WHERE clause's filter. Notice that in this case, the filter in the execution plan appears to be the same from the query. Running the same query on my SQL Server 2000 beta 2 installation returned the same results, but the execution plan looks slightly different:|--Table Scan(OBJECT:(\[testdb\].\[dbo\].\[T1\]),
The query plan shows the scan and the filter in one operation, but the logical expression still looks the same. Let's rewrite the query, flipping the left and the right parts of the logical expression, then execute the query:SELECT * FROM T1
WHERE (col1 / col2 > 0)
AND (col2 0)
On my SQL Server 7.0 machine, I get one row and a divide by zero error. Let's look at the execution plan:|--Filter(WHERE:(\[T1\].\[col1\]/\[T1\].\[col2\]>0 AND
Notice that the expression in the filter appears to be the same as the expression in the query. Because the system evaluates the left part of the expression (col1 divided by col2 > 0) first, as soon as the system processes the row with the value 0 in col2, the query aborts. However, I ran this query successfully on SQL Server 2000. Let's look at the execution plan.|--Table Scan(OBJECT:(\[testdb\].\[dbo\].\[T1\]),
Notice that the optimizer rewrote the expression and used the same expression from the first query. In any case, the queries that ran successfully would have failed if SQL Server didn't support short circuit. However, be aware of two aspects of how SQL Server's short circuit feature works. First, SQL Server will short-circuit an expression if it determines that the expression evaluates to either TRUE or FALSE. Second, there's no query hint you can apply to force the optimizer to use the exact expression you wrote so that the expression will appear the same way in the plan that SQL Server builds. Although in the previous example, the optimizer revised the logical expression col1/col2 AND col2>0 to col2>0 AND col1/col2, you have no guarantee that it wouldn't do the opposite when you use the form col2>0 AND col1/col2.
You can also try using one of the variations I present in Figure 1 and Figure 2 to rewrite your query as a derived table. On my SQL Server 7.0 machine, Query 1, which callout A in Figure 1 shows, ran successfully. Query 2, which callout A in Figure 2 shows, failed. But can you guarantee that such queries will always behave this way? Look at the execution plans in each figure. Notice that in both cases the optimizer combined the two simple expressions to form one complex expression. Again, you can't guarantee the order, even though it looks as if you can. Running two different queries in SQL Server 2000 resulted in the same execution plans, as Figure 1 shows for Query 1 and Figure 2 shows for Query 2. Notice that the optimizer revised the expression in Query 2, and the execution plans for both queries produced the same result; they both ran successfully, as callout C in Figure 1 and callout C in Figure 2 show.
If you want to guarantee a certain order of processing in an expression, you need to use a CASE expression. Let's revise our query:SELECT * FROM T1
WHEN col2 = 0 THEN 0
WHEN col1 / col2 > 0 THEN 1
END = 1
CASE is by nature a short circuit, which lets you control the processing order. SQL Server always evaluates CASE expressions from top to bottom, and when any of the expressions evaluates to TRUE, SQL Server doesn't process any further expressions.
I've demonstrated that the optimizer might revise the logical expressions you write. Therefore, you can't count on this feature when you write your queries. But for performance considerations, you benefit from knowing that SQL Server internally supports the short circuit feature, and that this feature saves many CPU cycles. In many cases, you can rewrite logical expressions to use a CASE expression instead, guaranteeing a short circuit.
This article is adapted from Advanced Transact-SQL for SQL Server 2000 (Apress), by Itzik Ben-Gan and Tom Moreau.