Determining the Position of Search Arguments Within a Join

Q: When writing a join, I always wonder if I can improve performance by moving the search argument into the JOIN clause instead of placing it in the WHERE clause. Listing 1 shows the search argument in the WHERE clause, and Listing 2 shows it in a JOIN clause. What effect would this move have, and is it beneficial?

A: With a WHERE clause, search arguments are applied to all the rows in the result set. In an INNER JOIN, the position of the search argument has no effect on the result set because the INNER JOIN requires all rows to have a match in all joined tables, and all rows have to meet the conditions of the WHERE clause argument. That’s an important point to make because if this query were an OUTER JOIN, the result set returned would be different based on where the search argument is placed. I’ll explain more about this later.

The position of the argument shouldn’t change the performance of the query plan. In an INNER JOIN, SQL Server will recognize the search argument and apply it efficiently based on the plan it chooses. It’s likely that SQL Server will filter the data using the argument before the join, because doing so usually reduces the overall cost of the join. However, there are other factors, such as the indexes that exist, that could influence the query plan and result in different behavior, but again, these aren’t affected by the position of the argument.

As I mentioned above, OUTER JOINs are different—the position of the search argument really matters. When executing an OUTER JOIN, a search argument in the FROM clause will define the result set of the INNER JOIN before the OUTER JOIN results are added. If we wanted to see sales order information, including StateProvinceID, for all US customers but list all customers regardless of StateProvinceID, we could ask for an OUTER JOIN to describe customers and sales (which also requires joins to Person and PersonAddress to expose the StateProvinceID). If we use the base queries you mentioned in your question to do this, we’ll end up with the queries shown in Listing 3 and Listing 4. When executed, you’ll find that they have different query plans and drastically different results. Listing 3 finds all matching rows for customers and only outputs a row if the customer is a US customer. Using the AdventureWorks2008 database, there are a total of 12,041 customers where the CountryRegionCode is US.

Listing 4 uses the CountryRegionCode as the predicate for determining which rows should return the data for the columns in the select list, including sp.StateProvinceID. However, it still produces rows for customers that aren’t in the United States. This query returns 32,166 rows.

Your question really isn’t performance related. SQL Server can optimize your query regardless of the position of a search argument. However, I strongly suggest that all search arguments stay in the WHERE clause so that you never get caught with incorrect data if you decide to change a complex query from an INNER JOIN to an OUTER JOIN.

Discuss this Blog Entry 6

on May 4, 2010
In listing 3, the OUTER joins (except for the Person.Person outer join) are actually turned into INNER joins. Listing 3 is in no way the same as listing 4. The code does not match the output in listing 3, in which case I would make the developer rewrite the code if I were doing the code review.
on May 4, 2010
Kimberly,
Agreed. We're on the same page. I simply stated what you implied in your posting. You didn't quite specify exactly that the outer joins turn into inner joins, so I wanted to get that comment out there for other readers.
on May 16, 2010
OK, I was going to post this as a follow-on Q&A article but it got a bit out of control (in terms of length) so I expanded it into a blog post. Here's the link: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Determining-the-position-of-search-arguments-in-a-join.aspx.

Enjoy!
kt


on May 16, 2010
That's a great point Phil. In fact, I'm writing another article that relates to this point right now. I'll link to it when it's all done. I wasn't really focusing on the data of the query as much as the specific placement of the SARGs and so I overlooked that point.

Thanks!!
Kimberly


on May 4, 2010
Hey there Phil-But that's kind of the point. The *only* difference in the actual code of the query is the position of the search argument. And, when moved - the query becomes completely different. However, people ask me this as a "performance" question (should I move it up into the join to make things faster). For inner joins it doesn't really matter but for outer joins it makes a HUGE difference on the results and completely changes the meaning of the query...

So *YES* these are NOT the same. That's the problem!

Cheers,
kt




on May 4, 2010
Kimberly,
Agreed. We're on the same page. I simply stated what you implied in your posting. You didn't quite specify exactly that the outer joins turn into inner joins, so I wanted to get that comment out there for other readers.

Please or Register to post comments.

What's SQL Server Questions Answered?

Practical tips and answers to many of your questions about SQL Server including database management and performance issues.

Contributors

Paul S. Randal

Paul Randal worked on Microsoft's SQL Server team for nine years in development and management roles, writing many of the DBCC commands. Randal was ultimately responsible for SQL Server 2008'...

Kimberly L. Tripp

Kimberly L. Tripp has been working with SQL Server since 1990, and she’s worked as a consultant, trainer, speaker, and writer specializing in core SQL Server performance tuning and availability...
Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×