SQL Server has supported the concept of query hints since the earliest version. However, in those versions there was such a small number of hints you could count them with your fingers. Today, if you look at SQL Server’s documentation, there’s not just one page that lists all the hints because there are too many. You can go to the Hints (Transact-SQL) page to see a list of the three different types of hints, but then you have to go to the page for each type of hint to see a list.
So what is a hint? In the English language a hint is a gentle suggestion, but in SQL Server a hint is a directive. A hint tells the SQL Server optimizer what should be in the query plan and the optimizer will do it, unless the hint is impossible to implement. In fact, many people talk about using hints as forcing a query plan.
The optimizer is the component of the SQL Server engine responsible for determining how queries will be processed. The optimizer determines what indexes will be used, the order in which the table will be processed, the kind of join algorithm that will be executed, whether the query will run on multiple processors, etc. It is probably one of the most complex parts of the entire engine. In early versions of SQL Server, the engineers writing the optimizer thought that one day they would have an optimizer that would always come up with the absolute best plan for every query and that hints would no longer be needed. In fact, prior to SQL Server 7, any time I was able to use a hint to come up with a better query plan than the optimizer, the optimizer engineers wanted to know about it. They wanted to try to figure out why the optimizer didn’t come up with that plan without a hint.
For a while it looked like hints would be deprecated as the optimizer became more and more sophisticated, but the opposite actually happened. As more features were added to SQL Server and queries became more and more complex on larger data sets, the optimizer is now so complex and has so many possible plans to investigate there’s no way it could always come up with the best plan. The goal now is to come up with a “good enough plan” without taking more time to optimize than the query will need for execution. At PASS Summit 2010, Microsoft’s David DeWitt gave a talk entitled “SQL Query Optimization: Why Is It So Hard to Get Right?” describing why the optimizer is so complex. (You can view David’s keynote at www.sqlpass.org/summit/na2010/LiveKeynotes/Thursday.aspx.) Even though more and more hints are added with each version of SQL Server, supporting the fact that the optimizer can’t always come up with the best plan on its own, the documentation page referenced in my first paragraph still includes the following warning:
Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that <join_hint>, <query_hint>, and <table_hint> be used only as a last resort by experienced developers and database administrators.
As I previously mentioned, hints fall into three categories, which, as you can see in the warning message, the documentation calls join hints, query hints, and table hints. I actually call the second type “option hints” because they’re specified in an OPTION clause at the end of your queries (and I think all hints are query hints).
We obviously don’t have room for any kind of technical discussion of the use cases for the different types of hints, but I’ll give you one technical tidbit about them. There’s a join hint to force a LOOP join and an option hint called LOOP JOIN, and you might be wondering what the difference is. The join hint is specified in the JOIN clause (so you must be using the join specification using the JOIN keyword). When you use a join hint, it applies to only the two tables on either side of the JOIN keyword. If you use LOOP JOIN as an option hint, it applies to all the joins in the query. So you might think that it doesn’t matter whether you use a join hint or an option hint if there are only two tables, but think again. If you use a join hint, it has the side effect of forcing the join order between the tables. In the first query below, in addition to using a LOOP JOIN the optimizer will make sure that SalesOrderHeader is the first table accessed during execution, whereas in the second query, the optimizer can decide for itself which table should be accessed first.
SELECT * FROM sales.SalesOrderheader h INNER LOOP JOIN sales.SalesOrderDetail d ON h.SalesOrderID = d.SalesOrderDetailID
- SELECT *
- FROM sales.SalesOrderheader h JOIN sales.SalesOrderDetail d
- ON h.SalesOrderID = d.SalesOrderDetailID OPTION (LOOP JOIN)
Keep in mind that once you have used a hint in your code, you have reduced the value of SQL Server’s optimizer. If the optimizer is upgraded in a service pack, you might never know it. And worse, if your data changes so that the plan forced by your hints is no longer the best plan, you might start getting worse performance than you would have without the hint!
Hints definitely have their place. However, that place should not be at the top of your list of query tuning techniques, it should be much closer to the bottom of the list. Hints can be a solution if you haven’t been able to find any other way to get SQL Server’s optimizer to find an acceptable plan. But take a hint from me and learn all you can about tuning and optimization before you start liberally sprinkling query hints throughout your code.