SQL Server 2005 introduces the APPLY operator, a handy new operator that's used in a query's FROM clause. The APPLY operator basically lets you apply a table expression to every row of an outer table. In terms of syntax, the APPLY operator is somewhat similar to a CROSS JOIN in the sense that it has two inputs that are table expressions:
\[CROSS | OUTER\] APPLY
A table expression can be a table, a view, a derived table, or even a function that returns a table. The APPLY operator applies the right table expression to every row from the left table expression. The really cool part about the APPLY operator is that the right table expression can refer to columns from the row in the left table expression.
This technical definition of APPLY doesn't quite convey exactly what the operator does and the benefits of using it. Let's look at some specific examples to give you an idea of how you might use APPLY.
TOP n for Each Group
Using the TOP n clause to return a specified number of rows from each group of rows is problematic in SQL Server 2000 because it's difficult to devise a set-based solution that performs reasonably with large tables. For example, you want to return the two most recent sales rows (i.e., rows having the latest ord_date values) that have a quantity greater than or equal to 10 for each store from the Sales and Stores tables in the pubs database. Because ord_date isn't unique, you should use the remaining primary key columns (ord_num, title_id) as your tiebreaker. In other words, if you sort the sales rows by store_id, ord_date DESC, ord_num DESC, and title_id DESC in that order, you should return the first two orders for each store. Try to solve the problem by using a set-based solution in SQL Server 2000, and you'll find it extremely hard to find a solution.
The new T-SQL elements in SQL Server 2005 provide several simple techniques to solve the problem. One such technique is to use the new ROW_NUMBER() function, but I won't discuss this technique here since it isn't the focus of the article. Moreover, you can create a faster solution with the new APPLY operator.
Listing 1 shows the solution that uses the new APPLY operator; Table 1 shows its output. The solution applies to each store from the Stores table (left input) the table expression that appears in the query as the right input. The derived table returns the two most recent sales rows where the stor_id is equal to the stor_id in the outer row and the quantity is greater than or equal to 10.
The ORDER BY clause determines which two rows the TOP query will return. It sorts the data by ord_date DESC, ord_num DESC, title_id DESC, meaning that you get the rows with the highest ord_date and use ord_num and title_id as the tiebreaker. The CROSS keyword means that if the right input returns an empty set for a row from the left input, the row from the left input isn't returned at all. Such is the case, for example, with store 6380, which has no sales row with a quantity greater than or equal to 10. Notice in Table 1 that store 6380 isn't returned at all. If you want to return stores with no matching sales rows, use OUTER APPLY instead of CROSS APPLY, as Listing 2 shows. You can see in the query output that Table 2 shows that store 6380 is also returned with NULLs in the right input's attributes.
APPLYing a User-Defined Function
Another interesting use of the APPLY operator is to apply a table-valued function to every row of an outer table. To demonstrate this functionality, run the code in Listing 3 to create the fn_getsubtree function. For our purposes, the actual code implemented in the function isn't really important because the focus is the function usage by the APPLY operator. The function returns from the Employees table in the Northwind database all subordinates in all levels of an input employee ID.
To test the function, invoke it with employee ID 2 as the input:
SELECT * FROM dbo.fn_getsubtree(2);
Table 3 shows the output of this code, which contains all subordinates of employee 2 in all levels.
Now, suppose you want to return the employee ID and full name of employees 2 and 5, along with their subordinates in all levels. You can achieve this by running the query in Listing 4. The query applies the fn_getsubtree function to employees 2 and 5 from the Employees table. So simple! You can see in the output of the query that Table 4 shows that employee 2 has eight subordinates (excluding himself), and employee 5 has three subordinates.
Another interesting related enhancement lets you invoke a table expression that refers to the outer table's columns in subqueries. For example, to return employees that have more than two direct or indirect subordinates, run the query in Listing 5.
- M.EmployeeID AS mgrid,
- M.FirstName + ' ' + M.LastName AS mgrname
- FROM dbo.Employees AS M
- WHERE (SELECT COUNT(*) FROM dbo.fn_getsubtree(M.EmployeeID)) - 1 > 2;
Table 5 shows the query's output.
The query returns all employees from the Employees table where the count of rows returned from the get_subtree function minus 1 (self) is greater than 2. Only Andrew Fuller and Steve Buchanan have more than two direct or indirect subordinates in total.
APPLY the New Operator
You can use the new APPLY operator in many interesting ways. APPLY's main benefit lies in its ability to refer to the outer table's columns from the inner table expression-functionality that earlier versions of SQL Server didn't support. You'll probably find the new APPLY operator handy, especially for optimizing your slow-running code.