DOWNLOAD THE CODE:
Download the Code 97007.zip

Last month, in “Grouping Sets, Part 1” (InstantDoc ID 96805), I described grouping sets and introduced the new SQL Server 2008 GROUPING SETS subclause of the GROUP BY clause. This month, I describe three more new SQL Server 2008 features related to grouping sets: the new CUBE and ROLLUP options and the GROUPING_ID function. In last month’s examples, I used the Orders table, which you create and populate by running the code in Web Listing 1, and this month I use the same Orders table in my examples. As always, be sure to read last month’s article as a prerequisite to this one.

CUBE and ROLLUP
You might be familiar with the CUBE and ROLLUP options in versions of SQL Server prior to SQL Server 2008. You specified these in a separate WITH clause following the GROUP BY clause, they were very rigid, and they weren’t ISO compliant. The old CUBE and ROLLUP options produced specific grouping sets that you couldn’t combine with other grouping sets in the same query. SQL Server 2008 introduces new, flexible CUBE and ROLLUP subclauses; that is, you can combine them with other grouping sets in the same query, as I’ll demonstrate shortly. Also, the new CUBE and ROLLUP subclauses are ISO compliant.

Think of the new CUBE and ROLLUP subclauses as abbreviations of lengthier GROUPING SETS subclauses. CUBE gives you something akin to (in set theory) a power set. Given a set of attributes as input, CUBE produces all possible permutations of grouping sets out of those attributes, including an empty grouping set. Out of n attributes, CUBE produces 2n grouping sets. For example, CUBE(a, b, c) is equivalent to GROUPING SETS ( (a, b, c), (a, b), (a, c), (b, c), (a), (b), (c), () ). So, the query

SELECT custid, empid, SUM(qty) AS qty
FROM dbo.Orders
GROUP BY CUBE(custid, empid);

is equivalent to

SELECT custid, empid, SUM(qty) AS qty
FROM dbo.Orders
GROUP BY GROUPING SETS( (custid, empid),
(custid), (empid), () );

Both generate the output that Web Table 1 (InstantDoc ID 97007) shows.

ROLLUP is handy when the attributes given as input have a hierarchy—for example, the attributes country, region, and city. Suppose you want to find all permutations of grouping sets that have business value that can be produced out of these attributes. The grouping set (city) has no business value because there might be different cities in your data that have the same name but appear in different countries and regions. There’s no business value in grouping data by the city alone because different cities that happen to have the same name will be arranged in the same group. However, the grouping set (country, region, city) does have business value because it generates a group for each unique city(within a specific country and region). So, the subclause ROLLUP (country, region, city) is equivalent to GROUPING SETS ( (country, region, city), (country, region), (country), () ). Out of the eight possible permutations of grouping sets that CUBE would produce in this case, ROLLUP produces four. More generally, out of n attributes, CUBE produces 2n grouping sets, whereas ROLLUP produces n + 1 grouping sets—those that have business value in case of a hierarchy among the attributes. Similarly, ROLLUP(YEAR(orderdate), MONTH(orderdate), DAY(orderdate)) is equivalent to GROUPING SETS( (YEAR(orderdate), MONTH(orderdate), DAY(orderdate)), (YEAR (orderdate), MONTH(orderdate)), (YEAR(orderdate)), () ).

To test the ROLLUP option, run the following query:

SELECT
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY ROLLUP(YEAR(orderdate),
MONTH(orderdate), DAY(orderdate));

You’ll get the output that Web Table 2 shows. Run the logical equivalent with the explicit GROUPING SETS subclause, and observe that you get the same output:

SELECT
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY GROUPING SETS(
(YEAR(orderdate), MONTH(orderdate),DAY(orderdate)),
(YEAR(orderdate), MONTH(orderdate)),
(YEAR(orderdate)),
() );

As I mentioned earlier, the new CUBE and ROLLUP subclauses are flexible in the sense that you can combine them with other grouping sets in the same query. In fact, you can cross multiple CUBE, ROLLUP, and GROUPING SETS subclauses in the same GROUP BY clause. You can have as many as you like, as long as you don’t exceed the hardcoded limit of 4096 grouping sets per query. The following query is an example:

SELECT custid, empid,
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY
CUBE(custid, empid),
ROLLUP(YEAR(orderdate),
MONTH(orderdate), DAY(orderdate));

To determine which grouping sets you get, first expand the CUBE and ROLLUP subclauses to the logically equivalent GROUPING SETS subclauses. The above query’s GROUP BY clause is equivalent to Figure 1.

After applying the Cartesian product between the two subclauses, you get 16 grouping sets, as Figure 2 shows.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.