Last month, in “Grouping Sets, Part 1” (InstantDoc
ID 96805), I described grouping sets and introduced
the new SQL Server 2008 GROUPING
SETS subclause of the GROUP BY clause. This month, I
describe three more new SQL Server 2008 features related
to grouping sets: the new CUBE and ROLLUP options and
the GROUPING_ID function. In last month’s examples,
I used the Orders table, which you create and populate by
running the code in Web Listing 1, and this month I use the
same Orders table in my examples. As always, be sure to
read last month’s article as a prerequisite to this one.
CUBE and ROLLUP
You might be familiar with the CUBE and ROLLUP
options in versions of SQL Server prior to SQL Server
2008. You specified these in a separate WITH clause
following the GROUP BY clause, they were very rigid,
and they weren’t ISO compliant. The old CUBE and
ROLLUP options produced specific grouping sets that
you couldn’t combine with other grouping sets in the
same query. SQL Server 2008 introduces new, flexible
CUBE and ROLLUP subclauses; that is, you can combine
them with other grouping sets in the same query, as I’ll
demonstrate shortly. Also, the new CUBE and ROLLUP
subclauses are ISO compliant.
Think of the new CUBE and ROLLUP subclauses as
abbreviations of lengthier GROUPING SETS subclauses.
CUBE gives you something akin to (in set theory) a power
set. Given a set of attributes as input, CUBE produces
all possible permutations of grouping sets out of those
attributes, including an empty grouping set. Out of n
attributes, CUBE produces 2n grouping sets. For example,
CUBE(a, b, c) is equivalent to GROUPING SETS ( (a, b,
c), (a, b), (a, c), (b, c), (a), (b), (c), () ).
So, the query
SELECT custid, empid, SUM(qty) AS qty
FROM dbo.Orders
GROUP BY CUBE(custid, empid);
is equivalent to
SELECT custid, empid, SUM(qty) AS qty
FROM dbo.Orders
GROUP BY GROUPING SETS( (custid, empid),
(custid), (empid), () );
Both generate the output that Web Table 1 (InstantDoc
ID 97007) shows.
ROLLUP is handy when the attributes given as input
have a hierarchy—for example, the attributes country,
region, and city. Suppose you want to find all permutations
of grouping sets that have business value that
can be produced out of these attributes. The grouping
set (city) has no business value because there might be
different cities in your data that have the same name
but appear in different countries and regions. There’s
no business value in grouping data by the city alone
because different cities that happen to have the same
name will be arranged in the same group. However, the
grouping set (country, region, city) does have business
value because it generates a group for each unique
city(within a specific country and region). So, the
subclause ROLLUP (country, region, city) is equivalent
to GROUPING SETS ( (country, region,
city), (country, region), (country), () ). Out of
the eight possible permutations of grouping sets
that CUBE would produce in this case,
ROLLUP produces four. More generally,
out of n attributes, CUBE produces 2n
grouping sets, whereas ROLLUP produces n
+ 1 grouping sets—those that have business
value in case of a hierarchy among the attributes.
Similarly, ROLLUP(YEAR(orderdate),
MONTH(orderdate), DAY(orderdate)) is equivalent
to GROUPING SETS( (YEAR(orderdate),
MONTH(orderdate), DAY(orderdate)), (YEAR (orderdate),
MONTH(orderdate)), (YEAR(orderdate)), () ).
To test the ROLLUP option, run the following
query:
SELECT
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY ROLLUP(YEAR(orderdate),
MONTH(orderdate), DAY(orderdate));
You’ll get the output that Web Table 2 shows.
Run the logical equivalent with the explicit
GROUPING SETS subclause, and observe that you get
the same output:
SELECT
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY GROUPING SETS(
(YEAR(orderdate), MONTH(orderdate),DAY(orderdate)),
(YEAR(orderdate), MONTH(orderdate)),
(YEAR(orderdate)),
() );
As I mentioned earlier, the new CUBE and ROLLUP subclauses
are flexible in the sense that you can combine them
with other grouping sets in the same query. In fact, you
can cross multiple CUBE, ROLLUP, and GROUPING
SETS subclauses in the same GROUP BY clause. You
can have as many as you like, as long as you don’t exceed
the hardcoded limit of 4096 grouping sets per query. The
following query is an example:
SELECT custid, empid,
YEAR(orderdate) AS orderyear,
MONTH(orderdate) AS ordermonth,
DAY(orderdate) AS orderday,
SUM(qty) AS qty
FROM dbo.Orders
GROUP BY
CUBE(custid, empid),
ROLLUP(YEAR(orderdate),
MONTH(orderdate), DAY(orderdate));
To determine which grouping sets you get, first expand the
CUBE and ROLLUP subclauses to the logically equivalent
GROUPING SETS subclauses. The above query’s
GROUP BY clause is equivalent to Figure 1.
After applying the Cartesian product between the two
subclauses, you get 16 grouping sets, as Figure 2 shows.
Prev. page  
[1]
2
next page