SideBar    Getting Your Feet Wet in SQL Server 2008, The Logical Puzzle
DOWNLOAD THE CODE:
Download the Code 96805.zip

Run the three queries above and see for yourself. Query 1 and Query 2 produce the output that Web Table 2 shows, and Query 3 produces the output that Web Table 3 shows. In other words, within the parentheses of the GROUPING SETS option, commas are used as a separator between grouping sets. Within a particular grouping set, commas are used as a separator between the grouping set's elements or attributes. For clarity, you might want to use parentheses surrounding each grouping set even when the grouping set contains a single element. For example, the following is logically equivalent to Query 3:

SELECT custid, empid, YEAR(orderdate)
AS orderyear, SUM(qty) AS qty
FROM dbo.Orders
GROUP BY GROUPING SETS
(
( custid ),
( empid ),
( YEAR(orderdate) )
);

Mixing Grouping Sets and Grouping Sets Algebra
You can have multiple GROUPING SETS subclauses in the same GROUP BY clause. If you specify multiple ones, something interesting will happen: You'll get a Cartesian product or a cross of the grouping sets in the various GROUPING SETS subclauses. This probably sounds very cryptic, and would be best explained through examples. Given grouping sets (A), (B), (C), (D), and (E), where each capital letter represents a set of elements in the grouping set (e.g., A stands for the set of elements a1, a2, ..., an), the following pseudo GROUP BY clause:

GROUP BY
GROUPING SETS( (A), (B), (C) ),
GROUPING SETS( (D), (E) )
is logically equivalent to
GROUP BY
GROUPING SETS( (A, D), (B, D), (C,
D), (A, E), (B, E), (C, E) )

How about a more tangible example? The code in Listing 5 has two GROUPING SETS subclauses. The first GROUPING SETS subclause has three grouping sets:

GROUPING SETS
(
( custid, empid ),
( custid ),
( empid )
)
The second GROUPING SETS subclause has two
grouping sets:
GROUPING SETS
(
( YEAR(orderdate),
MONTH(orderdate) ),
( YEAR(orderdate)
)
)

You get six grouping sets as a result of the Cartesian product between the three and the two. Listing 6 shows the logical equivalent to the code in Listing 5. All six grouping sets are specified in one GROUPING SETS subclause.

Finally, if all grouping sets have a common element, you can pull them outside all GROUPING SETS subclauses into their own GROUPING SETS subclause, and by doing so, shorten your code. For example, consider the following pseudo GROUP BY clause:

GROUP BY GROUPING SETS( (a, b, c, d),
(a, b, c), (a, b, d), (a, b) )

All grouping sets have the common elements "a, b,"- hence, this can also be expressed as

GROUP BY GROUPING SETS ( (a, b) ),
GROUPING SETS( (c, d), (c), (d), () )

Because GROUP BY GROUPING SETS ( (a, b) ) is logically equivalent to GROUP BY a, b, you can also express this as

GROUP BY a, b, GROUPING SETS( (c, d),
(c), (d), () )

As a more tangible example, the queries in Listings 5 and 6 are logically equivalent to the query in Listing 7; the common element YEAR(orderdate) was pulled out.

What's Next?
If grouping sets seem to be somewhat familiar, it's not by chance. You might have already realized that the GROUPING SETS option is similar to the CUBE and ROLLUP options in previous versions of SQL Server. In fact, you can consider CUBE and ROLLUP as special cases of the GROUPING SETS option. However, the CUBE and ROLLUP options were implemented in a rigid/non-flexible way prior to SQL Server 2008 and are also non-ISO compliant. SQL Server 2008 introduces new, ISO-compliant, more flexible CUBE and ROLLUP options to replace the older ones. In a later article, I'll tell you all about the new CUBE and ROLLUP options, and also about the new GROUPING_ID function that you'll probably find useful in conjunction with grouping sets.

In future articles, I'll cover other T-SQL enhancements in SQL Server 2008, including the new MERGE statement, Table-Valued Parameters, the HEIRARCHYID datatype, and temporal datatype enhancements (DATE, TIME, DATETIME2, DATETIMEOFFSET datatypes).

Download associated code.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Very nice article. I can see Group By Grouping Sets can be very useful in OLTP Systems since it provides REAL DATA data analysis comparing to OLAP.

edmundch

Article Rating 5 out of 5

Interesting improvement over the WITH CUBE option of the group by clause. WITH CUBE generate all possible grouping combinations, including those for which you have no insterest. However with the WITH CUBE option there is a T-SQL scalar function called GROUPING which tell in the result sets if NULL value is a real NULL value in the data data or a NULL generated for the purpose of mixing rows having not the same number of columns. Does the GROUPING function applies on NULL data generated by GROUPING SETS clause also ?

pelsql

Article Rating 4 out of 5

OOPS - The online version of the article is incomplete. The end of the article says "Continued on page 2." -and- "End of Article". I have the print copy and see that half the article is missing from the online version. Whats up?

MarkInChicago

Article Rating 3 out of 5