QUALIFY - Request for a New Filtering Clause

This entry describes a request posted in Microsoft’s connect site for a new query filtering clause called QUALIFY in SQL Server.

As of SQL Server 2008 (including R2) T-SQL supports three standard filtering clauses based on predicates (ON, WHERE, HAVING), and one nonstandard filtering option based on a number/percent of rows and ordering (TOP). Window calculations (e.g., ranking and aggregate calculations that use the OVER clause, and hopefully others in the future) are only allowed in the SELECT and ORDER BY clauses of a query, and therefore you cannot refer to those directly in the filtering clauses that are based on predicates. For example, the following is not a valid statement:

SELECT col1, col2, col3

FROM dbo.T1

WHERE ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2) BETWEEN 11 AND 20;

 

There is logic behind this restriction. Consider a table T1 with the following sample data and the query that follows:

col1
-----
100
150
300
350
900

SELECT col1

FROM dbo.T1

WHERE col1 >= 300 AND ROW_NUMBER() OVER(ORDER BY col1) 2;

 

Should the filter based on the ROW_NUMBER function be evaluated after col1 >= 300 or before it? If the former, you should get the output:

col1
-----
300
350

If the latter, you should get an empty set. But in SQL things get processed logically in an all-at-once fashion and order of evaluation of predicates shouldn’t matter. Due to this ambiguity SQL allows window calculations only in the SELECT and ORDER BY clauses of a query, making it clear that they are evaluated after the FROM, WHERE, GROUP BY and HAVING clauses.

Sure, you can achieve the filtering task indirectly by using a table expression:

WITH C AS

(

  SELECT ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2) AS rownum,

    col1, col2, col3

  FROM dbo.T1

)

SELECT col1, col2, col3

FROM C

WHERE rownum BETWEEN 11 AND 20;

However, this adds a layer of complexity that if possible would be nice to avoid.

I propose implementing a new filtering clause called QUALIFY similar to the one implemented already by Teradata. In terms of logical query processing, QUALIFY fits after the SELECT clause, allowing it to refer to window calculations:

1. FROM
2. WHERE
3. GROUP BY
4. HAVING
5. SELECT
6. QUALIFY
7. ORDER BY

If implemented, you will be able to refer to window calculations directly in the QUALIFY clause, like so:

SELECT col1, col2, col3

FROM dbo.T1

QUALIFY ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2) BETWEEN 11 AND 20;

 

Or indirectly, like so:

SELECT ROW_NUMBER() OVER(PARTITION BY col1 ORDER BY col2) AS rownum,

  col1, col2, col3

FROM dbo.T1

QUALIFY rownum BETWEEN 11 AND 20;

But this time without the need for a table expression.

Support for the QUALIFY clause would also help writing clearer code by avoiding the need to use the TOP option. TOP involves a lot of confusion surrounding the TOP ordering vs. presentation ordering. People use TOP to try to circumvent restrictions regarding the use of ORDER BY in table expressions, not realizing that unless the outermost query has an ORDER BY clause, presentation ordering is not guaranteed. The standard doesn’t support TOP, but does support TOP-like filtering clauses. However, those standard features have the same confusion aspects related to ordering.

In the past I proposed revising the TOP option to support an OVER clause with its own ordering specification instead of relying on the ORDER BY clause that traditionally serves a presentation purpose. And while we’re at it, why not also support a PARTITION BY clause. You can find my proposal here:

I still feel that such a feature would be great since it would avoid the existing ambiguity and confusion, plus support a partitioning element that TOP doesn’t currently support.

I find that supporting the QUALIFY clause is even more important than the proposed revision to TOP since:

·        It gives you all the functionality that a TOP OVER (and the top-like standard features) would give you.

·        There’s no confusion around the ordering aspect.

·        You have a partitioning element in window calculations and naturally it will be reflected in the filter.

·        You can do much more than what TOP and the TOP-like standard features allow you to do, e.g., if SQL Server adds support in the future for other window calculations and enhance existing ones, those will be supported in the QUALIFY clause as well. For example, you will be able to filter by a running total or a sliding total.

If you also feel that this feature could be useful, please support it by casting your vote here.

 

Cheers,

BG

Discuss this Blog Entry 3

on Feb 27, 2010
Very helpful!
on Mar 4, 2010
Hi Itzik,

As always, you got me thinking, and I added a long comment to your Connect item. I'm curious to know what you think!

Steve Kass



on Mar 5, 2010
Steve, likewise. :)

Itzik



Please or Register to post comments.

What's Puzzled By T-SQL Blog?

T-SQL tips and logical puzzles from Itzik Ben-Gan.

Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×