Last week I presented the following puzzle:
Catch That Bug! Random Row per Group
(http://www.sqlmag.com/Article/ArticleID/93895/93895.html).
As some of you figured, the reason that the query returns varying numbers of
rows in different invocations is because the subquery is reevaluated once per
order (can be observed in the execution plan), while you want the subquery to
reevaluate once per employee. Every time the subquery is invoked a new
random OrderID is produced, and it may or may not be equal to the OrderID
in the outer row. In some cases no order is found for a given employee, and
in some cases more than one order is found. In short, you need to come up
with a solution where either the execution plan evaluates a random order only
once per employee, or one where random sort values are assigned only once
(in one shot) for the set of Orders.
As an aside, the expression CHECKSUM(NEWID()) generates values with
better random distribution than NEWID() does, but that’s a different story.

SQL Server 2005 has language elements that allow elegant and efficient
solutions that overcome the problem including the APPLY table operator and
the ROW_NUMBER function. But I’ll start with solutions that work in SQL
Server 2000 first.

The following query produces a random OrderID per employee:

SELECT (SELECT TOP 1 OrderID                                      FROM dbo.Orders AS O                                      WHERE O.EmployeeID = E.EmployeeID                                      ORDER BY CHECKSUM(NEWID())) AS RandomOrder                              FROM dbo.Employees AS E;

Now filter the orders from the Orders table where the OrderID is IN the set
of OrderIDs returned by the previous query:

SELECT EmployeeID, OrderID, OrderDate, CustomerID                              FROM dbo.Orders                              WHERE OrderID IN                                  (SELECT (SELECT TOP 1 OrderID                                           FROM dbo.Orders AS O                                           WHERE O.EmployeeID = E.EmployeeID                                           ORDER BY CHECKSUM(NEWID())) AS RandomOrder                                   FROM dbo.Employees AS E)                              ORDER BY EmployeeID;

You get a plan where the innermost subquery (returning a random OrderID
for the outer employee) is evaluated only once per employee. You might not
feel comfortable with such a solution, since its correctness relies on the
execution plan that the optimizer chooses. The execution plan might be
different with different data distribution, indexes, etc. To be on the safe side,
you can first calculate a random sort value for each order (in one shot) and
materialize the result set in a temporary table:

SELECT EmployeeID, OrderID, CHECKSUM(NEWID()) AS Rnd                              INTO #OrdersRnd                              FROM dbo.Orders;

Then you can safely use the logic that I presented with the original query:

SELECT EmployeeID, OrderID, OrderDate, CustomerID                              FROM dbo.Orders AS O1                              WHERE OrderID IN                                  (SELECT TOP 1 OrderID                                   FROM #OrdersRnd AS O2                                   WHERE O2.EmployeeID = O1.EmployeeID                                   ORDER BY Rnd)                              ORDER BY EmployeeID;

Not an efficient solution, but a safe one.

As for SQL Server 2005, things are simpler. You can use the APPLY
operator to return a random order per employee like so:

SELECT A.*                              FROM dbo.Employees AS E                                CROSS APPLY                                  (SELECT TOP (1) EmployeeID, OrderID, OrderDate, CustomerID                                   FROM dbo.Orders AS O                                   WHERE O.EmployeeID = E.EmployeeID                                   ORDER BY CHECKSUM(NEWID())) AS A                              ORDER BY EmployeeID;

The APPLY operator pretty much lends itself to being evaluated only once
per outer row.

Another solution is to calculate row numbers partitioned by employee, sorted
by randomly generated values (CHECKSUM(NEWID()))) . Encapsulate the
query that generates random values in a CTE, and have the outer query filter
only rows where the row number is equal to 1:

WITH Orders_RN AS                              (                                SELECT EmployeeID, OrderID, OrderDate, CustomerID,                                  ROW_NUMBER() OVER(PARTITION BY EmployeeID                                                    ORDER BY CHECKSUM(NEWID())) AS RowNum                                FROM dbo.Orders                              )                              SELECT *                              FROM Orders_RN                              WHERE RowNum = 1                              ORDER BY EmployeeID;

Here the random values are generated only once. The last two solutions that
use APPLY and ROW_NUMBER are also more flexible than the original
solution in the sense that you can request more than one random order per
employee:

SELECT A.*                              FROM dbo.Employees AS E                                CROSS APPLY                                  (SELECT TOP (3) EmployeeID, OrderID, OrderDate, CustomerID                                   FROM dbo.Orders AS O                                   WHERE O.EmployeeID = E.EmployeeID                                   ORDER BY CHECKSUM(NEWID())) AS A                              ORDER BY EmployeeID;

WITH Orders_RN AS                              (                                SELECT EmployeeID, OrderID, OrderDate, CustomerID,                                  ROW_NUMBER() OVER(PARTITION BY EmployeeID                                                    ORDER BY CHECKSUM(NEWID())) AS RowNum                                FROM dbo.Orders                              )                              SELECT *                              FROM Orders_RN                              WHERE RowNum                               
Cheers,
--
BG