Corrections


The original version of "Using Fields as Primary Keys" (August 2005, InstantDoc ID 46890) contained an error. You can access an updated version online.

The original version of Listing 1 in the article "Tame Your Tables" (August 2005, InstantDoc ID 46755) contained an error. You can access a corrected version of the listing online.

In September's report about the winners of the 2005 SQL Server Magazine Readers' Choice Awards ("Administrative Tools Keep SQL Server Running Smoothly," InstantDoc ID 47081), we mistakenly listed Embarcadero's winning product in the Best Capacity Planning/Trend Analysis Software category as DBArtisan. The correct product name is DBArtisan Workbench. We apologize for any inconvenience these errors caused.

Puzzling


Itzik Ben-Gan's July puzzle was good ("The Logical Puzzle: Arithmetic Maximum Calculation," InstantDoc ID 46365). The answer came to me while driving back to the office after lunch:

(ABS (m-n) + m + n) / 2

Is that faster than using an IF statement or even a CASE expression? All three of the functions below seem to perform equivalently.

CREATE FUNCTION dbo.Maximum1
  (@m int, @n int)
RETURNS int
AS
BEGIN
      RETURN (ABS (@m - @n) + @m + @n) / 2
END
GO
CREATE FUNCTION dbo.Maximum2 (@m int, @n int)
RETURNS int
AS
BEGIN
      IF @m > @n
            RETURN @m
      RETURN @n
END
GO
CREATE FUNCTION dbo.Maximum3 (@m int, @n int)
RETURNS int
AS
BEGIN
      RETURN CASE WHEN @m > @n THEN @m ELSE @n END
END
GO

You're correct about the functions performing equivalently, but in this case, I wouldn't worry about performance. For many programmers it's natural to be mainly concerned about CPU cycles, but in data manipulation (set) activities, the primary factor affecting query performance is I/O. And as you observed, even in terms of CPU cycles there isn't much difference between the solutions.

I offered this puzzle to give readers a chance to practice logic with a limited set of tools, as is common in SQL. For example, tiebreaker problems can be solved very efficiently by using an aggregate query that concatenates sort columns and tiebreakers. The article "T-SQL Tiebreakers" (April 2005, InstantDoc ID 45235) explains more fully, but to make a long story short, you need to apply an aggregate (minimum or maximum) to concatenated columns. Say you have a table T1 with integer columns c1 and c2, with values in the range 1 to 100:

create table t1
(
  c1 int check(c1 between 1 and 100),
  c2 int check(c2 between 1 and 100),
  primary key(c1, c2)
);
insert into t1 values(10, 1);
insert into t1 values(10, 2);
insert into t1 values(10, 3);
insert into t1 values(10, 4);
insert into t1 values(20, 2);
insert into t1 values(20, 4);
insert into t1 values(20, 6);
insert into t1 values(20, 8);

You need to find the row with the highest c1 value, and if c1 has duplications, use the maximum c2 value as a tiebreaker. Here's a technique to achieve this:

select mx / 1000 as c1, mx % 1000 as c2
from (select max(c1*1000 + c2) as mx from t1) as d
c1          c2
-------  -------
20          8

However, what if you were asked to return the row with the highest c1 value, and if c1 has duplicates, use the minimum c1 value as a tiebreaker?

The limitation is to use just one aggregate function (e.g., MAX) and somehow logically calculate the maximum of c1 and within it the minimum c2. The trick is to subtract c2 from the highest possible value in the column (i.e., 100) before concatenating it. Apply MAX to the concatenated values, and in the outer query subtract the c2 portion from the maximum possible value again to get the original value:

select mx / 1000 as c1, 100 - mx % 1000 as c2
from (select max(c1*1000 + (100 - c2)) as mx from t1) as d
c1         c2
------- ---------
20         2

The Search Path Less Traveled


I just finished Michelle Poolet's article "The Search Path Less Traveled" (December 2004, InstantDoc ID 44273). I've been coding and developing Web sites for some time now, and the concept of data normalization is not foreign to me. Michelle's article drove home a simple point: Unnormalized data signifies bad database design. Although this might be considered a "novice" article, the content is just as valuable to experienced developers.