Congratulations to Charles Johnson, senior programmer/analyst at Parmed Pharmaceutical in Niagara Falls, New York, and Andrew Leighton-Sims, analyst/programmer at igroup in Watford, UK. Charles won first prize of $100 for the best solution to the July Reader Challenge, "Quickening the Query." Andrew won second prize of $50. Here’s a recap of the problem and the solution to the July Reader Challenge.

Problem


Steve is troubleshooting a performance problem in a VBScript application that uses SQL Server 2000 and 7.0 as its database servers. During his review of an SQL query that the application generates, he notices in the query’s WHERE clause an IN logical operator that contains a list of primary key values. Steve determines that the application’s slower processing stems from the large number of IN list values that the query checks and the significant overhead required for parsing the long list of values. The following query contains the IN operator in the WHERE clause:

SELECT t.x
   FROM tbl as t
   WHERE t.y IN ( 1, 2, 3, 4, 5, 6, 7 /* long list of IDs */ )

How can Steve improve this query’s performance? Devise a solution that works with SQL Server 2000 and 7.0.

Solution


Steve can improve the query’s performance by using a temporary table to eliminate the long list of values. First, he can pass the list of values to a stored procedure, then parse the values into the temporary table. Finally, he can perform an inner join between the temporary table and the query instead of using the IN operator. The following statements show a method to parse the comma-separated list of values in the stored procedure:

DECLARE @ids varchar(8000)
DECLARE @commapos int
CREATE TABLE #t( i int PRIMARY KEY )
SET @ids = '1,2,3'
WHILE( 1 = 1 )
BEGIN
   SET @commapos = CHARINDEX( ',', @ids )
   IF @commapos = 0 BREAK
   INSERT INTO #t (i) VALUES( LEFT( @ids, @commapos - 1 ) )
   SET @ids = STUFF( @ids, 1, @commapos, null )
END
INSERT INTO #t (i) VALUES( @ids )

The following modified query uses the temporary table:

SELECT t1.*
   FROM tbl AS t1
   JOIN #t AS t2
      ON t2.i = t1.n

If the list of values is longer than the SQL Server character data-type limit, Steve can use a text parameter instead, then parse the list of values by using logic similar to the CHARINDEX() function above. The following statements show the stored procedure that uses the parsing logic:

CREATE PROCEDURE GetRows ( @ids varchar(8000) )
AS
CREATE TABLE t1 ( n int PRIMARY KEY )

DECLARE @commapos int
CREATE TABLE #t( i int PRIMARY KEY )
WHILE( 1 = 1 )
BEGIN
   SET @commapos = CHARINDEX( ',', @ids )
   IF @commapos = 0 BREAK
   INSERT INTO #t (i) VALUES( LEFT( @ids, @commapos - 1 ) )
   SET @ids = STUFF( @ids, 1, @commapos, null )
END
INSERT INTO #t (i) VALUES( @ids )

SELECT t1.* FROM t1 JOIN #t AS t2 ON t1.n = t2.i
DROP TABLE t1
GO

Steve can achieve a similar performance benefit by using the BULK INSERT statement to populate the temporary table. Because SQL Server implements the BULK INSERT statement internally, the bulk-copy operation in this query performs more efficiently than it would if Steve used the bulk copy program (bcp) utility directly. (Because BULK INSERT is a native T-SQL command and runs within the context of the SQL Server process, it claims less overhead and is efficient. Bcp, on the other hand, is an external utility that makes a new connection to SQL Server and uses the ODBC bulk copy API.)

Using the temporary table eliminates the need for SQL Server to parse the list of values during query compilation and optimization and avoids a known problem with the query optimizer: Parsing a long list of values can take a disproportionately long time. Microsoft has addressed this problem in online articles. For example, see the article "Large Array of IN Clause Parameters Parses/Compiles Slower on SQL Server 2000" at http://support.microsoft.com/default.aspx?scid=kb;EN-US;q296612.

Steve can compare the application’s performance in each of these scenarios by running the following three scripts. The first script shows how the original VBScript application generates a query that includes a string containing hard-coded values. The second script shows how Steve can use a stored procedure to parse the list of values. The third script shows how Steve can reprogram the application to generate a query that uses a temporary table populated by a BULK INSERT statement.

Script That Generates a Query Containing a Long List of Hard-Coded Values

' Script using IDs
'
wscript.echo "Query using list of ids..."
wscript.echo "Start: " & CStr( Now() )

IDs = "( 1"
for i = 2 to 5000
 IDs = IDs & "," & cstr( i )
next
IDs = IDs & " );"

set adoCmd = CreateObject( "ADODB.Connection" )
adoCmd.Provider = "SQLOLEDB"
adoCmd.Properties( "Data Source" )  = ".\shilohdev"
adoCmd.Properties( "Initial Catalog" ) = "tempdb"
adoCmd.Properties( "Integrated Security" ) = "SSPI"
adoCmd.CommandTimeout = 0
adoCmd.Open

wscript.echo "Ids  : " & CStr( Now() )

adoCmd.Execute "create table t1( n int primary key );" & _
               "select t1.* from t1 where n in " & IDs & _
               "drop table t1;"
adoCmd.Close
Set AdoCmd = Nothing

wscript.echo "End  : " & CStr( Now() )

Script That Uses a Stored Procedure to Retrieve the Rows

wscript.echo "Query using list of ids..."
wscript.echo "Start: " & CStr( Now() )

IDs = "1"
for i = 2 to 5000
 IDs = IDs & "," & cstr( i )
next

set adoCmd = CreateObject( "ADODB.Connection" )
adoCmd.Provider = "SQLOLEDB"
adoCmd.Properties( "Data Source" )  = ".\shilohdev"
adoCmd.Properties( "Initial Catalog" ) = "tempdb"
adoCmd.Properties( "Integrated Security" ) = "SSPI"
adoCmd.CommandTimeout = 0
adoCmd.Open

wscript.echo "Ids  : " & CStr( Now() )

adoCmd.Execute "dbo.GetRows '" & IDs & "'"
adoCmd.Close
Set AdoCmd = Nothing

wscript.echo "End  : " & CStr( Now() )

Script That Generates a Query Containing a Temporary Table Populated by a BULK INSERT Statement

' Script that uses BULK INSERT
'
wscript.echo "Query using bulk insert..."
wscript.echo "Start: " & CStr( Now() )

Set f = CreateObject( "Scripting.FileSystemObject" )
Set t = f.CreatetextFile( "C:\temp\ids.txt" )

IDs = "1"
for i = 2 to 5000
 IDs = IDs & chr(13) & chr(10) & cstr( i )
next
t.WriteLine IDs
t.Close

set adoCmd = CreateObject( "ADODB.Connection" )
adoCmd.Provider = "SQLOLEDB"
adoCmd.Properties( "Data Source" )  = ".\shilohdev"
adoCmd.Properties( "Initial Catalog" ) = "tempdb"
adoCmd.Properties( "Integrated Security" ) = "SSPI"
adoCmd.Open

wscript.echo "Ids  : " & CStr( Now() )

adoCmd.Execute "create table t1 ( n int primary key );" & _
       "create table #t2 ( n int primary key );" & _
       "bulk insert #t2 from 'c:\temp\ids.txt';" & _
       "select t1.* from t1 join #t2 on t1.n = #t2.n;" & _
       "drop table t1;"
adoCmd.Close
Set AdoCmd = Nothing

wscript.echo "End  : " & CStr( Now() )

Now, test your SQL Server savvy in the August Reader Challenge, "Synchronizing Logins" (below). Submit your solution in an email message to challenge@sqlmag.com by July 24. SQL Server MVP Umachandar Jayachandran, a SQL Server Magazine technical editor, will evaluate the responses. We’ll announce the winner in an upcoming SQL Server Magazine UPDATE. The first-place winner will receive $100, and the second-place winner will receive $50.

Here’s the challenge: Ray’s company runs SQL Server 7.0 on its production servers and SQL Server 2000 on its staging servers. Ray needs to build a script that can synchronize logins between the production and staging servers (i.e., add missing logins from the production servers to the staging servers). Synchronized logins will let him create an identical environment for testing application upgrades and for testing SQL Server and the OS on a different server. When a staging server is configured identically to a production server and holds the same data, he can also test service-pack upgrades on the staging server. Then, after the upgrade is finished, he can switch the server roles.

The production servers are configured for mixed authentication, which means that users can connect to a SQL Server instance by using either Windows authentication or SQL Server authentication. Help Ray write a script that can synchronize the logins between the servers while preserving all login properties.