MERGE Puzzle and a Request for Warning

This is a result of e-mail exchange I had with Manuel Lopez…

Given the following DDL and sample data:

SET NOCOUNT ON;

USE tempdb;

 


IF OBJECT_ID('dbo.CustomersNew', 'U') IS NOT NULL

  DROP TABLE dbo.CustomersNew;

 

IF OBJECT_ID('dbo.DomesticCustomers', 'U') IS NOT NULL

  DROP TABLE dbo.DomesticCustomers;

 

IF OBJECT_ID('dbo.ForeignCustomers', 'U') IS NOT NULL

  DROP TABLE dbo.ForeignCustomers;

 


CREATE TABLE dbo.DomesticCustomers

(

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

);

 


CREATE TABLE dbo.ForeignCustomers

(

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

);

 


INSERT INTO dbo.DomesticCustomers(custid, custname) VALUES

  (1001, 'Cust domestic aaa'),(1002, 'Cust domestic bbb');

 


INSERT INTO dbo.ForeignCustomers(custid, custname) VALUES

  (2001, 'Cust foreign aaa'),(2002, 'Cust foreign bbb');

 


CREATE TABLE dbo.CustomersNew

(

  custtype CHAR(1) NOT NULL CHECK (custtype IN ('D', 'F')),

  custid   INT         NOT NULL PRIMARY KEY,

  custname VARCHAR(20) NOT NULL,

  /* ...other attributes... */

)

 


INSERT INTO dbo.CustomersNew(custtype, custid, custname) VALUES

  ('D', 1001, 'Cust domestic xyz'),

  ('D', 1003, 'Cust domestic ccc'),

  ('F', 2001, 'Cust foreign xyz'),

  ('F', 2003, 'Cust foreign ccc');

 


SELECT * FROM dbo.DomesticCustomers;

 


custid      custname

----------- --------------------

1001        Cust domestic aaa

1002        Cust domestic bbb

 


SELECT * FROM dbo.ForeignCustomers;

 


custid      custname

----------- --------------------

2001        Cust foreign aaa

2002        Cust foreign bbb

 


SELECT * FROM dbo.CustomersNew;

 


custtype custid      custname

-------- ----------- --------------------

D        1001        Cust domestic xyz

D        1003        Cust domestic ccc

F        2001        Cust foreign xyz

F        2003        Cust foreign ccc

 

You’re supposed to write a MERGE statement that deals only with domestic customers (those in CustomersNew with 'D' in the custtype attribute); updating the nonkey attributes of customers that already exist in the target DomesticCustomers table, and inserting new customers that don’t exist. Consider the following MERGE statement, and before you run it, see if you can guess what the output of the subsequent SELECT will be?

 

MERGE INTO dbo.DomesticCustomers AS TGT

USING dbo.CustomersNew AS SRC

  ON SRC.custtype = 'D'

  AND SRC.custid = TGT.custid

WHEN MATCHED THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

SELECT * FROM dbo.DomesticCustomers;

 

Scroll down to see the result…

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

 

I bet you didn’t guess the following:

 

custid      custname

----------- --------------------

1001        Cust domestic xyz

1002        Cust domestic bbb

1003        Cust domestic ccc

2001        Cust foreign xyz

2003        Cust foreign ccc

 

But that’s the result I got. It appears that this behavior is documented. From http://msdn.microsoft.com/en-us/library/bb522522.aspx:

 

“Reducing the number of rows in the input stream early in the process by specifying the additional search condition to the ON clause (for example, by specifying ON Source.EmployeeID = Target.EmployeeID AND EmployeeName LIKE 'S%') might seem to improve query performance. However, doing so can cause unexpected and incorrect results. Because the additional search conditions specified in the ON clause are not used for matching the source and target data, they can be misapplied.

The following example demonstrates how incorrect results can occur. The search condition for matching the source and target tables and the additional search condition for filtering rows are both specified in the ON clause. Because the additional search condition is not required to determine source and target matching, the insert and delete actions are applied to all input rows. In effect, the filtering condition EmployeeName LIKE 'S%' is ignored. When the statement is run, the output of the inserted and deleted tables shows that two rows are incorrectly modified: Mary is incorrectly deleted from the target table, and Bob is incorrectly inserted. “

There are simple workarounds like specifying the additional predicate in the WHEN … AND clause, or filtering the data in a table expression, like so:

WITH DomesticCustomersNew AS

(

  SELECT custid, custname

  FROM dbo.CustomersNew

  WHERE custtype = 'D'

)

MERGE INTO dbo.DomesticCustomers AS TGT

USING DomesticCustomersNew AS SRC

  ON SRC.custid = TGT.custid

WHEN MATCHED THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

Or like so:

MERGE INTO dbo.DomesticCustomers AS TGT

USING dbo.CustomersNew AS SRC

  ON SRC.custid = TGT.custid

WHEN MATCHED AND SRC.custtype = 'D' THEN

  UPDATE

    SET TGT.custname = SRC.custname

WHEN NOT MATCHED AND SRC.custtype = 'D' THEN

  INSERT(custid, custname)

    VALUES(SRC.custid, SRC.custname);

 

But I bet that many aren’t aware of this behavior.

Manuel Lopez opened a connect item (http://connect.microsoft.com/SQLServer/feedback/details/564676) asking at least for a warning to be issued by SQL Server when you execute a MERGE statement and specify a nonmatching predicate in the ON clause. I think that this is important and added my vote. If you also think it’s important, please add your vote too.

 

Cheers,

BG

 

Discuss this Blog Entry 1

on Jul 15, 2010
Syntax and behavior of MERGE statement are consistent. Warning is o.k. but it is the same level as warning that cast and convert functions will fail if you want to convert string to date but string is not properly formatted.

Example above can be transformed in two statements (UPDATE and INSERT):

Update dbo.DomesticCustomers
SET custname = SRC.custname
From dbo.DomesticCustomers AS TGT
Inner Join dbo.CustomersNew AS SRC
on SRC.custtype = 'D'
AND SRC.custid = TGT.custid

Insert Into dbo.DomesticCustomers(custid, custname)
Select SRC.custid, SRC.custname
From dbo.CustomersNew AS SRC
Left Join dbo.DomesticCustomers AS TGT
ON SRC.custtype = 'D'
AND SRC.custid = TGT.custid
Where TGT.custid is null


While UPDATE is correct, Insert will do exactly the same thing as MERGE statement, and I do believe no one will say that is because of syntax of INSERT statement or because of syntax of JOIN clause.

Even if you use
WHERE NOT EXISTS( Select * from dbo.DomesticCustomers Where SRC.custtype = 'D'
AND SRC.custid = TGT.custid )
instead of LEFT JOIN Where TGT.custid is null
you will end up with the same result and even with the same execution plans.

Where is the catch?

In MERGE statement above, as well as in both insert statements is clearly stated - INSERT into DomesticCustomers table all rows from CustomerNew table if combination of custid AND custtype='D' can not matched in DomesticCustomers table.

The problem is in "AND custtype='D'" part. For all rows with custtype different than D matching (join) condition will fail (EXISTS condition too).

I do not agree also with the term of "WORKAROUNDS" you used at the end of post, because those two solutions are not workarounds, but correctly written statements - in the similar way as separate INSERT statement should be written:

Insert Into dbo.DomesticCustomers(custid, custname)
Select SRC.custid, SRC.custname
From dbo.CustomersNew AS SRC
Left Join dbo.DomesticCustomers AS TGT
ON SRC.custid = TGT.custid
Where TGT.custid is null
and SRC.custtype = 'D'

or

Insert Into dbo.DomesticCustomers(custid, custname)
Select SRC.custid, SRC.custname
From dbo.CustomersNew AS SRC
WHERE NOT EXISTS( Select * from dbo.DomesticCustomers AS TGT Where SRC.custid = TGT.custid )
AND SRC.custtype = 'D'

























































Please or Register to post comments.

What's Puzzled By T-SQL Blog?

T-SQL tips and logical puzzles from Itzik Ben-Gan.

Blog Archive

Sponsored Introduction Continue on to (or wait seconds) ×