MERGE Statement Tips

New tricks for an old tool

What is in this article?:

  • MERGE Statement Tips

When a new T-SQL feature is introduced in SQL Server, it usually takes some time before users realize certain aspects of working with the feature. In time, people develop best practices for using the feature and discover interesting, nontrivial things about it. In this article, I discuss some tips concerning the MERGE statement that I discovered after several years of working with the statement. I cover four main tips: preventing MERGE conflicts, understanding that the MERGE ON clause isn't a filter, realizing that the MERGE USING clause is like the FROM clause in a SELECT statement, and, finally, referring to source table elements in the OUTPUT clause.

For the purposes of this article, I use a sample database called TSQL2012. You can download the source code to create the database and populate it with sample data. Some of my examples use the Sales.Customers table as the source table for the MERGE statement. All of my examples use a table called Sales.MyCustomers as the target table. Use the code in Listing 1 to create the table in the TSQL2012 database.

Preventing MERGE Conflicts

One of the classic tasks that's handled with the MERGE statement is update where exists and insert where not exists. Some people refer to this task as upsert. For example, suppose you have a stored procedure that accepts as input parameters the attributes of a customer: @custid, @companyname, @country, @phone. The procedure is supposed to check whether the customer ID already exists in the target table Sales.MyCustomers; if it does exist, update the target row, overwriting the nonkey columns with the new values, and if it doesn't exist, add a new row. This task can be accomplished with the following MERGE statement:

MERGE INTO Sales.MyCustomers AS TGT

USING ( VALUES( @custid, @companyname, @country, @phone ) )

AS SRC( custid, companyname, country, phone )

ON SRC.custid = TGT.custid

WHEN MATCHED THEN UPDATE

SET TGT.companyname = SRC.companyname,

TGT.country = SRC.country,

TGT.phone = SRC.phone

WHEN NOT MATCHED THEN INSERT

VALUES( SRC.custid, SRC.companyname, SRC.country, SRC.phone );

The problem is that under the default isolation level, read committed, conflicts can occur if two processes execute the procedure at the same time with the same customer ID as input. If the input customer ID doesn't exist in the target table, the MERGE predicate in both cases will return a false, activating in both cases the WHEN NOT MATCHED action: INSERT. In such a case, one of the inserts will result in a primary key violation. To prove this, run the code in Listing 2 from two sessions.

The code uses an infinite loop that invokes the aforementioned MERGE statement. It uses the expression CHECKSUM(SYSDATETIME()) to generate the customer ID. There's a high likelihood that the resulting value doesn't already exist as a customer ID in the target table and that after some iterations both sessions will produce the same value. When I ran this code from two sessions, I got a conflict after a few seconds, and one of the sessions generated the error message in Figure 1.

Note that this isn't considered a bug. The atomicity of the MERGE transaction isn't violated -- it's still an all-or-nothing behavior. According to the read committed isolation level, such conflicts aren't supposed to be prevented. If you want to serialize access to the object throughout the transaction (both the check whether the customer exists and the insertion), you need to use the serializable isolation level. You can achieve this by either setting the session's isolation level to serializable (SET TRANSACTION ISOLATION LEVEL SERIALIZABLE), or by specifying the table hint HOLDLOCK or SERIALIZABLE (both have the same meaning) against the target table.

To test using the serializable isolation level alter the MERGE statement in Listing 2, replace the first line in the statement with the following:

MERGE INTO Sales.MyCustomers WITH (HOLDLOCK) AS TGT

Run the code from two sessions, and this time conflicts shouldn't occur.

The MERGE ON Clause Isn't a Filter

The MERGE statement uses an ON clause to define the MERGE predicate. However, this clause has a different meaning than the ON clause in a join -- which some people don't realize. In a join, the ON clause is used for both matching and filtering purposes. In a MERGE statement, the ON clause is used to identify matches and nonmatches and to accordingly determine which WHEN clause to activate. If you don't realize this, and you think of ON as a filter, you can end up writing MERGE statements with bugs. I'll demonstrate this problem with an example, and I'll provide a solution.

In my example, I'll use the Sales.Customers table as the source for the MERGE statement and the Sales.MyCustomers table as the target. First, run the following code to populate the Sales.MyCustomers table with some rows representing customers from Sweden and Italy:

TRUNCATE TABLE Sales.MyCustomers;

INSERT INTO Sales.MyCustomers(custid, companyname, country, phone)

SELECT custid, companyname, country, phone

FROM Sales.Customers

WHERE country IN (N'Sweden', N'Italy');
 »

Discuss this Article 2

sabdula
on Sep 5, 2012
answer for my guestion: it is possible to use CTE but in this case, changes do not give you any advantages.
sabdula
on Sep 4, 2012
After reading I decided to review my code but in my case there is one limitation. Source data cannot be used in when not matched by source. Is there any way to avoid it? The code was: MERGE INTO dbo.positions AS tgt USING #TmpPositions ON #TmpPositions.PositionNumber = TGT.PositionNumber and #TmpPositions.PositionAsAtDate = TGT.PositionAsAtDate WHEN MATCHED THEN UPDATE SET tgt.[PositionNumber] = #TmpPositions.[PositionNumber] , tgt.[IsUnlisted] = #TmpPositions.[IsUnlisted] , tgt.[IsYieldOverridden] = #TmpPositions.[IsYieldOverridden] , tgt.[PositionAsAtDate] = #TmpPositions.[PositionAsAtDate] WHEN NOT MATCHED THEN INSERT ([PositionNumber],[IsUnlisted],[IsYieldOverridden],[PositionAsAtDate]) values (#TmpPositions.[PositionNumber],#TmpPositions.[IsUnlisted],#TmpPositions.[IsYieldOverridden],#TmpPositions.[PositionAsAtDate]) when not matched by source and trg.PositionAsAtDate in (select distinct PositionAsAtDate from #TmpPositions) then delete ; The code should be s/th like that: MERGE INTO dbo.positions AS tgt USING OPENROWSET(BULK '\\server\positions_cut.txt', FORMATFILE = '\\server\positions.fmt' ,FIRSTROW = 2) AS src ON SRC.PositionNumber = TGT.PositionNumber and SRC.PositionAsAtDate = TGT.PositionAsAtDate WHEN MATCHED THEN UPDATE SET tgt.[PositionNumber] = src.[PositionNumber] , tgt.[IsUnlisted] = src.[IsUnlisted] , tgt.[IsYieldOverridden] = src.[IsYieldOverridden] , tgt.[PositionAsAtDate] = src.[PositionAsAtDate] WHEN NOT MATCHED THEN INSERT ([PositionNumber],[IsUnlisted],[IsYieldOverridden],[PositionAsAtDate]) values (src.[PositionNumber],src.[IsUnlisted],src.[IsYieldOverridden],src.[PositionAsAtDate]) when not matched by source and trg.PositionAsAtDate in (select distinct PositionAsAtDate from src) then delete ;

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Mike Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
Vision road sign
May 6, 2013
blog

Cheaters Never Win, Even in TPC Benchmarks

In this portion of the series on database benchmarking, I want to tell you about one of my favorite aspects of the TPC benchmarks – CHEATING....More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.