Avoid Referential Integrity Errors When Deleting Records from Databases

Let this script or stored procedure do the work for you

Downloads
101931.zip

At the company in which I work, we sometimes need to release a development database to production that contains only the schema and no data. One of the most common practices that most of the DBAs follow is to script the schema and release it to production. However, sometimes we receive requests that the build should be as a SQL Server backup because they want to retain the diagrams in production and they want to reduce any problems that might come while executing the scripts.The fastest and easiest way I found to do this is to back up the development database, create a new database from the backup, then use a script to delete all the records from the newly created database, leaving only the schema. We then take the backup and release to production.

When you delete a database's records, you need to be careful about referential integrity errors. These errors will occur if you delete the records in tables with foreign keys before deleting the records in the referenced tables. To avoid this problem, I created a script, DELETE_RECORDS.sql, that disables all the foreign keys in the database, deletes all the records, then reenables all the foreign keys. You can use this generic script to delete all the records from any database. It uses the ALTER TABLE statement to disable and reenable the foreign keys, so ALTER TABLE permission for the database is needed to use the script. Without this permission, you'll get a foreign key violation error.

At our company, some database owners (DBOs) occasionally need to delete all the data from SQL Server Integration Services (SSIS) packages and refresh them with new data. A fellow DBA recently asked me whether there's a way to delete all the records in a database without disabling the foreign keys because giving ALTER TABLE permissions to DBO user accounts isn't advisable. So, I created a stored procedure, PR_HIERARCHIAL_DATA, that deletes all the records in a database without disabling the foreign keys.

To avoid referential integrity errors, PR_HIERARCHIAL_DATA determines the hierarchical order of all the tables in a database, then deletes the data from those tables in reverse hierarchical order. Identifying the order is bit difficult because many types of relationships can exist in a database, including one-to-one, one-to-many, and many-to-many relationships.

To identify the hierarchical order, the stored procedure uses information in the sysseferences systems table and sys.objects system view. In sysseferences, it uses the fkeyid column, which contains the IDs of the referencing tables (i.e., the tables that contain foreign keys), and the rkeyid column, which contains the IDs of the referenced tables. In sys.objects, it uses the object_id column, which contains object IDs.

PR_HIERARCHIAL_DATA uses a recursive common table expression (CTE) query to join the sysreferences table multiple times based on a join between the rkeyid and fkeyid columns. I prefer using CTEs because they offer a more readable form of the derived table that can be declared once and referenced multiple times in a query. This CTE query results in a hierarchical list that contains the tables with foreign keys and the tables those foreign keys are referencing. However, the list doesn't include tables that don’t have any relationships (i.e., independent entities). Another CTE query provides that information. The results from both CTE queries are put into a temporary table. Duplicate table names (which result when a table references to two or more tables) are removed with the MAX function.

The stored procedure then uses the information in the temporary table to delete the data in the tables in an order that avoids referential integrity errors. If PR_HIERARCHIAL_DATA finds conflicted tables (i.e., tables that reference each other), it will print the names of those tables. You will then need to manually delete the data from those tables.

You can use the PR_HIERARCHIAL_DATA stored procedure three ways. To simply display a hierarchical list of the tables with foreign keys and the tables those foreign keys are referencing, you execute the stored procedure without passing in any parameter values, as in

EXEC PR_HIERARCHIAL_DATA

If you want to output a list of the tables in hierarchical sequence so you can design a SSIS package or write an insert script that will load data into the tables, you use the INSERT parameter when you call the stored procedure, as in

EXEC PR_HIERARCHIAL_DATA 'INSERT'

If you want to delete the data from the database tables, you use the DELETE parameter, as in

EXEC PR_HIERARCHIAL_DATA 'DELETE'

You can download the PR_HIERARCHIAL_DATA stored procedure and DELETE_RECORDS.sql script by clicking the 101931.zip hotlink at the top of the page. The stored procedure and script were written for SQL Server 2005. You don't need to customize any code before running them.

Discuss this Article 8

yaniv.etrogi
on Jun 11, 2009
Great article and interesting solution. I downloaded the code and will look at it latter on. In some environments you will have to disable all triggers in addition to FKs. I would like to share with you a freeware I published which automates the process of schema generation. It may match your needs. http://blogs.microsoft.co.il/blogs/yaniv_etrogi/SQLScripter/SQLScripter.zip
manish (not verified)
on Jun 15, 2009
this is solved my problem. earlier we used to generate the script to release into production,but the inbuilt script option in ssms doesnt generates the database diagrams from devolopment to production. we have created a new script based on this script to make an automated job to release from devolopment to UAT for testing purpose.. nice work..
gnanau
on Jun 19, 2009
I love the recursive CTE idea! This is a very ingenious way to get the list of tables with references in a proper order for DELETE or INSERT! However, I have a question and a few comments which I will try to address in a few successive posts – due to limited # of characters allowed by the site in one message. Regarding the SELECT used to create the first temporary table (#sysref): Question: Why using left outer join to link with sys.objects (SOF, SOR)? I think a simple INNER JOIN will have sufficed.
gnanau
on Jun 19, 2009
Possible problem: This condition might create problems if the same table name exists in more than one schema: OBJECT_NAME(T1.FKEYID)+OBJECT_NAME(T1.RKEYID) = OBJECT_NAME(T2.RKEYID)+OBJECT_NAME(T2.FKEYID) Function OBJECT_NAME() returns the database object name, but it gives no information about the schema. In this particular scenario: Schema1.Tab1 references Schema1.Tab2 Schema1.Tab2 references Schema2.Tab1 the above code will exclude it as a false circular reference. However, these are two perfectly valid relationships to be included in temporary table #sysref.
MarcosGalvani
on Jun 15, 2009
I use to create the script of the database and execute it on the new server. This can be done without any code directly from SSMS. Take Care.
lucazav
on May 21, 2009
I was looking for a solution like this!
gnanau
on Jun 19, 2009
I propose the following code which makes consistent use on SQL Server 2005 and above system views (using sys.foreign_keys instead of sysreferences, kept in 2005+ only for backward compatibility reasons): select SCHEMA_NAME(fk.schema_id) AS FSC_NAME, /*OBJECT_NAME(fk.parent_object_ID) FOName,*/ fk.parent_object_id FKEYID, SCHEMA_NAME(o.schema_id) AS RSC_NAME, /*OBJECT_NAME(fk.referenced_object_id) AS ROName,*/ fk.referenced_object_id AS RKEYID into #sysref from sys.foreign_keys FK join sys.objects o on FK.referenced_object_id=o.object_id and not exists (select 1 from sys.foreign_keys FK1 where FK.parent_object_id=FK1.referenced_object_id and FK.referenced_object_id=FK1.parent_object_id)
chandra (not verified)
on Jun 15, 2009
This article is for how to identify the hierarchies to delete and insert the data using SSIS or during any refresh automations. For restoring only schema : Yes,ssms script task is available to create the scripts and its a manual process where the DBA has to select the objects which are required to release. there might be a chance to miss any single object and he has to follow a process to ensure that both devolopment and production are in sync once he released with scripts. But in this process the production will get exact replica of devolopment database with out any mismatch.

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Michael Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 21, 2013
blog

A Common Misconception about MAXDOP

Out of the box, SQL Server is (and has been) able to take advantage of multiple processors/cores without any effort on behalf of administrators....More
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.