Enforce referential integrity when cascading DRI won't work

Editor's Note: Parts of this article are adapted from Chapter 14 of Advanced Transact-SQL for SQL Server 2000 by Itzik Ben-Gan and Thomas Moreau (Apress, 2000).

Cascading declarative referential integrity (DRI) is efficient and easy to implement and should be your first choice for enforcing referential integrity with cascade support in your database. In "Ensure Data Integrity with Cascading DRI," June 2002, InstantDoc ID 25120, I covered the basics of referential integrity and cascading DRI and discussed how cascading DRI in SQL Server 2000 can enhance the functionality of the foreign key constraint. However, in some situations, SQL Server won't let you use the foreign key's cascade options. You can't implement cascading DRI when cyclic relationships exist between tables, when your table relationships include multiple cascade paths, or when an INSTEAD OF trigger exists on the secondary table. So, you might choose not to use the foreign key's cascade options when you know that you're going to create one of these situations. Let's look at the cases in which cascading DRI won't work and examine how you can use stored procedures and triggers as alternatives.

Around in Circles

Cyclic relationships form a closed loop of foreign keys. Figure 1, page 2, illustrates two forms of cyclic relationships: single-table and multitable. You can find the simplest form of a cyclic relationship in the Northwind database's Employees table, which points to itself. A more complex form of a cyclic relationship involves multiple tables, but ultimately, the foreign key relationships form a closed loop: Table1 points to Table2, Table2 points to Table3, ... , Table(N-1) points to TableN, and TableN points to Table1. SQL Server 2000 doesn't let you use cascading DRI when you create a cyclic relationship of foreign keys—you can use only NO ACTION support. Allowing cascade support for cyclic relationships might cause infinite loops. For example, try to add the foreign key that Listing 1, page 2, shows to the Employees table. The addition of the new foreign key would violate existing foreign keys, so you get the error messages that Figure 2 shows. But you might still want to use cascade. For example, when an employee is deleted from the Employees table, you might want all of the deleted employee's subordinates to be deleted as well.

Multiple Cascade Paths

Multiple cascade paths can happen in relationships in which a secondary table points from multiple foreign keys to one column in a primary table, as Figure 3 illustrates. Let's look at an example that uses the Customers and Invoices tables. You can run the script in Listing 2 to create and populate the tables.

Suppose that a certain source customer issues several invoices to a certain target customer. In the Invoices table, both the source_customer and the target_customer columns need to point to the customerid column in the Customers table. Suppose you want to cascade deletes of customers from the Customers table to both source and target customers. You might use the code that Listing 3 shows to try to create two foreign keys that point from Invoices.source_customerid to Customers.customerid and from Invoices.target_customerid to Customers.customerid. However, you'll get the error messages that Figure 2 shows—the same errors you got when you tried to add a cyclic relationship. A relationship that can cause multiple cascade paths isn't supported in SQL Server.

An INSTEAD OF Trigger on the Secondary Table

Another situation in which SQL Server doesn't support cascading DRI in the foreign key is when the secondary table has an INSTEAD OF trigger on the action that you want to cascade. For example, you can't have both an INSTEAD OF DELETE trigger and a foreign key with ON DELETE CASCADE on the Order Details table; nor can you have both an INSTEAD OF UPDATE trigger and a foreign key with ON UPDATE CASCADE. You have to decide which you want—the cascade support in the foreign key or the INSTEAD OF trigger. For example, if you first create a foreign key on Order Details with ON DELETE CASCADE, then try to run the following code:

CREATE TRIGGER trg_iod_OrderDetails_test ON Order Details
PRINT 'Hello from trg_iod_OrderDetails_test'

You'll get the error message that Figure 4, page 4, shows. If you first create the above INSTEAD OF DELETE trigger, then try to add a foreign key with ON DELETE CASCADE, you'll get the error messages that Figure 5, page 4, shows. If you decide to keep the INSTEAD OF trigger on the secondary table and not support cascade actions with the foreign key, you can implement the cascade support by using stored procedures, AFTER triggers, or INSTEAD OF triggers, as you can when you create cyclical relationships or multiple cascade paths.

Implementing Cascades Without DRI

Except for INSTEAD OF triggers, which are new in SQL Server 2000, you can use these mechanisms to implement cascade support in SQL Server 7.0 and 6.5, which don't support cascading DRI. You can use these alternatives for all the problem situations in this article. To demonstrate how the mechanisms work, let's examine in detail how you might use these alternatives to implement cascade support in cyclic relationships. You can then adapt these solutions for use with multiple cascade paths and when you have INSTEAD OF triggers on a secondary table.

One way to implement cascade support in the Employees table's cyclic relationship is to use stored procedures. This option lets you create a foreign key with NO ACTION support—which means that the foreign key will reject direct modifications against the Employees table that would result in employees having a manager who doesn't exist. You can implement the cascade activity in a stored procedure, and because you control the modifications of both parent and child rows from inside the stored procedure, you can modify the data in a way that doesn't violate the foreign key. For example, when a user invokes the stored procedure to delete a particular parent row, the stored procedure can first delete the child rows, then delete the parent row. The foreign key isn't violated at any point because it's allowed to delete children of an existing parent, but not to delete parents of existing children. If the children are deleted before the parent, then any parent you attempt to delete won't have children.

However, one disadvantage of this option is that your users and applications have to use the stored procedures—instead of Data Manipulation Language (DML) operations—to issue all modifications that should trigger cascade actions; otherwise, the users' modifications are rejected because they violate a foreign key. Another disadvantage of the stored-procedures approach is that you'd likely write the code of the stored procedures to handle modifications to a single row. Writing stored procedures that handle multirow modifications to the primary table can be difficult. How would you define the stored procedure's input parameters? Usually, a logical expression in the query's WHERE clause defines a multirow modification. If you want to pass a logical expression to the stored procedure as an argument, you have to use dynamic execution to construct and execute the modification statement based on the user's input. A discussion of dynamic execution is outside the scope of this article, but using dynamic execution brings up a whole set of issues, so it's simpler to avoid it.

Another way you can implement cascade actions in a cyclic relationship is by attaching triggers to the Employees table. A trigger is a special kind of stored procedure that's attached to a table and that SQL Server invokes automatically when you issue a modification against the table. You can attach a trigger to a certain type of modification (e.g., INSERT, UPDATE, DELETE) or to a combination of those modifications. SQL Server 2000 supports two kinds of triggers: AFTER triggers and INSTEAD OF triggers.

AFTER triggers fire after SQL Server has applied the modification to the table. Because SQL Server checks constraints before applying a modification to a table, the AFTER trigger won't fire if the modification would violate a constraint. You can find the old and new images of the data within the AFTER trigger by querying the inserted and deleted tables. The inserted and deleted tables are views of the section of the transaction log that contains the modified data. In the case of a DELETE action, the deleted table holds the data that was deleted; in the case of an INSERT action, the inserted table holds the data that was inserted; and in the case of an UPDATE action, the deleted table holds the old image of the affected rows, and the inserted table holds the new image of those rows.

In the Employees table, a foreign key prevents an AFTER trigger from firing when you attempt to delete an employee or when you attempt to update the empid column if that employee is also a manager. The trigger won't have a chance to perform the cascade action. If you decide not to keep a foreign key in the table to allow the AFTER triggers to always fire, be aware that the AFTER triggers will have to perform not only cascade activity but all referential integrity enforcement.

For example, to enforce referential integrity with ON DELETE CASCADE and ON UPDATE CASCADE, you need to create three triggers on the Employees table. One trigger performs the DELETE CASCADE action, one performs the UPDATE CASCADE action, and another prevents additions or updates that would result in a manager ID that doesn't have a related employee ID.

You can also implement the solution that uses AFTER triggers to solve other problems. Figure 6 shows two tables and the AFTER triggers that you'd create on each to enforce referential integrity. In the case of a self-referencing table such as Employees, you'd create all the triggers on the same table. For an example of how to set up this solution, see the code that Listing 4 shows. This code creates the trigger that performs the CASCADE DELETE activity. The trg_d_Employees_on_delete_cascade trigger first determines whether the DELETE statement that fired the trigger deleted any rows; if no rows were deleted, the trigger aborts. The trigger then issues a DELETE statement that deletes from the Employees table all the employees whose managers were in the deleted table. Note that employees deleted by the trigger might also be managers of other employees. However, by default, triggers don't fire recursively. To turn on recursive triggers in the database, issue the following statement:


Note that SQL Server uses the first statement in the aborted trigger as the recursion termination check. SQL Server limits the number of times that a routine such as a stored procedure or a trigger can nest (i.e., call another routine) to 32. Without the recursion termination check, the trigger would fire recursively even when no rows were affected; the trigger would quickly reach the maximum number of nesting levels and roll back all activity invoked by the modification that fired the first trigger. You'd use similar code to create the other two AFTER triggers and enforce all the referential integrity rules in the Employees table.

An INSTEAD OF trigger fires instead of the modification that was supposed to affect the table. If you want to perform some kind of validation check, then resubmit the original modification that fired the trigger, you must write the code that resubmits the original modification; otherwise, just the original modification will take place without the validation check. Inside an INSTEAD OF trigger, you have access to inserted and deleted tables, but those tables don't contain the old and new images of the modified data; they contain the image of the data that was supposed to be modified.

Let's look at how you can use INSTEAD OF triggers to implement cascade actions in a cyclic relationship. The main advantage of using INSTEAD OF triggers is that you can use a foreign key with NO ACTION support because INSTEAD OF triggers fire before SQL Server checks the foreign key; this order lets the triggers perform only the cascade activity as opposed to enforcing all referential integrity rules. But writing such triggers can be complex because the modifications that the trigger invokes can't break the foreign key.

To grasp the complexity of such a trigger, think about how you'd implement an INSTEAD OF DELETE trigger that, when you delete an employee, also deletes all the employee's subordinates at all levels without breaking the rules that the foreign key enforces. You can think of the Employees table as a tree that's turned upside down. Each employee is a node in the tree. The top node is the big boss, and the leaves of the tree (the bottom nodes) are employees who have no subordinates. When you delete an employee who is a manager, you must delete the branch of the tree that starts with the employee that the user deleted. Remember that the foreign key won't let you delete an employee who has subordinates. So, you must first delete the leaf-level employees in the branch and move your way up the branch. However, to find the leaf-level employees, you have to traverse down the branch you want to delete, and while traversing the branch, you need to keep the rows you've selected in a temporary table. Using an INSTEAD OF trigger in a single-table cyclic relationship such as the Employees table scenario requires you to write complex code to enforce cascade.

But, in other problem relationships such as multiple cascade paths, you can implement INSTEAD OF triggers by using simple, straightforward code. For example, if you run the code that Listing 5, page 5, shows, you create foreign keys with NO ACTION for both deletes and updates and an INSTEAD OF DELETE trigger that implements ON DELETE CASCADE on the Customers table. Remember that INSTEAD OF triggers fire before the foreign key is invoked and instead of the original modification. Using an INSTEAD OF trigger lets you keep foreign keys with NO ACTION support in the Invoices table.

Choosing Wisely

SQL Server 2000 provides an array of tools to enforce referential integrity with or without cascading DRI support in your database. When you have so many available tools, one of your most important jobs is to wisely choose the referential integrity—enforcement mechanism that will give your database the greatest consistency and functionality.