Supporting a recursive relationship—like you find in a typical bill of materials (BOM)—is one of the hardest problems to solve in relational databases. A BOM is a hierarchical structure that describes relationships among components. (See also, "Get in the Loop with CTEs").
For example, a car is made up of components such as a steering wheel, a frame, and tires. The car frame is made up of additional smaller components such as side rails, cross rails, and bolts. Traditional database tables store all the components together and link them in recursive one-to-many (1:M) relationships, as Table 1 shows.
But when a relationship between parts becomes two-way, this table design becomes problematic. In the traditional table model, the auto frame-to-bolt relationship is 1:M, and the car-to-auto frame relationship is 1:M. What happens if the relationship between car and bolt is also defined as 1:M, but you discover that the same bolt attaches the hood assembly to the rest of the car? Now, instead of a recursive 1:M relationship, you have a recursive many-to-many (M:N) relationship, and trying to force that relationship into the traditional BOM table architecture can lead to redundant data and update anomalies, as Table 2 shows.
These kinds of redundancy and update problems can occur in any relational database that supports a BOM. Let's look at a problem that I recently worked on for a client who needs to redesign his database to accommodate a BOM.
Franklin is a DBA for a company that provides telecommunication services. Currently, customers can purchase only simple services such as dial-up Web access or Web hosting. Franklin's company wants to move to a service-package model, in which a customer can purchase a package of services and get a discount. Franklin asked me to help him create an effective design for the new business model. One of his concerns is that because his company will be rolling out new simple services and packages on a continuous basis, maintaining the Service Plan table will be difficult. Franklin's original Service Plan table looked like the one that Table 3 shows.
Franklin wants three things. First, he wants to bundle the dial-up and Web-hosting plans and offer them at a discount, but he's not sure how to make the right table references. Second, he wants to avoid redundant data in the Service Plan table. Third, he wants to minimize data management when his company adds or changes plans and services.
Franklin is facing a BOM problem. He has a table that's related to itself in both directions—a recursive M:N relationship. My approach is to let the model determine the implementation. Franklin's situation might be confusing at first glance, so let's look at it with the help of an illustration.
Figure 1 shows a conceptual data model of the Service entity, which is an entity I created that's similar to Franklin's original Service Plan table. In this model, a service is composed of zero or more other services (if zero, it's a simple service; if many, it's an assembly of services). A simple service can be a component of zero or more other services (assemblies).
The recursive M:N relationship that Figure 1 shows is more complicated than an ordinary M:N relationship because, whereas you're used to seeing two different entities in a M:N relationship, you're now seeing only one—the Service entity is related to itself. But like any other M:N relationship, when you convert the base entity (Service) into a table, the relationship also becomes a table—in this case, a table called ServiceComponent.
To convert the recursive M:N conceptual data model to a physical data model, you create one table for the base entity (Service) and a second table (ServiceComponent) for the relationship. (For more information about the rules for converting models, see "Logical Modeling," July 2000, InstantDoc ID 8787.) In Figure 2, I use physical-model notation to show the two relationships—the arrowheads point to the parent table. ServiceComponent is the associative table that represents the M:N relationship. Listing 1 shows part of the code I used to create this article's examples. (For the complete script I used to populate the tables and test the recursive relationship, see Web Listing 1 at InstantDoc ID 42520.) A service can be composed of zero, one, or many services; FK_IS_COMPOSED_OF shows this relationship, in which AssemblyID is the foreign key in the ServiceComponent table that links back to the Service table. A Service can also be part of zero, one, or many services; the relationship FK_IS_A_COMPONENT_OF shows this construction, in which ComponentID is the foreign key that links ServiceComponent back to the Service table.
You can more easily visualize how this scheme works if you look at the data in table form. Figure 3 shows a list of services that I selected from the Service table. Notice that this result isn't a true hierarchy. The Service table contains "simple services" (Items A through H) that are also components of other services. The next levels of service (SuperPlan A and SuperPlan B) are composed of only simple services, as the ServiceComponent table in Figure 4 shows. The third level of services (SuperDooperPlan A and SuperDooperPlan B) can include multiple SuperPlans or combinations of SuperPlans and simple services.
The selected results in Figure 5 show the plans made up of more than one component; the components are recorded in the ServiceComponent table. Franklin's company can assemble any combination of simple services or composite plans by using this table structure. To explore how this design works, I wrote the query in Listing 2, which returns the list of each composite plan and its components that Figure 5 shows. Whenever Franklin needs to pull a report to show customers the cost savings they'll enjoy when they purchase a composite plan instead of multiple simple services, he can write a more complicated query such as the one that Listing 3 shows, which returns the following result:
Plan B $9.75 $1.00
(1 row(s) affected)
By using this table schema, Franklin can now efficiently and effectively manage service plans and service-plan components. Moreover, he can integrate this schema into the Web-hosting and billing schema that he found in my column "Web-Host Invoicing" (March 2003, InstantDoc ID 37716) and offer his customers a greater variety of service plans while keeping a handle on his data. Eventually, when Franklin migrates his SQL Server installation to the upcoming SQL Server 2005 release, he can rethink the queries I've presented in this article and evaluate the recursive common table expression (CTE). You can read more about T-SQL's new CTEs in Itzik Ben-Gan's articles "Get in the Loop with CTEs," May 2004, InstantDoc ID 42072, and "Cycling with CTEs," InstantDoc ID 42452.