ID attributes in an XML document work like keys and foreign keys
XML views define an XML-centric view of a subset of the data stored in your relational database. You define an XML view by adding annotations to an XML schema to form a mapping schema. In "Filtering Values in XML Views," November 2002, InstantDoc ID 26715, and "Defining XML Views," December 2002, InstantDoc ID 27106, I showed you how to use several annotations to map data from the rows and columns of database tables to a virtual XML document that your mapping schema defines. In this column, I show you how to use XML's built-in support for ID and IDREF attributes, which function like keys and foreign keys in your database, and IDREFS, which lets you specify one-to-many (1:M) relationships. You can use ID and IDREF attributes to avoid producing redundant data in an XML query result, which might happen when your database contains 1:M relationships. For example, many order-detail records might reference a single product, as they do in the Northwind sample database. You can avoid duplicating the product data within each order detail by using a unique ID attribute on each product element and an IDREF attribute on the order detail to reference the product. Using these attributes reduces the size of your XML query result because you include only one product element even though multiple order details might reference that element. I also explain how you can use a mapping schema's prefix annotation in SQL Server 2000—and the SQL Server 2000 Web releases (SQLXML)—to generate XML that contains ID, IDREF, and IDREFS attributes.
You use the prefix annotation when a schema declares attributes with the types ID, IDREF, or IDREFS. (Note that in this column, ID, IDREF, and IDREFS attributes always refer to types, not names. The attributes that your schema declares can use any valid attribute name.) These special attribute types build nonhierarchical relationships between different elements in an XML document. (In a nonhierarchical relationship, two elements in an XML document are related but neither element is an ancestor or descendant of the other.) When you build nonhierarchical relationships, you use an ID attribute to specify a document-wide unique key. Think of this unique key as a primary key in your SQL Server database. As you might guess, an IDREF attribute references an element that contains an ID attribute with the same value as the IDREF attribute—like a foreign key in a SQL Server database. Using ID and IDREF, you can build one-to-one (1:1) relationships between elements in your XML document. IDREFS is similar to IDREF except that an IDREFS attribute contains one or more references to elements that contain ID attributes with the same values specified in the IDREFS attribute. Each reference in the IDREFS attribute is separated by a space. As I noted earlier, you can use IDREFS to specify 1:M relationships.
The values you map to ID, IDREF, and IDREFS attributes must conform to specific formatting requirements. An IDREFS attribute's formatting implies a constraint on the values you use for ID attributes; these values can't contain the space character because this character separates ID values. The constraint on ID attribute values is part of a larger constraint. This larger constraint dictates that the values you use for ID, IDREF, and each value within an IDREFS attribute must meet the NCName production (a set of rules for constructing a constrained string value) defined in the XML for Namespaces specification. (For information about this production, see http://www.w3.org/tr/1999/rec-xml-names-19990114/). The NCName production specifies that a valid NCName—and hence a valid ID value—must be a string that starts with either a letter or the underscore (_) character, followed by printable characters or numbers. Because an ID value must begin with a letter or underscore, you need the prefix annotation to meet the constraints of the NCName production.
ID attribute values must also be unique among all elements in the document. Duplicates aren't allowed—even if the element names are different. Think of an ID attribute as a primary key for a table but with the additional constraint that you can't use the key value in any other table in your database. With these constraints in mind, let's see how you can map ID, IDREF, and IDREFS attributes to your SQL Server database.
Because the value of an ID attribute must be unique within an XML document, the value most readily lends itself to mapping to a primary key in your database. Typically, you can map the value of a primary key or an IDENTITY column to identify individual instances of an entity in a database table. In the example database that Listing 1 shows, you use a unique OrderID to identify individual orders stored in the Orders table. You can use these unique values to produce an ID attribute for the Order element, which will then uniquely identify the order in an XML result obtained from SQL Server. Recall that ID attributes must meet the NCName production defined in the XML specification. But in Listing 1, the OrderID column is of the integer type, which begins with a digit. Additionally, if you want to combine Order and Product elements in one document, you must guarantee that two different values are used for their IDs. However, in Listing 1, the OrderID and ProductID columns have duplicate values. If you use these column values, the XML document will be invalid because one order and one product would share the same unique ID attribute. Using the prefix annotation solves both problems.
You use the prefix annotation on ID, IDREF, and IDREFS attributes in your mapping schema to specify an arbitrary string that will be prepended to values retrieved from your database and mapped to those attributes. A prefix solves the first problem by letting you ensure that numeric values mapped to an ID attribute meet the NCName production. You can use a prefix with nonnumeric values as well. A prefix solves the duplicate-values problem too. By specifying separate prefix annotations for each ID, IDREF, or IDREFS attribute in your schema, you can eliminate duplicate values from different columns (a common problem when you're mapping to IDENTITY columns). Now, let's look at an example of how to use the prefix annotation.
The example uses the database that the SQL script in Listing 1 defines. This script borrows data from the Northwind sample database to build three tables that relate an order, the line items on the order (called Items and stored in the Order Details table), and the product referenced in the line items. You can run the XML Schema Definition (XSD) mapping schema example by using the database that the SQL script in Listing 1 creates.
Listing 2 shows a mapping schema that generates the XML result that Figure 1 displays. At callout C of Listing 2, the mapping schema declares a top-level element called root, which contains a list of Order elements and Product elements. The Order element at callout A has an OrderID attribute that uniquely identifies the Order element. The prefix annotation adds the ORDER- string to the value from the OrderID column so that the value of the OrderID attribute satisfies the NCName production. Each Order element also contains a ShipAddress and ShipCountry element as well as a set of LineItem elements—one for each line item on the order (found in the Order Details table). Each LineItem element contains an IDREF attribute named Product. The mapping schema uses the relationship annotation to map the IDREF attribute to the Products table, and the prefix annotation specifies the PRODUCT- value to make the values from the ProductID column unique. The Product element declared at callout B includes a subelement that contains the name of the Product and an ID attribute named ProductID. The prefix annotation specifies that the PRODUCT- string should be prepended to the values in the mapped column, ensuring that the IDs on the Product element and the IDREF on the LineItem element beneath the Order element have the same value.
To execute the code for the example, create a virtual directory by using the IIS Virtual Directory Management for SQLXML plug-in for Microsoft Management Console (MMC). Name the virtual directory Jan2003 and choose a directory on your system for the Local Path. (Selecting a directory beneath the Inetpub\www directory can avoid access security problems.) Save the mapping schema from Listing 2 to the directory in a file named Orders.xsd. Enter the security parameters to access the database that contains the tables from Listing 1, then select the Allow XPath check box on the Settings tab. On the Virtual Names tab, create a virtual name called Orders, set the type to schema, then set the path to (.), which will use the same directory you specified for the virtual directory. Next, save the virtual directory. You've now installed the XML view that uses ID and IDREF to define nonhierarchical relationships.
Now, execute an XPath query against the XML view by using Internet Explorer (IE) 5.0 or later. The query returns an XML result by using IDREF attributes to relate the LineItems elements to the products they reference. To execute a query, type http://localhost/Jan2003/Orders/Orders.xsd/root in the address bar. Jan2003/Orders/Orders.xsd selects the XML view that Listing 2's mapping schema defines, and /root is the XPath query to retrieve all customers. The query produces the results that Figure 1 shows.
Now that you've learned how to generate ID, IDREF, and IDREFS attributes from data within SQL Server, you can use this technique when your schema defines ID, IDREF, or IDREFS attributes or when you want to reduce the size of the XML result by eliminating duplicate data in lower levels of the hierarchy.