The view through an XMLDataDocument
XML views, which ship in SQL Server 2000 and the SQL Server 2000 Web releases (SQLXML), conveniently treat the data stored in your SQL Server database as if it were part of a virtual XML document. In "Defining XML Views," December 2002, InstantDoc ID 27106, I described how to create an annotated XML Schema Definition (XSD) schema (sometimes called a mapping schema) to build an XML view. These views are useful because they execute hierarchical XPath queries against your relational data. They accomplish this feat by combining the data in the mapping schema with an XPath query to produce a SQL query, which in turn generates the XML result.
Originally, XML views were the only method for querying your data in this way. But since the introduction of the Microsoft .NET Framework, not only can you execute hierarchical queries on relational data, but you can also manipulate the data—functionality that wasn't previously available. Let's look at how you can use ADO.NET's DataSet and XMLDataDocument classes to produce XML query results from relational data.
The DataSet is a client-side (or middle-tier) cache that stores data in a structure similar to a relational database; that is, it provides an in-memory relational representation of data retrieved to your local system. A DataSet lets you model your data by storing data in a set of tables that the DataTable class represents and by using key and foreign key columns to define relationships, called DataRelations, between the tables.
A DataSet also lets you navigate between tables by using those foreign key relationships when you access the data—an operation that's conceptually similar to a join in SQL Server. One of several ways you can populate a DataSet is to use a DataAdapter, which is an intermediary between a data source (such as a SQL Server database or an XML document) and a DataSet. A DataAdapter's Fill method executes a query in SQL Server, retrieves the data, then pours the data into the DataTables. (In a future column, I'll discuss how a DataAdapter also retrieves data directly from an XML document.) As you can see, a DataSet provides a familiar relational structure for in-memory, cached data. The DataSet also provides a relational object model similar to ADO for accessing the data, but that topic is beyond the scope of this article.
The XMLDataDocument class provides a virtual XML document whose contents are defined by data stored in a DataSet. Like SQLXML's XML views, XMLDataDocument requires a mapping between the relational storage of the DataSet and the hierarchical structure of the XML document that the XMLDataDocument provides. The following rules define that mapping:
- Tables in the DataSet, which I call table elements, map to the element of the same name in the XML document.
- Columns map to simple subelements of table elements.
- A DataRelation established between two tables causes the table elements to be nested.
The XMLDataDocument class is a subclass of the XMLDocument class, which provides the Document Object Model (DOM) programming model for accessing and navigating the data in a virtual XML document. The DOM provides a tree-oriented programming model for XML documents that stores data in nodes of the tree. You can access data stored in the nodes by navigating the parent-child relationships that the DOM provides or by executing XPath queries to return a set of nodes that match the query. Additionally, because the DataSet and the XMLDataDocument share the same underlying in-memory storage, changes that you make by using the XMLDataDocument are reflected in the DataSet and vice versa. Now that you understand the basics of a DataSet and an XMLDataDocument, let's look at an example of how you can use them to produce XML query results.
Querying a Virtual XML Document
Before diving into the code, let's quickly look at the scenario I use for this column's example. Suppose your company employs an outside firm that runs special promotions for customers who place large orders. The outside firm requires you to send it an XML document that contains the orders a customer has placed. Each order includes an order identifier, the date the order was placed, and the total order amount along with information that uniquely identifies the customer.
To meet the firm's requirements, you need to develop a program to access your database, extract the data, and build the XML document. The example uses the Northwind sample database that comes with SQL Server. I use a DataSet to store the customer and order data from SQL Server and an XMLDataDocument to build a virtual XML document that contains the data. I also use an XPath query to select only customers who placed at least one order that totals more that $800—which is considered to be a large order. For brevity, I show only a subset of the query results. With this scenario as the goal, let's look at the code.
Listing 1 shows the C# code for this example. A single method, Main, contains the entire program. At callout A in Listing 1, the code creates a new DataSet, named CustOrder, and adds to CustOrder two DataTables, named Customer and Order. The names you use for the DataSet and DataTables correspond directly to element names in the virtual XML document, so you should choose the names carefully. In general, choose the singular form of a name instead of the plural (e.g., Customer instead of Customers) because the XML element represents a single Customer, not a set of Customers like the rows in a DataTable. Note the singular element names in Figure 1's XML view.
At callout B, the code creates and opens a connection to the Northwind database. Next, the code creates two SQLCommand objects containing queries that return a set of Customer rows and a set of Order rows from the database. Then, the code at callout C creates the DataAdapters, which use those queries to retrieve data and populate the two DataTables that the code at callout A creates.
After the DataAdapters have populated the DataTables, the code creates a DataRelation named CustOrder between the Customer and Order DataTables. The DataRelation specifies that the two tables be related through the CustomerID columns. A Nested property set to true causes the XMLData Document that the code at callout D creates to generate a virtual XML document that nests orders inside customers. The next line of code saves the entire contents of the virtual XML document to a file called result.xml, which Figure 1 shows.
Now that the XMLDataDocument is complete, you can use XPath to query the data. At callout E, the code uses the SelectNodes method on the XMLDataDocument to execute an XPath query that returns orders totaling more than $800. SelectNodes returns a list of nodes—in this case, elements—that match the query.
Next, the code creates an XmlTextWriter, which is an object that you use to write XML to a file. You want to make certain that the XML output is well formed; that is, that the XML has a single root element that contains all other elements. To ensure that outcome, first you use the XmlTextWriter to output a LargeOrders element. Then, you write each node returned from the XPath query to the output file by using the node's WriteTo method. WriteTo writes the XML representation of the node and all its descendants. Once you've written all the nodes to the output file, you close the root element and the XmlTextWriter. Figure 2 shows the resulting XML file.
.NET XML Query Results
The DataSet and XMLDataDocument classes are analogous to the SQL Server database and mapping schemas that programmers have used to provide XML views—and just as easy to use. However, unlike XML views, which have no object model, the DataSet and XMLDataDocument classes provide relational and hierarchical object models—the DataSet and the XML DOM, respectively—which you can use to manipulate the data or execute business logic.
Using a DataSet and XMLDataDocument to generate XML query results provides the added benefit of reducing load on your SQL Server by moving the creation of the XML query results to a middle-tier (or client) machine. However, you should carefully weigh the benefits of generating XML on the client or middle tier against the cost of retrieving large amounts of data from SQL Server into the DataSet's local cache. When you need to generate XML query results, execute business logic, or offload processing from your SQL Server machine, using .NET's XMLDataDocument might be your best—and most flexible—option.