The IT community's burgeoning interest in anything XML has spilled over to SQL Server 2000, which ships with XML capability. Developers are using SQL Server 2000's XML features to write queries that use the FOR XML clause to return results in XML format, to query SQL Server through HTTP, and to program in XML's XPath query language. (For an overview of SQL Server 2000's XML features, see Bob Beauchemin, "The XML Files," September 2000.) In this article, I show you how to engage one of these XML features—SQL Server 2000's FOR XML clause—to produce a well-formed XML document that another system can easily read and process. The FOR XML clause, which you use in aSELECT statement, lets you write a SQL query that returns XML directly from the SQL Server query engine. Without the FOR XML clause, you must parse the query results through an XML parser or generate XML in a custom application.
Using T-SQL to Produce XML
Let's walk through an example that shows how to create an order document for Northwind Traders, a catalog company that uses the Internet to market international products. To produce a single, comprehensive order form that Northwind's automated warehouse system can print, I query SQL Server's sample Northwind database for the applicable information. The sample database contains customer and contact information for a stick-on address label; the name, product number, and quantity of each item in the order for a bill of lading; and the shipper name for a shipping label that tells warehouse employees where to stage the package for pickup. I decide to use XML for this project because XML provides a widely adopted and supported document format. I also like the fact that XML describes the information in the document instead of depending on the information's place in a fixed-format document such as a delimited text file.
To get XML results from SQL Server, I simply have to append a FOR XML clause to a standard SELECT statement. Listing 1 shows a T-SQL query that uses the keywords FOR XML AUTO. The FOR XML AUTO mode, which I must specify, tells SQL Server to return the results in the simple nested XML tree that Figure 1 shows. (I produced Figure 1's results by using SQL Server XML Support in Microsoft Internet Information Server—IIS).
Figure 1 reveals even more compelling information. The screen shows that SQL Server's query engine based its XML document nesting on the SELECT statement's columns. The resulting XML document also shows how the SQL Server query engine elected to join the tables. Although Figure 1 displays a valid XML result, I seek a different outcome. I want to rearrange the element hierarchy because I need elements only for the order header, the order number and shipper, and the line-item detail. Notice how Figure 1's result groups the header information. The "c" at the beginning of the header element comes from the "c" alias for the Customers table in the query, the "o" for the alias for the Orders table, and so on. In the Northwind tables, the Order level includes the shipper, and the quantity ordered applies to each product line item. But to make the XML result easier to manipulate, I decide to flatten out some of the hierarchies. In the result I want, the customer name, address, and contact comprise the first element; the order ID and the shipping company name form the second element; and the quantity ordered and product name constitute the third element.
To gain more control over the format of the resulting XML document, I can use the FOR XML clause in EXPLICIT mode. FOR XML EXPLICIT lets you specify exactly how you want information nested in the XML tree. To use FOR XML EXPLICIT to force the hierarchy of the resulting row set, I must format my query by using structured column names and UNION operators. SQL Server 2000 documentation calls this specific row set format a universal table, which is an effective way to visualize the structure of a FOR XML EXPLICIT query. With this table, you can institute the tag to identify the hierarchy levels of the XML result, set up the parent-child relationships between the levels, and establish the structured column names, as Table 1 shows. The first two columns in Table 1, TAG and PARENT, define the parent-child relationships among the elements of my result set. (The tag value is literally the row level in the hierarchy, and the parent value is the tag of that row's parent row.) To conserve space, I have included a limited XML result in Table 1's universal table.
Listing 2 shows the FOR XML EXPLICIT query that forces the XML result into the format I want. I construct the query in three parts that I connect with UNION ALL clauses. The query's first SELECT statement constructs the hierarchy's first element. I establish the relative column order according to my sample universal table model and specify the column names by using the element!tagnumber!attribute!directive format. You can see that Customer is the first hierarchy element. I include the CustomerID attribute to form links among the three hierarchy levels, but I use the HIDE directive to conceal the CustomerID attribute because I don't want to include the customer ID in the XML result. I add other customer information for the mailing label at the first level. Customer has no parent, so I assign the parent column a value of NULL.
Order is the hierarchy's second element, so I use the second SELECT statement to retrieve the order data. Because customer (first level) is the parent of the Order element, and Order is the second element, I assign the tag column a value of 2 and the parent column a value of 1. I include the customer columns again along with OrderID because in the SELECT statement, I join the Customers table with the Orders table. To retrieve all the data I need for the Order element of my XML result, I join the Orders table to the Customers table and the Shippers table to the Orders table. The final SELECT statement, which I write for the third level of the hierarchy, incorporates the order line-item details for my result. I assign the tag a value of 3 and the parent column a value of 2, in keeping with the hierarchy of this data set.
The final part of the query is the ORDER BY clause. In the ORDER BY clause, SQL Server uses the structured column names that I devised when I was developing the universal table to structure my query. When a FOR XML query is in EXPLICIT mode, the ORDER BY clause is very important for structuring the query results because the ORDER BY clause sets up the hierarchy of Customer, Order, and Order Detail. When I include the ORDER BY clause, the query returns the hierarchy of customer-order-order detail for a single order. Without the ORDER BY clause, the query returns a list of all the customers' data first, followed by a list of all order-level data, then finally a list of all the order details. Notice that I must place the HIDE directive again in the Customer!1!CustomerID column because I used that directive for CustomerID in the first SELECT statement. Omitting the HIDE directive produces the error message Invalid column name 'Customer!1!CustomerID'. Figure 2 shows the results of the FOR XML EXPLICIT query in Listing 2.
Using FOR XML EXPLICIT in a Real-World Application
So far, I've shown you how to query SQL Server by using the FOR XML clause and how to control the format of results by using EXPLICIT mode. I produced these sample results in Microsoft Internet Explorer (IE) by running the utility Configure SQL Server XML Support in IIS from the SQL Server program group in the Start menu. SQL Server XML Support in IIS lets me use HTTP to query SQL Server 2000 and format XML into customer-friendly HTML, which applies XML style sheets in Extensible Style Language (XSL). Some development teams I've worked with also use SQL Server's XML support with Visual Basic (VB) components by using the ADO Stream object and employing XSL to format results for browser applications. To make the results I've shown you in this article customer-friendly, let's explore how to extract the XML result from SQL Server by using a stored procedure that a VB application executes.
XML provides the foundation for Internet communications between business systems—for example, between Northwind's corporate database and warehouse system. So let's examine how SQL Server 2000 can help you prepare a document for entry into a transport mechanism that automates a business transaction. You can use a transport mechanism such as Microsoft BizTalk Server 2000 to set up workflows and document transformations that handle the processing of documents similar to the XML orders document I describe in this article. (For more information about BizTalk Server, see http://www.microsoft.com/biztalk.) All I need to do is feed BizTalk Server a well-formed XML stream along with information about the data's origin and destination. Because SQL Server 2000 can produce the XML stream, I can query SQL Server from a component that sends the stream straight to a product like BizTalk Server, which in turn processes the stream according to the workflows and transformations I've predefined. Thanks to the XML support in SQL Server 2000, I don't need to do any XML-specific programming beyond writing the FOR XML EXPLICIT query to produce the XML order document that I send to BizTalk Server.
Let's look at a simple example of preparing an XML document for processing. Because I've already built a query that returns order information as an XML document, my next step is to add a solution for processing the orders in the XML document. When SQL Server has inserted the order into Northwind's database, I can assume that the order is ready to be fulfilled. Then, my objective is to transmit the order to Northwind's central warehouse facility, where a bill of lading and a shipping label print out and where employees pull items off the shelves, package them for shipping, and stage the packages for shipper pickup. To fulfill an order, I set up a BizTalk Server at the corporate sales facility (which also houses Northwind's database server) and a BizTalk Server at the warehouse, because the only connection between the sales office and the warehouse is the Internet. The BizTalk Server at the warehouse interacts with the warehouse system to print the shipping-related documents, adjust inventory, and transmit order-fulfillment information back to the corporate sales office through HTTP.
The FOR XML EXPLICIT query example that Listing 2 shows produces all the information that the warehouse needs to fulfill an order. However, I need to modify the query to produce an XML document for one order. The modification is necessary because I plan to place new orders in BizTalk Server one at a time, the same way the salespeople enter orders in the Northwind database. I prefer to use stored procedures for data access because their execution plans are cached, and I can easily execute stored procedures from a VB application through remote procedure calls (RPCs) by using ADO Command objects. So I write stored procedure p_xmlorder, which produces the XML result that Listing 3 shows. I use the customer ID and order ID as input parameters for the stored procedure, and because the FOR XML EXPLICIT query uses UNIONs to link the three levels of the XML order hierarchy, I need to apply the search criteria in each of the individual SELECT statements. The stored procedure returns the customer name, address, and contact name; the order number and the shipper for the order ID parameter I provided (different orders for one customer might have different shippers); and the list of products in the order. Figure 3, page 68, shows the result of stored procedure p_xmlorder for the customer ID ALFKI and order number 10643.
Now that I've written a stored procedure to produce a well-formed XML document for an order, I need to write an application to send the order on its way to the warehouse for fulfillment. I can complete this step from a VB application that uses ADO to call the stored procedure and that loads the resulting XML in an ADO Stream object. I can then pass the XML in the Stream object to BizTalk Server by using the BizTalk Interchange.Submit method call with the name of the BizTalk Server channel to execute for this document. (The server channel is a named workflow that BizTalk Server executes when you submit the order.) The Interchange.Submit method call identifies the channel and the XML order in the ADO Stream object named adoOutStr. Listing 4, page 68, shows the VB code for retrieving the XML order document and sending it to BizTalk Server.
VB produces the XML results in a stream, so I use an ADO Stream object to store the XML. The ADO Command object functions in the standard way to execute a parameterized stored procedure, with the following exceptions. The Command object uses the .Properties property to direct the XML results to the Stream object, whose adExecuteStream option you set in the .Execute method to specify a stream output. A typical ADO Recordset doesn't require these properties and options. The Command object executes the stored procedure efficiently as an RPC in SQL Server as well. The remaining code calls the BizTalk COM interface to execute a BizTalk channel (called "Order Fulfillment Channel" for this example) that I defined to accept order documents and route them to the warehouse. I designed this standard VB .exe application to run on the same computer that runs BizTalk Server. In a production system, I would write this as a COM+ component to work in a distributed application so that I could take advantage of COM+'s manageability and scalability benefits.
SQL Server 2000's XML support opens up many interesting possibilities for developing solutions that require data in XML format. This article shows how to use the FOR XML clause to retrieve results as XML documents and how to use the FOR XML clause in EXPLICIT mode. With the help of the FOR XML clause, you can produce a well-formed XML document and forward it to another application in two fairly simple steps. Even better, you can accomplish this feat by using familiar programming techniques and avoid having to write a lot of XML-specific code.