Downloads
7643.zip

Is the introduction of XML in SQL Server a good thing?

Microsoft has recently started embracing Extensible Markup Language (XML), but embracing XML is different from putting it natively into a product. SQL Server administrators and database developers haven't missed full integration of XML in SQL Server, but they've now come to realize that they needed XML all along. In the next version of SQL Server, XML output will be a standard option, letting you create a channel of information—including information pulled directly from the SQL Server—that you can pass to other systems. If you combine XML and Data Transformation Services (DTS), bulk copy program (bcp) will continue to slip out of use, to the delight of most database developers and administrators.

XML is available today only in a manner of speaking. You can use the language today with extensions that Microsoft released for use with SQL Server 6.5 and 7.0. These extensions are more workarounds than real solutions. The real strength of XML will come with the next release of SQL Server. But you can use the tools available today to begin working with XML, if you haven't already, and to start understanding what you can and can't do with it.

What is XML?


XML's purpose is to solve the problem of exchanging data between systems. XML solves this problem by providing a self-describing data set. In other words, when you receive data from a query, XML provides the data and descriptive information about the data. The idea behind XML is to separate the data from the work of manipulating and presenting the information.

If you're familiar with HTML, XML will look suspiciously similar to HTML, but infinitely more noisy to read. By noisy, I'm referring to the number of tags that XML operations contain, including grams (commands sent to the server) or data or formatting tags. But the high number of tags is unavoidable if the data and formatting are to succeed as self-describing. For an example of an XML set, consider the snippet in Listing 1, page 22.

Without knowing what this snippet represents, you can see that it's two Web site listings, each with the site URL, owner, description, and other information about the site. The fact that you can discern this information is exactly the point of XML.

If you use XML with an application, the application no longer needs to know about the underlying data source (SQL Server), nor does it need to know about the data structure. Removing this requirement means you can write more generic applications. You'll be able to write an application, for example, that reports on information in the database without needing to know the information structure ahead of time.

In terms of implementation, XML is in its infancy. Microsoft is implementing it more natively in Internet Explorer 5, and the next version of SQL Server reportedly will support XML as a standard protocol. Including XML will be a boon to Internet development, making it easier for developers to build generic Web sites that run on the data behind the scenes.

Are Apples Red or Yellow?


One of XML's great strengths is that you can use it to describe your data in ways that are meaningful and in ways that a third-party application can use. But this capability introduces some interesting problems. For example, if I asked you the name of the fruit that you're eating, and you said apple, I'd know a bit more about what you were eating, but not all that I needed to know. After all, apples can be Red Delicious, Golden Delicious, Granny Smith, or another variety.

What you need is a more standardized way to describe the data. Specifically, you need the equivalent of a dictionary to help decipher and define the data-element descriptions so that others can understand them. Is the code that is your unique identifier on your inventory an iCode, ItemCode, Item_Code, or something else?

Initiatives are under way to address standardized data. The BizTalk Web site at http://www.biztalk.org shows a schema library and provides discussions about industry segments and the data elements that are specific to the segments.

The fact that these initiatives are forming introduces an interesting question—one that the IT industry will need to address before XML is fully accepted. If XML, with its self-describing data sets, is such a strong solution for building applications, why do these initiatives need to launch these standardization efforts? Isn't standardization one of XML's touted benefits?

XML in 6.5 and 7.0?


In October, Microsoft released a technology preview for XML. If you're familiar with the Internet Database Connector (IDC) approach to working with Web sites and databases, you'll quickly be saying, "Here we go again."

In this preview, Microsoft lets you use a browser to either pass a SQL operation as a parameter or specify a template file that contains the operation you want to perform. To call the XML interface, you use a URL that passes the SQL statement to a parser. The parser sends the statement to SQL Server and sends back XML-formatted data.

In addition to this query process ability, Microsoft lets you build and specify an output template file that is used to format the output as needed by the client making the call. This procedure is just like the HTML Extension files (HTX) of yore, the formatting files used in the early days of Microsoft databases on Web sites.

Although I respect Microsoft for implementing XML because XML provides connectivity to SQL Server 6.5, 7.0, and beyond, I'm concerned that this XML implementation will reintroduce many of the problems that still plague users who embraced the concept of IDC and HTX database integration for Web sites before. Too many Web sites still use the IDC approach to access databases, despite the fact that this approach is slow, difficult to manipulate, and not very dynamic.

The Basics of XML


Here are several key features of XML, specifically Microsoft's implementation—the technology preview—of XML for SQL Server. You can use the Microsoft preview to help you understand the following XML elements:

  • Grams are commands that you send to the database engine, such as INSERT, UPDATE, and DELETE. You can send multiple statements in a single gram. You use special tags in the gram, including Before, After, and Sync. Sync is what's now known as a transaction; all items within a sync section are considered a single transaction. You use the Before and After tags to indicate what needs to be done (after) and what the data should look like before the change (before). Think of the Before tag as a WHERE clause.
  • You use a URL to issue standard SELECT statements. In this URL, you can pass in the SELECT statement and indicate the formatting of the data output. By default, the data will come back as a block of text in XML format that your client application can use.
  • You can see the output of the statements, and you can work with available formatting options.

I recommend that you download, install, and use the XML technology preview from Microsoft's SQL Server Web site (http://msdn.microsoft.com/ workshop/xml/articles/xmlsql/), but I can't stress enough that you should use this preview only to learn XML and its use. Don't use this technology preview in a production environment. We all build sample applications that turn out to be so helpful that people use them in production. Don't go there with this preview. Use it as Microsoft presents it, as a learning and testing tool for your environment.

Future of XML


I'm concerned that XML, Extensible Style Language (XSL), and other facets of XML will introduce additional developer overhead. Developers already use OLE DB and ODBC to code SQL statements. At best, introducing XML adds a new layer to the process. At worst, XML adds a layer that won't be fully integrated into the database engine. This lack of integration will shift responsibility for the database language elements back to the developer, which is the situation Microsoft tried to avoid by introducing ODBC and OLE DB. I hope that Microsoft will make the true XML interface (not the technology preview) easier to use without giving up the middleman layers provided by OLE DB and such.

XML is here to stay. It's an easy way to exchange information between sources, and it will enhance the usability of many tools, such as reporting tools, because the tools will know more about the data being passed to them. The SQL Server community is in the foundational stages of using XML. We're learning the language, seeing it in action, putting it through its paces, and trying to understand it.

Here's one last thought about XML. Assuming that XML is truly a success and breaks down the barrier between the client and server software (and the tiers in between), what will XML do to the database server wars? If the data and the interface to that data are identical between engines, you'll need to remove a level of variables from the database engine selection criteria for a given project. Now that will be interesting.