Two ways to process database updates in Web applications
Development efforts in the corporate world are changing to leverage the power and interoperability of the Internet. Focus has shifted from developing traditional client/server applications—which require strict definition of the data structures and protocols used to exchange data—to developing Web applications using XML. Like traditional client/server applications, Web applications exchange data between different logical entities over a LAN or WAN. But instead of using dedicated connections and binary data formats, Web applications use the public Internet and XML. Most Web applications are also similar to client/server applications in their need to access and update data stored in a relational database such as SQL Server. Web applications have to obtain XML query results from SQL Server and update SQL Server with XML-formatted data. Microsoft has satisfied these requirements by implementing many technologies within SQL Server 2000, XML for SQL Server 2000 (SQLXML), and the .NET Framework. (For an overview of these technologies, see "Selecting XML Technologies for Queries and Updates," April 2002, InstantDoc ID 24342.) These technologies provide a solid basis for building a variety of Web applications. However, processing database updates has proven consistently difficult.
Microsoft provides several options for updating a database, including OpenXML, XML Bulk Load, updategrams, and Diffgrams. We've explored OpenXML and XML Bulk Load in previous columns; now it's time to look at updategrams and Diffgrams. Each technology has significant drawbacks for rapidly building real-world Web applications—scenarios in which data is requested from a Web application, modified remotely at the client site, then returned to the Web application so that modifications can be stored in the database. Implicit in these scenarios are three requirements for the Web application. First is the need for optimistic update. A Web application needs optimistic update because of the asynchronous, short-lived, potentially unreliable Internet connections between client and server. Without optimistic update, application and database performance deteriorate swiftly because many locks are held until they time out. The second requirement is that the client doesn't need—and should have no foreknowledge of—the database structure. Third, the client needs a way to transmit changes to the data—including concurrency information—to circumvent competing updates that multiple clients make to the server. Updategrams and Diffgrams are similar technologies that support these three requirements to varying degrees. So, assuming you're building a Web application, which technology should you use?
An updategram is an XML document that describes the changes made to an XML query result. You can use updategrams with any SQLXML Web release to optimistically update your database. Updategrams also support updating one or more tables in the database corresponding to hierarchical relationships in the XML. For example, you can insert order items nested within an XML order element into the database's Order and LineItem tables. Although updategrams attempt to satisfy the three requirements for updating data, they fall short in several areas.
Primarily, updategrams aren't supported by ADO, SQL Server, SQLXML, or object models such as the XML Document Object Model (DOM). This deficiency reduces the usefulness of updategrams in three ways. First, because ADO and DOM don't support generating updategrams, you must implement client code that generates the updategram syntax directly. What's worse, this onerous task doesn't scale across projects; you have to implement the logic every time you want to produce an updategram. Second, Web applications usually require business logic to verify and validate the data before storing it in the database. Again, you need an object model to express the business logic and—because the technology uses optimistic update—to verify that the data wasn't changed between the time you retrieved the data from the database and the time you attempted to update it. Third, an object model ideally can give you access to the updated data and to the original values in a consistent manner. But again, you'd need to implement code that interprets the updategram syntax in addition to performing the business logic and concurrency checks.
Updategrams also miss the mark because they can't call stored procedures to perform a database update. Therefore, business logic, constraints, and referential integrity that you implement by using stored procedures are rendered useless.
The way updategrams express changes made to a data set can pose an additional problem. These changes are expressed in terms of the XML—not the relational database. Although expressing the changes this way is generally beneficial because it hides details of the database from the client, it also can exclude data that you might require to validate changes or execute business logic before you commit the changes to the database.
Finally, updategrams fall short in the way they're usually applied. Because updategrams are best used for batch processing, they would produce the most efficient results if SQL Server processed them the same way it processes T-SQL UPDATE statements. However, the current SQLXML implementation processes updategrams on the middle tier, generating SQL statements that SQL Server executes instead of directly processing the updategrams. So although updategrams appear on the surface to be a good way to update data within your Web application and might work for a narrow set of scenarios, their shortcomings prevent them from being generally useful.
Introduced with the ADO.NET support in the .NET Framework, Diffgrams are similar to updategrams but solve some of the most serious problems inherent in updategrams. Most importantly, Diffgrams provide object-model support through the ADO.NET DataSet object. You can generate Diffgrams directly from a DataSet definition by persisting the DataSet in Diffgram format, and you can create a DataSet from a persisted Diffgram. Therefore, when you build your business logic, you don't have to write the code to generate the Diffgram or to navigate the Diffgram structure. Instead, you just use the DataSet object to write the code. You can also combine a DataSet object with a DataAdapter object to commit changes expressed in the Diffgram to the database. (DataAdapter is an intermediary object in ADO.NET that reads data from—and resolves changes to—a database.) Additionally, DataAdapter supports updates through stored procedures and can raise an event to your program if a concurrent update has been made to the data. Although Diffgrams are significantly more convenient for programmers than updategrams, using Diffgrams carries some caveats.
First, you need the .NET Framework to take full advantage of Diffgrams. The .NET Framework and the Common Language Runtime (CLR) aren't yet fully deployed, so a Web client might not have access to the DataSet's rich functionality, or the .NET Framework might not be available on a middle-tier server. Second, Diffgrams don't support updates on tables that use identity columns. If your database schema uses identity columns to automatically generate unique IDs for entities that are subsequently used for foreign keys in other tables, Diffgrams fail to process the updates. For example, Diffgrams can't successfully add line items to your database when an order and its line items are stored in separate tables and when an identity column (e.g., an order number) is used as a foreign key in the LineItems table to refer to its associated order. The only workaround is to write the update code yourself, which negates the value of Diffgrams. More subtle obstacles include the Diffgram's inability to consume XML when different complex types (defined in an XML Schema Definition—XSD) use the same name but in a different context. For example, you might use the name Vendor to reference generically a ServiceProvider type and a ProductReseller type, which have different structures. You might also encounter difficulties interpreting a Diffgram if the client uses a different schema—or no schema at all—to build the DataSet. Because the Diffgram is a serialization of a DataSet's contents, the Diffgram you receive might not match your database or any schema that you know, and consequently, updating your database becomes difficult or impossible.
If the .NET Framework isn't available, a Web client application typically has to generate a Diffgram by writing the code to construct the Diffgram format. However, on the middle tier, a processing alternative is available. SQLXML 2.0 and 3.0 can process a Diffgram and update your database in two ways: through ADO's XML dialect, which uses a CommandStream object, or through a SQLXML virtual directory, which processes Diffgrams in much the same way it processes XML templates. Both methods use an XSD mapping schema to map the data inside the Diffgram to the database structure. Using this mapping information, SQLXML processes the Diffgram and generates SQL statements to modify the database in the same way it processes an updategram. Although SQLXML will help you understand the kind of support available for Diffgrams before you install the .NET Framework, you still have to grapple with the problems I outlined earlier for updategrams and Diffgrams.
Use Diffgrams Selectively
Diffgrams are more useful than updategrams because ADO.NET generates Diffgrams automatically. But you should use Diffgrams only when your database doesn't refer to identity columns that have foreign keys and when you can control the XML schemas that the client code uses. When those two conditions are met, you can ensure successful interpretation of the Diffgrams that your Web application receives. Perhaps someday, a standard language will express changes to the data as it is represented in the XML document as opposed to expressing the changes based on the way a programming model operates on the data. Then, we might be able to rapidly build Web applications that interchange data—and changes to that data—across platforms, languages, frameworks, and database management systems.