More About .NET XML Readers

In my May 29 column, I introduced an XML reader, the XmlTextReader class. In .NET, you can use the XmlTextReader class as a lightweight, but not less-effective, alternative to XML Document Object Model (XMLDOM) classes. XML readers let you move along a source file with a cursor-like approach. The atomic elements that you're reading aren't records, as with a database, or single bytes, as with streams. Instead, XML readers let you jump from node to node. Let's examine the methods the XmlTextReader provides that let you make those jumps.

The Read Method


The Read method blindly moves the internal pointer from one node to the next, regardless of the node type. Thus, a Read can move you from a comment node to the root document-element node or from a given element's last attribute node to the next element.

The MoveToContent Method


The MoveToContent method lets you skip over nonelement nodes and reach the document element node directly from the beginning of the XML file. For example, the following code first opens the specified XML document:

XmlTextReader reader = new XmlTextReader(fileName);
reader.Read();
reader.MoveToContent();

When the XML reader loads, the system automatically positions the reader before the physical beginning of the file. Moving the reader to the first node requires a call to the Read method. In an XML file, the first node can be of various types. In an XML 1.0-compliant document, the first node is a declaration node:

<? xml version="1.0" ?>

Under other circumstances, the first node can be a processing instruction, a comment, a doctype, or a document element.

If you plan to work only on content nodes and attributes, you use the MoveToContent method to skip over the first block of nodes and automatically position the pointer at the first content node—the root document element.

The MoveToElement Method


XML readers also have other interesting features that qualify the whole API as a noncached, read-only (but not necessarily forward-only) way of working with nodes. For example, suppose that you access the attribute list of a given node. You then move from one attribute to the next in a clearly forward-only manner. When you finish the attribute list, you might want to continue to the next content node in line or return to the parent node—the one that the attributes belong to.

The former case is obviously a move forward. The latter case, however, could qualify as a backward move because you're jumping back over already-read attributes. This instance is the only one in which the XML reader class provides any backward movement. The MoveToElement method provides this capability: it moves the pointer back to the node element that contains the current attribute node.

The Skip Method


Below is a typical loop to scan the content of an XML document:

while (reader.Read())
\{
   Console.Write("Node Type: ");
   Console.Write(reader.NodeType.ToString());
   Console.Write(", Node Name: ");
   Console.Write(reader.Name);
\}

This loop checks all the nodes it finds on its way. You can use the Skip method, however, to skip the current node and jump to the next one. For example, in the following code, the reader skips all nodes with names different from MyNode:

while (reader.Read())
\{
  if (reader.Name != "MyNode")
     reader.Skip();
\}

The XmlTextReader class, which inherits from the abstract base class XmlReader, enforces the rules for well-formed XML but doesn't provide XML data validation. It also checks doctype nodes to ensure that they're well-formed and that the syntax of the specified Document Type Definition (DTD) is correct. The XmlTextReader expands entities and checks to ensure that they're well-formed. In no case, however, does it use the DTD to perform validation.

The XmlTextReader class is a very fast parser because it doesn't perform the extra steps necessary for data validation. To perform data validation, you must use a new derived class—XmlValidatingReader—which I'll review in my next column.

Please or Register to post comments.

IT/Dev Connections

Las Vegas
September 30th - October 4th

Paul ThurottOur Experts will show you:
• Common SQL Server
Problems
• Best Practices for T-SQL
• SQL Server Integration
Services
• Database Development

Come See Mike Otey & Tim Ford in Person!

Early Registration Now Open

From the Blogs
May 9, 2013
blog

My ISO 8601-Compliant Signature 2

My family recently just "officially" announced that we're in the process of adopting a child from South Africa. We're quite excited, of course, but there's a ton of paperwork to do—along with the need for gobs of signatures....More
May 8, 2013
blog

Use SSIS for ETL from Hadoop

In this blog post, Mark Kromer walks you through using SSIS as a way to use ETL techniques using Microsoft's Hadoop on Windows (HDInsight) as a source using Hive connectors...More
Vision road sign
May 6, 2013
blog

Cheaters Never Win, Even in TPC Benchmarks

In this portion of the series on database benchmarking, I want to tell you about one of my favorite aspects of the TPC benchmarks – CHEATING....More
SQL Server Pro Forums

Get answers to questions, share tips, and engage with the SQL Server community in our Forums.