Contents
Overview 1
Lesson: Overview of XML Parsing 2
Lesson: Parsing XML Using XmlTextReader 14
Lesson: Creating a Custom Reader 31
Review 37
Lab 2.1: Parsing XML 39
Module 2: Parsing XML
Information in this document, including URL and other Internet Web site references, is subject to
change without notice. Unless otherwise noted, the example companies, organizations, products,
domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious,
and no association with any real company, organization, product, domain name, e-mail address,
logo, person, place or event is intended or should be inferred. Complying with all applicable
copyright laws is the responsibility of the user. Without limiting the rights under copyright, no part
of this document may be reproduced, stored in or introduced into a retrieval system, or transmitted
in any form or by any means (electronic, mechanical, photocopying, recording, or otherwise), or
for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
©2002 Microsoft Corporation. All rights reserved.
Microsoft, MS-DOS, Windows, Windows NT, Win32, Active Directory, ActiveX, BizTalk,
IntelliSense, JScript, Microsoft Press, MSDN, PowerPoint, SQL Server, Visual Basic, Visual C#,
and Visual Studio are either registered trademarks or trademarks of Microsoft Corporation in the
United States and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their
respective owners.
Module 2: Parsing XML iii
Instructor Notes
After completing this module, students will be able to:
Create a Stream object from an XML file.
Build a mutable string by using the StringBuilder object.
Handle errors in the form of XML.
Parse XML as text by using the XmlTextReader object.
Create a custom XmlReader object.
To teach this module, you need the following materials:
Microsoft® PowerPoint® file 2663A_02.ppt
2663A_02_Code.htm
To prepare to effectively teach this module:
Read the following Microsoft .NET Framework Class Library topics:
• XmlReader Class
• XmlTextReader
• StringBuilder Class
Read all of the materials for this module.
Complete the practices and the lab.
Practice delivering the demonstrations.
In this module, some of the Microsoft PowerPoint® slides provide hyperlinks
that open a code samples page in the Web browser. The code samples page
provides a way to show and discuss code samples when there is not enough
space for the code on the PowerPoint slide. It also allows students to copy code
samples directly from the browser window and paste them into a development
environment. All of the linked code samples for this module are in a single .htm
file.
To open a code sample, click the appropriate hyperlink on the slide. To navigate
between code samples in a particular language, use the table of contents
provided at the top of the code page. Each hyperlink opens a separate instance
the Web browser, so it is a good practice to click the Back button in Microsoft
Internet Explorer after viewing a code sample. This will close the browser
window and return you to the PowerPoint presentation.
Required materials
Preparation tasks
Hyperlinked Code
Examples
iv Module 2: Parsing XML
How to Teach This Module
This section contains information that will help you to teach this module.
Lesson: Overview of XML Parsing
This section describes the instructional methods for teaching each topic in this
lesson.
This topic introduces the module by defining the technical problem of parsing
XML. Most students will already understand what parsing is and why they
would do it.
This topic introduces XmlReader by comparing it with the Simple application
programming interface (API) for XML, or SAX, which many students are
already familiar with. Many students should also already be aware of the two
models of XML parsing, the push model versus the pull model. This topic
compares SAX, as an example of the push model, to the Microsoft .NET
Framework XmlReader class, as an example of the pull model of XML
processing. As the lesson progresses, if you identify those students who have
previous experience writing a SAX application, they might be able to help you
point out the advantages of XmlReader.
Briefly cover the major features of the XmlReader class. Students might ask
about the technique of using XmlValidatingReader with a
ValidationEventHandler, which is covered in the next module.
We cover reading XML from streams early, because it is a basic skill. Be
prepared to provide a definition of a stream.
Another basic skill is creating and appending parsed XML by using a
StringBuilder object. StringBuilder is preferred over the String object,
because it uses much less memory. StringBuilder also allows you to append
content to the string without having to create a new StringBuilder object.
Lesson: Parsing XML Using XmlTextReader
This section describes the instructional methods for teaching each topic in this
lesson.
This demonstration consists of showing typically usage of three functions of a
Microsoft Visual Studio
® .NET add-in that was custom-built for this course. To
prepare for this demonstration, you should perform the demonstration steps as
they are written and prepare to explain what the add-in does.
Do not walk through the code during the demonstrations. There are separate
code examinations you will perform in which you will do just that.
For more information about the add-in see Appendix A, “The XML
Tools Add-In.”
Show how to instantiate a new XmlTextReader.
Discuss the Read() method.
Introduction to XML
Parsin
g
XML Parsin
g
Models
Parsing XML With the
XmlReader Class
How to Read Streams
How to Build Strings
from Parsed XML
Demonstration: Parsing
XML
Note
How to Create an
XmlTextReader Object
How to Navigate Nodes
Module 2: Parsing XML v
Discuss the NodeType property.
Discuss how to use the Name, Value, and Attributes properties to read the
contents of a node.
Prepare to define the difference between significant and insignificant white
space.
Discuss the use of XmlException to handle errors that result from XML that is
not well-formed.
When you perform code examinations, increase the font size used by the
Visual Studio .NET development environment, especially the font size used by
the Code Editor and the Output window.
To change the display options
1. On the Tools menu, click Options.
2. Click the Text Editor folder, and then click the HTML/XML folder.
3. Select the Word wrap and Line numbers options.
While in the Code window, pressing CTRL+R twice will toggle word
wrap on and off.
4. Click the Environment folder, and then click the Fonts and Colors folder.
5. Change the font used for the Text Editor and the Text Output Tool
Windows to Lucida Console 14 pt.
6. Click OK.
7. Close and restart Visual Studio .NET for the changes to take effect.
Lesson: Creating a Custom Reader
This section describes the instructional methods for teaching each topic in this
lesson.
Be prepared to provide one or two anecdotes that illustrate the need for a
custom reader.
Discuss the types of XmlReader you can inherit from and the mechanics of
overriding the Read() method.
Be prepared to explain how the Read() method exposes the attribute as an
element node type by using the XmlNodeType.Name and
XmlNodeType.Value properties.
How to Determine the
Current Node Type
How to Read the
Contents of a Node
How to Handle White
Space
How to Handle XML
Errors While Parsing
Code Examination:
Parsin
g
XML
Note
Why Create a Custom
Reader Ob
j
ect?
Inheriting from
XmlReade
r
Code Examination:
Inheriting from
XmlTextReade
r
Module 2: Parsing XML 1
Overview
Overview of XML Parsing
Parsing XML Using XmlTextReader
Creating a Custom Reader
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
This module discusses how to parse Extensible Markup Language (XML) data
from a file, string, or stream by using the XmlTextReader class. The
XmlNodeReader object is not covered in this module, but works in a similar
way as the XmlTextReader object.
Both the XmlTextReader and XmlNodeReader objects inherit from
XmlReader. If these descendant objects do not provide the needed
functionality, you can create a custom reader object that inherits from
XmlReader.
After completing this module, you will be able to use the Microsoft
® .NET
Framework to:
Create a Stream object from an XML file.
Build a mutable string by using the StringBuilder object.
Handle errors in the form of XML.
Parse XML as text by using the XmlTextReader object.
Create a custom XmlReader object.
Introduction
Objectives
2 Module 2: Parsing XML
Lesson: Overview of XML Parsing
Introduction to XML Parsing
XML Parsing Models
Parsing XML with the XmlReader Class
How to Read Streams
How to Build Strings from Parsed XML
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
The XmlReader base class and the objects that inherit from it are a powerful
set of tools for parsing XML. This lesson discusses how to use the XmlReader
and supporting classes to parse XML in a variety of use contexts.
After completing this lesson, you will be able to:
Read XML from a File object.
Read XML from a Stream object.
Store XML in a StringBuilder object.
Introduction
Lesson ob
j
ectives
Module 2: Parsing XML 3
Introduction to XML Parsing
Parsing and reading XML mean the same thing
Parse XML to find content and to use node information
Create a list by node type
Sort nodes by namespace identifier
List all of the child elements in an XML source
Find a node by relative position
Find the last node to signal when to stop parsing
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
What does it mean to parse XML? Parsing refers to the process of reading
XML and then performing some action based on the information read.
When you parse XML, you often filter the data in an attempt to locate a
particular data value or range of values. At other times, you might be more
interested in the node information that the parser finds. The term node, when
used in this context, refers to a node as defined by the World Wide Web
Consortium (W3C) XML Information Set Recommendation available at
Parsing XML allows you to query an XML source to find a particular data
value. For example, suppose that you must build an application that can query a
local store of XML-based human resources data. Parsing the XML should allow
you to find a particular value such as the record that is associated with an
employee number that is equal to “12345.”
Parsing also allows you to filter an XML source to find a set of related
information. For example, you might want to filter a personnel listing to find
those employees whose hire date falls within the current month.
Parsing allows you to use the node information in an XML source, such as the
node type, or node value. The following are useful tasks that you can
accomplish by using node information made available by parsing:
Use node information to create a list by node type
Sort nodes by namespace identifier
List all of the child elements in an XML source
Introduction
Find particular content
Make use of node
information
4 Module 2: Parsing XML
XML Parsing Models
Push Model
Push Model
Push Model
Pull Model
Pull Model
Application
Generate calls to
XmlReader that pull
specific XML
Application
Generate calls to
XmlReader that pull
specific XML
<a>
<b/>
</a>
<a>
<b/>
</a>
SAX XML reader
Push unfiltered
XML to the
calling application
SAX XML reader
Push unfiltered
XML to the
calling application
XmlReader class
Pull specified XML and
implement error handling
XmlReader class
Pull specified XML and
implement error handling
<a>
<b/>
</a>
<a>
<b/>
</a>
Application
Process nodes, handle
errors, and monitor the
state of the reader
Application
Process nodes, handle
errors, and monitor the
state of the reader
XmlTextReader
Content Handler
Content Handler
Error Handler
Error Handler
XmlNodeReader
Node Handler
Node Handler
XmlValidatingReader
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
XML processors are based on the push model or the pull model of XML
processing. The push model is typified by a processor that uses the Simple
application programming interface (API) for XML, referred to as SAX. The
pull model is typified by how the .NET Framework XML reader classes process
XML.
The push model of XML processing means that the parser “pushes” to the
application an unfiltered, steady stream of parsed XML nodes. SAX is an
example of a parser that does this. SAX pushes unfiltered XML nodes in
response to a request by an application.
You must write applications that consume unfiltered XML nodes to filter
relevant node information and content. The push model assumes that there is
perfectly formed XML. If the SAX processor finds an XML error, it
immediately stops processing and then sends an exception to the calling
application. You should write any application that uses the push model of XML
processing to handle a variety of XML errors.
SAX is not supported by the .NET Framework, but you can use existing SAX
tools, such as the Microsoft XML Parser (MSXML), in your .NET-based
programs.
Introduction
What is the push model
of XML processing?
Module 2: Parsing XML 5
The pull model of XML processing means that the parser pulls from the XML
source only those nodes that it is instructed to pull by a calling application.
XmlReader, a .NET Framework class, is an example of a parser that pulls a
filtered set of XML nodes in response to a request by an application.
XmlReader objects read the XML one node at a time and only send
notification to the application in response to some predefined criteria. Similar to
the SAX processor, if an XmlReader object finds an XML error, it sends an
exception to the calling application. Unlike the SAX processor, XmlReader
objects are designed to continue processing XML even after an XML error is
found.
There are two main advantages of using the XmlReader pull model versus the
push model, when it is implemented by SAX. First, it is easier to code
applications that use the XmlReader. XmlReader pull-processing is typically
implemented by using looping structures, whereas push models use routines
that handle state. Looping structures are easier to write than routines that handle
state. Although contextual state management is still a challenge with the pull
model, managing the context is easier to code by using consumer-driven
procedural techniques.
Second, applications that use XmlReader can potentially perform better
because they require less processing power and memory than applications that
rely on SAX. Applications that use XmlReader can take advantage of client
hints to make more efficient use of character buffers; for example, by avoiding
needless string copies. Consumers can also selectively process elements; for
example, by skipping elements of no interest and by not expanding entities.
With a push model, everything must be passed through the application, because
the reader has no way of knowing what is important.
If you still prefer to use a push model, you can layer a set of push-style
interfaces on top of the XmlReader pull model, but the reverse is not yet true.
A sample SAX2 implementation layered over an XmlReader may ship with the
.NET Framework software development kit (SDK).
What is the pull model of
XML processing?
6 Module 2: Parsing XML
The following table summarizes the primary benefits of the pull model.
Benefit Description
State management The push model requires the content handlers to build very
complex state machines. The pull model client simplifies
state management by means of a natural, top-down
procedural refinement.
Multiple input streams The pull model allows the client to put together multiple
input streams. This task is extremely complicated in the push
model.
Layering You can build the push model on top of the pull model. The
reverse is not true.
Efficient data handling The push model requires data to be written twice. First, the
parser writes a string object to its own buffer. Then, the
parser pushes the string object to the client buffer.
In the pull model, the string is read into the parser buffer one
time only.
Selective processing The push model notifies the client of each item, including
attributes, processing instructions, and white space. The pull
model client can skip items, processing only those items that
are of interest to the application. This allows for extremely
efficient applications.
Summary of pull model
benefits
Module 2: Parsing XML 7
Parsing XML with the XmlReader Class
What is XmlReader?
An abstract base class
Extends to these XML readers: XmlTextReader,
XmlNodeReader, and XmlValidatingReader
Can be used either to create customized readers
Non-cached, forward-only, read-only access
Allows you to pull only those nodes that interest you
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
The XmlReader class is an abstract base class that provides non-cached,
forward-only, read-only access to XML sources, including streams, files, and
Uniform Resource Locators (URLs). It implements the namespace requirements
outlined in the Namespaces in XML Recommendation provided by the W3C,
located at
XmlReader class objects can quickly read data from XML sources without
placing high demands on system resources such as memory and CPU time.
Because XmlReader is an abstract base class, you can use it to create your own
type of reader or implement one of the XmlReader extended classes. The
XmlReader class has three implementations that extend the base class and vary
in their design to support different scenario needs.
The following table describes the implementations of the XmlReader class.
Class Description
XmlTextReader Reads character streams. This is a forward-only reader
that has methods that return data on content and node
types.
XmlNodeReader Provides a parser over an XML Document Object Model
(DOM) API.
XmlValidatingReader Provides a fully compliant validating or non-validating
XML parser with Document Type Definition (DTD),
XML Schema Definition language (XSD) schema, or
XML-Data Reduced (XDR) schema support. This class
takes an XmlTextReader and layers validation services
on top.
Any class that inherits
from XmlReader
Allows developer-defined derivations of the
XmlReader.
Introduction
Benefits
Classes that inherit from
XmlReade
r
8 Module 2: Parsing XML
By using the XmlReader class members, you can develop a solution that can
respond conditionally to node information in the XML source.
XmlReader class objects read XML by stepping though it one node at a time.
As each node is read, the program can perform actions based on the qualities of
that node. Such qualities include the type of the node, its attributes and data,
and other node information.
As an additional benefit to the job of programming, XmlReader class objects
determine if the XML is well-formed. If the XML contains an error,
XmlReader objects throw an exception of the type XmlException, and the
processing stops.
To continue processing after an error occurs, you must use an
XmlValidatingReader with a ValidationEventHandler instead.
For a complete description of the members of the XmlReader class, see
XmlReader Members in the Additional Reading folder.
To use an XmlReader object or any of its derived classes in your
application, you must provide a reference to the .NET Framework System.Xml
namespace.
Conditional processing
Robust XML error
handling
Note
Module 2: Parsing XML 9
How to Read Streams
A stream is an abstraction of bytes drawn from any
number of sources
A stream may be created from a file, URL, or another
stream
Use a StreamReader to read a stream
object = new Stream( file | string | stream )
object = new Stream( file | string | stream )
Visual Basic Example
C# Example
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
Your application can use the classes contained in the System.IO namespace to
read XML data from a stream or from a file. The terms file and stream convey a
particular meaning within the .NET Framework.
The term file is here used in the ordinary sense: an ordered and named
collection of a particular sequence of bytes having persistent storage. When you
program an application to read XML from a file, you must consider directory
paths, disk storage, and file and directory names.
To simplify the job of programming an application to read files, you can use
.NET Framework file and directory system input and output classes. The
following table describes the file and directory System.IO classes.
System.IO class Description
File Provides static methods to create, copy, move, and open files. Aids
in the creation of FileStream objects. The FileInfo class provides
instance methods.
Directory Provides static methods to create, move, and enumerate directories
and subdirectories. The DirectoryInfo class provides instance
methods.
TextReader Represents a reader that can read a sequential series of characters.
TextReader is designed for character input, whereas the Stream
class is designed for byte input and output.
StreamReader Implements a TextReader that reads characters from a byte stream
in a particular encoding. StreamReader is designed for character
input in a particular encoding, whereas the Stream class is
designed for byte input and output.
Introduction
What is a file?
System.IO classes for
reading files
10 Module 2: Parsing XML
A stream is an abstraction of a sequence of bytes. The bytes themselves can
originate from any number of sources, such as a file, an input/output device, an
interprocess communication pipe, or a Transmission Control Protocol/Internet
Protocol (TCP/IP) socket. Examples of streams include network, memory, and
tape streams.
The Stream class and its derived classes provide a generic view of a sequence
of bytes. Using a stream simplifies the job of programming read operations of
XML that might originate from various operating systems and devices.
Streams involve the following fundamental operations:
Streams can be read from. Reading is the transfer of data from a stream into
a data structure, such as an array of bytes.
Streams can be written to. Writing is the transfer of data from a data
structure into a stream.
Streams can support seeking. Seeking is the querying and modifying of the
current position within a stream.
Depending on the underlying data source or repository, streams might support
only some of these capabilities.
What is a stream?
Module 2: Parsing XML 11
In this example, a StreamReader object is created from a File object. The
following code example, provided in both the Microsoft Visual Basic
® and C#
languages, reads an entire text file line by line.
All code samples assume that any required namespaces are aliased at the
top of the class. For example, to use the classes within the System.IO
namespace, the following statement is required:
' Visual Basic®
Imports System.IO
// C#
using System.IO;
' Visual Basic
Dim BooksFilename As String = "c:\books.txt"
If File.Exists(BooksFilename) Then
Dim BooksReader As StreamReader = _
File.OpenText(BooksFilename)
Dim CurrentLine As String = BooksReader.ReadLine()
While Not CurrentLine Is Nothing
' process line
CurrentLine = BooksReader.ReadLine()
End While
BooksReader.Close()
End If
// C#
string BooksFilename = @"c:\books.txt";
if (File.Exists(BooksFilename)) {
StreamReader BooksReader = File.OpenText(BooksFilename);
String CurrentLine = BooksReader.ReadLine();
while (CurrentLine != null) {
// process line
CurrentLine = BooksReader.ReadLine();
}
BooksReader.Close();
}
For more information, search the .NET Framework Class Library for the
keywords Stream Class.
When using the classes in the System.IO namespace, you must satisfy the
operating system security requirements, such as access control lists (ACLs), for
access to be allowed. This requirement is in addition to any FileIOPermission
requirements.
Example
Note
S
y
stem.IO and securit
y
12 Module 2: Parsing XML
How to Build Strings from Parsed XML
The String object is immutable
Do NOT use when concatenating in a loop
The StringBuilder object is mutable
To build a string with the StringBuilder class, use the
Append() method inside a loop
Use the ToString() method to retrieve the string
Visual Basic Example C# Example
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
It is typical for an application that reads XML to build strings to hold filtered
data. A reader object is normally inserted into a looping structure. In such a
case, each time the loop iterates, the reader object reads another node or set of
nodes and then copies the data into a string object. For a large XML file, the
loop might iterate thousands of times and build a result composed of tens of
thousands of XML nodes.
When you want to modify a string without creating a new object, consider using
the System.Text.StringBuilder class instead of the String class. For example,
using the StringBuilder class can boost performance when concatenating many
strings together in a loop.
The System.Text.StringBuilder class represents a mutable string of characters.
This means that you can modify the contents of a StringBuilder object. The
value is said to be mutable because it can be modified after it has been created,
by appending, removing, replacing, or inserting characters.
At first glance, you might decide to try the String class as the object type
to concatenate XML fragments that originate from an XML reader. However,
this would be a mistake, because the String class is designed to represent an
immutable series of characters. This means that you cannot simply append new
characters to a String class each time a reader iterates through a looping
structure. Doing so creates multiple instances of the String object and can
easily result in highly expensive XML source processing. However, in the case
of reading a file into a stream, an appropriate first step is to load the file into a
String object. The stream can then load the XML from the String object.
Introduction
What is the
StringBuilder class?
Note
Module 2: Parsing XML 13
The following example initializes a new instance of the StringBuilder class by
using the specified string, and then creates a string containing the 12 Times
Table by using a for loop:
' Visual Basic
Dim sb As New StringBuilder("12 Times Table:")
Dim i As Integer
For i = 1 To 12
sb.Append(vbCrLf & i & " x 12 = " & i * 12)
Next
MessageBox.Show(sb.ToString())
' Do NOT use the String class, for example
Dim s As String = "12 Times Table:"
Dim i As Integer
For i = 1 To 12
s += vbCrLf & i & " x 12 = " & i * 12
Next
MessageBox.Show(s)
// C#
StringBuilder sb = new StringBuilder("12 Times Table:");
for (int i = 1; i <= 12; i++) {
sb.Append("\n" + i + " x 12 = " + i * 12);
}
MessageBox.Show(sb.ToString());
// Do NOT use the String class, for example
string s = "12 Times Table:";
for (int i = 1; i <= 12; i++) {
s += "\n" + i + " x 12 = " + i * 12;
}
MessageBox.Show(s);
Example
14 Module 2: Parsing XML
Lesson: Parsing XML Using XmlTextReader
Demonstration: Parsing XML
How to Create an XmlTextReader Object
How to Navigate Nodes
How to Determine the Current Node Type
How to Read the Contents of a Node
How to Handle White Space
How to Handle XML Errors While Parsing
Code Examination: Parsing XML
Practice: Reading XML Content and Nodes
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
The node information in an XML source is an important resource that you can
use in applications that process XML. You can use node information not only to
find particular content, but also as a very useful basis for the logic that controls
program flow. In this lesson, you will learn how to find and use XML node
information in your applications.
After completing this lesson, you will be able to:
Navigate through XML nodes by using the Read() methods.
Determine the current node type and extract information about the current
node.
Read the attributes of an element type of node.
Handle white space in an XML document.
Implement XML error handling while parsing.
Introduction
Lesson ob
j
ectives
Module 2: Parsing XML 15
Demonstration: Parsing XML
The XML Tools add-in:
Parses
Filters
Converts
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
In this demonstration, you will see the parsing and filtering functionality of the
XML Tools add-in. Compiled release versions of the add-in written in both
Microsoft Visual C#
™
and Microsoft Visual Basic® languages are available in
the following folders:
install_folder\Democode\Addins\
XmlToolsAddinCS\XmlToolsAddinCSSetup\Release\
install_folder\Democode\Addins\
XmlToolsAddinVB\XmlToolsAddinVBSetup\Release\
To install the add-in
1. Double-click the setup.exe file in one of the folders above.
2. Follow the instructions in the wizard.
For detailed installation instructions see Appendix A.
To parse a sample XML file that is open in the editor
1. In Microsoft Visual Studio
® .NET, open the files named books.xml and
employee.xml. These are located in the folder
install_folder\Democode\Mod02\.
2. Switch to the employee.xml file to make it the active window and then
examine the following:
<?xml version="1.0"?>
<employees>
<employee fname="Nancy" lname="Davolio" alias="nancyd" />
<employee fname="Andrew" lname="Fuller" alias="andrewf"
/>
</employees>
3. On the XML Tools toolbar, click Parse.
Introduction
Demonstration
16 Module 2: Parsing XML
4. Notice that the Output window opens, showing detailed information about
the employee.xml file. Each node in the XML file appears as a row in the
details table, and a count of the number of each type of node appears in the
summary table.
Parsing: C:\ \Democode\Mod02\employee.xml
DEPTH|PREFIX |NODETYPE |NAME |VALUE
| | | |
0| |XmlDeclaration |xml |version="1.0"
0| |Whitespace | |{CrLf}
0| |Element |employees |
1| |Whitespace | |{CrLf}
1| |Element |employee |
1| |Whitespace | |{CrLf}
1| |Element |employee |
1| |Whitespace | |{CrLf}
0| |EndElement |employees |
STATISTICS
XmlDeclaration: 1
ProcessingInstruction: 0
DocumentType: 0
Comment: 0
Element: 3
Attribute: 6
Text: 0
Whitespace: 4
To parse another sample XML file
1. Click Solution Explorer to make it active.
2. On the XML Tools toolbar, click Parse. Because no XML file is active, a
dialog box appears prompting the user to choose one of the open files.
3. In the Parse dialog box, click the file named books.xml, and then click OK.
The Output window opens showing detailed information about the file.
4. Use the Output window to verify the answers provided to the following
questions:
a. What is the Depth of the Text node with a value of Benjamin?
4
b. How many elements are there in total?
18
c. How many attributes are there in total?
9
Module 2: Parsing XML 17
To parse a sample XML file in the internal browser
1. On the View menu, click Web Browser, and then click Show Browser (or
press Ctrl+Alt+R).
2. Make sure that Set web links to internally or externally opened is set to
internal. (The icon should look like this
.)
3. Enter the following URL in the Web toolbar:
http://localhost/2663/Democode/Mod02/books.xml
4. On the XML Tools toolbar, click Parse. This demonstrates that the add-in
can parse any XML-compliant file that is accessible on the Internet.
To filter by specifying a child element value
1. On the XML Tools toolbar, click Filter.
2. If the add-in prompts to select a file, click books.xml, and then click OK.
3. In the Filter dialog box, enter the following options, and then click OK.
Option Value
Return elements named book
Where a child element named first-name
is equal to Herman
4. Notice that the Output window shows the one book that matches the filter.
To filter by specifying an attribute value
1. On the XML Tools toolbar, click Filter.
2. If the add-in prompts you to select a file, click books.xml, and then click
OK.
3. In the Filter dialog box, enter the following options, and then click OK.
Option Value
Return elements named book
Where an attribute named publicationdate
is greater than 1980
4. Notice that the Output window shows the two books that match the filter.
To convert the active file and save the result to a file
1. On the XML Tools toolbar, click Convert.
2. If the add-in prompts to select a file, click books.xml, and then click OK.
The Output window shows books.xml with all of its attributes converted to
elements.
3. Click the Output window to make it active, and then on the File menu, click
Save Output As.
4. Save the output as BooksAsElements.xml in the folder
install_folder\Democode\Mod02\.
18 Module 2: Parsing XML
How to Create an XmlTextReader Object
XmlTextReader BooksReader =
new XmlTextReader(@"c:\books.xml");
XmlTextReader BooksReader =
new XmlTextReader(@"c:\books.xml");
Use the XmlTextReader constructor
Possible parameters:
Stream
String
TextReader
URL
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
The XmlTextReader class is an implementation of XmlReader and provides a
high performance parser. It enforces the rule that XML must be well-formed. It
is neither a validating nor a non-validating parser, because it does not have
DTD or schema information. It can read text in blocks or read characters from a
stream.
The XmlTextReader can read data from different inputs.
Stream
String
TextReader
URL identifying a local file location or a Web site
The following code can be used to construct a XmlTextReader.
' Visual Basic
Dim BooksReader As _
New XmlTextReader("c:\books.xml")
// C#
XmlTextReader BooksReader =
new XmlTextReader(@"c:\books.xml");
Introduction
XmlTextReader
constructo
r
Module 2: Parsing XML 19
How to Navigate Nodes
Read() methods, for example,
Read(), ReadStartElement(), and so on
Determine the end of the file by checking the Boolean
return value of the Read() methods
' Visual Basic
While BooksReader.Read()
' process current node
End While
// C#
while (BooksReader.Read()) {
// process current node
}
' Visual Basic
While BooksReader.Read()
' process current node
End While
// C#
while (BooksReader.Read()) {
// process current node
}
*****************************
ILLEGAL FOR NON-TRAINER USE******************************
You can advance the reader by calling one of the Read() methods. Calling
Read() repeatedly moves the reader to the next node and is typically performed
inside a while loop.
The current XML node is that node upon which the reader is currently
positioned. All methods called and actions taken are performed with respect to
that current node, and all properties retrieved reflect the value of the current
node.
When an XmlReader is first initialized, there is no current node, so the first
call to Read() moves to the first node in the document. When an XmlReader
reaches the end of the document, it does not walk off the end, leaving it in an
indeterminate state, but simply returns a Boolean false when there are no more
nodes to process.
XmlReader also offers several other Read() methods that can provide
contextual checks along the way. For example, if you want to make sure that
the current node is an element with a specific name before continuing, you can
use ReadStartElement().
In the following example, the Read() method is called in a while loop until it
returns false, indicting the end of the file has been reached.
' Visual Basic
While BooksReader.Read()
' process current node
End While
// C#
while (BooksReader.Read()) {
// process current node
}
Introduction
Current node position
Read methods
Example