Tải bản đầy đủ (.pdf) (60 trang)

Professional ASP.NET 2.0 XML phần 3 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.25 MB, 60 trang )

}
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>Writing XML File</title>
</head>
<body>
<form id=”form1” runat=”server”>
<div>
<asp:label id=”lblResult” runat=”server” />
</div>
</form>
</body>
</html>
Listing 4-8 uses “urn:employees-wrox” as the namespace and the namespace prefix used is “emp”. If
you navigate to the code Listing 4-8 in a browser, you will see the output shown in Figure 4-6.
Figure 4-6
In Listing 4-8, you supplied the namespace as an argument to the
WriteStartElement as shown in the
following code.
writer.WriteStartElement(“employee”, “urn:employees-wrox”);
You can also accomplish this effect using the following two lines of code as well.
string prefix = writer.LookupPrefix(“urn:employees-wrox”);
writer.WriteStartElement(prefix, “employee”, null);
94
Chapter 4
07_596772 ch04.qxd 12/13/05 11:23 PM Page 94
By leveraging the LookupPrefix() method, you can get reference to the namespace space prefix in a local
variable and then supply it as an argument to methods such as
WriteStartElement(). The advantage to


this approach is that you don’t have to supply the namespace to each of the creation methods; you simply
supply the prefix obtained through the
LookupPrefix() method to the creation methods.
Writing Images Using XmlWriter
The techniques described in the previous sections can also be used with any sort of binary data that can
be expressed with an array of bytes, including images. This section provides you with an example and
demonstrates how to embed a JPEG image in an XML document. The structure of the sample XML
document is extremely simple. It consists of a single employee node, and inside that node there is an
image node holding the binary image data plus an attribute containing the original file name. Code
required for implementing this is shown in Listing 4-9.
Listing 4-9: Embedding an Image in an XML Document
<%@ Page Language=”C#” %>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.IO” %>
<script runat=”server”>
void Page_Load(object sender, EventArgs e)
{
string xmlFilePath = @”C:\Data\Employees.xml”;
string imageFileName = @”C:\Data\Employee.jpg”;
try
{
using (XmlWriter writer = XmlWriter.Create(xmlFilePath))
{
//Start writing the XML document
writer.WriteStartDocument(false);
writer.WriteStartElement(“employee”);
writer.WriteAttributeString(“id”, “1”);
writer.WriteStartElement(“image”);
writer.WriteAttributeString(“fileName”, imageFileName);
//Get the size of the file

FileInfo fi = new FileInfo(imageFileName);
int size = (int)fi.Length;
//Read the JPEG file
byte[] imgBytes = new byte[size];
FileStream stream = new FileStream(imageFileName, FileMode.Open);
BinaryReader reader = new BinaryReader(stream);
imgBytes = reader.ReadBytes(size);
reader.Close();
//Write the JPEG data
writer.WriteBinHex(imgBytes, 0, size);
writer.WriteEndElement();
writer.WriteEndElement();
writer.WriteEndDocument();
//flush the object and write the XML data to the file
writer.Flush();
lblResult.Text = “File is written successfully”;
}
}
catch (Exception ex)
{
95
Reading and Writing XML Data Using XmlReader and XmlWriter
07_596772 ch04.qxd 12/13/05 11:23 PM Page 95
lblResult.Text = “An Exception occurred: “ + ex.Message;
}
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>Writing Images using XmlWriter</title>

</head>
<body>
<form id=”form1” runat=”server”>
<div>
<asp:label id=”lblResult” runat=”server” />
</div>
</form>
</body>
</html>
Listing 4-9 uses the FileInfo class to determine the size of the JPEG file. FileInfo is a helper class in
the
System.IO namespace that allows you to retrieve information about individual files. The contents
of the
employees.jpeg file are extracted using the ReadBytes method of the .NET binary reader. The
contents are then encoded as
BinHex and written to the XML document. Figure 4-7 shows the output
produced by the code.
Figure 4-7
Summary
This chapter introduced you to .NET’s XML-handling capabilities. The .NET architecture provides the
most complete, integrated support platform for XML yet from Microsoft, and it makes many otherwise
daunting tasks much easier to accomplish. This chapter introduced you to the SAX and DOM methods
for processing XML, and showed you how the Microsoft approach attempts to marry these two
approaches using a model that provides the benefits of both.
96
Chapter 4
07_596772 ch04.qxd 12/13/05 11:23 PM Page 96
Specifically, you learned how to read XML using the XmlReader class, and how to use the
XmlReaderSettings object in conjunction with the XmlReader object to customize the output of the
XmlReader object. You learned how to use the XmlWriter class to write XML data files, which greatly

reduces the amount of information that an application has to keep track of when writing XML. Finally,
you learned how to use the
XmlWriter object to create namespaces and embed images in an XML
Document.
As you can see, after you know the basics of reading an XML file with the
XmlReader, it’s very easy to
begin using its built-in constructs to extract and manipulate XML data to your precise needs. Hopefully
this chapter gave you the motivation to start writing your own XML applications. XML is clearly going
to play a large role in future Web development, and learning these skills is essential to the success of any
Web application developer. As an exercise to better understand how this works, I recommend taking
your own XML markup and writing a similar script to extract element and attribute values from it. After
all, practice makes perfect!
97
Reading and Writing XML Data Using XmlReader and XmlWriter
07_596772 ch04.qxd 12/13/05 11:23 PM Page 97
07_596772 ch04.qxd 12/13/05 11:23 PM Page 98
XML Data Validation
In the previous chapters, you have seen all about reading XML files, and even checking if they are
well-formed and valid. This chapter takes a step into more advanced territory by looking at how
to perform validation of XML data at the time of reading XML data. This chapter discusses the dif-
ferent types of XML validation using the classes in the
System.Xml namespace. This chapter also
provides an in-depth discussion on the .NET Schema Object Model by providing examples on how
to programmatically create and read XML schemas. Specifically, this chapter will cover:
❑ XML validation support provided by the .NET Framework 2.0
❑ How to validate an XML file using the
XmlReaderSettings class in conjunction with the
XmlReader class
❑ How to take advantage of the
XmlSchemaSet class to cache XML schemas and then use

them to validate XML files
❑ How to perform XML DOM validation through the
XmlNodeReader class
❑ How to use inline schemas to validate XML data
❑ How to validate XML data using DTDs
❑ Visual Studio’s support for creating XSD schemas
❑ How to programmatically read XSD schemas using
XmlSchema
❑ How to programmatically create XSD schemas
❑ How to programmatically infer XSD schema from an XML file
The next section starts by reviewing the validation support provided by the .NET Framework 2.0.
08_596772 ch05.qxd 12/13/05 11:17 PM Page 99
XML Validation
Validation is the process of enforcing rules on the XML content either via a XSD schema or a DTD or a XDR
schema. There are two ways to define a structure for an XML document, sometimes called a vocabulary:
DTDs and XML schemas. Using an XML schema is a newer and somewhat more flexible technique than
using a DTD, but both approaches are in common use. A DTD or schema may be embedded within an
XML file, but more often it will be contained in a separate file. An XML processing program, called a parser,
can check an XML document against its DTD or schema to see if it follows the rules; this process is called
validation. An XML file that follows all the rules in its DTD or schema is said to be valid.
The XML schema file usually is an XML-Data Reduced (XDR) or XML Schema Definition language
(XSD) file. XSD schema-based validation is the industry accepted standard and is the primary method of
XML validation used in most of the applications. Although validation of XML data using DTDs is used
only in legacy applications, this chapter provides you with an example on how to use DTDs for XML
validation.
Validation Types Supported in .NET Framework 2.0
In .NET Framework, there are a number of ways you can perform validation of XML data. Before dis-
cussing those validation types, it is important to understand the key differences between the validation
mechanisms (DTD, XDR, and XSD) supported by the .NET Framework.
❑ DTD —A text file whose syntax stems directly from the Standard Generalized Markup

Language (SGML) —the ancestor of XML as we know it today. A DTD follows a custom, non-
XML syntax to define the set of valid tags, the attributes each tag can support, and the depen-
dencies between tags. A DTD allows you to specify the children for each tag, their cardinality,
their attributes, and a few other properties for both tags and attributes. Cardinality specifies the
number of occurrences of each child element.
❑ XDR —A schema language based on a proposal submitted by Microsoft to the W3C back in
1998. (For more information, see
/>XDRs are flexible and overcome some of the limitations of DTDs. Unlike DTDs, XDRs describe
the structure of the document using the same syntax as the XML document. Additionally, in a
DTD, all the data content is character data. XDR language schemas allow you to specify the data
type of an element or an attribute. Note that XDR never reached the recommendation status.
❑ XSD —Defines the elements and attributes that form an XML document. Each element is
strongly typed. Based on a W3C recommendation, XSD describes the structure of XML docu-
ments using another XML document. XSDs include an all-encompassing type system composed
of primitive and derived types. The XSD type system is also at the foundation of the Simple
Object Access Protocol (SOAP) and XML Web services.
As mentioned, XDR is an early hybrid specification that never reached the status of a W3C recommendation
since it evolved into XSD. The .NET classes support XDR mostly for backward compatibility; however XDR
is fully supported by the Component Object Model (COM)-based Microsoft XML Core Services (MSXML)
parser.
100
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 100
The .NET Framework provides a handy utility, named xsd.exe, that among other things can automatically
convert an XDR schema to XSD. If you pass an XDR schema file (typically, an
.xdr extension), xsd.exe
converts the XDR schema to an XSD schema, as shown here:
xsd.exe Authors.xdr
The output file has the same name as the XDR schema, but with the .xsd extension.
XML Data Validation Using XSD Schemas

An XML document contains elements, attributes, and values of primitive data types. Throughout this
chapter, I will use an XML document named
Authors.xml, which is shown in Listing 5-1.
Listing 5-1: Authors.xml File
<?xml version=”1.0”?>
<authors>
<author>
<au_id>172-32-1176</au_id>
<au_lname>White</au_lname>
<au_fname>Johnson</au_fname>
<phone>408 496-7223</phone>
<address>10932 Bigge Rd.</address>
<city>Menlo Park</city>
<state>CA</state>
<zip>94025</zip>
<contract>true</contract>
</author>
<author>
<au_id>213-46-8915</au_id>
<au_lname>Green</au_lname>
<au_fname>Marjorie</au_fname>
<phone>415 986-7020</phone>
<address>309 63rd St. #411</address>
<city>Oakland</city>
<state>CA</state>
<zip>94618</zip>
<contract>true</contract>
</author>
</authors>
XSD schema defines elements, attributes, and the relationship between them. It conforms to the W3C

XML schema standards and recommendations. XSD schema for the
Authors.xml document is
Authors.xsd, and that is shown in Listing 5-2.
DTD was considered the cross-platform standard until a few years ago. The W3C
then officialized a newer standard— XSD— which is, technically speaking, far
superior to DTD. Today, XSD is supported by almost all parsers on all platforms.
Although the support for DTD will not be deprecated anytime soon, you’ll be better
positioned if you start migrating to XSD or building new XML-driven applications
based on XSD instead of DTD or XDR.
101
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 101
Listing 5-2: Authors.xsd File
<?xml version=”1.0” encoding=”utf-8”?>
<xs:schema attributeFormDefault=”unqualified” elementFormDefault=”qualified”
xmlns:xs=” /><xs:element name=”authors”>
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs=”unbounded” name=”author”>
<xs:complexType>
<xs:sequence>
<xs:element name=”au_id” type=”xs:string” />
<xs:element name=”au_lname” type=”xs:string” />
<xs:element name=”au_fname” type=”xs:string” />
<xs:element name=”phone” type=”xs:string” />
<xs:element name=”address” type=”xs:string” />
<xs:element name=”city” type=”xs:string” />
<xs:element name=”state” type=”xs:string” />
<xs:element name=”zip” type=”xs:unsignedInt” />
<xs:element name=”contract” type=”xs:boolean” />

</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
.NET Framework 2.0 classes support the W3C XML schema recommendation. The classes that are
commonly employed to validate the XML document are
XmlReader, XmlReaderSettings,
XmlSchemaSet, and XmlNodeReader. The sequence of steps to validate an XML document using an
XSD schema is as follows.
Steps for Validating an XML Document
❑ A ValidationEventHandler event handler method is defined.
❑ An instance of the
XmlReaderSettings object is created. XmlReaderSettings class allows
you to specify a set of options that will be supported on the
XmlReader object and these options
will be in effect when parsing XML data. Note that the
XmlReaderSettings renders the
XmlValidatingReader class (used with .NET 1.x version) obsolete.
❑ The previously defined
ValidationEventHandler method is associated with the
XmlReaderSettings class.
❑ The
ValidationType property of the XmlReaderSettings is set to ValidationType.Schema.
❑ An XSD schema is added to the
XmlReaderSettings class through the Schemas property of
the
XmlReaderSettings class.

❑ The
XmlReader class validates the XML document while parsing the XML data using the Read
method.
102
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 102
Validation Event Handler
The ValidationEventHandler event is used to define an event handler for receiving the notification
about XSD schema validation errors. The validation errors and warnings are reported through the
ValidationEventHandler call-back function. Validation errors do not stop parsing and parsing only
stops if the XML document is not well-formed. If you do not provide validation event handler callback
function and a validation error occurs, however, an exception is thrown. This approach of using the
validation event callback mechanism to trap all validation errors enables all validation errors to be
discovered in a single pass.
Role of XmlReaderSettings Class in XML Validation
The XmlReaderSettings class is one of the most important classes along with the XmlReader class that
provides the core foundation for validating XML data. Table 5-1 provides a brief recap of the validation
related properties of the
XmlReaderSettings class that will be utilized later in this chapter.
Table 5-1. Validation Related Properties and Events of XmlReaderSettings Class
Property Description
ProhibitDtd Indicates if the DTD validation is supported in the XmlRead-
erSettings
class. The default value is true meaning that the
DTD validation is not supported.
ValidationType Specifies the type of validation supported on the XmlReader-
Settings
class. The permitted validation types are DTD, XSD,
and None.
ValidationEventHandler Specifies an event handler that will receive information about

validation events.
ValidationFlags Specifies additional validation settings such as use of inline
schemas, identity constraints, and XML attributes that will be
enforced when validating the XML data.
Schemas Gets or sets the XmlSchemaSet object that represents the
collection of schemas to be used for performing schema
validation.
To validate XML data using the
XmlReaderSettings class, you need to set the properties of the
XmlReaderSettings class to appropriate values. This class does not operate on its own, but works in
conjunction with an
XmlReader or XmlNodeReader instance. You can use this class to validate against
either a DTD or an XML schema.
An XML Validation Example
Now that you have a general understanding of the steps involved in validating XML data, it is time to
look at an example to understand how it actually works. Listing 5-3 utilizes the
Authors.xsd schema
file to validate the
Authors.xml file.
103
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 103
Listing 5-3: Validating XML Data Using XSD Schemas
<%@ Page Language=”C#”%>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.Xml.Schema” %>
<script runat=”server”>
private StringBuilder _builder = new StringBuilder();
void Page_Load(object sender, EventArgs e)
{

string xmlPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors.xml”;
string xsdPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors.xsd”;
XmlReader reader = null;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(null, XmlReader.Create(xsdPath));
reader = XmlReader.Create(xmlPath, settings);
while (reader.Read())
{
}
if (_builder.ToString() == String.Empty)
Response.Write(“Validation completed successfully.”);
else
Response.Write(“Validation Failed. <br>” + _builder.ToString());
}
void ValidationEventHandler(object sender, ValidationEventArgs args)
{
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>XSD Validation</title>
</head>
<body>
<form id=”form1” runat=”server”>

<div>
</div>
</form>
</body>
</html>
Before examining the code, Figure 5-1 shows the output produced by Listing 5-3.
104
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 104
Figure 5-1
To start, Listing 5-3 declares variables that hold the path of the XML and XSD schema files. It then
creates an instance of the
XmlReaderSettings object and associates a validation event handler callback
method to it.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
Then, it sets the ValidationType property of the XmlReaderSettings class to
ValidationType.Schema signaling the XmlReader object to validate the XML data using the supplied
XSD schema as it parses the XML data.
settings.ValidationType = ValidationType.Schema;
In addition to the schema, the ValidationType enumeration also supports other values that are shown
in Table 5-2.
Table 5-2. ValidationType Enumeration Values
Value Description
DTD Indicates that the validation will be performed using DTD
None No validation is performed and as a result no validation
errors are thrown
Schema Validates the XML document according to XML schemas,
including inline XSD schemas

The code then adds the
Authors.xsd file to the schemas collection of the XmlReaderSettings object.
After that it invokes the static
Create method of the XmlReader object passing in the path of the
Authors.xml file and the XmlReaderSettings object. The Create method returns an instance of the
XmlReader object, which actually performs the validation using a DTD or an XML schema when pars-
ing the document.
105
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 105
settings.Schemas.Add(null, XmlReader.Create(xsdPath));
reader = XmlReader.Create(xmlPath, settings);
Because the XmlReader objects are created with the Create method by passing in the XmlReaderSettings
object, settings on the XmlReaderSettings will be supported on the XmlReader object. The Read method
of the
XmlReader object is then invoked in a While loop so that the entire XML file can be read and vali-
dated. The
ValidationEventHandler method is invoked whenever a validation error occurs. Inside this
method, a
StringBuilder object keeps appending the contents of the validation error message to itself.
If a validation event handler is not provided, an
XmlSchemaException is thrown when a validation error
occurs.
Handling Exceptions in XML Validation
In Listing 5-3, whenever an XML validation occurs, the control is automatically transferred to the
ValidationEventHandler method that handles the exception by appending the validation error message
(obtained through the
Message property of the ValidationEventArgs object) to a StringBuilder
object. And finally this error message is displayed to the user if the StringBuilder object contains any
messages at all. Although this is sufficient for the purposes of this example, there are times when you may

want to differentiate the different types of exceptions such as warnings or errors generated during the
validation. To accomplish this, you check on the
Severity property of the ValidationEventArgs object.
This property returns an enumeration of type XmlSeverityType, which can be used to determine the type
of the generated exception. This enumeration contains the values shown in Table 5-3.
Table 5-3. XmlSeverityType Enumeration Values
Value Description
Error Indicates that a validation error occurred when validating the
instance document. This can be the result of validation using DTDs,
and XSD schemas. If there is no validation event handler to handle
this situation, an exception is thrown.
Warning Indicates that a validating parser has run into a situation that is not
an error but may be important enough to warn the user about.
Warn-
ing
differs from Error in that it doesn’t result in an exception being
thrown to the calling application.
For example, if you want to filter only the errors generated during the validation process, you can
accomplish that by using the following line of code.
private void ValidationEventHandler(object sender, ValidationEventArgs args)
{
if (args.Severity == XmlSeverityType.Error)
{
//Add code to handle the errors
}
}
106
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 106
A Cache for Schemas

In the XmlReaderSettings class, the Schemas property represents a collection— that is, an instance of
the
XmlSchemaSet class that allows you to store one or more schemas that you plan to use later for vali-
dation. Using the schema collection improves overall performance because the various schemas are held
in memory and don’t need to be loaded each and every time validation occurs. You can add as many
XSD schemas as you want, but bear in mind that the collection must be completed before the first
Read
call is made.
To add a new schema to the cache, you use the
Add() method of the XmlSchemaSet object. The method
has a few overloads, as follows:
public void Add(XmlSchemaSet);
public XmlSchema Add(XmlSchema);
public XmlSchema Add(string, string);
public XmlSchema Add(string, XmlReader);
The first overload populates the current collection with all the schemas defined in the given collection.
The remaining three overloads build from different data and return an instance of the
XmlSchema
class — the .NET Framework class that contains the definition of an XSD schema.
Populating the Schema Collection
The schema collection actually consists of instances of the XmlSchema class — a kind of compiled version
of the schema. The various overloads of the
Add method allow you to create an XmlSchema object from a
variety of input arguments. For example, consider the following method:
public XmlSchema Add(string ns, string url);
This method creates and adds a new schema object to the collection. The compiled schema object is
created using the namespace URI associated with the schema and the URL of the source.
You can check whether a schema is already in the schema collection by using the
Contains() method.
The

Contains() method can take either an XmlSchema object or a string representing the namespace
URI associated with the schema. The former approach works only for XSD schemas. The latter covers
both XSD and XDR schemas.
Validating XML Data Using XmlSchemaSet Class
The XmlSchemaSet class represents a cache of XML schemas. It allows you to compile multiple schemas
for the same target namespace into a single logical schema.
Before taking a look at an example, I will provide a brief overview of the important properties and meth-
ods of the
XmlSchemaSet class. Table 5-4 provides a listing of the important properties of the
XmlSchemaSet class.
The XmlSchemaSet class replaces the XmlSchemaCollection class, which was the class
of choice when caching schemas in .NET Framework 1.x. The new
XmlSchemaSet class
not only provides much better standards compliance but also increased performance.
107
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 107
Table 5-4. Important Properties of the XmlSchemaSet Class
Property Description
Count Gets the count of logical XSD schemas contained in the
XmlSchemaSet
GlobalAttributes Gets reference to all the global attributes in all the XSD schemas
contained in the
XmlSchemaSet
GlobalElements Gets reference to all the global elements in all the XSD schemas
contained in the
XmlSchemaSet
GlobalTypes Gets all of the global simple and complex types in all the XSD
schemas contained in the
XmlSchemaSet

IsCompiled Indicates if the XSD schemas in the XmlSchemaSet have been
already compiled
Table 5-5 discusses the important methods of the
XmlSchemaSet class.
Table 5-5. Important Methods of the XmlSchemaSet Class
Method Description
Add Adds the given XSD schema to the XmlSchemaSet
Compile Compiles the XSD schemas added to the XmlSchemaSet class
into a single logical schema that can then be used for validation
purposes
Contains Allows you to check if the supplied XSD schema is in the
XmlSchemaSet
Remove Removes the specified XSD schema from the XmlSchemaSet
Reprocess Reprocesses an XSD schema that already exists in the
XmlSchemaSet
Listing 5-4 shows you an example of how to utilize the XmlSchemaSet class for validating XML data.
Listing 5-4: Validating XML Data Using XmlSchemaSet Class
<%@ Page Language=”C#”%>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.Xml.Schema” %>
<script runat=”server”>
private StringBuilder _builder = new StringBuilder();
void Page_Load(object sender, EventArgs e)
{
string xmlPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors.xml”;
string xsdPath = Request.PhysicalApplicationPath +
108
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 108

@”\App_Data\Authors.xsd”;
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(null, xsdPath);
XmlReader reader = null;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
settings.ValidationType = ValidationType.Schema;
settings.Schemas = schemaSet;
reader = XmlReader.Create(xmlPath, settings);
while (reader.Read())
{
}
if (_builder.ToString() == String.Empty)
Response.Write(“Validation completed successfully.”);
else
Response.Write(“Validation Failed. <br>” + _builder.ToString());
}
void ValidationEventHandler(object sender, ValidationEventArgs args)
{
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>XSD Validation using XmlSchemaSet</title>
</head>
<body>
<form id=”form1” runat=”server”>
<div>

</div>
</form>
</body>
</html>
In Listing 5-4, after an instance of XmlSchemaSet class is created, its Add method is invoked to add the
Authors.xsd schema to the XmlSchemaSet class.
XmlSchemaSet schemaSet = new XmlSchemaSet();
schemaSet.Add(null, xsdPath);
After the schema is added to the XmlSchemaSet, then you simply set the Schemas property of the
XmlReaderSettings object to the XmlSchemaSet object.
settings.Schemas = schemaSet;
You then invoke the Read method of the XmlReader object to parse the XML data in a loop. As similar to
the previous example, the parser stops only if the XML data is not well-formed. By not stopping for vali-
dation errors, you are able to find all the validation errors in one pass without having to repeatedly
parse the XML document. If you navigate to the page using a browser, you will see the same output as
shown in Figure 5-1.
109
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 109
XML DOM Validation
Currently, if you have data stored in an XmlDocument object, the only type of validation you can perform is
load-time validation. You do this by passing a validating reader object such as an
XmlReader object into
the
Load method. If you make any changes, however, there is no way to ensure that the data still conforms
to the schema. Using the
XmlNodeReader class, which reads data stored in an XmlNode object, you can
validate a
DOM object by passing the XmlNodeReader to the Create method. Listing 5-5 shows you an
example of how to accomplish this.

Listing 5-5: Performing XML DOM Validation
<%@ Page Language=”C#”%>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.Xml.Schema” %>
<script runat=”server”>
private StringBuilder _builder = new StringBuilder();
void Page_Load(object sender, EventArgs e)
{
string xmlPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors.xml”;
string xsdPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors.xsd”;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlPath);
XmlElement authorElement = (XmlElement)
xmlDoc.DocumentElement.SelectSingleNode
(“//authors/author[au_id=’172-32-1176’]”);
authorElement.SetAttribute(“test”, “test”);
XmlNodeReader nodeReader = new XmlNodeReader(xmlDoc);
XmlReader reader = null;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add(null, XmlReader.Create(xsdPath));
reader = XmlReader.Create(nodeReader, settings);
while (reader.Read())
{
}
if (_builder.ToString() == String.Empty)

Response.Write(“Validation completed successfully.”);
else
Response.Write(“Validation Failed. <br>” + _builder.ToString());
}
void ValidationEventHandler(object sender, ValidationEventArgs args)
{
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>DOM Validation</title>
</head>
110
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 110
<body>
<form id=”form1” runat=”server”>
<div>
</div>
</form>
</body>
</html>
Listing 5-5 illustrates how an XmlNodeReader object returned from the XmlDocument object (which in
turn is loaded from the
Authors.xml document) has XML schema validation support layered on top
while reading.
Before reading the
XmlDocument object into an XmlNodeReader object, the Authors.xml file is loaded
into an

XmlDocument and modified in-memory by adding an attribute called “test”.
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(xmlPath);
XmlElement authorElement = (XmlElement)
xmlDoc.DocumentElement.SelectSingleNode
(“//authors/author[au_id=’172-32-1176’]”);
authorElement.SetAttribute(“test”, “test”);
The XML document is then passed to an XmlNodeReader, which in turn is then passed to the factory-
created
XmlReader object.
reader = XmlReader.Create(nodeReader, settings);
When the validating reader parses the file, it can validate any changes made to the file. Because an invalid
attribute is added to the
XmlDocument object, the XSD schema will fail and you will see an output that is
somewhat similar to Figure 5-2.
Figure 5-2
As you can see from Figure 5-2, the XML validation has failed because the modified XML data is not in
compliance with the XSD schema.
111
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 111
XML Validation Using Inline Schemas
If you want, you can embed an XML schema at the top of an XML data file. This gives you a single XML
file for transport that includes data and validation requirements. This is called an inline schema. An
interesting phenomenon takes place when the XML schema is embedded in the same XML document
being validated, as in the case of inline schemas. In this case, the schema appears as a constituent part of
the source document. In particular, it is a direct child of the document root element.
The schema is an XML subtree that is logically placed at the same level as the document to validate. A
well-formed XML document, though, cannot have two roots. Thus an all-encompassing root node must
be created with two children: the schema and the document. You will see an example of this in Listing

5-6 that introduces a new XML element at the root called
<root>. This code contains the XSD schema as
well as the XML data to be validated as its children.
Listing 5-6: XML File That Contains the Inline XSD Schema
<?xml version=”1.0”?>
<root xmlns:xs=” xmlns:x=”urn:authors”>
<! Start of Schema >
<xs:schema targetNamespace=”urn:authors”>
<xs:element name=”authors”>
<xs:complexType>
<xs:sequence>
<xs:element maxOccurs=”unbounded” name=”author”>
<xs:complexType>
<xs:sequence>
<xs:element name=”au_id” type=”xs:string” />
<xs:element name=”au_lname” type=”xs:string” />
<xs:element name=”au_fname” type=”xs:string” />
<xs:element name=”phone” type=”xs:string” />
<xs:element name=”address” type=”xs:string” />
<xs:element name=”city” type=”xs:string” />
<xs:element name=”state” type=”xs:string” />
<xs:element name=”zip” type=”xs:unsignedInt” />
<xs:element name=”contract” type=”xs:boolean” />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>

<! End of Schema >
<x:authors>
<author>
<au_id>172-32-1176</au_id>
<au_lname>White</au_lname>
<au_fname>Johnson</au_fname>
<phone>408 496-7223</phone>
<address>10932 Bigge Rd.</address>
<city>Menlo Park</city>
<state>CA</state>
<zip>94025</zip>
<contract>true</contract>
</author>
112
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 112
<author>
<au_id>213-46-8915</au_id>
<au_lname>Green</au_lname>
<au_fname>Marjorie</au_fname>
<phone>415 986-7020</phone>
<address>309 63rd St. #411</address>
<city>Oakland</city>
<state>CA</state>
<zip>94618</zip>
<contract>true</contract>
</author>
</x:authors>
</root>
Note that in Listing 5-6, the root element cannot be successfully validated because there is no schema

information about it. When the
ValidationType property is set to ValidationType.Schema, the
XmlReader class throws a warning for the root element if an inline schema is detected. Be aware of this
when you set up your validation code. A too strong filter for errors could signal as wrong a perfectly
legal XML document if the XSD code is embedded. Listing 5-7 shows the code required to validate the
inline XSD schema contained in the XML file.
Listing 5-7: Validating XML Data through Inline XSD Schema
<%@ Page Language=”C#”%>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.Xml.Schema” %>
<script runat=”server”>
private StringBuilder _builder = new StringBuilder();
void Page_Load(object sender, EventArgs e)
{
string xmlPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors_InlineSchema.xml”;
XmlReader reader = null;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
settings.ValidationFlags &=
XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags &=
XmlSchemaValidationFlags.ReportValidationWarnings;
reader = XmlReader.Create(xmlPath, settings);
while (reader.Read())
{
}
if (_builder.ToString() == String.Empty)

Response.Write(“Validation completed successfully.”);
else
Response.Write(“Validation Failed. <br>” + _builder.ToString());
}
void ValidationEventHandler(object sender, ValidationEventArgs args)
{
if (args.Severity == XmlSeverityType.Error)
{
113
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 113
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
}
</script>
<html xmlns=” >
<head runat=”server”>
<title>Inline XSD Schema Validation</title>
</head>
<body>
<form id=”form1” runat=”server”>
<div>
</div>
</form>
</body>
</html>
The code that deselects the ProcessInlineSchema and ReportValidationWarnings is what differenti-
ates this listing from the previous listings. To specify the schema options used by the
XmlReaderSettings
class, you assign the ValidationFlags property of the XmlReaderSettings class to one of the values of

the
XmlSchemaValidationFlags enumeration. The following lines of code accomplish this.
settings.ValidationFlags &= XmlSchemaValidationFlags.ProcessInlineSchema;
settings.ValidationFlags &= XmlSchemaValidationFlags.ReportValidationWarnings;
In addition to the values used in this example, the XmlSchemaValidationFlags enumeration also
provides values shown in Table 5-6.
Table 5-6. XmlSchemaValidationFlags Enumeration Values
Value Description
AllowXmlAttributes Allows xml attributes even if they are not defined in the
schema
None The default validation options are utilized and no schema
validation options are performed
ProcessIdentityConstraints Processes identity constraints such as xs:ID, xs:IDREF,
xs:key, xs:keyref, xs:unique that are encountered dur-
ing validation
ProcessInlineSchema Processes inline schemas that are encountered during
validation
ProcessSchemaLocation Processes schema location hints such as
xsi:schemaLocation,
xsi:noNamespaceSchemaLocation that are encountered
during validation
ReportValidationWarnings Reports schema validation warnings that are encountered
during validation
114
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 114
Notice the use of the XmlSeverityType enumeration in the ValidationEventHandler to filter out the
warnings generated by the parser. These warnings are caused by the fact that the root element that con-
tains the inline schema is not considered as part of the validation.
if (args.Severity == XmlSeverityType.Error)

{
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
The check for XmlSeverityType.Error ensures that only errors are captured inside the validation
event handler.
Although XML schema as a format is definitely a widely accepted specification, the same cannot be said
for inline schema. The general guideline is to avoid inline XML schema whenever possible. This
improves the bandwidth management (the schema is transferred at most once) and shields you from bad
surprises. With the
XmlReaderSettings object, you can preload the schemas in schema cache and
use them when parsing the source XML data.
Using DTDs
The DTD validation guarantees that the source document complies with the validity constraints defined
in a separate file —the DTD. A DTD file uses a formal grammar to describe both the structure and the
syntax of XML documents. XML authors use DTDs to narrow the set of tags and attributes allowed in
their documents. Validating against a DTD ensures that processed documents conform to the specified
structure. From a language perspective, a DTD defines a newer and stricter XML-based syntax and a
new tagged language tailor-made for a related group of documents.
Historically speaking, the DTD was the first tool capable of defining the structure of a document. The DTD
standard was developed a few decades ago to work side by side with SGML— a recognized ISO standard
for defining markup languages. SGML is considered the ancestor of today’s XML, which actually sprang to
life in the late 1990s as a way to simplify the too-rigid architecture of SGML.
DTDs use a proprietary syntax to define the syntax of markup constructs as well as additional definitions
such as numeric and character entities. You can correctly think of DTDs as an early form of an XML
schema. Although doomed to obsolescence, DTD is today supported by virtually all XML parsers. An
XML document is associated with a DTD file by using the
DOCTYPE special tag. The validating parser (for
example, the
XmlReader class with the appropriate options set in the XmlReaderSettings class) recog-
nizes this element and extracts from it the schema information. The

DOCTYPE declaration can either point
to an inline DTD or be a reference to an external DTD file.
Developing a DTD Grammar
To build a DTD, you normally start writing the file according to its syntax. In this case, however, you
start from an XML file named
Authors_DTD.xml that will actually be validated through a DTD file. The
Authors_DTD.xml is shown in Listing 5-8.
115
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 115
Listing 5-8: Authors_DTD.xml File That Uses DTD Validation
<?xml version=”1.0”?>
<!DOCTYPE authors SYSTEM “Authors.dtd”>
<authors>
<author>
<au_id>172-32-1176</au_id>
<au_lname>White</au_lname>
<au_fname>Johnson</au_fname>
<phone>408 496-7223</phone>
<address>10932 Bigge Rd.</address>
<city>Menlo Park</city>
<state>CA</state>
<zip>94025</zip>
<contract>true</contract>
</author>
<author>
<au_id>213-46-8915</au_id>
<au_lname>Green</au_lname>
<au_fname>Marjorie</au_fname>
<phone>415 986-7020</phone>

<address>309 63rd St. #411</address>
<city>Oakland</city>
<state>CA</state>
<zip>94618</zip>
<contract>true</contract>
</author>
</authors>
Any XML document that must be validated against a given DTD file includes a DOCTYPE tag through
which it simply links to the DTD of choice, as shown here:
<!DOCTYPE authors SYSTEM “Authors.dtd”>
The word following DOCTYPE identifies the meta-language described by the DTD. This information is
extremely important for the validation process. If the document type name does not match the root
element of the DTD, a validation error is raised. The text following the
SYSTEM attribute is the URL from
which the DTD will actually be downloaded.
Listing 5-9 demonstrates a DTD that is tailor-made for the preceding XML document.
Listing 5-9: DTD for Validating the Authors_DTD.xml
<!ELEMENT authors (author+)>
<!ELEMENT author (au_id,au_lname,au_fname,phone,address,city,state,zip,contract)>
<!ELEMENT au_id (#PCDATA)>
<!ELEMENT au_lname (#PCDATA)>
<!ELEMENT au_fname (#PCDATA)>
<!ELEMENT phone (#PCDATA)>
<!ELEMENT address (#PCDATA)>
<!ELEMENT city (#PCDATA)>
<!ELEMENT state (#PCDATA)>
<!ELEMENT zip (#PCDATA)>
<!ELEMENT contract (#PCDATA)>
116
Chapter 5

08_596772 ch05.qxd 12/13/05 11:17 PM Page 116
The ELEMENT tag identifies a node element. An element declaration has the following syntax:
<!ELEMENT element-name (element-content)>
Elements with only character data are declared with #PCDATA inside parenthesis. Elements with one or
more children are defined with the name of the children elements inside parentheses. For example, an
element that contains one child is declared as follows:
<!ELEMENT element-name (child-element-name)>
For an element that contains multiple children, it is declared as follows:
<!ELEMENT element-name (child1, child2, childn)>
After all the child elements are declared, you can then specify its data type using the element syntax
shown previously.
Validating Against a DTD
The code snippet shown in Listing 5-10 creates an XmlReader object that works on the sample XML file
Authors_DTD.xml discussed in Listing 5-8. The document is bound to a DTD file and is validated using
the DTD validation type.
Listing 5-10: Validating an XML Document Against a DTD
<%@ Page Language=”C#”%>
<%@ Import Namespace=”System.Xml” %>
<%@ Import Namespace=”System.Xml.Schema” %>
<script runat=”server”>
private StringBuilder _builder = new StringBuilder();
void Page_Load(object sender, EventArgs e)
{
string xmlPath = Request.PhysicalApplicationPath +
@”\App_Data\Authors_DTD.xml”;
XmlReader reader = null;
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationEventHandler += new
ValidationEventHandler(this.ValidationEventHandler);
settings.ValidationType = ValidationType.DTD;

settings.ProhibitDtd = false;
reader = XmlReader.Create(xmlPath, settings);
while (reader.Read())
{
}
if (_builder.ToString() == String.Empty)
Response.Write(“DTD Validation completed successfully.”);
else
Response.Write(“DTD Validation Failed. <br>” + _builder.ToString());
}
void ValidationEventHandler(object sender, ValidationEventArgs args)
{
_builder.Append(“Validation error: “ + args.Message + “<br>”);
}
</script>
117
XML Data Validation
08_596772 ch05.qxd 12/13/05 11:17 PM Page 117
<html xmlns=” >
<head runat=”server”>
<title>DTD Validation</title>
</head>
<body>
<form id=”form1” runat=”server”>
<div>
</div>
</form>
</body>
</html>
The following lines of code in Listing 5-10 warrant special attention.

settings.ValidationType = ValidationType.DTD;
settings.ProhibitDtd = false;
First, the ValidationType property of the XmlReaderSettings object is set to ValidationType.DTD
to signal to the parser that a DTD is utilized for validation. When the validation mode is set to DTD, the
validating parser returns a warning if the file has no link to any DTDs. If a DTD is correctly linked and
accessible, the validation is performed, and in the process, entities are expanded. If the linked DTD file is
not available, an exception is raised. What you’ll get is not a schema exception but a simpler
FileNotFoundException exception.
Next, you set the
ProhibitDtd property to false to ensure that the DTDs are not prohibited. Note that
this property is set to true, by default.
settings.ProhibitDtd = false;
If you navigate to the file using a browser, you see the output shown in Figure 5-3.
Figure 5-3
118
Chapter 5
08_596772 ch05.qxd 12/13/05 11:17 PM Page 118

×