Tải bản đầy đủ (.pdf) (80 trang)

Tài liệu Module 9: Using XML to Exchange Data pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (1.04 MB, 80 trang )







Contents
Overview 1
Introduction to XML 2
Validating XML Documents 20
Using the Document Object Model 31
Applying XML in N-Tier Applications 53
Lab 9: Exchanging Data Using XML 60
Best Practices 67
Review 69

Module 9: Using XML to
Exchange Data


Information in this document is subject to change without notice. The names of companies,
products, people, characters, and/or data mentioned herein are fictitious and are in no way intended
to represent any real individual, company, product, or event, unless otherwise noted. Complying
with all applicable copyright laws is the responsibility of the user. No part of this document may
be reproduced or transmitted in any form or by any means, electronic or mechanical, for any
purpose, without the express written permission of Microsoft Corporation. If, however, your only
means of access is electronic, permission to print one copy is hereby granted.

Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any


license to these patents, trademarks, copyrights, or other intellectual property.

 2000 Microsoft Corporation. All rights reserved.

Microsoft, BackOffice, MS-DOS, Windows, Windows NT, ActiveX, MSDN, PowerPoint, and
Visual Basic are either registered trademarks or trademarks of Microsoft Corporation in the U.S.A.
and/or other countries.

The names of companies, products, people, characters, and/or data mentioned herein are fictitious
and are in no way intended to represent any real individual, company, product, or event, unless
otherwise noted.

Other product and company names mentioned herein may be the trademarks of their respective
owners.


Module 9: Using XML to Exchange Data iii


Instructor Notes
This module introduces the Extensible Markup Language (XML). Students will
learn how data is represented by using XML and how to use Document Type
Definitions (DTDs) and schemas to validate document structure. Students will
also learn how to parse XML by using the Document Object Model (DOM).
After completing this module, students will be able to:
!
Describe the purpose and benefits of XML.
!
Describe the structure of a well-formed XML document.
!

Describe the purpose of XML Schemas and DTDs.
!
Manipulate XML by using DOM.

In the lab, students will examine code to see how the XML DOM can be used
to create XML documents. They will then use the DOM to read an XML
purchase order that was generated by the Purchase Order Online application
used in the labs for this course.
Materials and Preparation
This section provides you with the required materials and preparation tasks that
are needed to teach this module.
Required Materials
To teach this module, you need the following materials:
!
Microsoft PowerPoint
®
file 1907A_09.ppt
!
Module 9: Using XML to Exchange Data
!
Lab 9: Exchanging Data Using XML

Preparation Tasks
To prepare for this module, you should:
!
Read all of the materials for this module.
!
Complete the lab.
!
Read the instructor notes and the margin notes for this module.


Presentation:
75 Minutes

Lab:
60 Minutes
iv Module 9: Using XML to Exchange Data



Demonstration
This section provides demonstration procedures that will not fit in the margin
notes or are not appropriate for the student notes.
Using the Document Object Model
The purpose of this demonstration is to show how the DOM can be used to
create, load, browse, and search an XML document. This demonstration uses an
XML document containing a booklist. It has an associated schema at
http://localhost/books/booklistschema.xml
.
This demonstration is divided into four parts. You can demonstrate all parts or
selected individual parts. Note, however, that to demonstrate parts C or D, the
demonstration program first requires that an XML document be loaded. This
initial step is achieved by clicking the Load XML Document button (described
in part B). You must also follow the demonstration preparation instructions that
follow prior to performing a demonstration.
!
Part A: Create and Save an XML Document
!
Part B: Load and Validate an XML Document
!

Part C: Walk Through an XML Document
!
Part D: Search an XML Document

!
Prepare for the demonstration
1. Use Windows Explorer to navigate to the <install
folder>\Democode\Mod09\XML folder.
2. Right-click the Books folder and choose Properties
3. On the Web Sharing tab, click the Share this folder option button.
4. Leave the default alias (Books) and ensure that all Access Permissions
check boxes are checked. Then click OK.
5. Click Yes to accept the warning message and then click OK on the
Properties dialog box.
6. Open the project XMLDemo.vbp located in <install
folder>\Democode\Mod09\XML.
7. Display the cmdCreateXMLDoc_Click procedure and place a breakpoint
on the line Set xmlDoc = New MSXML.DOMDocument.
8. Display the cmdLoadXMLDocument_Click procedure and place a
breakpoint on the line Set xmlDoc = New MSXML.DOMDocument.
9. Display the cmdWalkXMLDocument_Click procedure and place a
breakpoint on the line Debug.Assert Not (xmlDoc Is Nothing).
10. Display the cmdSearchXMLDocument_Click procedure and place a
breakpoint on the line Debug.Assert Not (xmlDoc Is Nothing).

Module 9: Using XML to Exchange Data v



!

Part A: Create and Save an XML Document
1. Run the project.
2. Click the Create XML Document button. Execution will halt at the
breakpoint in cmdCreateXMLDoc_Click.
3. Explain that the line with the breakpoint instantiates an
MSXML.DOMDocument object, which represents the top node of the
XML DOM tree. Press F8 to step over the line with the breakpoint.
4. Step over the next three lines of code, making the following observations:
• A processing instruction containing an XML declaration is created.
• Processing instructions (like all new elements and attributes) must be
appended to the appropriate place in the DOM tree. This step is
performed by using the appendChild method of the
IXMLDOMDocument interface. In this case, the processing instruction
is appended directly to the DOMDocument, placing it at the top level of
the document.
5. Step over the next three lines of code, making the following observations:
• The DOMDocument interface’s createElement method is used to
create the booklist element.
• An xmlns attribute is created by using setAttribute. This attribute is
used to associate a schema with an XML document.
• The booklist element is appended to the tree at the top level, making
booklist the root element.
6. Explain that the private subroutine AddBook is used to create a book
element, together with its associated attributes and child elements (title,
author, and price).
7. Press F8 to enter the AddBook subroutine.
8. Step over the code in the AddBook subroutine, making the following
observations:
• New elements are created by using the createElement method of the
IXMLDOMDocument interface. This interface is obtained via the

ownerDocument property of IXMLDOMNode. This code returns the
root of the document containing the node.
• Attributes are associated with elements by using the setAttribute
method of the element’s IXMLDOMElement interface.
• Each element is appended to the supplied parent node (which in this case
is the booklist element) by using the appendChild method.
9. Having returned to the cmdCreateXMLDoc_Click procedure, step over the
remaining calls to AddBook.
vi Module 9: Using XML to Exchange Data



10. Step into the SaveXMLDocument private subroutine. Step through this
routine, making the following observations:
• ADO Record and Stream objects are used to create the output XML
document. Using these objects allows the document to be output to a
Web folder by using a URL.
• A Stream object is opened from the Record object to represent the
contents of the file.
• The WriteText method of the Stream object is called with a string
representation of the XML document passed as a parameter
(xmlDoc.xml).
11. Press the Continue toolbar button to resume execution of the program.
12. A message box will be displayed confirming that the document has been
successfully created. Press OK to dismiss this message box.

!
Part B: Load and Validate an XML Document
1. Run the project if it is not already running. Click the Load XML Document
button. Execution will halt at the breakpoint within

cmdLoadXMLDocument_Click.
2. Step over the line of code that instantiates a new DOMDocument.
3. Explain that by default the Load method of DOMDocument will load a
document asynchronously. In this instance, the code sets the async property
to False to perform a synchronous load. Point out the pitfalls of starting to
use the DOM with a partially loaded tree. Also point out that if an
asynchronous load is chosen, the ondataavailable event is fired when the
XML document data is available.
4. Press F8 to step over the next line of code.
5. Point out that the validateOnParse property of DOMDocument indicates
whether validation should be performed while loading the document. In this
case, validation occurs against the schema referenced by the document (by
using the xmlns attribute of the booklist root element).
6. Press F8 to step over the next line of code.
7. Step over the call to the DOMDocument’s Load method and point out that
a URL is used to locate the XML document.
8. Press the Continue toolbar button to resume program execution.
9. A message box will be displayed confirming that the document has been
successfully loaded. Press OK to dismiss this message box.
10. The DOM tree is now fully populated.

!
Part C: Walk Through An XML Document
1. Click the Walk XML Document button. If this button is disabled, you must
first load an XML document by using the Load XML Document button.
2. Execution will halt at the breakpoint in cmdWalkXMLDocument_Click.
3. Step over the remaining lines of code in the subroutine, making the
following observations:
4. The root IXMLDOMNode variable is set to the document’s root element by
using the DOMDocument’s documentElement property.

Module 9: Using XML to Exchange Data vii



5. The hasChildNodes method of IXMLDOMNode is used to test whether
the root element has any child elements.
6. The length property of the child nodes IXMLDOMNodeList interface is
used to ascertain how many direct children elements the booklist element
possesses. This number represents the number of books in the book list.
7. A For loop is established to process each book element.
8. The bookNode IXMLDOMNode variable is set to each successive child
node.
9. The node type is checked by using the nodeTypeString property. In this
case, the node type will always be “element” because only book elements
are direct children of the booklist element.
10. For each element, the ProcessBookElement private subroutine is called,
which outputs the element tag name and text together with the values of the
isbn and type attributes. Make sure the Immediate window is visible as
output is sent to this window. Notice that the text property associated with
the book element is a concatenation of all the text nodes for all child
elements of book.

!
Part D: Search an XML Document
A set of XSL patterns has been provided in the XSL Pattern combo box. You
can repeat these steps for each pattern. The patterns are:
//author
Returns all author elements in the document
//book[@isbn=’1-444444-11-0’]
Returns the book element with the specified isbn attribute value.

//book[@type=’psychology’]
Returns all book elements with the specified type attribute value.

!
Part E: ...
1. Click the Search XML Document button. If this button is disabled, you
must first load an XML document by using the Load XML Document
button.
2. Execution will halt at the breakpoint in cmdSearchXMLDocument_Click.
3. Step over the remaining lines of code in this subroutine, making the
following observations:
4. The XSL pattern is passed as a parameter to the selectNodes method of the
IXMLDOMDocument interface.
5. selectNodes returns an IXMLDOMNodeList collection that contains
matching nodes.
6. A For Each construct is established to process each node in the collection.
7. The element tag name and text values are output to the Immediate window.
Notice that for searches that return book elements, the text property of the
book element is a concatenation of all the text nodes for all child elements
of book.

viii Module 9: Using XML to Exchange Data


Module Strategy
Use the following strategy to present this module:
!
Introduction to XML
Provide an overview of XML. XML defines a generic mechanism for
adding tagged information to character data. This extra information can help

to convey additional context, or metadata, or it can define the structure of
the data contained within the tags (for example, by defining the fields of a
purchase order). Although it may initially seem fairly simplistic, the
simplicity and flexibility of XML are two of the key features that have
helped make it the de facto information exchange mechanism for e-
commerce.
Discuss the syntax of XML and how it encompasses other data. Look at
how XML can be applied and the types of data it can help to represent.
Emphasize the suitability of XML for document interchange (e-business)
and explain that document exchange is how XML is used in the lab
scenario. The purchase order system produces XML order documents for
vendor trading partners. Mention other members of the XML family and
which parts of the XML family are implemented in some common
Microsoft products.
There is a practice of using Microsoft Internet Explorer 5 to view an XML
document. The practice initially uses Internet Explorer 5 to display the
XML data in its raw format and then asks students to associate an XSLT
style sheet with the XML document. Internet Explorer 5 is used to view the
document again. This time, as Internet Explorer 5 processes the style sheet,
the data is displayed in an HTML table.
!
Validating XML Documents
Describe the concept of XML validation and that it is frequently useful to
know whether an XML document conforms to a specific XML grammar.
The process of checking that an XML document conforms to a specific
XML grammar is called validation. Applications can accept or reject
documents based on their validity. The two common mechanisms for
defining the XML grammars used when validating XML documents are
DTDs and XML Schemas.
Explain why it is useful to validate XML documents. Show students how

DTDs and XML Schemas work, and stress the advantages that schemas
have over DTDs. Point out that the prime advantage of the DTD is that it is
part of the XML 1.0 specification. Discuss schema syntax.
!
Using the Document Object Model
Discuss that one of the great advantages of using XML rather than a
proprietary data format is that there are ready-made parsers, such as the
Microsoft XML engine MSXML, to perform much of the difficult work
automatically. However, after the parser has processed the XML data, you
need some mechanism of accessing it programmatically. The DOM standard
defines such a mechanism.
Explain the basics of how to access and manipulate XML data by using the
DOM. Discuss searching for name tags and nodes based on particular
criteria. Discuss XPath syntax. Finally, describe how to create, trim, and
persist XML trees.
Module 9: Using XML to Exchange Data ix


!
Applying XML in N-Tier Applications
Discuss the different mechanisms for creating and manipulating XML.
Refer students to the lab scenario, and discuss some of the ways in which
XML might be used in distributed applications like those found in the lab
for this module.
Examine some of the potential sources of XML, and explain how XML can
be easily sent to a URL for further processing. Look at how XSL
Transformations (XSLT) can be used as a powerful tool for converting
between XML grammars.

!

Best Practices
Summarize the best practices that should be observed when using XML to
exchange data.



THIS PAGE INTENTIONALLY LEFT BLANK
Module 9: Using XML to Exchange Data 1


#
##
#

Overview
!
Introduction to XML
!
Validating XML Documents
!
Using the Document Object Model
!
Applying XML in N-Tier Applications
!
Lab 9: Exchanging Data Using XML
!
Best Practices
!
Review



The Extensible Markup Language (XML) defines a flexible data representation
that is ideal for data exchange in loosely coupled systems. Because of this
advantage, XML is rapidly becoming the markup language of choice for e-
commerce and other business-to-business data exchange.
In this module, you will learn some of the uses for XML and its basic syntax.
You will learn about different XML grammars and how to validate an XML
document against a grammar defined as a Document Type Definition (DTD) or
XML Schema. You will also learn how to manipulate XML programmatically
by using the Document Object Model (DOM). Finally, you will learn about
how XML can be used in an n-tier application.
Objectives
After completing this module, you will be able to:
!
Describe the purpose and benefits of XML.
!
Describe the structure of a well-formed XML document.
!
Describe the purpose of XML Schemas and DTDs.

!
Manipulate XML by using the Document Object Model.
!
Describe how XML can be applied in an n-tier Windows DNA solution.



2 Module 9: Using XML to Exchange Data



#
##
#

Introduction to XML
!
What is XML?
!
Benefits of XML
!
XML Syntax
!
XML Family of Standards
!
XML Support in Microsoft Products
!
Practice: Viewing an XML Document in Internet Explorer 5


XML defines a generic mechanism for adding tagged information to character
data. This extra information can help to convey additional context, or metadata,
or it can define the structure of the data contained within the tags (for example,
to define the fields of a purchase order). Although it may seem fairly simplistic
at first, the simplicity and flexibility of XML are two of the key features that
have helped to make it the de facto information exchange mechanism for e-
commerce.
In this section, you will learn the syntax of XML and how it encompasses other
data. You will look at how XML can be applied and the types of data it can help
to represent. You will learn about the other members of the XML family and
which parts of the XML family are implemented in some common Microsoft

products.
!
This section includes the following topics:
!
What Is XML?
!
Benefits of XML
!
XML Syntax
!
XML Family of Standards
!
XML Support in Microsoft Products
!
Practice: Viewing an XML Document with Internet Explorer 5

Module 9: Using XML to Exchange Data 3


What Is XML?
!
Standard for defining data in tagged form
$
Imposes a structure on the underlying raw data
$
Provides extra information, or metadata, about part of
the underlying raw data
<booklist>
<book>
<title>Is Anger the Enemy?</title>

<author>Anne Ringer</author>
<price>10.95</price>
</book>
<book>
<title>Life Without Fear</title>
<author>Albert Ringer</author>
<price>7.00</price>
</book>
</booklist>


XML is a standard for defining data in a tagged form. It allows you to define
extra information beyond the raw data contained in a file or stream. This extra
information takes two primary forms:
!
Imposing a structure on the underlying raw data
!
Providing extra information, or metadata, about part of the underlying raw
data

An XML tag consists of a name enclosed in angle brackets, such as <book>. As
you will see later, tags usually have matching end tags; in this case, </book>.
The pair of tags defines an XML element within which data can be contained. If
the application processing the file or stream containing the tags is XML-aware,
it will identify these tags and interpret them appropriately.
A portion of a simple XML document is shown below:
<booklist>
<book>
<title> Is Anger the Enemy?</title>
<author>Anne Ringer</author>

<price>10.95</price>
</book>
<book>
<title>Life Without Fear</title>
<author>Albert Ringer</author>
<price>7.00</price>
</book>
</booklist>

4 Module 9: Using XML to Exchange Data



Examining the fragment reveals information about two books surrounded by
XML tags. In this case, each book has a title, author, and price. These three
pieces of information about the book are contained within a <book> element.
Two different <book> elements are defined and they, in turn, are contained
within a <booklist> element. This fragment shows how XML can be used to
define structure for the underlying data.
This example reveals two other aspects of XML. The first is that, when
formatted correctly, it can be human readable. The second aspect is that it is
hierarchical in nature. This hierarchy means that it can be formed into tree-like
structures for processing. For information about the structure of the Document
Object Model (DOM), see Using the Document Object Model in this module.
XML is All About Data
If you have written or examined an HTML document, XML may look familiar.
However, there are two important aspects in which XML differs from HTML.
HTML tags XML tags

Convey formatting instructions for a Web

browser. These formatting instructions do
not convey any information about the data
within them.
Consist of a fixed set of tags defined by
the World Wide Web Consortium (W3C).
If you include your own tags in an HTML
document, the browser will silently ignore
them.
Can be used to define the type and
structure of the data contained within tags.
This function preserves the original
meaning of the data in the document.
Allow you to define your own tags to
create your own grammar or dialect to
describe the data in your document. In
fact, XML has only a few predefined tags
that pertain to document structure.

XML and HTML look similar because they come from a common origin. They
are both derived from the Standard Generalized Markup Language (SGML).
SGML is used to define the structure and metadata for complex documentation,
such as that required to describe the electrical wiring contained in an airliner.
SGML was used before the advent of the Internet. The problem with SGML is
that it is a complex syntax with many options for providing a high level of
flexibility. Unfortunately, this feature can make it difficult to handle. As a
result, SGML-aware applications have tended to reflect this difficulty in their
complexity and price.
HTML was an attempt to apply SGML principles and provide a small subset of
tags for simple documents. The success of the Word Wide Web (WWW) is a
testament to how important simplicity is when creating a common standard.

The simplicity of HTML means that anyone with a copy of Microsoft Notepad
can create a Web document.
But in some ways, the simplicity of HTML has also been its undoing. HTML is
a good mechanism for conveying information about the formatting of a
document in a browser. However, it is not an effective way of representing data.
As the Web becomes the backbone for e-commerce, much of the information
transmitted over the Internet is not for human consumption in a Web browser,
but rather for the use of applications. These applications need a description of
the data being sent, not just instructions on how to display it. This requirement
is best satisfied by XML. XML documents retain the structure of the data and
can be more easily processed in software. Later in this module, you will see
how much easier it is to write a program to process an XML document than it
would be to process an HTML document.
Module 9: Using XML to Exchange Data 5



XML itself is only one of a set of related standards. The World Wide Web
Consortium defines the standard for XML and the other technologies in the
XML family. For more information about the XML family of standards, see
XML Family of Standards in this module.
6 Module 9: Using XML to Exchange Data


Benefits of XML
!
Data exchange
$
A cheap alternative to EDI
!

Standardization of documents
$
XML grammars being standardized for vertical markets
$
Microsoft’s BizTalk initiative
!
Metadata
$
Tools such as Rational Rose can export OO design model in a
format known as XML Metadata Interchange (XMI)
!
Structure and interoperability in infrastructure
$
For example, the Simple Object Access Protocol (SOAP)


As it has evolved, XML has found a variety of applications:
!
Data exchange
The main area in which XML is being applied is data exchange. The ability
to exchange structured data between applications is a key enabler for e-
commerce. For many years, the Electronic Data Interchange (EDI) standard
governed most data interchange between organizations. This standard acted
as a barrier to entry for smaller firms because it was traditionally expensive
to implement. The EDI standardization process also limited the speed at
which EDI-based systems could respond to changing conditions. With XML
representing the data and the Web acting as the transport mechanism, the
barrier to entry has lowered considerably. Also, the flexibility of XML has
increased dramatically because two organizations simply need to agree to a
common XML grammar to start the exchange of data.

!
Standardization of documents
Many professionals in industry, computer science, and the academic world
are working to standardize XML grammars for vertical markets such as
legal practice, scientific work, and finance. This standardization will allow
interchanging common documents and files that define such things as client
records, chemical models, and financial instruments. Although some of
these documents may form part of an e-commerce chain, they can be
equally well exchanged on a floppy disk. Other initiatives, such as
Microsoft’s BizTalk, concentrate on the interchange and interoperability of
documents between organizations rather than absolute standards.
!
Metadata
For example, XML is being used as metadata in the world of object-oriented
software development. Tools such as Rational Rose can now export a model
of an object-oriented design in a format known as XML Metadata
Interchange (XMI). This format can then be read by other modeling tools or
processed by a tool that can convert the class and component descriptions
into software.
Module 9: Using XML to Exchange Data 7



!
Structure and interoperability in infrastructure
For example, the Simple Object Access Protocol (SOAP) defines a
mechanism for delivering remote procedure calls as XML-encoded
messages in HTTP requests. Using XML obviates the need for specialized
parsing code and provides a high degree of extensibility.


This discussion should give you a few ideas about how XML is being applied.
For more information about the advantages and uses of XML, see "Proposed
Applications and Industry Initiatives" on the Extensible Markup Language
page of Robin Cover’s XML/SGML Web pages located at
www.oasis-open.org/cover/xml.html.
8 Module 9: Using XML to Exchange Data


XML Syntax
!
Comments
!
Elements
$
Must nest correctly
$
Unicode, case sensitive
$
Can contain text, other elements, or both
<!–- This is my favorite book -->
<booklist>
<book>
<title>Is Anger the Enemy?</title>
<author>Anne Ringer</author>
<price>10.95</price>
</book>
</booklist>


To read and manipulate XML, you must understand some of the syntax of XML

code. XML does not have many predefined tags, but it does have a strict syntax
that must be adhered to regardless of the tags used.
Comments
XML comments assume the same tagged form as comments in HTML, as
shown in the following example:
<!-- This is my favorite book -->

All characters inside a comment are ignored until the closing --> is
encountered.
Comments can occur anywhere in an XML document. It is a good practice to
insert comments in XML documents or code if you expect them to be read by
others at any point.
Elements
An element consists of a matched pair of opening and closing XML tags.
Between the opening and closing tags, you can have:
!
Simple text.
!
Other elements.
!
A mixture of text and elements.

The following example shows an XML element containing simple text:
<title>Is Anger the Enemy?</title>

Module 9: Using XML to Exchange Data 9



It is important to note that all characters in an XML document are by default

unicode characters. This rule applies to both XML tags and the data. As a
result, XML does not provide equivalence between uppercase and lowercase
characters (that is, it is case sensitive). The following example is not correct
XML syntax:
<!-- This is invalid XML syntax: -->
<title>Is Anger the Enemy?</TITLE>

The following example shows an XML element containing a mixture of other
elements and text:
<chapter title="Inorganic Chemistry">
In this chapter, we will discuss inorganic chemistry...
<section>
Transition Metals
</section>
Transition Metals are found in the centre of the periodic
table...
<section>
Group 1 Metals
</section>
Group 1 Metals have a single electron...
</chapter>

Note that the indentation is only shown for clarity. There is no need for such
spacing, or indeed new lines, in your XML document.
It is important that XML tags nest correctly. If tag B is contained within tag A,
then there must be an end tag for tag B before the end tag for tag A. The
following example shows invalid nesting:
<!-- This is invalid XML syntax: -->
<book>
<author>

Anne Ringer
</book>
</author>

Finally, you will sometimes have elements that are not intended to contain data.
Their presence is sufficient to convey meaning, as shown in the following
example:
<book>
...
<inStock></inStock>
</book>

10 Module 9: Using XML to Exchange Data



The <inStock> element is enough to signify that the book is currently in stock.
This type of XML element is called an empty element. XML defines a
shorthand notation for empty elements by collapsing the two tags into one. This
single tag looks like a start tag, but with a trailing slash as shown in the
following example:
<!-- This syntax... -->

<inStock/>

<!-- ...is equivalent to this syntax... -->

<inStock></inStock>

Module 9: Using XML to Exchange Data 11



XML Syntax (continued)
!
Attributes
!
Document Structure
<book ISBN=“1-444444-11-0”>
<title>Is Anger the Enemy?</title>
<author>Anne Ringer</author>
<price>10.95</price>
</book>
<?xml encoding=“UTF-8” version=“1.0”?>
<!DOCTYPE booklist SYSTEM “booklist.dtd”>
<booklist>
<book>
<title>Is Anger the Enemy?</title>
<author>Anne Ringer</author>
<price>10.95</price>
</book>
</booklist>
XML Declaration
Prolog
Root Element


Attributes
XML elements can have one or more attributes applied to them. An attribute
consists of a single name/value pair. The following example below shows how
an attribute containing an ISBN number could be applied to the book element:

<book ISBN="1-444444-11-0">
<title>Is Anger the Enemy?</title>
<author>Anne Ringer</author>
<price>10.95</price>
</book>

The value of an attribute must always be enclosed in single or double quotes.
Again, the attribute name and value are unicode strings by default.
Document Structure
The structure of an XML document is defined in the XML specification:
!
XML declaration
This element is a special tag that declares that the document contains XML
and indicates the version of the XML specification to which it conforms.
The XML declaration belongs to a family of tags called processing
instructions.
!
Encoding
The XML declaration may also contain an encoding. This element defines
the character set used to encode the rest of the file. By default, this set is
assumed to be the 8-bit Unicode Transmission Format (UTF-8) that is
compatible with 8-bit ASCII.
12 Module 9: Using XML to Exchange Data



!
Prolog
The XML declaration forms part of the document prolog. This element may
contain various things, including a DTD specification for the document. The

DTD defines the expected structure of the document. For more information
about DTDs, see the Validating XML Documents section later in this
module.
!
Root element
There should be a single XML element that encloses all of the other XML
elements and data in the document. This element is called the root element.

An XML document that conforms to this structure and obeys the rules for
attributes, elements, and comments defined previously is called a well-formed
XML document. The following example shows a well-formed XML document:
<?xml encoding="UTF-8" version="1.0"?>
<!DOCTYPE booklist SYSTEM "booklist.dtd">

<booklist>
<book>
<title>Is Anger the Enemy?</title>
<author>Anne Ringer</author>
<price>10.95</price>
</book>
<book>
<title>Life Without Fear</title>
<author>Albert Ringer</author>
<price>7.00</price>
</book>
</booklist>

For more information about the structure of XML documents, refer to the XML
specification on the W3C Web site at www.w3.org/TR/REC-xml.
Module 9: Using XML to Exchange Data 13



XML Family of Standards
!
Schema
$
Defines a document’s structure; XML Schema will replace DTDs
!
Document Object Model (DOM)
$
Standard object model for manipulating XML documents
!
Extensible Stylesheet Language (XSL)
$
Transformation and formatting language
$
Evolved into XSL for Formatting and XSL for Transformations (XSLT)
!
Namespaces
$
Supports element name qualification
<book xmlns:bookns=‘urn:com:booknamespace’
xmlns:familyns=‘urn:com:familyns’>
<bookns:title>My Family</bookns:title>


XML forms the basis for a whole set of related standards defined by the W3C.
When people refer to XML, as in the phrase “XML will play a major role in e-
commerce,” they are referring to the whole family of XML standards, not just
the XML syntax itself.

As is the nature of standards, the XML family is constantly evolving to take
into account new uses and challenges that present themselves as the XML
family of standards is applied. Some of the main members of the XML family
of standards are described in the following discussion.
Schema
The term schema is somewhat overloaded in the XML environment. In generic
terms, a schema describes some form of plan or structure. In computer terms,
the term schema is commonly used to describe the structure of the tables and
columns in databases. Used in its generic form, an XML schema would define
what can and cannot be in a particular XML document. It would describe which
elements could contain which other elements, what attributes each element can
have, and so forth.
The XML specification already contains a form of schema called a DTD. The
DTD can be used to define an XML grammar to which a document must
conform. The grammar could define a purchase order, the structure of a book, a
financial transaction, or the format of an RPC packet. An XML-aware tool can
then use the DTD to ensure that a document conforms to the given grammar.
This process is termed validating the document. A document that has been
proven to conform to its associated grammar is called a valid XML document.
14 Module 9: Using XML to Exchange Data



Unfortunately, the DTD syntax is somewhat limited because compatibility with
SGML is required. At the time of writing, the W3C is in the process of defining
a replacement for DTDs called XML Schema. The XML Schema standard is
based on work by Microsoft and other W3C members and will replace DTDs
over time.
For more information about DTDs and XML Schema, see Validating XML
Documents in this module.

Document Object Model
To manipulate XML in applications, some form of programmatic interface is
required. The W3C defines a standard for manipulating XML documents called
the Document Object Model (DOM). The DOM model treats an XML
document as a tree structure in which each element, attribute, and chunk of text
is a node. The DOM provides a set of interfaces that allow a programmer to
traverse the tree, access data, add new nodes, and remove unwanted nodes.
For more information about the DOM, see Using the Document Object Model
in this module.
Extensible Stylesheet Language
The Extensible Stylesheet Language (XSL) standard was originally intended as
a transformation and formatting language to be used alongside XML in a
similar way that Cascading Style Sheets (CSS) are used alongside HTML. It
soon became clear, however, that the transformation and formatting were
actually two distinct parts of XML. As a result, XSL has now evolved into two
standards: XSL for the formatting aspects (sometimes also known as formatting
objects) and XSLT for XSL transformations.
Most of the interest in the XML community has been in XSLT. By using an
XSLT style sheet, an XML document can be transformed into another XML
document. The ability to transform one XML grammar into another is a
powerful mechanism, whether it is transforming XML into HTML for display
or transforming one company’s purchase order definition into another form that
is compatible with a different company’s software.
XSLT has its own, XML-based syntax. An XSLT style sheet consists of a set of
template rules. Each template rule has a pattern that can be matched to part of
the source XML document and a matching output template. Any part of the
input XML document that matches a pattern in a template rule will have the
associated output template applied to it.
The pattern matching language used in XSLT is defined in a separate standard
called XPath. The XPath standard can also be used with Microsoft’s

implementation of the DOM to help find particular nodes in the DOM tree.
For more information about XPath, XSLT, and XSL, you can go to the W3C
Web site at www.w3.org/Style/XSL.
Module 9: Using XML to Exchange Data 15



Namespaces
People and organizations are free to define their own XML grammar. A
potential problem, however, is that the names of the elements and attributes in
these grammars may conflict. For example, consider an XML grammar that
defines the structure of a book. Each chapter in the book would have a title.
Imagine that the book described family trees. Each tree may be defined
according to another XML grammar specifically for family tree description. In
this family tree grammar, each member of a family may also have a title. The
title for a family member may have limitations placed on it (for example,
limited to “Mr..” “Ms.,” “Mrs.,” “Dr.,” and so on), whereas there would be no
such restrictions on the titles of the chapters.
In this example, we need a way to differentiate between the titles used in
different XML grammars. Using namespaces solves this problem. Namespaces
are prefixes that can be used to establish that a particular element or attribute
belongs to a specific XML grammar. An example of using namespaces is
shown below:
<book xmlns:bookns='urn:com:booknamespace'
xmlns:familyns=’urn:com:familyns’>
<chapter>
<bookns:title>My Family</bookns:title>
The original Bloggs family can be traced back
to <familyns:title>Dr.</familyns:title> Jack Bloggs
at the turn of the 1900's...

</chapter>
</book>

You will encounter namespaces again when you look at XML Schemas in
Validating XML Documents later in this module.
For more information about namespaces, you can go to the W3C Web site at
www.w3.org/TR/1999/REC-xml-names-19990114.

×