Contents
Overview 1
Lesson: The Form of XML 2
Lesson: Designing an XML Vocabulary 13
Lesson: Namespaces 17
Lab 2: Designing an XML Vocabulary 25
Review 28
Module 2: Basic XML
Information in this document, including URL and other Internet Web site references, is subject to
change without notice. Unless otherwise noted, the example companies, organizations, products,
domain names, e-mail addresses, logos, people, places, and events depicted herein are fictitious,
and no association with any real company, organization, product, domain name, e-mail address,
logo, person, places or events is intended or should be inferred. Complying with all applicable
copyright laws is the responsibility of the user. Without limiting the rights under copyright, no
part of this document may be reproduced, stored in or introduced into a retrieval system, or
transmitted in any form or by any means (electronic, mechanical, photocopying, recording, or
otherwise), or for any purpose, without the express written permission of Microsoft Corporation.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
2001 Microsoft Corporation. All rights reserved.
Microsoft, MS-DOS, Windows, Windows NT, ActiveX, BackOffice, bCentral, BizTalk,
FrontPage, MSDN, MSN, Netshow, PowerPoint, SharePoint, Visio, Visual Basic, Visual C++,
Visual C#, Visual InterDev, Visual Studio, Windows Media, and Xbox are either registered
trademarks or trademarks of Microsoft Corporation in the U.S.A. and/or other countries.
The names of actual companies and products mentioned herein may be the trademarks of their
respective owners.
Module 2: Basic XML iii
Instructor Notes
This module defines Extensible Markup Language (XML) through a functional
description and review of its formal requirements.
After completing this module, participants should be able to:
!
Create a well-formed XML document.
!
Find and fix errors of form in an XML document.
!
Create an XML vocabulary that meets a set of business requirements.
!
Use namespaces to establish a context for an XML vocabulary.
Materials and Preparation
This section provides the materials and preparation tasks that you need to teach
this module.
To teach this module, you need the following materials:
• Microsoft
®
PowerPoint
®
file 2500A_02.ppt
To prepare for this module:
!
Read all of the materials for this module.
!
Complete all practices.
!
Complete the demonstrations.
!
Complete the lab.
This module contains code examples that are linked to text hyperlinks at the
bottom of PowerPoint slides. These examples enable you to teach from code
examples that are too long to display on a slide. All the code examples for this
module are contained in one .htm file. Participants can copy the code directly
from the browser or from the source, and paste it into a development
environment.
To display a code sample, click a text hyperlink on a slide. Each link opens an
instance of Microsoft Internet Explorer and displays the code associated with
the link. At the end of each example, a link displays a table of contents of all
examples in this module. After you finish teaching the code for an example,
close the instance of the browser to conserve resources.
Presentation:
45 Minutes
Lab:
45 Minutes
Required Materials
Preparation Tasks
iv Module 2: Basic XML
Be ready to explain basic XML terminology. Consult the following references,
and give special attention to key definitions:
!
The World Wide Web Consortium (W3C) Recommendation XML 1.0
!
The W3C Recommendation Namespaces
In addition to consulting the W3C specifications, you should browse articles
and book chapters for explanations on:
!
Well-formed XML.
!
Processing instructions.
!
XML namespaces.
!
XML vocabularies.
!
Encoding schemes, such as Unicode.
A good place to start is Microsoft MSDN
®
online at
Recommended Reading
Module 2: Basic XML v
Module Strategy
Use the following strategy to present this module:
!
The Form of XML
Explain the rules of form that govern well-formed XML.
!
Designing an XML Vocabulary
This lesson can help you to prepare to explain the next lesson, which is
about namespaces, and the next module, which is about validation.
!
Namespaces
This module provides a very basic introduction to the idea of namespaces.
Most of the time, learners have trouble understanding two points regarding
namespaces. The first point of difficulty comes in understanding situations
that require their use. The lesson provides a concrete example of why you
might need to differentiate one element name from another in the context of
combining data from two separate XML sources. This is probably good
enough to get you through this lesson. You will see in subsequent modules
plenty of examples of namespaces, especially in the chapter on XSLT. The
second point that learners have trouble understanding is the logic behind
why a URI is used. Again, the lesson supplies a discussion on URI usage in
namespaces. You should point to other examples of URIs in subsequent
chapters to make sure students understand this important idea.
Module 2: Basic XML 1
Overview
!
The Form of XML
!
Designing an XML Vocabulary
!
Namespaces
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
This module covers some of the basic ideas of XML: the rules that govern the
form of XML, designing your own mark up vocabulary, and the use of
namespaces.
After completing this module, you will be able to:
!
Create a well-formed Extensible Markup Language (XML) document.
!
Find and fix errors of form in an XML document.
!
Create an XML vocabulary that meets a set of business requirements.
!
Use a default or an explicit namespace.
Objectives
2 Module 2: Basic XML
Lesson: The Form of XML
!
Parts of an XML Document
!
What Is Well-Formed XML?
!
Rules for Elements
!
Rules for Attributes
!
Processing Instructions
!
Comments
!
How to Handle Reserved Characters
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
In this lesson, you will examine the rules that govern the form of XML.
Understanding these rules will help you to create your own XML documents,
and provides a foundation for further learning.
By the end of this lesson, you will be able to:
!
Identify and fix errors of form in an XML document.
!
Create an XML file that is well-formed.
!
Mark up comments and special characters so that an XML processor can
process them.
Introduction
Lesson ob
jectives
Module 2: Basic XML 3
Parts of an XML Document
<?xml version="1.0"?>
<planets>
<planet ID="1">
<name>Mercury</name>
</planet>
<planet ID="2">
<name>Venus</name>
</planet>
<!-- There are more planets. -->
</planets>
<?xml version="1.0"?>
<planets>
<planet ID="1">
<name>Mercury</name>
</planet>
<planet ID="2">
<name>Venus</name>
</planet>
<!-- There are more planets. -->
</planets>
Processing instructions
Elements
Attributes
Comments
Root element
Child elements
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
This topic describes the key parts of an XML document.
Most XML documents begin with an instruction to the XML processor. Here,
the XML processor is instructed that the document is formed according to the
World Wide Web Consortium (W3C) XML Recommendation version 1.0.
A set of nested elements usually comes after the processing instruction. An
element usually consists of a start tag and a closing tag pair. Between the start
tag and closing tag pair, an element may contain data content or other elements.
An element may consist of just the closing tag.
The first element that the XML processor encounters must consist of a start tag
and a closing tag. This first element contains all other elements and is called the
root element.
All other elements within the root element are called child elements. Any child
element may nest subsequent child elements. Most of the content data in XML
is stored between the start tag and closing tag of child elements.
Any element can contain attributes. Using attributes is an alternative to using
elements to store content.
You create an attribute in the start tag of an element. You declare the name of
the attribute, followed by a value assignment. You can use either single or
double quotation marks to set an attribute’s value.
Comments are optional. If a comment is well-formed, it is ignored by the XML
processor.
Introduction
Processing instruction
Elements
Attributes
Comments
4 Module 2: Basic XML
What Is Well-Formed XML?
!
XML is well-formed when it conforms to specification
!
An XML error stops the XML processor
<Temp>22</temp>
<Temp>22</temp>
<Temp>22</Temp>
<Temp>22</Temp>
XML Processor
XML Processor
Application
Application
Well-Formed XML Not Well-Formed XML
Closing tag ‘temp’
does not match the
start tag ‘Temp’
Line 1, Position 11
Closing tag ‘temp’
does not match the
start tag ‘Temp’
Line 1, Position 11
Error!
Error!
<Temp>22</Temp>
<Temp>22</Temp>
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
For an XML document to be processed and shared between applications, it
must be formed according to a rigid set of rules. An XML document that
conforms to a set of rules is said to be well-formed. The current rules that
govern the form of XML are stated in the W3C Recommendation entitled
XML 1.0.
The XML 1.0 Recommendation differentiates an XML processor from an
application. An XML processor is any software that reads XML so that you can
access the document content and structure. XML processors are also called
parsers.
An example of an XML processor is the Microsoft XML Core Services
(MSXML) version 4.0 or MSXML4.dll. Microsoft Internet Explorer is not an
XML processor. It is an application, because it relies on the MSXML processor
to handle XML.
For example, white space consists of space characters, line feeds, and tabs.
XML processors preserve white space. When you use Internet Explorer to open
an XML file that contains extra white space, you notice that much of the white
space is gone. What happened to the white space? Internet Explorer, the
application, removed the white space that was sent to it by MSXML, the XML
processor.
When an application calls an XML processor to open a document that is not
well-formed, the processor stops and reports an error to the application. When
Internet Explorer receives an error message from the MSXML processor, it
displays an error message. This feature helps you debug errors in XML
documents.
Introduction
XML processors and
applications
XML errors
Module 2: Basic XML 5
Rules for Elements
!
Element names cannot contain white space
!
Names cannot start with a number or a punctuation mark
!
Names cannot start with xml or variants
!
No space after the left angle bracket (<)
!
The case of start and closing tags must match
!
The first element is the root element
!
The root element must have start and closing tags
!
All child elements must nest within the root
!
Nesting elements cannot overlap
!
An empty child element can consist of a single tag
<ElementName />
<ElementName />
<ElementName>element content</ElementName>
<ElementName>element content</ElementName>
<Root>
<ChildA>
<ChildB>content
</ChildB>
</ChildA>
</Root>
<Root>
<ChildA>
<ChildB>content
</ChildB>
</ChildA>
</Root>
XML Example
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
An element is a container for data and other elements. An element consists of a
start tag and a closing tag.
The syntax of elements is as follows:
<ElementName>ElementContent</ElementName>
The following are the rules of form for elements:
!
Element names cannot contain white space.
!
Names cannot start with a number or a punctuation mark.
!
Names cannot start with xml or any case variants.
!
Names must start after the left angle bracket, with no space intervening.
!
The case of start and closing tags must be the same.
!
An XML document must contain at least one element, called the root
element or document element. The processor treats the first element that it
finds as the root element. All subsequent elements are contained within the
root element.
!
All elements that follow the root element must be nested within the root
element.
!
The nesting of elements cannot overlap.
!
An element that lacks content can consist of a single closing tag unless it is
the root element.
!
The root element must consist of a start tag and a closing tag, even if it has
no content.
Introduction
S
yntax
Rules for elements
6 Module 2: Basic XML
Find violations of form in the elements of the following XML document:
elements.xml
<cloud-types>
<High-level>
<name>cirrus</name>
<name>cirrostratus</name>
</High-Level>
<Mid-Level>
<name>altocumulus</name>
<name>altostratus</name>
</Mid-Level>
<Low-Level>
<name>nimbostratus</name>
<name>stratocumulus</name>
</Low-Level>
<Vertical Development>
<name>fair weather cumulus</name>
<name>cumulonimbus</name>
</Vertical Development>
</cloud-types>
<Other-Types>
<name>contrails</name>
<name>billow clouds</name>
<name>mammatus</name>
<name>orographic</name>
<name>pileus</name>
<reserved>
<!-- 'reserved' is a single-tag element -->
</Other-Types>
If time permits, or for more practice on your own, try the following practice:
1. Using Notepad, open install_folder\Practices\Mod02\elements.xml.
2. Fix any errors of form in the XML, and then save the file.
3. Open the file in Internet Explorer to check your work.
4. Repeat the process of editing and checking until all errors are fixed.
Remember to click Refresh in Internet Explorer after each time you save
the file in Notepad.
5. Compare your solution to install_folder\Practices\Mod02\elements1.xml.
Practice
Optional practice
Module 2: Basic XML 7
Rules for Attributes
!
Declare them in start tags and processing instructions
!
Separate multiple declarations with a space
!
An attribute consists of a name and an assignment
"
Each name must be unique within an element
"
You can reuse names throughout a document
"
There are no spaces in names
"
Use either single or double quotes for assignments
<tree species ="Salix">Willow</tree>
<tree species ="Salix">Willow</tree>
Name
Name
Assignment
Assignment
XML Example
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
The syntax of attributes is as follows:
<ElementName AttributeName=’expression’></ElementName>
There are a few rules that govern the form of attributes:
!
Attributes may be located in start tags or in processing instructions.
!
An attribute consists of a name and a value assignment.
!
An element can contain as many attributes as you like.
!
Successive attribute names are separated by a space.
!
You can use a particular attribute name only once in an element.
!
You can reuse a particular attribute name between elements.
!
An attribute name cannot contain a space.
!
To assign a value to an attribute name, use an equal sign followed by an
expression that is enclosed in single or double quotes.
In the following example, the value of the moons attribute for different planets
is assigned:
<planet moons=’0’>Mercury</planet>
<planet moons=’0’>Venus</planet>
<planet moons=’1’>Earth</planet>
<planet moons=’not known’>Mergatroid</planet>
Syntax
Rules for attributes
Example
8 Module 2: Basic XML
Find violations of form in the attributes of the following XML document:
attributes.xml
<cloud-types>
<High-Level al t='+6,000 meters'>
<name latin='curl of hair'>cirrus</name>
<name latin='curl of hair + layer'>cirrostratus</name>
</High-Level>
<Mid-Level ALT='2000 - 6000 meters'>
<name latin='high heap'>altocumulus</name>
<name latin='high layer'>altostratus</name>
</Mid-Level>
<Low-Level alt '-2000 meters'>
<name latin='rain layer'>nimbostratus</name>
<name latin='layer heap'>stratocumulus</name>
</Low-Level>
<Vertical-Development alt='>12000 meters'>
<name latin='heap'>fair weather cumulus</name>
<name latin='heap rain' latin='heap
rain'>cumulonimbus</name>
</Vertical-Development>
<Other-Types alt="variable'>
<name>contrails</name>
<name>billow clouds</name>
<name>mammatus</name>
<name>orographic</name>
<name>pileus</name>
</Other-Types>
</cloud-types>
If time permits, or for more practice on your own, try the following practice:
1. Using Notepad, open install_folder\Practices\Mod02\attributes.xml.
2. Edit the XML so that all errors of form are fixed, and then save the file.
3. Open the file in Internet Explorer to check your work.
4. Repeat the process of editing and checking until all errors are fixed.
Remember to click Refresh in Internet Explorer after each time you save
the file in Notepad.
5. Compare your solution to install_folder\Practices\Mod02\attributes1.xml.
Practice
Optional practice