Tải bản đầy đủ (.pdf) (31 trang)

The Semantic Web:A Guide to the Future of XML, Web Services, and Knowledge Management phần 5 ppsx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (812.93 KB, 31 trang )

Table 5.3 (continued)
RSS IN XML RSS IN RDF
<link> />link>
<dc:description>
This article explores how the
semantic web will change business.
</dc:description>
<dc:publisher>Super News
Network</dc:publisher>
<co:name>XML.com</co:name>
<co:market>NASDAQ</co:market>
<co:symbol>SNN</co:symbol>
</item>
<item rdf:about=” />article2”>
<title>Syndication Controversy</
title>
<dc:description> How the RSS
format flip-flops have caused strife and
confusion among developers.</
dc:description>
<link> /></link>
</item>
</rdf:RDF>
The explicit expression of associations between entities is not available in XML
documents and is therefore a major benefit of RDF. Two applications of RDF
that stress association between entities are the Publishing Requirements for
Industry Standard Metadata (PRISM), available at smstan-
dard.org, and the Friend Of A Friend (FOAF) vocabulary, available at
While we will not go into the details of these
formats, it is encouraging that the proficiency with RDF is growing to the point
where compelling vocabularies are being developed.


We will close this section on a positive note, because we believe that RDF
adoption will pick up. Like the proverbial Chinese bamboo tree, RDF is a tech-
nology that has a long lead time. The Chinese bamboo tree must be cultivated
and nourished for four years with no visible signs of growth; however, in the
first three months of the fifth year, the Chinese bamboo tree will grow 90 feet.
The authors believe that RDF’s watering and fertilizing has been in the form of
mainstream adoption of XML and namespaces and that we are now entering
Chapter 5
102
that growth phase of RDF. Here are the five primary reasons that RDF’s adop-
tion will grow:
■■ Improved tutorials
■■ Improved tool support
■■ Improved XML Schema integration
■■ Ontologies
■■ Noncontextual modeling
Improved tutorials like this book, the W3C’s RDF Primer, and resources on the
Web fix the complexity issue. Improved tool support for RDF editing, visual-
izing, translation, and storage (like Jena and IsaViz, which we have seen, and
Protégé, which we will see in the next section) fix the syntax problem by
abstracting your applications away from the syntax. This not only isolates the
awkward parts of the syntax but also future-proofs your applications via a tool
to mediate the changes.
TIP
Most RDF authors write their RDF assertions in N3 format and then convert the N3 to
RDF/XML syntax via a conversion tool (like Jena’s n3 program).
Improved integration with XML documents is being pushed both inside and
outside of the W3C, and many bridges are being built between these technol-
ogy families. The RDF Core Working Group recently added the ability for RDF
literals to be typed via XML Schema data types. Another example of

RDF/XML document integration is an RDF schema available to validate the
RDF in simple Dublin Core documents. This schema is available at http://
www.dublincore.org/documents/dcmes-xml/. Another way to solve the val-
idation problem is to have the namespace URI point to a document, which
describes it as proposed by the Resource Directory Description Language
(RDDL), available at There is work under way to
allow RDF assertions in RDDL. So, the momentum and benefits to combining
XML and RDF are increasing, as highlighted in the article “Make Your XML
RDF-Friendly” by Bob DuCharme and John Cowan, available at http://www
.xml.com/pub/a/2002/10/30/rdf-friendly.html. Ontologies and ontology
languages like the Web Ontology Language (OWL), discussed in Chapter 8,
are layered on top of RDF.
Many see ontologies as the killer application for the Semantic Web and thus
believe they will drive the adoption of RDF. In the next section, we examine
RDF Schema, which is a lightweight ontology vocabulary layered on RDF.
Lastly, ontologies are not the only killer application for RDF; noncontextual
modeling makes RDF the perfect glue between systems and fixed data models.
Noncontextual modeling is discussed in detail later in this chapter.
Understanding the Resource Description Framework
103
What Is RDF Schema?
RDF Schema is language layered on top of RDF. This layered approach to creat-
ing the Semantic Web has been presented by the W3C and Tim Berners-Lee as
the “Semantic Web Stack,” as displayed in Figure 5.8. The base of the stack is the
concepts of universal identification (URI) and a universal character set (Uni-
code). Above those concepts, we layer the XML Syntax (elements, attributes, and
angle brackets) and namespaces to avoid vocabulary conflicts. On top of XML
are the triple-based assertions of the RDF model and syntax we discussed in the
previous section. If we use the triple to denote a class, class property, and value,
we can create class hierarchies for the classification and description of objects.

This is the goal of RDF Schema, as discussed in this section.
Above RDF Schema we have ontologies (a taxonomy is a lightweight ontology,
as described in Chapter 7, and robust ontology languages like OWL, described
in Chapter 8). Above ontologies, we can add logic rules about the things in our
ontologies. A rule language allows us to infer new knowledge and make deci-
sions. Additionally, the rules layer provides a standard way to query and filter
RDF. The rules layer is sort of an “introductory logic” capability, while the
logic framework will be “advanced logic.” The logic framework allows formal
logic proofs to be shared. Lastly, with such robust proofs, a trust layer can be
established for levels of application-to-application trust. This “web of trust”
forms the third and final web in Tim Berners-Lee’s three-part vision (collabo-
rative web, Semantic Web, web of trust). Supporting this web of trust across
the layers are XML Signature and XML Encryption, which are discussed in
Chapter 6.
In this section, we focus on examining the RDF Schema layer in the Semantic
Web stack. RDF Schema is a simple set of standard RDF resources and proper-
ties to enable people to create their own RDF vocabularies. The data model
expressed by RDF Schema is the same data model used by object-oriented pro-
gramming languages like Java. The data model for RDF Schema allows you to
create classes of data. A class is defined as a group of things with common char-
acteristics. In object-oriented programming (OOP), a class is defined as a tem-
plate or blueprint for an object composed of characteristics (also called data
members) and behaviors (also called methods). An object is one instance of a
class. OO languages also allow classes to inherit characteristics and behaviors
from a parent class (also called a super class). The software industry has
recently standardized a single notation called the Unified Modeling Language
(UML) to model class hierarchies. Figure 5.9 displays a UML diagram model-
ing two types of employees and their associations to the artifacts they write
and the topics they know.
Chapter 5

104
Figure 5.8 The Semantic Web Stack.
Copyright  [2002] World Wide Web Consortium,
(Massachusetts Institute of Technology, European Research
Consortium for Informatics and Mathematics, Keio University).
All Rights Reserved. />2002/copyright-documents-20021231
Figure 5.9 UML class diagram of employee expertise.
Employee
- Topic knows
- Artifact writes
Software EngineerSystem-Analyst
DesignDocument SourceCode
writes writes
knows
Artifact
Technology
Topic
RDF M&S
Signature
NamespacesXML
UnicodeURI
Encryption
RDF Schema
Ontology
Rules
Logic
Framework
Proof
Trust
Understanding the Resource Description Framework

105
Figure 5.9 uses several UML symbols to denote the concepts of class, inheri-
tance, and association. The rectangle with three sections is the symbol for a
class. The three sections are for the class name, the class attributes (middle sec-
tion), and the class behaviors or methods (bottom section). RDF Schema only
uses the first two parts of a class, since it is for data modeling and not pro-
gramming behaviors. Also, to reduce the size of the diagram, we eliminated
the bottom two sections of the class for Topic, Technology, Artifact, and so on.
Inheritance is when a subclass inherits the characteristics of a superclass. The
arrow from the subclass to the superclass denotes this. The inheritance relation
is often called “isa,” as in “a software engineer is a(n) employee.”
Lastly, a labeled line between two classes denotes an association (like knows or
writes). The key point of Figure 5.9 is that we are modeling two types of
employees: software engineer and system-analyst. The key difference between
the employees that we want to capture is the different types of artifacts that
they create. Whereas both employees may know about a technology, the key
differentiator of developing source code to implement a technology is impor-
tant enough to be formally captured in RDF. This is precisely the type of key
determining factor that is often lost in a jumble of plaintext. So, let’s see how
we would model this in RDF Schema.
Figure 5.10 displays the Protégé open source ontology editor developed by
Stanford University with the same class hierarchy. Protégé is available at
Protégé allows you to easily describe classes
and class hierarchies.
Figure 5.10 Improved expertise modeling via RDFS.
Chapter 5
106
Notice in Figure 5.10 the right pane is a visualization of the ontology, while the
left pane allows you to choose what class or classes to visualize from the class
list (bottom left pane). The Protégé class structure is identical to the UML

model except for the lack of behaviors. RDFS classes only have a name and
properties. After modeling the classes, Protégé allows you to generate both the
RDF schema and an RDF document if you create instances of the Schema (Fig-
ure 5.10 has one tab labeled “Instances”). Remember, a class is the blueprint
from which you can create many instances. So, if the class describes the prop-
erties of an address like street, city, state, and zip code, you can create an num-
ber of instances of addresses like “3723 Saint Andrews Drive,” “Sierra Vista,”
“Arizona,” and “85650.” Listing 5.6 is the RDF Schema for the class model in
Figure 5.10. Listing 5.7 is an RDF document with instances of the classes in
Listing 5.6.
<?xml version=’1.0’ encoding=’ISO-8859-1’?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf ‘ /><!ENTITY example_chp5
‘ /><!ENTITY rdfs ‘ />]>
<rdf:RDF xmlns:rdf=”&rdf;”
xmlns:example_chp5=”&example_chp5;”
xmlns:rdfs=”&rdfs;”>
<rdfs:Class rdf:about=”&example_chp5;Artifacts”
rdfs:label=”Artifacts”>
<rdfs:subClassOf rdf:resource=”&rdfs;Resource”/>
</rdfs:Class>
<rdfs:Class rdf:about=”&example_chp5;DesignDocument”
rdfs:label=”DesignDocument”>
<rdfs:subClassOf rdf:resource=”&example_chp5;Artifacts”/>
</rdfs:Class>
<rdfs:Class rdf:about=”&example_chp5;Employee”
rdfs:label=”Employee”>
<rdfs:subClassOf rdf:resource=”&rdfs;Resource”/>
</rdfs:Class>
<rdfs:Class rdf:about=”&example_chp5;Software-Engineer”

rdfs:label=”Software-Engineer”>
<rdfs:subClassOf rdf:resource=”&example_chp5;Employee”/>
</rdfs:Class>
<! Classes SourceCode, System-Analyst, Technology, and Topic omitted
for brevity. They are similar to the above Classes. >
Listing 5.6 RDF schema for Figure 5.9. (continued)
Understanding the Resource Description Framework
107
<rdf:Property rdf:about=”&example_chp5;knows”
rdfs:label=”knows”>
<rdfs:domain rdf:resource=”&example_chp5;Employee”/>
<rdfs:range rdf:resource=”&example_chp5;Topic”/>
</rdf:Property>
<rdf:Property rdf:about=”&example_chp5;writes”
rdfs:label=”writes”>
<rdfs:range rdf:resource=”&example_chp5;Artifacts”/>
<rdfs:domain rdf:resource=”&example_chp5;Employee”/>
</rdf:Property>
</rdf:RDF>
Listing 5.6 (continued)
Listing 5.6 uses the following key components of RDF Schema:
rdfs:Class. An element that defines a group of related things that share a
set of properties. This is synonymous with the concept of type or category.
Works in conjunction with rdf:Property, rdfs:range, and rdfs:domain to
assign properties to the class. Requires a URI as an identifier in the
rdf:about attribute. In Listing 5.6 we see the following classes defined:
“Artifacts,” “DesignDocument,” “Employee,” and “Software-Engineer.”
rdfs:label. An attribute that defines a human-readable label for the class.
This is important for applications to display the class name in applications
even though the official unique identifier for the class is the URI in the

rdf:about attribute.
rdfs:subclassOf. An element that specifies that a class is a specialization of
an existing class. This follows the same model as biological inheritance,
where a child class can inherit the properties of a parent class. The idea of
specialization is that a subclass adds some unique characteristics to a gen-
eral concept. Therefore, going down the class hierarchy is referred to as
specialization, while going up the class hierarchy is referred to as generaliza-
tion. In Listing 5.6, the class “Software-Engineer” is defined as a subclass of
“Employee.” Therefore, Software-Engineer is a specialization of Employee.
rdf:Property. An element that defines a property of a class and the range of
values it can represent. This is used in conjunction with rdfs:domain and
rdfs:range properties. It is important to understand a key difference
between modeling classes in RDFS versus modeling classes in object-
oriented programming, in that RDFS takes a bottom-up approach to class
modeling, whereas OOP takes a top-down approach. In OOP, you define a
class and everything it contains. In RDFS, you define properties and state
what class they belong to. So, in OOP we are going down from the class to
the properties. In RDFS, we are going up from the properties to the class.
Chapter 5
108
rdfs:domain. This property defines which class a property belongs to (for-
mally, its sphere of activity). The value of the property must be a previ-
ously defined class. In Listing 5.6, we see that the domain of the property
“knows” is the “Employee” class.
rdfs:range. This property defines the legal set of values for a property. The
value of this attribute must be a previously defined class. In Listing 5.6, the
range of the “knows” property is the “Topic” class.
Some other important RDFS definitions not used in Listing 5.6 are as follows:
rdf:type. A standard property to define that an RDF subject is of a type
defined in an RDF schema. For example, you could say that a person with

Staff ID of 865 is a type of employee like this:
<rdf:Description rdf:about= “ /><rdf:type rdf:resource =”&example_chp5;Employee”>
rdfs:subPropertyof. A property that declares that the property that is the
subject of the statement is a subproperty of another existing property. This
feature actually goes beyond common OOP languages like Java and C#
that only offer class inheritance. An example of this would be to declare a
property called “weekend,” which would be a subPropertyof “week.”
rdfs:seeAlso. A utility property that allows you to refer to a resource that
can provide additional RDF information about the current resource.
rdfs:isDefinedBy. A property to define the namespace of a subject. This is
a subPropertyOf rdfs:seeAlso. In practice, the namespace can point to the
RDF Schema document.
rdfs:comment. A utility property to add additional descriptive information
to explain the classes and properties to other users of the schema. As in
programming, good comments are essential to fostering understanding
and adoption.
rdfs:Literal. A property that represents a constant value represented as a
character string. In Listing 5.7, the value of the example_chp5:name
attribute is a literal (like “Jane Jones”). RDF/XML syntax revision has
recently added typed literals to RDF so that you can specify any of the types
in the XML Schema specification (like integer or float).
rdfs:XMLLiteral. A property that represents a constant value that is well-
formed XML. This allows XML to be easily embedded in RDF.
In addition to the classes and properties described in the preceding lists, RDF
Schema describes classes and properties for the RDF concepts of containers
and reification. For containers, RDF Schema defines rdfs:Container, rdf:Bag,
rdf:Seq, rdf:Alt, rdfs:member, and rdfs:ContainerMembershipProperty. The
Understanding the Resource Description Framework
109
purpose for defining these is to allow you to subclass these classes or proper-

ties. For reification, RDF Schema defines rdf:Statement, rdf:subject, rdf:predi-
cate, and rdf:object. These can be used to explicitly model a statement to assert
additional statements about it. Additionally, as with the Container classes and
properties, you can extend these via subclasses or subproperties.
Listing 5.7 displays an RDF instance document generated by Protégé con-
forming to the RDF schema in Listing 5.6.
<?xml version=’1.0’ encoding=’ISO-8859-1’?>
<!DOCTYPE rdf:RDF [
<!ENTITY rdf ‘ /><!ENTITY example_chp5 ‘ /><!ENTITY rdfs ‘ />]>
<rdf:RDF xmlns:rdf=”&rdf;”
xmlns:example_chp5=”&example_chp5;”
xmlns:rdfs=”&rdfs;”>
<example_chp5:SourceCode rdf:about=”&example_chp5;example-chp5_00015”
example_chp5:name=”stuff.java”
rdfs:label=”example-chp5_00015”/>
<example_chp5:System-Analyst rdf:about=”&example_chp5;example-
chp5_00016”
example_chp5:name=”Jane Jones”
rdfs:label=”example-chp5_00016”>
<example_chp5:writes rdf:resource=”&example_chp5;example-chp5_00017”/>
</example_chp5:System-Analyst>
<example_chp5:DesignDocument rdf:about=”&example_chp5;example-
chp5_00017”
example_chp5:name=”system.sdd”
rdfs:label=”example-chp5_00017”/>
<example_chp5:Software-Engineer rdf:about=”&example_chp5;example-
chp5_00018”
example_chp5:name=”John Doe”
rdfs:label=”example-chp5_00018”>
<example_chp5:writes rdf:resource=”&example_chp5;example-chp5_00015”/>

</example_chp5:Software-Engineer>
</rdf:RDF>
Listing 5.7 RDF instance document.
In Listing 5.7, notice that the classes of the RDF schema in Listing 5.6 are not
defined using rdf:type or rdf:about; instead, they use an abbreviation called
using a “typed node element.” For example, instead of <rdf:Description>, List-
ing 5.7 has <example_chp5:System-Analyst, which is an rdfs:Class in Listing
5.6. In terms of knowledge capture, Listing 5.7 captures the fact that the System-
Analyst, Jane Jones wrote the DesignDocument named “system.sdd,” and that
the Software-Engineer, John Doe, wrote SourceCode called “stuff.java.”
Chapter 5
110
In this section, we saw how RDF is the foundation layer for RDF Schema that
enables you to create new RDF classes and properties. Another key benefit of
RDF is that it allows you to do noncontextual modeling, described in the fol-
lowing section.
What Is Noncontextual Modeling?
Over the years, businesses have used standard document types to easily con-
vey the context of a specific business transaction. For example, a purchase
order is a common document shared between companies with little difficulty
even if there is some variation in specific fields or the order of fields. The
shared understanding is facilitated because the context is conveyed or fixed by
the document type. In that same vein, XML documents have a fixed context
provided by their root element and governing schema (formerly called the
Document Type Definition, or DTD). For example, in the XML.org schema reg-
istry, there are many specific document types for each vertical industry. If we
examine the Human Resources-XML Consortium Schema for a Resume
(), we could probably guess most of the fields even
without looking at the sample in Listing 5.8.
<?xml version=”1.0” encoding=”UTF-8”?>

<Resume xmlns=”
xmlns:xsi=”
xsi:schemaLocation=”
Resume-2_0.xsd”>
<StructuredXMLResume>
<ContactInfo>
<PersonName>
<FormattedName>John Doe</FormattedName>
</PersonName>
<ContactMethod>
<Telephone>
<FormattedNumber>123-456-7890</FormattedNumber>
</Telephone>
<InternetEmailAddress></InternetEmailAddress>
<PostalAddress>
<CountryCode>US</CountryCode>
<Region>MA</Region>
<Municipality>Brooklyn</Municipality>
<DeliveryAddress>
<AddressLine>27 </AddressLine>
<StreetName>Pine Street</StreetName>
</DeliveryAddress>
</PostalAddress>
Listing 5.8 Example of contextual modeling (a resume). (continued)
Understanding the Resource Description Framework
111
</ContactMethod>
</ContactInfo>
<Objective> To obtain a leadership position in the field of Electronic
Commerce</Objective>

<EmploymentHistory>
<EmployerOrg employerOrgType=”soleEmployer”>
<EmployerOrgName>General Electric</EmployerOrgName>
<PositionHistory positionType=”directHire”>
<Title> E-Business Program Manager - Business to Business integra-
tion (B2Bi) Program</Title>
<OrgName>
<OrganizationName>Aircraft Engines (GEAE)</OrganizationName>
</OrgName>
<Description>Key Player in the GE growth initiative bringing IT
leadership into our acquisition/ JV strategy.
Ensured fundamental IT capabilities were present in acquisition targets
in order to maintain a competitive advantage and ensure future growth.
Led cross-functional team on due diligence, and negotiations activity
for $100M+ acquisitions.
Led several new market opportunity assessments and Instrumental in
acquisition strategy development including negotiation of partnership
structures and negotiating potential new market opportunities.
</Description>
<! remainder omitted for brevity. >
</Resume>
Listing 5.8 (continued)
Before continuing our discussion, it is very important to understand that this
section is not making a value judgment on contextual versus noncontextual
modeling, as both are useful. It is not a question of better or worse, but a ques-
tion of whether your specific application is better served by fixing the context
or not fixing the context. In some ways this is the classic trade-off between flex-
ibility in the face of change versus reliable execution via static processes. For
many applications, fixing the context at the document level is the best method.
One example of this would be high-volume static transactions between well-

known trading partners. When the environment is stable and the volume is
high, it is both easier and more efficient to strictly fix the context of documents
and messages to reduce errors and increase throughput. Of course, the oppo-
site situation, where neither the environment is stable nor the volume is high,
is the classic example where flexibility and noncontextual modeling are the
best choice. We will examine more situations where noncontextual modeling is
applicable in the following paragraphs.
Chapter 5
112
Noncontextual modeling is a continuum and not a single point. In fact,
markup languages have been following the trend toward noncontextual
modeling over the last several years via namespaces and modularization.
Namespaces divide a set of terms (used as elements or attributes) into domain-
specific vocabularies with fixed definitions. Modularization allows namespaces
to be mixed and matched to assemble a document (sometimes on the fly) that
conveys the desired meaning. Two examples of such modularization are
XHTML and XBRL. XHTML is described in detail in the next chapter. XHTML
modularization allows you to mix and match vocabularies inside of HTML
documents. The extensible business reporting language (XBRL) uses both
modularization and taxonomies (discussed in Chapter 7) for the description of
financial statements for public and private companies. The XBRL specifica-
tions are available at .
RDF takes this trend toward composeable context to its logical conclusion.
How does RDF implement noncontextual modeling? RDF creates a collection
of statements and not a document. Therefore, the context of a set of RDF state-
ments cannot be determined beforehand; instead, it is wholly dependent on
the statements themselves and the relationships between the sentences. In a
sense, this disconnect between a list of statements and a hierarchical tree is the
root cause of the difficulty in encoding RDF in RDF/XML syntax, because it
attempts to marry a list of statements with a hierarchical tree structure. Fol-

lowing are two key aspects of this noncontextual modeling:
Non-contextual modeling uses explicit versus implicit relationships.
XML documents create a hierarchy of name/value pairs. As demonstrated
in Chapter 3, both elements and attributes revolve around a name and a
typed value. However, XML does not state the relationship between the
name and the value. The relationship between them is implicit. On the con-
trary, RDF uses an explicit relationship between the name and the value
with the triple structure: subject, predicate, and object.
A graph is less brittle than a tree. A collection of RDF statements can be
added to dynamically without regard to order or even previous state-
ments. In fact, a previous statement can be reified and deprecated by
another statement. This allows the RDF graph to be robust in the face of
change and suffer less from the brittle data problem and need for version-
ing and compatibility issues that can plague XML documents. Why is this?
Part of the reason is the basic difference between a document and a collec-
tion of RDF statements. Tim Berners-Lee highlighted several of these dif-
ferences in his document entitled “Why RDF Model is Different from the
XML Model,” available at />XML.html. He stresses several differences between the XML document
Understanding the Resource Description Framework
113
model and an RDF graph. First is that there are many possible XML docu-
ments that can express a set of semantic assertions. Therefore, RDF simpli-
fies this via a semantic model also known as the triple model. In other
words, RDF makes you explicitly define the semantics of your data and
thus avoid confusion and alternate syntaxes.
Another obvious difference he highlights is that order is often very impor-
tant in a document but not important to an RDF graph. Many times the
order reflects implicit context not expressed in the name/value pairs. By
forcing explicit relationships between subjects and objects, RDF avoids
this. Of course, if order is important and it changes, you have an incompat-

ible change to the document structure; hence, this is another example
where an RDF list of statements is less affected by change and therefore
less brittle.
One application (among many) that is bridging the gap between contextual
and noncontextual modeling is called SMORE, developed by Aditya Kalyan-
pur of the University of Maryland, College Park. SMORE stands for Semantic
Markup, Ontology, and RDF Editor. It allows you to embed RDF markup
inside of HTML documents during the HTML authoring process. Figure 5.11
displays embedding an RDF triple in a simple HTML document by highlight-
ing some text in the HTML editor.
Figure 5.11 Semantic Markup, Ontology, and RDF Editor (SMORE).
Chapter 5
114
Figure 5.11 is a simplified view of the SMORE desktop, which starts out with
four windows: an HTML editor (shown), semantic data representation (shown),
Web browser (not shown), and an ontology manager (not shown). SMORE
allows you to select an ontology and easily add triples about the information in
your Web pages to your HTML document. Listing 5.9 displays the generated
document with the RDF embedded in the head of the HTML document.
<html>
<head>
<script type=”application/rdf+xml”>
<?xml version=”1.0”?>
<rdf:RDF
xmlns:rdf=” />xmlns:general1.0=” />eral1.0.daml#”
xmlns:personOnt=” /><general1.0:Organization rdf:ID=”Virtual_Knowledge_Base_”>
<general1.0:subOrganizationOf>JIVA</general1.0:subOrganiza-
tionOf>
</general1.0:Organization>
<general1.0:Organization rdf:ID=”JIVA”>

<general1.0:subOrganizationOf>DIA</general1.0:subOrganiza-
tionOf>
</general1.0:Organization>
<personOnt:Person rdf:ID=”Ted_Wiatrak”></personOnt:Person>
<personOnt:Person rdf:ID=”Danny_Proko”></personOnt:Person>
</rdf:RDF>
</script>
</head>
<body>
<p>
<b>Virtual Knowledge Base (VKB) </b>
</p>
<! omitted for brevity. >
</body>
</html>
Listing 5.9 RDF embedded in HTML (via SMORE).
Listing 5.9 demonstrates the embedding of RDF in HTML using a script ele-
ment. The script specifies that its contents are an RDF document using the RDF
MIME type “application/rdf+xml”. The RDF captures statements about the
organizations, suborganizations, and people discussed in the HTML page.
A project from IBM’s Knowledge Management Group and Stanford’s Knowl-
edge Systems Laboratory that enables the distributed processing of chunks of
RDF knowledge is the TAPache subproject of the TAP project at http://
tap.stanford.edu. TAPache is a module for the Apache HTTP server that
Understanding the Resource Description Framework
115
enables you to publish RDF data via a standard Web service called getData().
This allows easy integration of distributed RDF data. This further highlights
the ability to assemble context even from disparate servers across the network.
This section demonstrated several concepts and ideas that leverage RDF’s

strength in noncontextual modeling. The idea that context can be assembled in
a bottom-up fashion is a powerful one. This is especially useful in applications
where corporate offices span countries and continents. In the end, it is the end
user that is demanding the power to assemble information as he or she sees fit.
This building-block analogy in information processing is akin to the “do-it-
yourself” trend of retail stores like Home Depot and Lowe’s. The end user gets
the power to construct larger structures from predefined definitions and a sim-
ple connection model among statements. In the end, it is that flexibility and
power that will drive the adoption of RDF and provide a strong foundation
layer for the Semantic Web.
Summary
In this chapter, we learned about the foundation layer of the Semantic Web
called the Resource Description Framework (RDF). The sections built upon
each other, demonstrating numerous applications of RDF, highlighting the
strengths and weaknesses of the language, and offering ideas and concepts for
leveraging it in your organization.
The first section answered the question “What is RDF?” It began by highlight-
ing its most obvious use in describing opaque resources like images, audio,
and video. We then began dissecting the technology into its core model, syn-
tax, and additional features. The core model revolves around denoting con-
cepts with Universal Resource Identifiers (URIs) and structured knowledge as
a collection of statements. An RDF statement has three parts: a subject, a pred-
icate, and an object. The RDF/XML syntax uses a striped syntax and a set of
elements like rdf:Description and attributes like rdf:about, and rdf:resource.
The other features discussed in the section were RDF containers and reifica-
tion. RDF containers allow an object to contain multiple values or resources.
RDF reification allows you to make statements about statements.
The second section cast a skeptic’s eye on the slow adoption of RDF. We first
noted this phenomenon by comparing RDF’s adoption to XML’s adoption via
simple Web queries. We then listed several possible reasons for the slow adop-

tion: the difficulties in combining RDF and XML documents, the complexity of
RDF concepts and syntax, and the weakness of current examples like RSS and
Dublin core that do not highlight the unique characteristics of RDF. However,
we are confident that RDF’s strengths outweigh its weaknesses and forecast
Chapter 5
116
strong adoption in the coming year. Its two main engines of growth will be
ontologies (like RDF Schema) and noncontextual modeling.
The third section covered the layer above RDF called RDF Schema. RDF
Schema provides simple RDF subjects (classes) and predicates (properties) for
defining new RDF vocabularies. This section demonstrated the power of RDF
via the Protégé ontology editor and an example of how a good ontology mod-
els the key determinants of decision making that often get muddled or lost in
free text descriptions. Thus, RDF strengthens the basic proposition of the Web:
Adding meta data and structure to information improves the effectiveness of
our processing and in turn our processes.
The final section of the chapter explored a powerful new trend called noncon-
textual modeling. To define the concept, we began with its antonym, contex-
tual modeling. We stressed the continuum between these two extremes and
how neither is good or bad, just less or more appropriate to solving the partic-
ular business problem. Whereas document types provide context and implicit
relationships supporting the document divisions and fields, noncontextual
modeling builds its context by connecting its statements. In other words, either
the context is fed onto the information or the context is derived from the infor-
mation. We believe that noncontextual modeling and the merging of contex-
tual and noncontextual modeling will rise exponentially in the next five years.
This loosely coupled, slowly accrued knowledge that is supported by well-
defined concepts and relationships specified in ontologies and knitted
together from within and outside your organization will enable huge produc-
tivity gains through better data mining, knowledge management, and soft-

ware agents.
Understanding the Resource Description Framework
117

Installing Custom Controls
119
Understanding the Rest
of the Alphabet Soup

In reality, XML just clears away some of the syntactical
distractions so that we can get down to the big prob-
lem: how we arrive at common understandings about
knowledge representation.”
—Jon Bosak
CHAPTER
6
T
he world of XML has brought us great things, but it has brought us so many
new acronyms and terms that it is hard to keep up. Some are more important
to your understanding the big picture than others are. This chapter aims to
provide you with an understanding of some of the key standards that are not
covered in the other chapters. In our discussion of these specifications, we give
you a high-level overview. Although it is not our intention to get into a lot of
the technical details, we show examples of each standard and explain why it is
important. We have included sections on the following XML technologies:
XPath, XSL, XSLT, XSLFO, XQuery, XLink, XPointer, XInclude, XML Base,
XHTML, XForms, and SVG. After reading this chapter, you should be familiar
with the goals and practical uses of each.
XPath
XPath is the XML Path Language, and it plays an important role in the XML

family of standards. It provides an expression language for specifically
addressing parts of an XML document. XPath is important because it provides
key semantics, syntax, and functionality for a variety of standards, such as
XSLT, XPointer, and XQuery. By using XPath expressions with certain software
frameworks and APIs, you can easily reference and find the values of individual
119
components of an XML document. Before we get into an XPath overview,
Figure 6.1 shows examples of XPath expressions, their meaning, and their result.
A W3C Recommendation written in 1999, XPath 1.0 was the joint work of the
W3C XSL Working Group and XML Linking Working Group, and it is part of
the W3C Style Activity and W3C XMLActivity. In addition to the functionality
of addressing areas of an XML document, it provides basic facilities for manip-
ulation of strings, numbers, and booleans. XPath uses a compact syntax to
facilitate its use within URIs and XML attribute values. XPath gets its name
from its use of a path notation as in URLs for navigating through the hierar-
chical structure of an XML document. By using XPath, you can unambigu-
ously define where components of an XML document live. Because we can use
XPath to specifically address where components can be defined, it provides an
important mechanism that is used by other XML standards and larger XML
frameworks and APIs.
Figure 6.1 Examples of XPath expressions.
//TaskItem[@id]
"Give me all TaskItem
elements that have ID
attributes"
<TaskItem id = "123"
value="Status Report"/>
<TaskItem id = "124"
value="Writing Code"/>
XPath Expression What the Expression Means Return Value

<Task>
<TaskItem id = "123"
value="Status Report"/>
<TaskItem id = "124"
value="Writing Code"/>
<TaskItem value="Idle Chat"/>
<Meeting id="125"
value="Daily Briefing"/>
<Task>
//[@id]
"Give me all ID
attributes"
id = "123"
id = "124"
id = "125"
/Task/Meeting
"Select all elements
named 'Meeting' that
are children of the root
element 'Task'"
<Meeting id="125"
value="Daily Briefing"/>
Chapter 6
120
Although the original XPath specification (1.0) is an addressing language, the
XPath 2.0 specification is a product of the XML Query and XML Style Working
Groups and thus shares common features with the next generation of XQuery,
the XML query language. XQuery Version 1.0 is, in fact, an extension of XPath
Version 2.0, so the two will be very closely related. The XPath 2.0 Working
Draft, released in August 2002, states that “it is intended that XPath should be

embedded in a host language, such as XQuery and XSLT.”
What role does XPath play in other standards? With XSLT, you can define a
template in advance using XPath expressions that allow you to specify how to
style a document. XQuery is a superset of XPath and uses XPath expressions to
query XML native databases and multiple XML files. XPointer uses XPath
expressions to “point” to specific nodes in XML documents. XML Signature,
XML Encryption, and many other standards can use XPath expressions to ref-
erence certain areas of an XML document. In a Semantic Web ontology, groups
of XPath expressions can be used to specify how to find data and the relation-
ships between data. The DOM Level 3 XPath specification, a Working Draft
from the W3C, also provides interfaces for accessing nodes of a DOM tree
using XPath expressions.
XPath has been incredibly successful. Its adoption has been widespread in
both practical technology applications and within other XML standards.
XPath alone is a strong addressing mechanism, but it is often difficult to com-
prehend its power without looking at how other specifications use them. As
we discuss some of the other standards in this chapter, you will see more ref-
erences to this specification.
The Style Sheet Family: XSL, XSLT, and XSLFO
Style sheets allow us to specify how an XML document can be transformed into
new documents, and how that XML document could be presented in different
media formats. The languages associated with style sheets are XSL (Extensible
Stylesheet Language), XSLT (Extensible Stylesheet Language: Transforma-
tions), and XSLFO (Extensible Stylesheet Language: Formatting Objects).
A style sheet processor takes an XML document and a style sheet and produces
a result. XSL consists of two parts: It provides a mechanism for transforming
XML documents into new XML documents (XSLT), and it provides a vocabu-
lary for formatting objects (XSLFO). XSLT is a markup language that uses tem-
plate rules to specify how a style sheet processor transforms a document and is
a Recommendation by the W3C ( XSLFO is a

pagination markup language and is simply the formatting vocabulary defined
in the XSL W3C Recommendation (
Understanding the Rest of the Alphabet Soup
121
It is important to understand the importance of styling. Using style sheets
adds presentation to XML data. In separating content (the XML data) from the
presentation (the style sheet), you take advantage of the success of what is
called the Model-View-Controller (MVC) paradigm. The act of separating the
data (the model), how the data is displayed (the view), and the framework
used between them (the controller) provides maximum reuse of your
resources. When you use this technology, XML data can simply be data. Your
style sheets can transform your data into different formats, eliminating the
maintenance nightmare of trying to keep track of multiple presentation for-
mats for the same data. Embracing this model allows you to separate your con-
cerns about maintaining data and presentation. Because browsers such as
Microsoft Internet Explorer have style sheet processors embedded in them,
presentation can dynamically be added to XML data at download time.
Figure 6.2 shows a simple example of the transformation and formatting
process. A style sheet engine (sometimes called an XSLT engine) takes an orig-
inal XML document, loads it into a DOM source tree, and transforms that doc-
ument with the instructions given in the style sheet. In specifying those
instructions, style sheets use XPath expressions to reference portions of the
source tree and capture information to place into the result tree. The result tree
is then formatted, and the resulting XML document is returned. Although the
original document must be a well-formed XML document, the resulting docu-
ment may be any format. Many times, the resulting document may be post-
processed. In the case of formatting an XML document with XSLFO styling, a
post-processor is usually used to transform the result document into a differ-
ent format (such as PDF or RTF—just to name a few).
At this point, we will show a brief example of using style sheets to add pre-

sentation to content. Listing 6.1 shows a simple XML file that lists a project, its
description, and its schedule of workdays. Because our example will show a
browser dynamically styling this XML, we put the processing directive (the
styling instructions) on the second line.
Figure 6.2 Styling a document.
XML
Document
Transformation Formatting
Source Tree
Resulting
Document
Stylesheet
Result Tree
Stylesheet Engine
Chapter 6
122
<?xml version=”1.0” encoding=”UTF-8”?>
<?xml-stylesheet href=”simplestyle.xsl” type=”text/xsl”?>
<project name=”Trumantruck.com”>
<description>Rebuilding a 1967 Chevy Pickup Truck</description>
<schedule>
<workday>
<date>20000205</date>
<description>Taking Truck Body Apart</description>
</workday>
<workday>
<date>20000225</date>
<description>Sandblasting, Dismantling Cab</description>
</workday>
<workday>

<date>20000311</date>
<description>Sanding, Priming Hood and Fender</ Æ
description>
</workday>
</schedule>
</project>
Listing 6.1 A simple XML file.
To create an HTML page with the information from our XML file, we need to
write a style sheet. Listing 6.2 shows a simple style sheet that creates an HTML
file with the workdays listed in an HTML table. Note that all pattern matching
is done with XPath expressions. The <xsl:value-of> element returns the value
of items selected from an XPath expression, and each template is called by the
XSLT processor if the current node matches the XPath expression in the match
attribute.
<xsl:stylesheet xmlns:xsl=” /><xsl:template match=”/”>
<html>
<TITLE>Schedule For
<xsl:value-of select=”/project/@name”/>
- <xsl:value-of select=”/project/description”/>
</TITLE>
<CENTER>
<TABLE border=”1”>
<TR>
<TD><B>Date</B></TD>
<TD><B>Description</B></TD>
</TR>
<xsl:apply-templates/>
</TABLE>
Listing 6.2 A simple style sheet. (continued)
Understanding the Rest of the Alphabet Soup

123
</CENTER>
</html>
</xsl:template>
<xsl:template match=”project”>
<H1>Project:
<xsl:value-of select=”@name”/>
</H1>
<HR/>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match=”schedule”>
<H2>Work Schedule</H2>
<xsl:apply-templates/>
</xsl:template>
<xsl:template match=”workday”>
<TR>
<TD>
<xsl:value-of select=”date”/>
</TD>
<TD>
<xsl:value-of select=”description”/>
</TD>
</TR>
</xsl:template>
</xsl:stylesheet>
Listing 6.2 (continued)
The resulting document is rendered dynamically in the Internet Explorer
browser, shown in Figure 6.3.
Why are style sheets important? In an environment where interoperability is

crucial, and where data is stored in different formats for different enterprises,
styling is used to translate one enterprise format to another enterprise format.
In a scenario where we must support different user interfaces for many
devices, style sheets are used to add presentation to content, providing us with
a rich mechanism for supporting the presentation of the same data on multiple
platforms. A wireless client, a Web client, a Java application client, a .NET
application client, or the application of your choice can have different style
sheets to present a customized view. In a portal environment, style sheets can
be used to provide a personalized view of the data that has been retrieved
from a Web service. Because there is such a loose coupling between data con-
tent and presentation, style sheets allow us to develop solutions faster and eas-
ier. Because we can use style sheet processors to manipulate the format of XML
documents, we can support interoperability between enterprise applications
supporting different XML formats.
Chapter 6
124
Figure 6.3 A browser, using style sheets to render our example.
Figure 6.4 shows a diagram of many practical examples of style sheets in use.
In this diagram, different style sheets are applied to an XML document to
achieve different goals of presentation, interoperation, communication, and
execution. At the top of the diagram, you see how different style sheets can be
used to add presentation to the original XML content. In the case of XSLFO,
sometimes a post-processor is used to transform the XSLFO vocabulary into
another format, such as RTF and PDF. In the “interoperation” portion of Fig-
ure 6.4, a style sheet is used to transform the document into another format
read by another application. In the “communication” portion of Figure 6.4, a
style sheet is used to transform the XML document into a SOAP message,
which is sent to a Web service. Finally, in the “execution” section, there are two
examples of how XML documents can be transformed into code that can be
executed at run time. These examples should give you good ideas of the power

of style sheets.
Hopefully, this section has given you the big picture on the importance of style
sheets and their uses. As you can see, the sky is the limit for the uses of style
sheets.
Understanding the Rest of the Alphabet Soup
125
Figure 6.4 Practical examples of style sheets in use.
XQuery
XQuery is a language designed for processing XML data, and it is intended to
make querying XML-based data sources as easy as querying databases.
Although there have been attempts at XML query languages before—such as
XQL, XML-QL, and Quilt—XQuery, a product of the W3C, is the first to
receive industry-wide attention. The Query Working Group of the W3C has
made it clear that there is a need for a human-readable query syntax, and there
is also a requirement for an XML-based query syntax. XQuery was designed to
meet the human-readable syntax requirement. It is an expression language
that is an extension of XPath. With few exceptions, every XPath expression is
also an XQuery expression. What XQuery provides, however, is a human-
readable language that makes it easy to query XML sources and combine that
with programming language logic.
Similar to using style sheet transformations, the XQuery process uses XPath
expressions and its own functions to query and manipulate XML data. It is a
strongly typed language, and its syntax is very similar to a high-level pro-
gramming language, such as Perl. To demonstrate how XQuery works, we will
use the example XML file shown in Listing 6.1. The following is a valid
PRESENTATION
Viewed with
Web browser
Viewed with
PDF viewer

Viewed with
RTF viewer
(MS Word,
StarOffice, etc.)
Sent to wireless
device
Used with another
XML application
Sent to
Web service
Sent to server
and compiled
for later
execution
ANT program
execution
INTEROPERATION
COMMUNICATION
EXECUTION
XHTML
Document
PDF
Document
XML Document
RTF
Document
WML
Document
XML
Document

XSLFO
Document
Java Server
Page
SOAP
Message
Jakarta Ant
File
2fo.xsl
2XHTML.xsl
2WML.xsl
2JSP.xsl
2AnotherFormat.xsl
2SOAP.xsl
2ANT.xsl
PDF Post Processor
RTF Post Processor
Chapter 6
126

×