Contents
Overview 1
Creating a Rowset from an XML Document 2
Specifying the Structure of a Rowset 13
Lab 3: Using OPENXML 26
Best Practices 33
Review 34
Module 3: Manipulating
XML with Transact-SQL
Information in this document is subject to change without notice. The names of companies,
products, people, characters, and/or data mentioned herein are fictitious and are in no way intended
to represent any real individual, company, product, or event, unless otherwise noted. Complying
with all applicable copyright laws is the responsibility of the user. No part of this document may
be reproduced or transmitted in any form or by any means, electronic or mechanical, for any
purpose, without the express written permission of Microsoft Corporation. If, however, your only
means of access is electronic, permission to print one copy is hereby granted.
Microsoft may have patents, patent applications, trademarks, copyrights, or other intellectual
property rights covering subject matter in this document. Except as expressly provided in any
written license agreement from Microsoft, the furnishing of this document does not give you any
license to these patents, trademarks, copyrights, or other intellectual property.
2001 Microsoft Corporation. All rights reserved.
Microsoft, ActiveX, BackOffice, BizTalk, MSDN, MS-DOS, SQL Server, Visual Basic, Visual
C++, Visual InterDev, Visual J++, Visual Studio, Windows, Windows Media, Windows NT, and
Windows 2000 are either registered trademarks or trademarks of Microsoft Corporation in the
U.S.A. and/or other countries.
Other product and company names mentioned herein may be the trademarks of their respective
owners.
Module 3: Manipulating XML with Transact-SQL iii
Instructor Notes
This module provides students with an understanding of how to insert XML
data into SQL Server tables by using the OPENXML statement.
After completing this module, students will be able to:
Use the OPENXML statement to create a rowset from a single-level
Extensible Markup Language (XML) document.
Use the OPENXML statement to process rowsets from complex XML
documents.
Retrieve attributes, elements, or both from an XML document by specifying
the appropriate flags parameter with the OPENXML statement.
Use XML Path Language (XPath) expressions in rowpattern and colpattern
parameters to specify rowset structure.
Materials and Preparation
This section provides the materials and preparation tasks that you need to teach
this module.
Required Materials
To teach this module, you need Microsoft
®
PowerPoint
®
file 2091A_03.ppt.
Preparation Tasks
To prepare for this module, you should:
Read all of the materials for this module.
Complete the lab and practice.
Practice the demonstration.
Review the multimedia animation.
Presentation:
90 Minutes
Lab:
30 Minutes
iv Module 3: Manipulating XML with Transact-SQL
Module Strategy
Use the following strategies to present this module:
Creating a Rowset from an XML Document
Emphasize the importance of using the sp_xml_removedocument system
stored procedure to remove from memory the tree generated by the
sp_xml_preparedocument system stored procedure.
Emphasize that the main purpose of using OPENXML is to shred a
document into one or more tables.
Do not go into details about the rowpattern, flags, and Table Name
parameters, and do not discuss colpattern parameters. These will be
introduced in the next section.
Specifying the Structure of a Rowset
Emphasize that combining flags parameters by using a logical OR can
introduce performance issues. A better approach is to specify either an
attribute-centric or element-centric mapping and use column patterns for
non–default-mapped data.
Module 3: Manipulating XML with Transact-SQL 1
Overview
Creating a Rowset from an XML Document
Specifying the Structure of a Rowset
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
After completing this module, you will be able to:
Use the OPENXML statement to create a rowset from a single-level
Extensible Markup Language (XML) document.
Use the OPENXML statement to process rowsets from complex XML
documents.
Retrieve attributes, elements, or both from an XML document by specifying
the appropriate flags parameter with the OPENXML statement.
Use XML Path Language (XPath) expressions in rowpattern and colpattern
parameters to specify rowset structure.
Topic Objective
To provide an overview of
the module topics and
objectives.
Lead-in
In this module, you will learn
about the OPENXML
statement and how you can
use it to manipulate XML
data in Transact-SQL.
2 Module 3: Manipulating XML with Transact-SQL
Creating a Rowset from an XML Document
Process Overview
Multimedia: Parsing and Shredding XML
Creating an Internal Tree
Retrieving a Rowset from an XML Tree
Inserting Data from XML Documents into Tables
Practice: Retrieving Rowsets with OPENXML
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
A rowset is an OLE DB object that contains the result set for a query. In a
trading partner integration scenario, you might need to generate a rowset from
an XML document. For example, a retailer sends orders to a supplier as XML
documents. The supplier must then generate rowsets from the XML in order to
insert the data into one or more tables in the database.
This section discusses the use of the OPENXML statement to generate rowsets
from data in XML documents.
Topic Objective
To introduce the topics in
this section.
Lead-in
In an application integration
scenario, you often must
create a rowset from XML
data in order to process it.
Module 3: Manipulating XML with Transact-SQL 3
Process Overview
1.
Receive an XML document
• Usually a parameter to a stored procedure
2. Generate an internal tree representation
• Use sp_xml_preparedocument to parse the XML
3. Retrieve a rowset from the tree
• Use OPENXML
4. Process the data from the rowset
• Usually “shredding” into permanent tables
5. Destroy the internal tree when it is no longer required
• Use sp_xml_removedocument
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
Processing XML data as a rowset involves the following five steps:
1. Receive an XML document.
When an application receives an XML document, it can process the
document by using Transact-SQL code. For example, when a supplier
receives an XML order from a retailer, the supplier logs the order in a
Microsoft
®
SQL Server
™
database. Usually, the Transact-SQL code to
process the XML data is implemented in the form of a stored procedure, and
the XML string is passed as a parameter.
2. Generate an internal tree representation.
Before processing the document, use the sp_xml_preparedocument system
stored procedure to parse the XML document and transform it into an in-
memory tree structure. The tree is conceptually similar to a Document
Object Model (DOM) representation of an XML document. You can use
only a valid, well-formed XML document to generate the internal tree.
3. Retrieve a rowset from the tree.
You use the OPENXML statement to generate an in-memory rowset from
the data in the tree. Use XPath query syntax to specify the nodes in the tree
to be returned in the rowset.
Topic Objective
To describe the process
used to generate a rowset
from XML.
Lead-in
There are five steps
involved in processing XML
data as a rowset.
4 Module 3: Manipulating XML with Transact-SQL
4. Process the data from the rowset.
Use the rowset created by OPENXML to process the data, in the same way
that you would use any other rowset. You can select, update, or delete the
data by using Transact-SQL statements. The most common use of
OPENXML is to insert rowset data into permanent tables in a database. For
example, an XML order received by a supplier might contain data that must
be inserted into OrderHeader and OrderDetails tables.
5. Destroy the internal tree when it is no longer required.
Because the tree structure is held in memory, use the
sp_xml_removedocument system stored procedure to free the memory
when the tree is no longer required.
Module 3: Manipulating XML with Transact-SQL 5
Multimedia: Parsing and Shredding XML
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
In this animation, you will observe the process used to receive an XML
document and insert its data into tables in a SQL Server database.
Topic Objective
To introduce the animation,
which is about 2 ½ minutes
long.
Lead-in
In this animation, you will
observe the process of
shredding an XML
document into a SQL Server
database.
6 Module 3: Manipulating XML with Transact-SQL
Creating an Internal Tree
Create the tree with sp_xml_preparedocument
Free memory with sp_xml_removedocument
CREATE PROC ProcessOrder @doc NText
AS
DECLARE @idoc integer
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Process Document
EXEC sp_xml_removedocument @idoc
CREATE PROC ProcessOrder @doc NText
AS
DECLARE @idoc integer
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
-- Process Document
EXEC sp_xml_removedocument @idoc
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
Before you can process an XML document by using Transact-SQL statements,
you must parse the document and transform it into a tree structure.
Parsing XML with sp_xml_preparedocument
The sp_xml_preparedocument system stored procedure parses an XML
document and generates an internal tree representation of the document, and has
the following syntax.
sp_xml_preparedocument hdoc OUTPUT [, xmltext]
[, xpath_namespaces]
The following table describes the parameters for the
sp_xml_preparedocument system stored procedure.
Parameter Description
xmltext The original XML document to be processed
hdoc A handle to the parsed XML tree
xpath_namespaces
(optional)
XML namespace declarations that are used in row and
column XPath expressions in OPENXML statements
Topic Objective
To describe the use of the
sp_xml_preparedocument
stored procedure.
Lead-in
Use the
sp_xml_preparedocument
stored procedure to parse
an XML document and
generate the internal tree.
Syntax
Module 3: Manipulating XML with Transact-SQL 7
The following example shows how to use the sp_xml_preparedocument
system stored procedure to parse an XML document that has been passed to a
new custom stored procedure.
CREATE PROC ProcessOrder @doc NText
AS
DECLARE @idoc integer
EXEC sp_xml_preparedocument @idoc OUTPUT, @doc
Freeing Memory with sp_xml_removedocument
You store parsed documents in the internal cache. To avoid running out of
memory, use the sp_xml_removedocument system stored procedure to release
the document handle and destroy the tree structure when it is no longer
required.
You must call sp_xml_removedocument in the same query batch as the
sp_xml_preparedocument system stored procedure that is used to generate the
node tree. This is because the hdoc parameter that is used to reference the tree
is a local variable, and if it goes out of scope, there is no way to remove the tree
from memory.
You implement the sp_xml_preparedocument and
sp_xml_removedocument system stored procedures as extended stored
procedures. This means that you can call them from within your own custom
stored procedures, but not from within a user-defined function.
Example
Note
8 Module 3: Manipulating XML with Transact-SQL
Retrieving a Rowset from an XML Tree
OPENXML syntax
The rowpattern parameter determines rows
The WITH clause determines columns
Use the Flags parameter to determine attribute centricity
or element centricity
Using OPENXML
Valid anywhere you can use a rowset
Commonly used in SELECT statements
SELECT * FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
SELECT * FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
After you have parsed an XML document by using the
sp_xml_preparedocument system stored procedure, and a handle to the
internal tree has been returned, you can generate a rowset from the parsed tree.
Using the OPENXML Statement
You use the OPENXML statement to retrieve a rowset from the tree. You can
then write Transact-SQL SELECT, UPDATE, or INSERT statements that
modify a database.
The OPENXML statement has the following syntax.
OPENXML(idoc, rowpattern [, flags])
[WITH (SchemaDeclaration) | TableName]
The following table describes parameters of the OPENXML statement.
Parameter Description
rowpattern XPath query defining the nodes that should be returned
idoc Handle to the internal tree representation of the XML
document
flags Optional bit-mask determining attribute centricity or
element centricity
SchemaDeclaration Rowset schema declaration for the columns to be
returned
TableName Name of an existing table, the schema of which should
be used to define the columns that are returned
Topic Objective
To describe the OPENXML
statement.
Lead-in
After you have parsed the
XML document, you can use
the OPENXML statement.
Delivery Tip
Although the rowpattern,
flags, and Table Name
syntax elements are only
briefly mentioned here, they
will be discussed in detail in
the next section.
Syntax
Module 3: Manipulating XML with Transact-SQL 9
You can use an OPENXML statement anywhere you can use a rowset
provider, such as a table, view, or the OPENROWSET function. You use the
OPENXML statement primarily in SELECT statements, as shown in the
following example.
SELECT * FROM OPENXML(@idoc, 'order/lineitem', 1)
WITH (productid integer,
price money)
In the preceding example, “@idoc” is a handle to the internal tree
representation of the following XML order document:
<order orderno="1001" orderdate="01/01/2001" custid="1235">
<lineitem productid="14" quantity="2" price="15.99"/>
<lineitem productid="17" quantity="1" price="5.49"/>
<lineitem productid="21" quantity="2" price="14.99"/>
</order>
The following table shows the rowset that the preceding OPENXML statement
returns.
productid price
14 15.99
17 5.49
21 14.99
Formatting the Rowset
You use the WITH clause of the OPENXML statement to define the structure
of the rowset returned by the query. The rowset must be compatible with the
table into which the data is inserted. If you do not include a WITH clause with
an OPENXML statement, the query returns an edge table, which is discussed
later in this module.
A WITH clause must contain a description of the mapping between the XML
nodes and the columns in the rowset. This description is in the form of a
schema declaration, which can be a table name or a table schema. The
following table contains guidelines for when to use a name or a schema.
Use When
Table Name You have an existing table with the exact structure that you
want, so that the table schema is the same as the rowset schema.
Table Schema You need to map the attributes and elements in the XML
document to columns in a table that have different names from
the columns in the rowset.
Example
10 Module 3: Manipulating XML with Transact-SQL
Inserting Data from XML Documents into Tables
Use an INSERT statement for an existing table
Use a SELECT INTO statement to create a new table
INSERT orders
SELECT * FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
INSERT orders
SELECT * FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
SELECT * INTO neworders
FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
SELECT * INTO neworders
FROM OpenXML (@idoc, 'order', 1)
WITH (orderno integer,
orderdate datetime)
*****************************
ILLEGAL FOR NON
-
TRAINER USE
******************************
The primary use of the OPENXML statement is to insert XML data into tables,
which is a common task when integrating two applications that use XML to
represent business data. You can achieve this integration by using an
OPENXML query within an INSERT statement, or by using a SELECT
INTO statement.
The following table contrasts the INSERT and SELECT INTO statements.
Statement Usage Permissions
INSERT Inserts data only into existing
tables.
Requires INSERT
permission on the table.
SELECT INTO Creates a new table and
populates it with the result set of
a SELECT statement.
Requires CREATE TABLE
permission in the destination
database.
Topic Objective
To describe how OPENXML
can be used to insert XML
data into a table.
Lead-in
The primary use of
OPENXML is to insert XML
data into tables.
For Your Information
Previous versions of SQL
Server required that the
SELECT INTO /
BULKCOPY option be set
before using SELECT INTO
to create a permanent table.
SQL Server 2000 has no
requirement to set this
option.
Module 3: Manipulating XML with Transact-SQL 11
Using the INSERT Statement with Existing Tables
You use the INSERT statement to insert data from an XML document into an
existing SQL Server table. For example, use the following Transact-SQL
statement to insert data from an XML document into an existing orders table.
INSERT orders
SELECT * FROM OPENXML(@idoc, 'order', 1)
WITH (orderno integer,
[date] datetime,
customer integer)
This Transact-SQL statement inserts the following data into the orders table.
orderno date customer
1001 01/01/2001 1235
Using the SELECT INTO Statement to Create a Table
You use the SELECT INTO statement to create a new table containing data
from an XML document. For example, use the following Transact-SQL
statement to create a new table named neworders that contains data from an
XML document.
SELECT * INTO neworders
FROM OPENXML(@idoc, 'order', 1)
WITH (orderno integer,
[date] datetime,
customer integer)
This statement creates the following neworders table.
orderno date customer
1001 01/01/2001 1235
Example
Example