Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 442
Part III Beyond Relational
creating tables with XML columns, and allows declaring XML variables and using them as parameters
and return values.
XQuery is a W3C-recommended language created to query and format XML documents. XQuery can
be used to query XML documents just like a SQL query is used to retrieve information from relational
tables. The XML data type implements a limited subset of the XQuery specification and a T-SQL query
can use XQuery to retrieve information from XML columns or variables.
XQuery is built into the Relational Engine of SQL Server, and the Query Optimizer can build query
plans that contain relational query operations as well as XQuery operations. Results of XQuery opera-
tions can be joined with relational data, or relational data can be joined with XQuery results. SQL Server
supports creating special types of indexes on XML columns to optimize XQuery operations.
XML Schema Definition (XSD) is another W3C-recommended language created for describing and vali-
dating XML documents. XSD supports creating very powerful and complex validation rules that can be
applied to XML documents to verify that they are fully compliant with the business requirements. The
XML data type supports XSD validation and is explained later in this chapter.
The XML data type supports a number of methods, listed here:
■
value()
■ exist()
■ query()
■ modify()
■ nodes()
Each of these methods is explained in detail later in this chapter.
Typed and untyped XML
As mentioned earlier, support for XSD schema validation is implemented in SQL Server in the form
of XML schema collections. An XML schema collection can be created from an XML schema definition.
XML columns or variables can be bound to an XML schema collection. An XML column or variable that
is bound to an XML schema collection is known as typed XML.
When a typed XML value is modified, SQL Server validates the new value against the rules defined in
the XML schema collection. The assignment or modification operation will succeed only if the new value
passes the validations defined in the XML schema collection.
Typed XML has a number of advantages over untyped XML columns or variables. The most important
benefit is that the validation constraints are always respected. The content of a typed XML document is
always valid as per the schema with which it is associated.
With typed XML, SQL Server has better knowledge of the XML document (structure, data types, and
so on) and can generate a more optimized query plan. Because SQL Server has complete knowledge o f
the data types of elements and attributes, storage of typed XML can be made significantly more compact
than untyped XML.
Static type checking is possible with typed XML documents, and SQL Server can detect, at compile time,
if an XQuery expression on a typed XML document is mistyped. Stored procedures or functions that
442
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 443
Manipulating XML Data 18
accept typed XML parameters are protected from receiving invalid XML documents, as SQL Server will
perform implicit validation of the XML value against the schema collection before accepting the parame-
ter value.
Creating and using XML columns
The XML data type can be used like other native SQL Server data types in most cases. (Note that there
are exceptions, however. For example, an XML column cannot be added as a column to a regular index
or used in a comparison operation.) A table can be created with one or more XML columns, or XML
columns can be added to an existing table.
VARCHAR/NVARCHAR/VARBINARY/TEXT/NTEXT columns
can be altered to XML data type columns if all the existing values are well-formed XML values.
Entire XML documents can be retrieved as part o f a
SELECT query, or specific information can be
extracted from within the XML documents. The following example shows a
SELECT query that selects a
column from a table and a value from the XML document stored in each row:
DECLARE @t TABLE (OrderID INT,OrderData XML )
INSERT INTO @t(OrderID, OrderData)
SELECT 1,
‘<CustomerNumber>1001</CustomerNumber>
<Items>
<Item ItemNumber="1001" Quantity="1" Price="950"/>
<Item ItemNumber="1002" Quantity="1" Price="650" />
</Items>’
SELECT
OrderID,
OrderData.value(’CustomerNumber[1]’,’CHAR(4)’) AS CustomerNumber
FROM @t
/*
OrderID CustomerNumber
1 1001
*/
Thecodemightgetalittlemorecomplexifthequeryneeds to retrieve more than one element from the
XML document stored in each row. Such a query needs to generate more than one row against each row
stored in the base table. The
nodes() method of the XML data type can be used to obtain an acces-
sor to each element within the XML document. The XML element collection returned by the
nodes()
method can be joined with the base table using the CROSS APPLY operator as shown in the following
example:
DECLARE @t TABLE (OrderID INT,OrderData XML )
INSERT INTO @t(OrderID, OrderData)
SELECT 1,
‘<CustomerNumber>1001</CustomerNumber>
<Items>
<Item ItemNumber="1001" Quantity="1" Price="950"/>
443
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 444
Part III Beyond Relational
<Item ItemNumber="1002" Quantity="1" Price="650" />
</Items>’
SELECT
OrderID,
o.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber,
o.value(’@Quantity’,’INT’) AS Quantity,
o.value(’@Price’,’MONEY’) AS Price
FROM @t
CROSS APPLY OrderData.nodes(’/Items/Item’) x(o)
/*
OrderID ItemNumber Quantity Price
1 1001 1 950.00
1 1002 1 650.00
*/
The preceding examples use the value() method exposed by the XML data type. XML data type meth-
ods are explained in detail later in this section.
Declaring and using XML variables
Just like other SQL Server native data types, XML variables can be created and used in T-SQL batches,
stored procedures, functions, and so on. The following example demonstrates a few different ways an
XML variable can be declared:
Declare an XML variable
DECLARE @x XML
Declare a TYPED XML Variable
DECLARE @x XML(CustomerSchema)
Declare a TYPED XML DOCUMENT Variable
DECLARE @x XML(DOCUMENT CustomerSchema)
Declare a TYPED XML CONTENT variable
DECLARE @x XML(CONTENT CustomerSchema)
The first example creates an untyped XML variable, and the second example creates a typed one. The
third example creates a
DOCUMENT type variable, and the last one creates a CONTENT type variable.
DOCUMENT and CONTENT types are explained later in this chapter.
There is a slight difference in the way that an XQuery expression needs to be written for an XML
variable versus an XML column. While working with an XML variable, the query will always process
only one document at a time. However, while working with an XML column, more than one XML
document may b e processed in a single batch operation. Because of this, the
CROSS APPLY oper-
ator is required while running such a query on an XML column (as demonstrated in the previous
example).
444
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 445
Manipulating XML Data 18
What follows is the version of the prior query that operates on an XML variable:
DECLARE @x XML
SELECT @x = ‘
<CustomerNumber>1001</CustomerNumber>
<Items>
<Item ItemNumber="1001" Quantity="1" Price="950"/>
<Item ItemNumber="1002" Quantity="1" Price="650" />
</Items>’
SELECT
o.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber,
o.value(’@Quantity’,’INT’) AS Quantity,
o.value(’@Price’,’MONEY’) AS Price
FROM @x.nodes(’/Items/Item’) x(o)
/*
ItemNumber Quantity Price
1001 1 950.00
1002 1 650.00
*/
An XML variable may be initialized by a static XML string, from another XML or VARCHAR/NVARCHAR/
VARBINARY variable, from the return value of a function, or from the result of a FOR XML query. The
following example shows how to initialize an XML variable from the result of a
FOR XML query:
DECLARE @x XML
SELECT @x = (
SELECT OrderID
FROM OrderHeader
FOR XML AUTO, TYPE)
XML variables can also be initialized from an XML file, as demonstrated later in the section ‘‘Loading
XML Documents from Disk Files.’’
Using XML parameters and return values
Typed and untyped XML parameters can be passed to a stored procedure as INPUT as well as OUTPUT
parameters. XML parameters can be used as argumentsaswellasthereturnvalueofscalarfunctionsor
in result columns of table-valued functions.
When a function returns an XML data type value, XML data type methods can be directly called on the
return value, as shown in the following example:
Create a function that returns an XML value
CREATE FUNCTION GetOrderInfo(
@OrderID INT
) RETURNS XML
445
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 446
Part III Beyond Relational
AS
BEGIN
DECLARE @x XML
SELECT @x = (
SELECT OrderID, CustomerID
FROM OrderHeader
WHERE OrderID = @OrderID
FOR XML PATH(’’),ROOT(’OrderInfo’))
RETURN @x
END
GO
Call the function and invoke the value() method
SELECT dbo.GetOrderInfo(1).value(’(OrderInfo/CustomerID)[1]’,’INT’)
AS CustomerID
/*
CustomerID
1
*/
Loading/querying XML documents from disk files
The capability to load XML documents from disk files is one of the very interesting XML features
available with SQL Server. This i s achieved by using the
BULK row set provider for OPENROWSET.The
following example shows how to load the content of an XML file into an XML variable:
/*The sample code below assumes that a file named "items.xml"
exists in folder c:\temp with the following content.
<Items>
<Item ItemNumber="1001" Quantity="1" Price="950"/>
<Item ItemNumber="1002" Quantity="1" Price="650" />
</Items>
*/
DECLARE @xml XML
SELECT
@xml = CAST(bulkcolumn AS XML)
FROM OPENROWSET(BULK ‘C:\temp\items.xml’, SINGLE_BLOB) AS x
SELECT
x.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber,
x.value(’@Quantity’,’INT’) AS Quantity,
x.value(’@Price’,’MONEY’) AS Price
FROM @xml.nodes(’/Items/Item’) i(x)
446
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 447
Manipulating XML Data 18
/*
ItemNumber Quantity Price
1001 1 950.00
1002 1 650.00
*/
OPENROWSET(BULK [filename, option])
can even query the data in the file directly without
loading it to a table or variable. It can also be used as the source of an
INSERT/UPDATE operation. The
following example queries the XML file directly:
SELECT
x.value(’@ItemNumber’,’CHAR(4)’) AS ItemNumber,
x.value(’@Quantity’,’INT’) AS Quantity,
x.value(’@Price’,’MONEY’) AS Price
FROM (
SELECT CAST(bulkcolumn AS XML) AS data
FROM OPENROWSET(BULK ‘C:\temp\items.xml’, SINGLE_BLOB)
AS x
)a
CROSS APPLY data.nodes(’/Items/Item’) i(x)
/*
ItemNumber Quantity Price
1001 1 950.00
1002 1 650.00
*/
To use the OPENROWSET(BULK ) option, the user should have ADMINISTRATOR BULK
OPERATIONS
permission.
Limitations of the XML data type
Though the XML data type comes with a number of very interesting capabilities, it has a number of
limitations as well. However, the limitations are not really ‘‘limiting,’’ considering the extensive set
of functionalities provided by the data type.
The stored representation of an XML data type instance cannot exceed 2 GB. The term ‘‘stored represen-
tation’’ is important in the preceding statement, because SQL Server converts XML data type values to an
internal structure and stores it. This internal representation takes much less space than the textual rep-
resentation of the XML value. The following example demonstrates the reduction in size when a value is
stored as an XML data type value:
DECLARE @EmployeeXML XML, @EmployeeText NVARCHAR(500)
SELECT @EmployeeText = ‘
447
www.getcoolebook.com
Nielsen c18.tex V4 - 07/21/2009 1:01pm Page 448
Part III Beyond Relational
<EmployeeInfo>
<EmployeeName>Jacob</EmployeeName>
<EmployeeName>Steve</EmployeeName>
<EmployeeName>Bob</EmployeeName>
</EmployeeInfo>’
SELECT DATALENGTH(@EmployeeText) AS StringSize
/*
StringSize
284
*/
SELECT @EmployeeXML = @EmployeeText
SELECT DATALENGTH(@EmployeeXML) AS XMLSize
/*
XMLSize
109
*/
The stored representation of the XML data type value is different and much more optimized than the
textual representation a nd the limit of 2 GB is on the stored representation. It indicates that an XML
data type column may be able to store XML documents containing more than 2 * 1024 * 1024 * 1024
VARCHAR characters.
Unlike other data types, XML data type values cannotbesortedorusedinagroupbyexpression.They
cannot be used in a comparison operation. However, they can be used with the
IS NULL operator to
determine if the value is
NULL. XML data type columns cannot be used in the key of an index. They
can only be used in the
INCLUDED column of an index.
To facilitate faster querying and searching over XML columns, SQL Server supports a spe-
cial type of index called an XML index. XML indexes are different from regular indexes and
are discussed later in this chapter.
Understanding XML Data Type Methods
The XML data type supports a number of m ethods that allow various operations on the XML document.
The most common operations needed on an XML document might be reading values from elements
or attributes, querying for specific information, or modifying the document by inserting, updating, or
deleting XML elements or attributes. The XML data type comes with a number of methods to support all
these operations.
Any operation on an XML document is applied on one or more elements or attributes at a spe-
cific location. To perform an operation, the location of the specific element or attribute has to be
specified.
448
www.getcoolebook.com