Tải bản đầy đủ (.pdf) (10 trang)

Microsoft SQL Server 2008 R2 Unleashed- P190 potx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (201.2 KB, 10 trang )

ptg
1884
CHAPTER 47 Using XML in SQL Server 2008
<ScrapReason>12
<! Comment: Name = Thermoform temperature too high >
<?ModDatePI 1998-06-01T00:00:00?>
<WorkOrders WorkOrderIds=”72370 72273 70875 69474 69173 68573 65970 60472
56975 56875 55275 53771 50370 47670 45773 42071 41975 39372 36673
36671 32872 32775 32770 31073 29370 27771 24174 22673 22670 17674
16073 13073 10274 9071 7771 4972 2573” />
</ScrapReason>
</ScrappedWorkOrders>
Let’s review the selected columns in Listing 47.11: the first is aliased with the asterisk (*)
character. This character tells SQL Server to inline-generate the data for that column (as
text). (Using the
text() node test would do the same in this case.)
Next, the
comment() node test is specified for Name, telling the XML generator to output its
value in a comment. For clarity’s sake, we added a little syntactic sugar in this statement
by prepending the text
’Comment: Name = ‘ to the value produced inside the comment.
Next, the processing-instruction() node test is specified to output each value of
ModifiedDate to a new processing instruction called ModDatePI.
Finally, the fourth column is produced as a list of
WorkOrderId values, using the magical
data() keyword in a nested FOR XML PATH statement. data() tells SQL Server to generate a
space-delimited list of atomic column values, one value for each row in the result set.
Note that the nested query is merely used to generate a list of
WorkOrderId values. The
empty string is given for the
PATH keyword, telling the XML engine not to generate a


default element at all, so no XML is generated whatsoever! You can extract and test the
statement to see this in action.
The nested query applies the same
WHERE clause as its parent to filter WorkOrderId values
where the value of
ScrapReasonId is 12. This ensures the relevancy of the nested data to
the outer query.
The resulting list of values is grafted to the XML of the outer statement, using the column
alias
’WorkOrders/@WorkOrderIds’.
FOR XML and the xml Data Type
By default, the results of any FOR XML query (using all four modes) is streamed to output
as a one-column/one-row dataset with a column named
XML_F52E2B61-18A1-11d1-B105-
00805F49916B of type nvarchar(max). (In SQL Server 2000, this was a stream of XML split
into multiple
varchar(8000) rows.)
One of the biggest limitations of SQL Server 2000’s XML production was the inability to
save the results of a
FOR XML query to a variable or store it in a column directly without
using some middleware code to first save the XML as a string and then insert it back into
an
ntext or nvarchar column and then select it out again.
ptg
1885
Relational Data As XML: The FOR XML Modes
47
Today, SQL Server 2008 natively supports column storage of XML, using the xml data
type. Be sure to read the section “Using the
xml Data Type,” later in this chapter, for a

complete overview.
You can easily convert
FOR XML results to instances of xml by using the TYPE directive with
all four modes (
RAW, AUTO, EXPLICIT, and PATH). Listing 47.12 demonstrates the use of FOR
XML PATH with the TYPE directive.
LISTING 47.12 Using FOR XML PATH, TYPE to Create an Instance of the xml Data Type
SELECT *
FROM Production.WorkOrder WorkOrder
WHERE ScrapReasonId = 12
AND WorkOrderId = 72370
FOR XML RAW(‘WorkOrder’), ELEMENTS XSINIL, ROOT(‘WorkOrders’), TYPE
go
<WorkOrders xmlns:xsi=” /><WorkOrder>
<WorkOrderID>72370</WorkOrderID>
<ProductID>329</ProductID>
<OrderQty>48</OrderQty>
<StockedQty>47</StockedQty>
<ScrappedQty>1</ScrappedQty>
<StartDate>2008-07-01T00:00:00</StartDate>
<EndDate>2008-07-11T00:00:00</EndDate>
<DueDate>2008-07-12T00:00:00</DueDate>
<ScrapReasonID>12</ScrapReasonID>
<ModifiedDate>2008-07-11T00:00:00</ModifiedDate>
</WorkOrder>
</WorkOrders>
Notice that in contrast to the preceding FOR XML examples, in this example, the query
window in SQL Server Management Studio (SSMS) no longer displays the lengthy XML
column UUID in the results frame, nor on the window tab. The results have been cast to a
single instance of the

xml data type, ready for use in variables of type xml, in subsequent
queries, inserted into
xml columns, or returned to the client.
The five xml data type methods—value(), exist(), nodes(), query(), and modify(),
discussed later in this chapter, in the section “The Built-in
xml Data Type Methods”—can
be intermixed with relational queries by using all
FOR XML modes. This makes it even
easier to shape your XML exactly the way you want.
Listing 47.13 demonstrates how you can nest XQuery queries inside regular
FOR XML T-
SQL to produce XML documents built from both relational and XML sources.
ptg
1886
CHAPTER 47 Using XML in SQL Server 2008
LISTING 47.13 Bridging the Gap Between Relational and XML Data by Using FOR XML PATH
and the xml Data Type
SELECT
FirstName,
LastName,
E.JobTitle,
Resume.query(
‘declare namespace ns=” />adventure-works/Resume”;
//ns:Education

) ‘*’
FROM HumanResources.Employee E
JOIN Person.Person C on E.BusinessEntityID = C. BusinessEntityID
JOIN HumanResources.JobCandidate J on J. BusinessEntityID = E. BusinessEntityID
WHERE J.JobCandidateId = 8

FOR XML PATH(‘AWorthyJobCandidate’), TYPE
go
<AWorthyJobCandidate>
<FirstName>Peng</FirstName>
<LastName>Wu</LastName>
<Title>Quality Assurance Supervisor</Title>
<ns:Education xmlns:ns=” />works/Resume”>
<ns:Edu.Level> </ns:Edu.Level>
<ns:Edu.StartDate>1986-09-15Z</ns:Edu.StartDate>
<ns:Edu.EndDate>1990-05-15Z</ns:Edu.EndDate>
<ns:Edu.Degree>Bachelor of Science</ns:Edu.Degree>
<ns:Edu.Major> </ns:Edu.Major>
<ns:Edu.Minor />
<ns:Edu.GPA>3.3</ns:Edu.GPA>
<ns:Edu.GPAScale>4</ns:Edu.GPAScale>
<ns:Edu.School>Western University</ns:Edu.School>
<ns:Edu.Location>
<ns:Location>
<ns:Loc.CountryRegion>US </ns:Loc.CountryRegion>
<ns:Loc.State>WA </ns:Loc.State>
<ns:Loc.City>Seattle</ns:Loc.City>
</ns:Location>
</ns:Edu.Location>
</ns:Education>
</AWorthyJobCandidate>
ptg
1887
XML As Relational Data: Using OPENXML
47
In this example, the asterisk (*) is used as a column alias for the results of the nested

query (on
HumanResources.JobCandidate.Resume), telling SQL Server to simply output the
XML inline with the other nodes.
XML As Relational Data: Using OPENXML
This section covers what might be called the inverse of FOR XML: OPENXML. You use
OPENXML in T-SQL queries to read XML data and shred (or decompose) it into relational
result sets.
OPENXML is part of the SELECT statement, and you use it to generate a table
from an XML source.
The first step required in this process is a call to the system stored procedure
sp_xml_preparedocument. sp_xml_preparedocument creates an in-memory representation
of any XML document tree for use in querying. It takes the following parameters:
. An integer output parameter for storing a handle to the document tree
. The XML input data
. An optional XML namespace declaration, used in subsequent
OPENXML queries
sp_xml_preparedocument is able to convert the following data types into internal XML
objects:
text, ntext, varchar, nvarchar, single-quoted literal strings, and untyped XML
(data from an
xml column having no associated schema collection). This is its syntax:
sp_xml_preparedocument integer_variable OUTPUT[, xmltext ][, xpath_namespaces ]
And here is an example of OPENXML in use:
DECLARE @XmlDoc XML, @iXml int
SET @XmlDoc = ‘
<ex:ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”>
<ex:foo>hello</ex:foo>
<ex:bar>sql!</ex:bar>
</ex:ExampleDoc>’
EXEC sp_xml_preparedocument

@iXml OUTPUT,
@XmlDoc,
‘<ExampleDoc xmlns:ex=”urn:www-samspublishing-com:examples”/>’
SELECT id, parentid, nodetype, localname, prefix
FROM OPENXML(@iXml, ‘/ex:ExampleDoc/ex:foo’)
WITH (foo varchar(10) ‘/ex:ExampleDoc/ex:foo’)
EXEC sp_xml_removedocument @iXml
go
ptg
1888
CHAPTER 47 Using XML in SQL Server 2008
id parentid nodetype localname prefix

3 0 1 foo ex
5 3 3 #text NULL
Notice in the example that the WITH predicate has been commented out. This is to illus-
trate in the query results what is known as an edge table: the XML document in its rela-
tional form. Edge is a term taken from graph theory. It refers to what you might visualize
as a depth line between two nodes.
If the edge table looks familiar, the reason is probably that it bears a resemblance to the
universal table that must be created for
EXPLICIT mode. As with the universal table, the
edge table follows the adjacency list model for its hierarchical relationships. The node
types of the input XML are marked in the
nodetype column (1 = element, 2 = attribute, 3
= text). Namespaces are stored in namespaceuri, and the data of each node is stored in the
text column.
If you uncomment the WITH predicate and change the query from SELECT * to SELECT
foo, you get back a one-row/one-column table with a column called foo that has the
varchar(10) value hello. This shows that the WITH predicate instructs OPENXML how to

decompose the nodes to columns by using XPath syntax.
The syntax for
OPENXML (including the WITH predicate) is as follows:
OPENXML(integer_document_handle_variable int, rowpattern nvarchar,[flags byte])
[WITH (SchemaDeclaration | TableName)]
Let’s match this syntax with the values in the example:
. The first parameter is the local variable
@iXml, which acts as a handle to the internal
XML representation.
. The next parameter is a row pattern in XPath syntax that tells
OPENXML how to select
nodes into rows.
OPENXML generates one row in the result set for each node that
matches this row pattern. This is similar to the .NET
XmlDocument object’s
SelectNodes() method, insofar as every matching node in rowpattern returns a row
in the rowset.
. The result set’s columns are then defined, using matching nodes as the context and
the XPath in the column definitions of the
WITH predicate to find the values relative
to the node.
. The
flags parameter is a combinable byte value that controls how the selected XML
nodes are to be decomposed. The following values are possible:
.
0—Uses attribute-centric decomposition. In this case, each attribute in the
source XML is decomposed into a column. This is the default.
.
1—Uses attribute-centric decomposition. May be combined with flag 2 (that is,
the value

3 may be specified). Combining flags 1 and 2 tells the rowset genera-
tor how to deal with the values in the XML not yet accounted for in the down-
ptg
1889
XML As Relational Data: Using OPENXML
47
ward parse of the XML document from nodes into rows. In other words,
attribute-centric decomposition takes place before element-centric decomposi-
tion. This point is important because without the combinability of the flags,
only one or the other decomposition will happen, and (lacking a
WITH predi-
cate that captures all the nodes) some nodes would not make it into the rowset.
.
2—Uses element-centric decomposition. Combinable with flag 1 (that is,
specify
3).
. 8—Tells the rowset generator how to deal with text data in the metaproperties
(not covered in this chapter). Can be combined with flags
1, 2, or both.
Note that the column generation determined by the flags 0, 1, and 2 can all be overridden
by the XPath expressions expressed in the lines of the
WITH predicate. For example, if the 1
flag is specified to map a particular attribute to a column, but in the line of the WITH pred-
icate for that same column, the XPath maps the value from an XML element, the
WITH
predicate takes precedence. It’s truly best to just set the value of flags to 3 in most cases,
unless you care to ignore attributes or elements for some reason.
The syntax of the
WITH predicate tells the rowset generator which column names and data
types to use when mapping the XML to rows. If the structure of the input XML matches

the schema of a particular table in your database, the name of that table may be specified.
An example of this case occurs when the input XML has been produced from an existing
table, using
FOR XML. The values in the FOR XML-produced document have been updated,
and the new values need to make it back into the table. The following code example illus-
trates this common scenario:
DECLARE @JobCandidateXmlDoc XML, @iXml int
SET @JobCandidateXmlDoc = ‘
<JobCandidateUpdate>
<ModifiedDate>
10/5/2008 12:34PM
</ModifiedDate>
</JobCandidateUpdate>’
EXEC sp_xml_preparedocument
@iXml OUTPUT,
@JobCandidateXmlDoc,
‘<JobCandidateUpdate
xmlns:ns=” />works/Resume”/>’;
UPDATE HumanResources.JobCandidate
SET ModifiedDate = OXML.ModifiedDate
FROM
(
SELECT *
FROM OPENXML(@iXml, ‘/JobCandidateUpdate’, 2)
WITH HumanResources.JobCandidate
) AS OXML
ptg
1890
CHAPTER 47 Using XML in SQL Server 2008
WHERE JobCandidateId = 8

EXEC sp_xml_removedocument @iXml
go
(1 row(s) affected)
If a table name is not specified, you need to specify a comma-separated list of lines, using
the following syntax:
column_name datatype ‘XPath’
The following list explains each part of the preceding syntax:
.
column_name—Provides a relational name for the XML-produced column.
. datatype—Provides a T-SQL data type for the XML-produced column.
. ’XPath’—Specifies a row pattern that matches the nodes in the XML whose values
are to be mapped to the XML-produced column.
When you’re done reading out the XML, it’s important to free the memory used to hold
the internal XML document. You accomplish this by calling the system stored procedure
sp_xml_removedocument, as in the following example:
EXEC sp_xml_removedocument @iXml
Using the xml Data Type
The xml data type is a real problem solver for those who use both XML and SQL Server on
a daily basis. Relational columns and XML data can be stored side by side in the same
table, in an implementation that plays to the strengths of both. With SQL Server’s power-
ful XML storage, validation, querying, and indexing capabilities, it’s bound to cause quite
a stir in the field of XML content management and beyond.
Some of the benefits of storing XML on the database tier can be realized immediately.
Building middleware using the .NET Framework to manage XML stored in columns, rather
than on the filesystem, is a far more robust solution than depending on the filesystem;
plus, it’s a lot easier to access the content from anywhere.
SQL Server inherently provides to stored XML the traditional DBMS benefits of backup
and restoration, replication and failover, query optimization, granular locking, indexing,
and content validation. The
xml data type can be used with local variable declarations, as

the output of user-defined functions, as input parameters to stored procedures and func-
tions, and much more. XML instances containing up to 128 levels of nesting can be stored
in
xml columns; deeper instances cannot be inserted, nor may existing instances be made
to increase beyond this depth via the
modify() data type method.
xml columns can also be used to store code files such as XSLT, XSD, XHTML, and any
other well-formed content. These files can then be retrieved by user-defined functions
written in managed code hosted by SQL Server. (See Chapter 53, “SQL Server 2008
Reporting Services,” for a full review of SQL Server–managed hosting.)
ptg
1891
Using the xml Data Type
47
NOTE
In some cases, it’s still a perfectly valid scenario to store XML on the filesystem or in
[n]varchar(max), [n]text, or [n]varbinary(max) columns. In a few cases this
usage is actually recommended. The following summary details some possible XML
usage scenarios and makes suggestions for each.
XML data is stored in an internal binary format and can be up to 2GB in size.
Before we dig into the many uses of the
xml data type, it’s worthwhile to consider some of
the different ways you can leverage your institution’s XML with SQL Server:
. XML can be used solely as a temporary output format produced from relational data,
using
FOR XML. This applies in scenarios in which the relational tables hold the real-
time data and XML is produced only for read-only application uses, as in the display
of dynamic web pages. In this scenario, the XML really just provides a DBMS-inde-
pendent, easy-to-transform view of the data.
. XML can be stored in relational (

nvarchar and so on) columns, as done previously.
This might be the best option when your XML is sometimes not well formed or
when the learning curve to XQuery is too high for an application-delivery time
frame. This is also a valuable option when the byte-for-byte exactness of the XML
must be preserved.
Note that the latter is a necessary option in some institutions because typed XML
(that is,
xml data type columns associated with a schema collection) storage disre-
gards extra whitespace characters, namespace prefixes, attribute order, and the XML
declaration to make way for query optimizations. This scenario also leverages fast
data retrieval because, as far as SQL Server is concerned, XML is never brought into
the mix (it’s all relational). The data can still be converted to the
xml data type,
using the methods described earlier, and applications can use
OPENXML to read it as
well. To read XML into SQL Server from server-side accessible files, you call the T-
SQL
OPENROWSET function.
. The XML can be stored as untyped XML—that is, XML stored in an
xml data type
column lacking an associated schema collection. This provides the benefits of query-
ing the XML using the data type methods (discussed later in the section “The Built-
in
xml Data Type Methods”) and provides server-side checks for well-formed XML.
This scenario also allows for the possibility that XML adhering to any (or no)
schemas may reside in the column. A schema collection could be added later to pro-
vide validation on the existing data (although a few intermediate editing steps may
be necessary if any documents fail to validate).
Safely armed with an understanding of some of the different options and uses, let’s plunge
into our discussion of

xml.
ptg
1892
CHAPTER 47 Using XML in SQL Server 2008
Defining and Using xml Columns
You can add columns of type xml to any table by using a familiar Data Definition
Language (DDL) syntax, with a few new twists. Much like their relational counterparts,
xml columns, parameters, and variables may contain null or non-null values.
The following snippet shows the DDL used to create the table
HumanResources.JobCandidate from AdventureWorks2008. The column you are concerned
with is
Resume:
CREATE TABLE [HumanResources].[JobCandidate](
[JobCandidateID] [int] IDENTITY(1,1) NOT NULL,
[EmployeeID] [int] NULL,
[Resume] [xml](CONTENT [HumanResources].[HRResumeSchemaCollection]) NULL,
[ModifiedDate] [datetime] NOT NULL
CONSTRAINT [DF_JobCandidate_ModifiedDate] DEFAULT (getdate()),
CONSTRAINT [PK_JobCandidate_JobCandidateID] PRIMARY KEY CLUSTERED
(
[JobCandidateID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
When you are defining objects of type xml, either of two facets may be applied:
.
CONTENT—This facet specifies that well-formed XML documents as well as fragments
may be inserted into the
xml column or variable. (CONTENT is the default and may be
omitted from the definition.)
Fragments may have more than one top-level node (as is produced, by default, using

FOR XML), and elements may be mixed with text-only nodes.
. DOCUMENT—This facet specifies that only well-formed, valid XML conforming to a
specified schema collection may be stored. Updates to the column must also result
in schema-valid, well-formed XML.
XML schema collections can be associated with
xml variables, parameters, or columns. The
name of the schema collection is specified directly after the chosen facet, as is done in
JobCandidate.Resume.
The following code example defines a typed xml local variable that allows only valid
Resume data to be stored in it:
DECLARE @ValidWellFormed xml (DOCUMENT HumanResources.HRResumeSchemaCollection)
Trying to insert the following well-formed but invalid document throws an error that says
the first (and only)
ThisBlowsUp element in the document is not declared in any of the
schemas in
HRResumeSchemaCollection:
SELECT @ValidWellFormed = ‘<ThisBlowsUp/>’
go
ptg
1893
Using the xml Data Type
47
XML Validation: Declaration not found for element ‘ThisBlowsUp’.
Location:/*:ThisBlowsUp[1]
When you change the facet to CONTENT (the default) and remove the schema association,
the following is possible:
DECLARE @WellFormed xml
SELECT @WellFormed = ‘<ThisWorks/>’
go
Command(s) completed successfully.

When defining xml columns, you can specify defaults and constraints just as you do with
relational columns. Consider the following example:
CREATE TABLE XmlExample
(
XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0)
)
This example creates an xml column called XmlColumn that starts out having an empty root
node. Notice how the string ’<root/>’ is converted to the xml type. This is actually not
necessary because conversions from literal strings and from
varchar to xml are implicit.
The next example adds a table-level constraint to XmlColumn to make sure the root node
always exists. It depends on a scalar-valued user-defined function to do its validation work:
CREATE FUNCTION dbo.fn_XmlColumnNotNull
(
@XmlColumnValue xml
)
RETURNS bit
AS
BEGIN
RETURN @XmlColumnValue.exist(‘/root’)
END
GO
CREATE TABLE XmlExample
(
XmlColumn xml NOT NULL DEFAULT CONVERT(xml,’<root/>’,0)
)
GO
ALTER TABLE XmlExample WITH CHECK
ADD CONSTRAINT CK_XmlExample_HasRoot
CHECK (dbo.fn_XmlColumnNotNull(XmlColumn) = 1)

The following statement thus fails:
INSERT XmlExample SELECT ‘<foo/>’

×