Tải bản đầy đủ (.pdf) (10 trang)

Hands-On Microsoft SQL Server 2008 Integration Services part 19 docx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (339.14 KB, 10 trang )

158 Hands-On Microsoft SQL Server 2008 Integration Services
Exercise (Configure File System Task)
In this final part, you will configure the File System task to move the downloaded
zipped files from the C:\SSIS\downloads folder to the C:\SSIS\downloads\Archive
folder. This task will move the files one by one with each iteration of the Foreach Loop
Container. It will use the variable User::fname, populated by Foreach Loop Container,
to determine the source filename.
8. Drag and drop the File System task from the Toolbox within the Enumerating
Files Container.
9. Double-click the File System Task icon to open the File System Task Editor.
Select Move File in the Operation field. In the General area on the right pane, fill
in the following details:
Figure 5-6 Configuring the Foreach Loop Container to enumerate fully qualified filenames
Chapter 5: Integration Services Control Flow Tasks 159
Name Archive downloaded files
Description This task copies downloaded files from the ‘downloads’ folder to the ‘Archive’
folder.
10. In the Source Connection section, set IsSourcePathVariable to True.
11. Click in the SourceVariable field and then click the down arrow to see the drop-
down list. Choose User::fname as shown in Figure 5-7.
12. In the Destination Connection section, verify that IsDestinationPathVariable is
set to False.
Figure 5-7 Configuring the File System task for moving files
160 Hands-On Microsoft SQL Server 2008 Integration Services
13. Click in the DestinationConnection field and then click the down arrow to see
the drop-down list. Choose the <New Connection…> to open File Connection
Manager Editor. In the Usage type field, select Existing Folder and type
C:\ SSIS\downloads\Archive in the Folder field. Note that a File Connection
Manager named Archive has been created.
14. The OverwriteDestination field allows you to overwrite the files with the same
name at destination folder. Be mindful while configuring this option in the


production environment. Leave it set at the default value of False. Click OK to
close the File System Task Editor.
15. Now that your package is ready to be run, press 5 on the keyboard to run the
package and notice how the Enumerating Files Container changes from yellow
followed by Archive Downloaded Files task changing from yellow to green. This
cycle is repeated twice before both the objects stop processing and turn green
to declare success of the operation. Each time Archive Downloaded Files task
changes color from yellow to green, one file has been moved. Stop debugging the
package by pressing -
5.
16. Run Windows Explorer to check that the files have been moved from C:\SSIS\
downloads folder to the Archive subfolder in this directory.
17. Press -- to save all the files in this solution and then choose
File | Close Project.
Review
You have configured the Foreach Loop Container to enumerate over files in a folder
and pass the filenames via a variable to the File System task. The variable passed by
the Foreach Loop Container was used to set the source filename in the File System
task, which was configured to move files from a dynamic source to the hard-coded
destination Archive folder. In this exercise, you have seen the functionality provided
by SSIS components to run in synchronization, where one component was reading the
files one by one and passing the information to the other component that was moving
those files to a different folder as it receives the filenames from the parent container.
Web Service Task
You can read data from a Web Service method and write that data to a variable or a file
using the Web Service task. For example, you can obtain a list of postal codes from the
local postal company, write it to a flat file using the Web Service task, and then do the
lookup against this postal codes file to clean or standardize your data at loading time.
Web Service task uses the HTTP Connection Manager to connect to the web
service. HTTP Connection Manager specifies the server URL, user credentials,

optional client certificate details, time-out length, proxy settings, and so on.
Chapter 5: Integration Services Control Flow Tasks 161
The Web Service Description Language (WSDL) is an XML-based language used
for defining web services in a WSDL file, which lists the methods that the web service
offers, the input parameters that the methods require, the responses that the methods
return, and how to communicate with the web service. Thus, a web service requires a
WSDL file to get details of settings to communicate with another web service. The
HTTP Connection Manager can specify in the Server URL field a web site URL or
a WSDL file URL. If you specify the WSDL file URL in the Server URL field, the
computer can download the WSDL file automatically. However, if you are specifying
the web site URL, you must copy the WSDL file to the local computer.
XML Task
Whenever you are working with XML data, you will be most likely using the XML
task to perform operations on the XML documents. This task is designed to work
with the XML documents from the workflow point of view, whereas if you want to
bring XML data, i.e., the content of an XML document in the data flow, to apply
transformations, you will be using the XML Source adapter while configuring your
Data Flow task. The XML Source adapter is available in the Data Flow Sources section
in the Toolbox when you’re working with the Data Flow task on the Data Flow panel.
Using the XML task, you can perform the following operations on XML
documents:
1. Retrieve XML documents and dynamically modify those documents at run time.
2. Select a segment of the data from the XML document using XPath expressions
similar to how you select data using an SQL query against database tables.
3. Transform an XML document using XSLT (extensible stylesheet language
transformations) style sheets and make it compatible with your application
database.
4. Merge multiple documents to make one comprehensive document at run time
and use it to create reports.
5. Validate an XML document against the specified schema definition.

6. Compare an XML document against another XML document.
The XML task can automatically retrieve a source XML document from a specified
location. To perform this operation, the XML task can use a File Connection Manager,
though you can directly enter XML data in the task or specify a variable to access the XML
file. If the XML task is configured to use a File Connection Manager, the connection
string specified inside the File Connection Manager provides the information of the
path of the XML file; however, if the XML task is configured to use a variable, the
162 Hands-On Microsoft SQL Server 2008 Integration Services
specified variable contains the path to the XML document. At run time, other processes
or tasks in the package can dynamically populate this variable. Like the retrieval process
of XML documents, the XML task can save the result set after applying the defined
operation to a variable or file. By now, you can guess that to write to a file, the XML
task will be using a File Connection Manager.
The XML Task Editor has a dynamic configuration interface that changes depending
upon the type of operation you choose to apply to the XML documents. Following are
the descriptions of these configuration areas.
Input Section
As mentioned, the XML task can retrieve the source document that is specified under the
Input section in the XML Task Editor. You can choose from three available SourceType
options: Direct Input allows you to type in XML data directly in the Source field; File
Connection allows you to specify a File Connection Manager in the Source field; and
Variable allows you to specify a variable name in the Source field.
Second Operand Section
This section defines the second document required for the operation to be performed.
The type of second document depends on the type of operation. For example, the
second document type will be an XML document if you are merging two documents,
while the second document will be an XSD (XML Schema Definition) document if
you are trying to validate an XML document against an XSD schema. Again, like the
Input section, you can choose between the three types—Direct input, File connection,
and Variable—in the SecondOperandType field and, based on your choice, specify the

document details in the SecondOperand field.
Output Section
In this section, you specify whether you want to save the results of the operation
performed by running the XML task. You can save the results to a variable or a file by
using the File Connection Manager to specify the destination file. You can also choose
to overwrite the destination.
Operation Options Section
This section is dynamic and changes with the option selection. For example, for a
Diff operation, this section will change to the Diff Options section (see Figure 5-8),
and for Merge operation, this will become the Merge Options section with its specific
fields relevant to the operation. The two operations XSLT and Patch do not have this
section at all.
Chapter 5: Integration Services Control Flow Tasks 163
The XML task has six predefined operations for you to use. The configuration
layout of the options changes as soon as you select a different operation.
Validate
You can validate the XML document against a Document Type Definition (DTD) or
XML Schema Definition (XSD) schema. The XML document you want to validate is
specified in the Input section in the Editor, and the schema document is specified in the
Second Operand section. The type of schema document depends upon what you specify
for ValidationType—XSD or DTD. With either type of ValidationType, you can
choose to fail the operation on a validation failure in the FailOnValidationFail field.
Figure 5-8 The XML Task Editor
164 Hands-On Microsoft SQL Server 2008 Integration Services
XSLT
You can perform XSL transformations on the XML documents using XSLT style
sheets. The Second Operand should contain the reference to the XSLT document,
which you can type directly into the field or specify by using either the File Connection
Manager or a variable.
XPATH

Using this operation, you can perform XPATH queries and evaluations on the XML
document. The Second Operand should contain a reference to the second XML
document, which you can type directly into the field or specify by using either the File
Connection Manager or a variable. You can select the type of XPATH operation in the
XPathOperation field. The XPathOperation field provides three options.
Evaluation
c Return the results of an XPath function such as sum().
Node list c Return the selected nodes as an XML fragment.
Values c Return the results in a concatenated string for text values of all the
selected nodes.
Merge
Using this operation, you can merge two XML documents. This operation adds the
contents of the document specified in the Second Operand section into the source
document. The operation can specify a merge location within the base document.
One thing to note here is that the XML task merges only the documents that have
Unicode encoding. To determine whether your documents are using Unicode encoding,
open the XML document with any editor or using Notepad and look at the beginning
of the document to find
[encoding="UTF-8"] in the declaration statement. UTF-8
indicates the 8-bit Unicode encoding.
Diff
Using this operation you can compare the source XML document to the document
specified in the Second Operand section and write the differences to an XML
document called a Diffgram document. The Diff operation provides a number of
options to customize this comparison:
DiffAlgorithm
c Provides three choices: Auto, Fast, and Precise. You can choose
between comparison algorithm to be fast or precise. e Auto option lets the Diff
operation decide whether to select a fast or precise comparison based on the size
of the documents being compared.

Chapter 5: Integration Services Control Flow Tasks 165
IgnoreComments c Specifies whether comment nodes are compared.
IgnoreNamespaces
c Indicates whether the namespace URI (uniform resource
identifier) of an element and its attribute names are compared.
IgnorePrefixes c Specifies whether prefixes of element and attribute names are
compared.
IgnoreXMLDeclaration
c Specifies whether the XML declarations are
compared.
IgnoreOrderOfChildElements
c XML documents have hierarchical structure,
and this option specifies whether the order of child elements is compared.
IgnoreWhiteSpaces
c Specifies whether white spaces are compared.
IgnoreProcessingInstructions
c Specifies whether the processing instructions are
compared.
IgnoreDTD
c Specifies whether the DTD is ignored.
FailOnDifference c Specifies whether the task fails if the Diff operation fails,
e.g., an XML document fails to validate according to the validation schema.
SaveDiffGram
c Choose to save the comparison result in a Diffgram document.
Patch
Using this operation, you can apply the Diffgram document you saved earlier in the
package during the Diff operation to an XML document. By doing this, you actually
create a new XML document that includes the contents of the Diffgram document
created earlier by the Diff operation.
Execute SQL Task

The Execute SQL task is the main workhorse task to run SQL statements or stored
procedures and uses the power of the underlying relational database. If you have used
DTS, you may have used this task. Typically in DTS, once you have loaded data
into a database and you apply transformations using the Execute SQL task. These
transformations vary from generating salutations to lookup transformations, deriving
columns, or applying business rules using the SQL Server relational engine. The design
philosophy used in SQL Server 2008 Integration Services allow you to perform many
of these tasks during the loading phase while data is still in memory, thereby increasing
performance by reducing the repeated and inefficient transformations that require data
to be staged or involved read/write operations on hard disks. The power of the Execute
166 Hands-On Microsoft SQL Server 2008 Integration Services
SQL task is still available in SSIS in a more usable form by providing ability to use
variables, to create expressions over the properties of the task, and to return a result set
to the control flow that can be used to populate a variable.
Using the Execute SQL task, you can perform workflow tasks such as create, alter,
drop, or truncate tables or views. You can run a stored procedure and store the result
to a variable to be used later in the package. You can use it to run either a single
SQL statement or multiple SQL statements, parameterized SQL statements, and
save the rowset returned from the query in to a variable. You have already used this
task in the “Using System Variables to Create Custom Logs” Hands-On exercise in
Chapter 3, where you used a parameterized SQL statement, and in the “Contacting
Opportunities” Hands-On exercise in Chapter 4, where you saved the resulting rowset
to a variable, which then got enumerated over by a Foreach Loop Container.
If you scroll to the Maintenance Plan Tasks in the Control Flow Toolbox, you will
see a similar task, the Execute T-SQL Statement task. The Execute T-SQL Statement
task has a more simple interface than the Execute SQL task and is focused on
performing maintenance tasks on SQL Server databases using T-SQL. It doesn’t give
you any facility to run parameterized queries and direct the result set to the work flow,
whereas the Execute SQL task has a more complex interface and is designed for use in
a relatively complex workflow where you need to use SQL Statements against not only

the SQL Server but a variety of sources, deal with variables, run parameterized queries,
or direct the result set to the data flow.
Keep this task open in front of you and try various selections as we go through each
option, as the task contains dynamic fields that change depending upon the choices you
make in certain fields.
The Execute SQL Task Editor includes General, Parameter Mapping, Result Set,
and Expressions pages.
General Page
In this page, you define a Name and Description for the task under the General section.
In the Options section, you can specify a TimeOut value in seconds for the query to run
before timing out. By default, the TimeOut value is 0 (zero), indicating an infinite time.
The CodePage field allows you to specify the code page value.
In the Result Set section, you choose one of four options based upon the result set
returned by the SQL statement you specify in this task. Based on the type of SQL
statement—i.e., whether it is a SELECT statement or INSERT/UPDATE/DELETE
statement—the result set may or may not be returned. Also, the result set may contain
Chapter 5: Integration Services Control Flow Tasks 167
zero rows, one row, or many rows, and the following four options in the Execute SQL
Task Editor ResultSet field allow you to configure them:
None
c Use this value when you use INSERT, UPDATE, or DELETE SQL
statement that returns the result set containing zero rows.
Single Row
c When the SQL statement or a stored procedure returns a single
row in the result set.
Full result set
c When the query returns more than one rows.
XML c When the SQL statement returns a result set in the XML format.
In the SQL Statement section ConnectionType field, options are the EXCEL,
OLE DB, ODBC, ADO, ADO.NET, and SQLMOBILE connection manager types,

used for connecting to a data source. Depending on the type of connection manager
you’ve chosen, the Connection field provides a drop-down list of already configured
connection managers of the same type or provides a <New Connection…> option to
let you add a connection manager of the appropriate type. The interface provided by
the <New Connection…> option changes to match your selection of the connection
manager specified in the ConnectionType field.
Depending on the data source type, you can write a query using an SQL statement in
the dialect the specified data source can parse. Further, you can specify the source from
where the SQL statement can be read for execution in the SQLSourceType field. The
selection of the source in the SQLSourceType field changes the next field dynamically
(which is coupled to it) to match the SQLSourceType choice. The options available in
the SQLSourceType field and how it affects the coupled field are explained here:
Direct input
c Allows you to type an SQL statement directly in the task. is
changes the coupled field to SQLStatement, which provides an interface in which
to type your query.
File connection
c If you have multiple SQL statements written in a file, you can
choose this option to enable the Execute SQL task to get the SQL statements
from the file. Selecting this option changes the coupled field to FileConnection,
which allows you to specify a File Connection Manager to connect to an existing
file containing SQL statements.
Variable
c Enables the Execute SQL task to read the SQL statement from a
variable. is option changes the coupled field to the SourceVariable field, which
provides a drop-down list of all the system and user variables.

×