Tải bản đầy đủ (.pdf) (10 trang)

Hands-On Microsoft SQL Server 2008 Integration Services part 34 pptx

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (281.05 KB, 10 trang )

308 Hands-On Microsoft SQL Server 2008 Integration Services
because both the containers were running under one transaction that was started
by the package, and when one of the tasks in a container fails, the transaction rolls
back all the work done by previous tasks. One lesson to learn from this exercise
is that the parent container, which is the package in this case, must have its
TransactionOption property set to Required to start a transaction, and the child
containers need to have at least the Supported attribute for this property.
Exercise (Case III: Transaction Spanning over Multiple Packages)
In the last part of this exercise, you will use a transaction to roll back the inconsistent
data when your loading process uses multiple packages. When you have multiple
packages to process, you use the Execute Package task to embed them inside a single
package to run them. The Execute Package task is basically a wrapper task that enables
a package to be used inside another package. The Execute Package task is covered in
Chapter 5.
26. Right-click the SSIS Packages node in the Solution Explorer window and choose
New SSIS Package from the context menu. You will see that the new package has
been added with the default name of Package1.dtsx and the screen is switched to
the new package. Note that the Designer shows these two packages as tabs.
27. Go to Package.dtsx, right-click the localhost.Campaign Connection Manager,
and choose Copy. Switch back to Package1.dtsx and paste this connection
manager in the Connection Managers area.
28. Again go to Package.dtsx and cut the Sequence Container 1 with Loading
Vehicle Task, return to Package1.dtsx, and paste this container on the Control
Flow. You will see a validation error about the connection manager on the
Loading Vehicle task. This is because the ID for the localhost.Campaign
Connection Manager has been changed.
29. Double-click the Loading Vehicle task icon to open the editor. In the Connection
field, choose localhost.Campaign Connection Manager from the drop-down list
and click OK. You’ve divided the first package into two separate packages. To run
these two packages as a single job, you need to create a new package and call these
two packages using the Package Execute task.


30. Right-click the SSIS Packages node in the Solution Explorer window and choose
New SSIS Package from the context menu. When the new blank package is
loaded, drop two Execute Package tasks on the Control Flow surface.
31. Rename the first Execute Package task Package and the second task Package1.
Join Package to Package1 using an on-success precedence constraint.
32. Double-click the Package icon to open the editor. Go to the Package page and
change the Location field value to File System.
Chapter 8: Advanced Features of Integration Services 309
33. Click in the Connection field and then click the drop-down arrow and choose
<New Connection . . .>. In the File Connection Manager Editor’s File field, type
C:\SSIS\Projects\Maintaining data Integrity with Transactions\Package.dtsx
and click OK. You will see Package.dtsx displayed in the Connection field. Click
OK to close the Execute Package Task Editor.
34. As in the last two steps, open the editor for the Package1 task, change the Location
to File System, and add a file connection manager in the Connection field pointing
to C:\SSIS\Projects\Maintaining data Integrity with Transactions\Package1.dtsx
as the existing file. Close the Execute Package Task Editor after making these
changes.
35. Click anywhere on the blank surface of the Control Flow panel and press 4
to open the Properties window for the package. Scroll down and locate the
Transactions section and set the TransactionOption property to Required. This will
run both the Execute Package tasks and hence the child packages in the context
of a single transaction. However, before proceeding any further, verify that the
TransactionOption is set to the default value on Package and Package1 tasks and on
the Package1.dtsx package. The Package.dtsx will have this property set to Required,
which is okay, as this will also enable it to join the transaction started by Package2
.dtsx. At this time, your package will look like the one shown in Figure 8-7.
Figure 8-7 Calling multiple packages using the Execute Package tasks
310 Hands-On Microsoft SQL Server 2008 Integration Services
36. Go to the Solution Explorer window, right-click Package2.dtsx, and then select

Execute Package from the context menu. You will see that the Package.dtsx will
execute successfully and then Package1.dtsx will execute, but it fails, and the
components will turn red.
37. Switch to SQL Server Management Studio and run the command you created
in Step 10 in the first sequence of steps to see the results. You will see that
still no record has been added to the tables, despite the fact that Package.dtsx
executed successfully. This is because both the packages were running under
one transaction. And when the Loading Vehicle task failed in the Package1.dtsx
package, the transaction rolled back not only all the tasks in this package but also
the tasks in the other package, Package.dtsx.
Review
You’ve seen how you can use a transaction to combine various tasks and containers and
even the packages to behave as a single unit and create atomicity among them that will
commit or roll back as a unit. You’ve worked with the Sequence container to combine
set of tasks as a logical unit and have learned a new trick of copying and pasting tasks
among packages to increase productivity.
While all the preceding is useful when you want to use distributed transactions,
you cannot use the distributed transactions in all situations. Sometimes you may need
to use Native Transaction support. Native transactions are native to the RDBMS
that is used, for instance. A simple case could be that you create and populate a
temporary table in one task and want to use it later in another task. This kind of
requirement cannot be met using the distributed transaction support. In SSIS, when
you configure a task you specify a connection manager on each task. So, when a task
is run, a connection is opened specifically for that task, and later this connection is
closed when the defined operation on the task has been performed. The closure of a
connection doesn’t help to perform native transactions that need the same connection
to be retained across all the tasks involved. SSIS provides you a Boolean property on
the Connection Manager named intuitively the RetainSameConnection property
that allows you to keep a connection open across all the involved tasks. To use this
property, click the Connection Manager, then set the RetainSameConnection to

True, and then use this connection manager in all the tasks that participate in native
transaction process. One of the main benefits of using a native transaction is that you
can build a logic-based commit or rollback of the transaction that is otherwise not
possible with distributed transactions, which can commit or roll back only on success
or failure of the tasks involved.
Chapter 8: Advanced Features of Integration Services 311
Restarting Packages with Checkpoints
If you’re like most other information analysts and update your data warehouse every
night, this feature will be of much interest to you. After having set up logging for your
packages, every morning you’d be checking the logs for the last night’s update process to
see how the update went. You usually expect that the update process has been successful,
but what if the update process has failed? You will have to rerun your package during the
daytime—and I know you wouldn’t be happy about this, because doing this work during
business hours involves some serious implications. Your users will not get the latest
updates and will experience poor performance of the involved database servers while you
rerun the update process. If you’ve worked with DTS 2000 packages, you know that
DTS 2000 doesn’t support restating a package from the point of failure. You have to
rerun the package from the start or manually run the tasks individually, which is quite
involving and sometimes impossible to do. This is where Integration Services comes to
the rescue by providing improved functionality of restarting a package.
By using checkpoints with Integration Services packages, you can restart your failed
packages from the point of failure and can save the work that has completed successfully.
Integration Services writes all the information that is required to restart a failed package
in a checkpoint file. This file is created whenever you run a package the first time after
a successful completion, and it is deleted when the package successfully completes.
However, if an Integration Services package fails and is configured to use checkpoints,
the checkpoint file is not deleted; instead, it is updated with information that is required
to rerun the package from that point. When you rerun your package, Integration Services
checks two things before executing the package: whether the package is configured to use
checkpoints and whether the checkpoint file exists—i.e., whether the package failed while

executing last time. If it finds that the package configured to use checkpoints has actually
failed the last time it was run—i.e., the checkpoint file exists, it then reads the checkpoint
file associated with the package, gets the required information from the file, and restarts
the package from the point of failure.
The checkpoint file contains all the necessary information for a package to restart at
the point of failure such as the execution results of all the completed units of work, the
current values of variables involved, and package configuration information.
You decide the key positions in your package that would be good candidates for the
point of restart and can be written as checkpoints in the file. For example, you would
definitely designate a checkpoint immediately after the task that loads a large data set
or downloads multiple large files from an FTP site. In case of failure of the package
after successfully downloading files or completing loading the data set, the package will
be restarted after these tasks, as the checkpoint defines the starting place. As mentioned
earlier, the checkpoint file also contains the package configuration information—
i.e., the information about the configurations under which the package was running.
312 Hands-On Microsoft SQL Server 2008 Integration Services
This avoids reloading of package configurations, as this is read from the checkpoint file
and hence maintains the original configurations into which the package was running at
the time of failure.
To enable your package to record checkpoints information, you set the following
properties at the package level:
CheckpointUsage
c You can access this property in the Checkpoints section of
the package Properties window. is property can have one of three values: Never,
Always, or If Exists. e default value is Never, which means the checkpoints are
not enabled and no checkpoint file will be created; hence, the package will always
start processing from the beginning whenever it is executed. e second value is
Always, which, if selected, will make the package always use a checkpoint file.
If the package has failed in the previous execution and you’ve somehow deleted or
lost the checkpoint file, the package will fail to execute. e third possible value is

If Exists, which, when selected, makes the package use a checkpoint file if it exists
and start the package from the point of failure in the previous execution. You
can reuse a checkpoint file over and over for the same package. However, if the
checkpoint file doesn’t exist, the package will always start from the beginning. e
checkpoint file is specific to a package. Before executing a package, SSIS checks if
the PackageID in the checkpoint file is the same as that of the package. If there is
a mismatch, SSIS won’t execute the package.
SaveCheckpoints c After enabling your package to use checkpoints, you can set
this property to True to indicate that checkpoints should be saved.
CheckpointFileName
c Using this property, you can specify the path and the file
into which you would like to save checkpoints.
Along with these properties, you also need to set the FailPackageOnFailure property,
available in the Execution section in Properties window on the package and the containers,
to True to specify that the package will fail when a failure occurs. This property helps
in setting the checkpoints on the tasks that you want to make as points of restart.
If you do not set this property on any task or container in the package, the checkpoint
file will not include any information for the containers on failure and will restart the
package from the beginning. It is interesting to note the following points concerning
the smallest unit that can be restarted:
e smallest unit that can be restarted is a task.
c
e Data Flow task, which is a special task in Integration Services enclosing c
the data flow engine, can consist of several data flow transformations. is task
is considered similar to any other Control Flow task as far as checkpoints are
Chapter 8: Advanced Features of Integration Services 313
concerned and cannot be started from halfway where it failed. If you have massive
pipeline operations in your package and you’re concerned about rerunning
packages, it is better that you divide up the data transformations work between
multiple Data Flow tasks.

e Foreach Loop Container is also considered an atomic unit of work that will c
either commit or restart completely to iterate over all the values provided by the
enumerator used.
When used with For Loop Container, the checkpoint file will save the last value
c
of the variable and hence will restart from the same point where it left off.
The use of an atomic unit of work actually calls for a discussion on transactions and
checkpoints, as transactions convert the tasks and the packages involved into an atomic
unit of work. Let’s understand the checkpoints and their operation within the scope of
a transaction in the following Hands-On exercise.
Hands-On: Restarting a Failed
Package Using Checkpoints
In this exercise, you will simulate a package failure and configure your package with
checkpoints to restart it from the point of failure.
Method
You will use the package you developed earlier in the last exercise and apply checkpoint
configurations to it. In the second step, you will use transactions over the package to see
its behavior.
Exercise (Apply Checkpoint Configurations to Your Package)
In the first part of this Hands-on, you configure the Integration Services package to use
the checkpoints and execute the package to see it execution behavior.
1. Open BIDS and create a new Integration Services Project with the following
details:
Name Restarting failed package
Location C:\SSIS\Projects
2. When a blank project is created, delete the Package.dtsx package under SSIS
Packages node in the Solution Explorer window. Then, right-click the SSIS
Packages node and choose Add Existing Package from the context menu.
314 Hands-On Microsoft SQL Server 2008 Integration Services
3. In the Add Copy Of Existing Package dialog box, select Package Location as

the File System. In the Package Path field, type C:\SSIS\Projects\Maintaining
data Integrity with Transactions\Package.dtsx and click OK to add this package.
Once the package has been added, open it in the Designer.
4. Drop an Execute SQL task from the Toolbox on to the Designer surface outside
the Sequence container and rename this task Loading Vehicle. Double-click the
task icon to open the editor. In the General page’s Connection field, choose the Add
localhost.Campaign Connection Manager and type the following SQL statement in
the SQLStatement field:
INSERT INTO Vehicle (CustomerID, Series, Model) VALUES
('N501', 'X11 Series', 'Saloon')
You already know that this SQL statement is without the mandatory VIN field;
hence it will fail the Loading Vehicle task. Join the Sequence Container with the
Loading Vehicle task using an on-success precedence constraint.
5. Click anywhere on the blank surface of the Designer and press 4 to open the
Properties of the package. First, make sure that the package is not configured to
use transactions. Scroll down and locate the TransactionOption property, and
change its value to Supported.
6. Scroll up in the Properties window and locate the Checkpoints section. Specify
the following settings in this section:
SaveCheckpoints True
CheckpointUsage IfExists
CheckPointFileName C:\SSIS\Projects\Restarting failed package\checkpoints.chk
7. Because we want to include the restart information of the Loading Vehicle task in
the checkpoints file, click the Loading Vehicle task on the Designer surface. You
will see that the context of Properties window changes to show the properties
of the Loading Vehicle task. Locate the FailPackageOnFailure property in the
Execution section and change its value to True.
8. Press 5 to execute the package. You already know the result of the execution.
The Sequence Container and the two Execute SQL tasks in it successfully
execute and turn green, but the Loading Vehicle task fails and shows up in red.

Press shift-
5 to switch back to designer mode.
9. Let’s see what has happened in the background while the package was executing.
Open SQL Server Management Studio and run the following query to see the
records imported into the database:
SELECT n.[CustomerID], [FirstName], [SurName], [Email],
[Type], [VIN], [Series], [Model]
FROM [Campaign].[dbo].[NewCustomer] n LEFT OUTER JOIN
[Campaign].[dbo].[EmailAddress] e
Chapter 8: Advanced Features of Integration Services 315
ON n.CustomerID = e.CustomerID
LEFT OUTER JOIN [Campaign].[dbo].[Vehicle] v
ON n.CustomerID = v.CustomerID
You will see that the customer information and its e-mail information have been
loaded while the vehicle information fields have null values.
Using Windows Explorer, navigate to the C:\SSIS\Projects\Restarting failed
package folder and note that the checkpoints.chk file has been created. Open this
XML formatted file and note that it contains information about the failure of the
package and the cylinder involved in the failure.
10. Change the SQL statement of the Loading Vehicle task to include the VIN
information with the following query:
INSERT INTO Vehicle (CustomerID, VIN, Series, Model) VALUES
('N501', 'UV123WX456YZ789', 'X11 Series', 'Saloon')
11. Again execute the package. This time you will see that only the Loading Vehicle
task is executed and the earlier two tasks and the Sequence container did not
run at all (see Figure 8-8). This is because the package reads the checkpoint file
before executing and finds the information about where to start executing. Press
-
5 to switch back to design mode.
Figure 8-8 Restarting package with checkpoints

316 Hands-On Microsoft SQL Server 2008 Integration Services
12. Explore to the C:\SSIS\Projects\Restarting failed package folder and note that
the checkpoints.chk file does not exist.
13. Switch to SQL Server Management Studio and run the script specified in Step 9
to see the result set. You will see one record containing customer, e-mail, and
vehicle information. Run the following queries to clear the tables:
DELETE [Campaign].[dbo].[NewCustomer]
DELETE [Campaign].[dbo].[EmailAddress]
DELETE [Campaign].[dbo].[Vehicle]
Exercise (Effect of Transaction on Checkpoints)
To set transactions on this package we need to set the TransactionOption value to
Required. So, let’s do it.
14. Click anywhere on the blank surface of the Control Flow Panel and press 4 to
open the Properties window. Scroll down and locate the TransactionOption
property in the Transactions section. Set it to the Required value so that it starts
a transaction. But SSIS doesn’t allow you to do this and throws an error as shown
in Figure 8-9.
This behavior is different than Integration Services 2005, in which you could use
transactions and checkpoints in the same package and Integration Services left proper
usage and management of both of them to you. In that case the transactions roll back
the information of the checkpoint file and cause that package to execute all over again.
This is actually applicable to containers in simple packages also. But there is a potential
for error or misbehavior when you are using Integration Services 2005 with checkpoints
Figure 8-9 Error thrown while trying to use transactions alongside checkpoints
Chapter 8: Advanced Features of Integration Services 317
and transactions in a complex package; that is, if your package consists of a complex
container hierarchy and a subcontainer commits before the parent container fails, the
subcontainers do not get rolled back and also do not get recorded in the checkpoint
file. This causes those subcontainers to be executed again when the parent container
is restarted. Similarly, the Foreach Loop container does not record any information in

the checkpoint file about the iterations it may have already done before failing and gets
executed all over again when restarted. So, when you’re planning to use checkpoints
alongside the transactions, use caution and test thoroughly. Integration Services 2008
R2, by contrast, stops you doing that altogether due to the complexity and risk involved,
and you can’t use transactions and checkpoints in your packages at the same time.
Review
You’ve seen in this exercise that the checkpoints can help you restart a package precisely
from the task where the package failed. You also understand that you need to be careful
while using transactions and checkpoints on packages with complex container hierarchies
in Integration Services 2005. On the other hand, Integration Services 2008 R2 doesn’t
allow you to implement checkpoints and transactions at the same time.
Expressions and Variables
You learned about variables and property expressions in Chapter 3 and have used
them in various Hands-On exercises in subsequent chapters. With DTS 2000, use of
variables was considered an advanced feature that allowed you to add some dynamic
behavior to your packages. However, use of variables in Integration Services is made
easier and has been tied into SSIS package design so much that the packages developed
without using variables are reduced to ad hoc data operations, most of which can be
done using the SQL Server Import and Export Wizard. On the other hand, use of
property expressions is a new feature in Integration Services that provides an ability to
set values for component properties dynamically using variables that are updated at run
time by other tasks. Property Expressions allow you to evaluate values generated at run
time by other tasks and use the evaluated values to update properties exposed by the
concerned task at run time. This is quite a powerful feature, as it allows you to read and
evaluate the values that exist only at run time and modify the property or behavior of
other tasks in the package.
Though you’ve used variables and expressions in the Hands-On exercises earlier, here
you will do another exercise that uses variables and particularly property expressions
extensively to update properties of the send mail task to generate personalized mails.

×