Tải bản đầy đủ (.pdf) (564 trang)

Microsoft SQL Server 2012 Integration Services: An Expert Cookbook pot

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (13.88 MB, 564 trang )

Microsoft SQL Server 2012
Integration Services:
An Expert Cookbook
Over 80 expert recipes to design, create, and deploy
SSIS packages
Reza Rad
Pedro Perfeito
BIRMINGHAM - MUMBAI
Microsoft SQL Server 2012 Integration
Services: An Expert Cookbook
Copyright © 2012 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or
transmitted in any form or by any means, without the prior written permission of the publisher,
except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the
information presented. However, the information contained in this book is sold without
warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers
and distributors will be held liable for any damages caused or alleged to be caused directly or
indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies
and products mentioned in this book by the appropriate use of capitals. However, Packt
Publishing cannot guarantee the accuracy of this information.
First published: May 2012
Production Reference: 1140512
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-84968-524-5
www.packtpub.com


Cover Image by Artie Ng ()
Credits
Authors
Reza Rad
Pedro Perfeito
Reviewers
Phil Brammer
Brenner Grudka Lira
April L. Rains
Rafael Salas
Milla Smirnova
Acquisition Editor
Rukshana Khambatta
Lead Technical Editors
Kedar Bhat
Meeta Rajani
Technical Editors
Joyslita D'Souza
Manasi Poonthottam
Aaron Rosario
Project Coordinator
Leena Purkait
Proofreaders
Mario Cecere
Chris Smith
Indexer
Monica Ajmera Mehta
Graphics
Valentina D'silva
Manu Joseph

Production Coordinator
Aparna Bhagat
Cover Work
Aparna Bhagat

Foreword
Data Transformation Services (DTS) was Microsoft's rst entrance into the world of advanced
data transformation and task-oriented tools, allowing users to rapidly move data from one
point to another, or to perform common tasks such as FTPing les from one server to another.
New to SQL Server 2000, this tool was the foundation for many developers' toolkits. The UI
was easy to use and understand, precedence constraints could be applied between tasks
ensuring business rules were maintained, and custom code could be added to perform
advanced tasks not found in the boxed feature set. DTS is still the bar against which many
measure SQL Server Integration Services.
SQL Server Integration Services (SSIS), introduced in SQL Server 2005 and largely unchanged
through SQL Server 2008 R2, was a rewrite of both the toolset and the paradigm by which
developers were used to thinking as compared to the relatively easy-to-use DTS. SSIS has its
strengths in separating the work surface of a DTS package into distinct parts, the Control Flow
and the Data Flow. The Control Flow is designed to direct the "ow" of the package, ensure
dependencies are met before executing a downstream task, perform looping operations over
a varied list of sources, execute SQL statements, and so on. A Data Flow Task is designed
to move data from one source to another, transforming data along the way. The separation
allows for greater exibility in developing a package by limiting the scope of what a developer
can edit at once, and by allowing specic tasks to be copied and subsequently reused.
SSIS is not without its list of negatives, however. Through SQL Server 2008 R2, an SSIS
package was a single entity, which could be executed in any number of places from within
Business Intelligence Developer Studio, from the lesystem, or on a SQL Server instance. In
a shop that has a large number of packages deployed, it was extremely difcult to manage
all of the packages and track all of the activities that the packages were doing. This meant
that developers were forced to write their own logging solutions to capture data such as row

counts, start and end times, audit information, and any other pertinent information necessary
to support the package. SSIS also has a steep learning curve, which many developers nd
very hard to overcome.
SQL Server 2012 introduces some very welcome additions to the existing SSIS product.
The most welcome addition, and the one I am most excited about, is the inclusion of a
true server-side component to SSIS. Choosing to deploy packages to the server will allow
developers and administrators to nally get ease of deployment, and capture the most often
requested information about the execution of packages. This server component, called the
SSIS Catalog, and its new project deployment model allow administrators to override logging
levels, set input parameters, and view built-in reports in an easy-to-use presentation format.
In the new project deployment model, the project build process creates a .ispac le, which
can be shared with any person doing the physical deployment of the project. The le includes
all of the packages in the project, any shared project-level connections, and other metadata
pertaining to the project. Double-clicking on the le will start the deployment wizard.
Very easy.
Some other changes found in SQL Server 2012 SSIS are a revamped design surface helping
to meet accessibility requirements, full undo/redo capability, a removed limit of 4,000
characters on expressions, ability to change variable scopes, and so on.
This book will walk you through, step-by-step, each major feature of SSIS in SQL Server 2012,
and how to use them. Pedro and Reza have given contextual examples where possible, and
you will be able to download and implement them yourself to help you follow along each
recipe. If you are an experienced SSIS developer or you are new to the product, this book
will be an often-referenced resource in your bookshelf. Pedro and Reza have put together
a great reference book that I know you'll enjoy.
Phil Brammer
Microsoft MVP – SQL Server
About the Authors
Reza Rad is an author, trainer, speaker, and consultant. He has a BSc in Computer
Engineering; he has more than 10 years' experience in programming and development
mostly on Microsoft technologies. He received the Microsoft Most Valuable Professional

(MVP) award in SQL Server in 2011 and 2012 for his dedication in Microsoft BI and
specially SSIS. He has been working on the Microsoft BI suite for more than six years. He is
an SSIS/MSBI/.NET Trainer and also software and BI Consultant at some companies and
institutes. His articles on different aspects of technologies, specially on SSIS, can be found on
his blog
.
He was the co-author of
SQL Server MVP Deep Dives Volume 2. He is one of the active
members on online technical forums such as MSDN and Experts-Exchange. He is a
Microsoft Certied Professional (MCP); Microsoft Certied Technology Specialist (MCTS) and
Microsoft Certied IT Professional (MCITP) in Business Intelligence (BI). His e-mail address is

I would like to thank my wife who has been a wonderful supporter in writing
this book; she encouraged me a lot to complete this book, she was a light
during my difcult moments.

I would also like to thank my parents and sister, who were my teachers for
many years of my life.

I would like to thank Pedro, my good friend who helped a lot in writing this
book. He did a good job in completing this book in his busy hours with
full-time job and teaching.
Pedro Perfeito was born in 1977 in Portugal and currently works as a BI Senior Consultant
and Developer at Novabase. He's also an invited teacher in master and short-master BI
degrees at IUL-ISCTE (Lisbon) and at Universidade Portucalense (UPT-Porto) respectively. He
received the Microsoft award Microsoft Most Valuable Professional (MVP) in 2010, 2011, and
2012 for all his dedication and contribution in helping theoretical and practical issues in the
various BI communities. He is also the co-author of SQL Server MVP Deep Dives Volume
2. He has several Microsoft certications including MCP, MCSD, MCTS-Web, MCTS-BI, and
MCITP-BI. He also has worldwide certications in the area of BI provided by TDWI/CBIP (The

Data Warehouse Institute, ). He's currently preparing for his PhD
degree on BI. For further details you can visit his personal blog at rocgd.
blogspot.com
or even contact him directly at
I would like to express my gratitude to all teams at Packt who trusted me—a
Portuguese author—and helped me complete this book. I would like to thank
my friend and co-author of this book Reza Rad because without him this
book would not have been possible.

I have furthermore to thank Barbara Chambel for all the support she gave
me since the rst moment at Novabase, to Luis Ferreira (Project Manager
at Banco de Portugal) and Simão Fernandes (ex-student and colleague
at Novabase) for all hints and complaints from the previous SSIS version
(you both know which ones I mean!) and for all my Master BI students from
Universidade Portucalense (Oporto) and from ISCTE-IUL (Lisbon) who have
directly and indirectly motivated me in this challenge.

I am deeply indebted to Dr. Maria José Trigueiros for all the encouragement
to go inside this amazing world of Business Intelligence and make my dream
come true. She's not physically with us but she will be remembered for ever.

Especially, I would like to give my special thanks to my family and my
girlfriend Joana whose patient love helped me to complete this work!

Thanks to all who I haven't mentioned here and who believed in me, even
more than myself.
About the Reviewers
Phil Brammer, a fth year Microsoft MVP in SQL Server, has over 12 years' data
warehousing experience in various technologies from reporting through ETL to database
administration. He has worked with SSIS since 2007 and he continues to play an active role

in the SSIS community via online resources as well as his technical blog site, SSISTalk.com.
He has contributed to SQL Saturdays, SQL PASS Summits, and the rst volume of the SQL
Server MVP Deep Dives book.
Most recently he has taken on the role of a full-time operational DBA managing over 120
database instances in the health-care insurance industry. He is an avid golfer and loves
spending time with his wife and two children.
Brenner Grudka Lira joined NeuroTech as a Data Analyst in 2012. He has a Bachelor's
degree in Computer Science from the Catholic University of Pernambuco in Recife, Brazil.
He also has experience in building and modeling Data Warehouses and has knowledge of
Oracle Warehouse Builder, SQL Server Integration Services, SAP Business Objects, and Oracle
Business Intelligence Standard Edition One. Today, he is dedicated to the study of Business
Intelligence with focus on the ETL process and Risk Management in Financial Operations.
April L. Rains has 13 years of experience building Business Intelligence, Web, and Windows
applications using Microsoft tools and platforms. Working in the transportation and logistics
industry for many years provided numerous opportunities for ETL, EAI, and trading partner EDI
using both SSIS and BizTalk. She has a wide range of hands-on experience in multiple roles
across the application lifecycle. You can e-mail her at or contact
her through her website at www.aprilrains.com.
I would like to thank my son Kieran who provides amazing and never-ending
inspiration to me.
Rafael Salas is a Data Warehousing and Business Intelligence professional with more than
a decade of experience in many industries and Fortune 500 companies. He provides technical
leadership and helps organizations to improve performance through Business Intelligence
strategies and solutions. His credentials include a Bachelor's degree in Computer Sciences,
a Master's degree in Business and Technology, and a number of industry certications. He
has been recognized as Microsoft Most Valuable Professional (MVP) since 2007 and is a
published author, blogger, and frequent speaker at conferences and technology community
events. His specialties include architecture, Data Warehouse appliances, data integration,
data quality, OLAP databases, and Dimensional Modeling. You can nd more about him on his
blog at www.rafael-salas.com.

Milla Smirnova is a Data Architect, DBA, and BI specialist. She possesses over 10 years
of experience in Information Technology; most of those years of experience are in SQL Server
Administration and Development. As her involvement with Business Intelligence technologies
increased drastically within the last few years so has her passion for ETL design, development,
and optimization utilizing SSIS.
I would like to thank my wonderful husband Larry for all his help and
support. I would like to thank Maria and Nikolay as well.

I would also like to thank everyone at Packt Publishing for their
encouragement and guidance.
www.PacktPub.com
Support les, eBooks, discount offers and more
You might want to visit www.PacktPub.com for support les and downloads related to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub
les available? You can upgrade to the eBook version at www.PacktPub.com and as a print
book customer, you are entitled to a discount on the eBook copy. Get in touch with us at
for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a
range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library.
Here, you can access, read and search across Packt's entire library of books.
Why Subscribe?
f Fully searchable across every book published by Packt
f Copy and paste, print and bookmark content
f On demand and accessible via web browser
Free Access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib
today and view nine entirely free books. Simply use your login credentials for immediate access.
Instant Updates on New Packt Books

Get notied! Find out when new books are published by following @PacktEnterprise on Twitter,
or the Packt Enterprise Facebook page.

Table of Contents
Preface 1
Chapter 1: Getting Started with SQL Server Integration Services 7
Introduction 7
Import and Export Wizard: First experience with SSIS 9
Getting started with SSDT 17
Creating the rst SSIS Package 25
Getting familiar with Data Flow Task 27
SSIS 2012 versus previous versions in Developer Experience 30
Chapter 2: Control Flow Tasks 35
Introduction 35
Executing T-SQL commands: Execute SQL Task 36
Handling le and folder operations: File System Task 43
Sending and receiving les through FTP: FTP Task 48
Executing other packages: Execute Package Task 52
Running external applications: Execute Process Task 57
Reading data from web methods: Web Service Task 60
Transforming, validating, and querying XML: XML Task 64
Proling table statistics: Data Proling Task 73
Batch insertion of data into a database: Bulk Insert Task 76
Querying system information: WMI Data Reader Task 80
Querying system events: WMI Event Watcher Task 84
Transferring SQL server objects: DBMS Tasks 87
Chapter 3: Data Flow Task Part 1—Extract and Load 91
Introduction 91
Working with database connections in Data Flow 92
Working with at les in Data Flow 104

Passing data between packages—Raw Source and Destination 108
ii
Table of Contents
Importing XML data with XML Source 113
Loading data into memory—Recordset Destination 118
Extracting and loading Excel data 121
Change Data Capture 127
Chapter 4: Data Flow Task Part 2—Transformations 135
Introduction 135
Derived Column: adding calculated columns 136
Audit Transformation: logging in Data Flow 139
Aggregate Transform: aggregating the data stream 143
Conditional Split: dividing the data stream based on conditions 148
Lookup Transform: performing the Upsert scenario 152
OLE DB Command: executing SQL statements on each row
in the data stream 161
Merge and Union All transformations: combining input data rows 165
Merge Join Transform: performing different types of joins in data ow 168
Multicast: creating copies of the data stream 177
Working with BLOB elds: Export Column and Import Column
transformations 180
Slowly Changing Dimensions (SCDs) in SSIS 185
Chapter 5: Data Flow Task Part 3—Advanced Transformation 193
Introduction 193
Pivot and Unpivot Transformations 194
Text Analysis with Term Lookup and Term Extraction transformations 207
DQS Cleansing Transformation—Cleansing Data 214
Fuzzy Transformations—how SSIS understands fuzzy similarities 220
Chapter 6: Variables, Expressions, and Dynamism in SSIS 229
Introduction 229

Variables and data types 230
Using expressions in Control Flow 237
Using expressions in Data Flow 240
The Expression Task 246
Dynamic connection managers 249
Dynamic data transfer with different data structures 256
Chapter 7: Containers and Precedence Constraints 261
Introduction 261
Sequence Container: putting all tasks in an executable object 262
For Loop Container: looping through static enumerator till
a condition is met 266
Foreach Loop Container: looping through result set of a database query 271
iii
Table of Contents
Foreach Loop Container: looping through les using File Enumerator 277
Foreach Loop Container: looping through data table 282
Precedence Constraints: how to control the ow of task execution 289
Chapter 8: Scripting 295
Introduction 295
The Script Task: Scripting through Control Flow 296
The Script Component as a Transformation 299
The Script Component as a Source 303
The Script Component as a Destination 310
The Asynchronous Script Component 315
Chapter 9: Deployment 323
Introduction 323
Project Deployment Model: Project Deployment from SSDT 324
Using Integration Services Deployment Wizard and
command-line utility for deployment 331
The Package Deployment Model, Using SSDT to deploy package 335

Creating and running Deployment Utility 341
DTUTIL—the command-line utility for deployment 344
Protection level: Securing sensitive data 348
Chapter 10: Debugging, Troubleshooting, and Migrating Packages
to 2012 355
Introduction 355
Troubleshooting with Progress and Execution Results tab 356
Breakpoints, Debugging the Control Flow 360
Handling errors in Data Flow 367
Migrating packages to 2012 373
Data Tap 377
Chapter 11: Event Handling and Logging 383
Introduction 383
Logging over Legacy Deployment Model 384
Logging over Project Deployment Model 389
Using event handlers and system variables for custom logging 398
Chapter 12: Execution 409
Introduction 409
Execution from SSMS 410
Execution from a command-line utility 415
Execution from a scheduled SQL Server Agent job 420
iv
Table of Contents
Chapter 13: Restartability and Robustness 429
Introduction 429
Parameters: Passing values to packages from outside 430
Package conguration: Legacy method to inter-relation 442
Transactions: Doing multiple operations atomic 453
Checkpoints: The power of restartability 459
SSIS reports and catalog views 464

Chapter 14: Programming SSIS 471
Introduction 471
Creating and conguring Control Flow Tasks programmatically 472
Working with Data Flow components programmatically 478
Executing and managing packages programmatically 487
Creating and using Custom Tasks 491
Chapter 15: Performance Boost in SSIS 503
Introduction 503
Control Flow Task and variables considerations for boosting performance 504
Data Flow best practices in Extract and Load 508
Data Flow best practices in Transformations 512
Working with buffer size 520
Working with performance counters 522
Index 527
Preface
Microsoft SQL Server 2012 Integration Services: An Expert Cookbook is a complete guide for
everyone, from a novice to a professional in Integration Services 2012. SQL Server Integration
Services is an ETL tool, which stands for Extract Transform and Load. There is a need for a
data transfer system in all operational systems these days, and SSIS is one of the best data
transfer tools. In this book, all aspects of SSIS 2012 are discussed with lots of real-world
scenarios to help readers to understand usage of SSIS in every environment.
What this book covers
Chapter 1, Getting Started with SQL Server Integration Services, provides an overview of the
ETL concepts and ETL terminologies, why ETL is needed in the technology world, and what
problems ETL will solve. Then an overview of SSIS as an ETL tool is provided to help readers
to get an overall view of the other parts of the book.
Chapter 2, Control Flow Tasks, explores all Control Flow Tasks with real-world samples of each
Task. The reader will learn what each Task stands for, what is its usage, real-world scenarios,
and the new tasks available in SSIS 2012.
Chapter 3, Data Flow Task Part 1—Extract and Load, explains the data sources and data

destinations under the Data Flow Task. Data Flow Task is the most functional part of SSIS,
to which an SSIS Developer probably dedicates most time.
Chapter 4, Data Flow Task Part 2—Transformations, explores the transformations used
to apply data quality and business rules that are essential to prepare data loaded into
destinations. Data Flow Task provides an easy way to transform source data into the form
needed by its destination in several different ways.
Chapter 5, Data Flow Task Part 3—Advanced Transformation, briey discusses Advanced
Transformations. In real-world scenarios, different data sources don't provide the
same structure, so there is a need to unify them in a unique structure. There are some
transformations in SSIS Data Flow Task that use complex ways to apply such changes on
data stream. We call them Advanced Transformations.
Preface
2
Chapter 6, Variables, Expressions, and Dynamism in SSIS, describes how SSIS works with
dynamism with the aid of expressions, what are the limitations of some tasks in dynamism,
and what are the alternative solutions. SSIS as an executable unit needs to have a structure
for declaring in-memory variables and store some data in memory to pass between Tasks
through the execution phase. Besides the variables, there is a built-in statement language in
SSIS components and Tasks to do many operations such as data conversion, data splitting
based on a condition, or creating text lenames based on date. In this chapter, readers will
learn how to work with variables and expressions in many scenarios. Dynamism is the most
powerful aspect of an ETL tool in data transfer operations.
Chapter 7, Containers and Precedence Constraints, explains three types of containers and
precedence constraint in the SSIS Control Flow, which help developers to control the ow of
task execution. All of these containers and the precedence constraints are covered in this
chapter with real-world samples.
Chapter 8, Scripting, explains the powerful aspect of SSIS: scripting—developers can use
scripting whenever other tasks or transformations can't help them to fulll their requirements.
There are two places for scripting in SSIS the—Script Task in Control Flow and Script Component
in Data Flow. Scripting in both of these components will be covered in this chapter with samples.

Chapter 9, Deployment, describes how to deploy the developed packages and projects to
a production environment, discussing different methods of deployment with the pros and
cons of each way in real-world scenarios.
Chapter 10, Debugging, Troubleshooting, and Migrating Packages to 2012, explains the ability
of SSIS to debug and troubleshoot like all robust systems. Developers need to know how
to face problems in Control Flow or Data Flow, how to handle errors in Data Flow Task, and
troubleshoot them. Debugging and troubleshooting have two sides in SSIS—Control Flow and
Data Flow. This chapter describes both sides with appropriate examples. Also, this chapter
has two recipes on migrating packages from the previous versions to 2012.
Chapter 11, Event Handling and Logging, explores all aspects of event handlers in SSIS
besides logging in custom or built-in modes. SSIS provides a set of handlers for events on
executable objects of Control Flow, which helps developers to handle these events and design
appropriate operations on them. These event handlers also help developers to do some
custom logging in their packages. There is a built-in logging feature in SSIS which can be
used in general logging scenarios.
Chapter 12, Execution, covers different methods of package execution, and the properties
and settings that can be congured at the time of execution.
Chapter 13, Restartability and Robustness, covers all these aspects of SSIS: SSIS has the
structure to get input parameters from other applications. On the other hand, Packages can
operate in a restartable mode. They can store their state at the time of failure and continue
execution from that state next time. They are also capable of running Tasks in packages
as a transaction.
Preface
3
Chapter 14, Programming SSIS, explains library classes for creating package and tasks,
conguring them, deployment of a package, and running the package. Integration Services
provide a set of .NET library classes and methods to do all parts of SSIS lifecycle operations
from .NET programming.
Chapter 15, Performance Boost in SSIS, covers recommendations and best practices for
raising the performance of packages and Data Flow. As an advanced part of each tool,

there are some tips to raise the performance; they are described in this chapter.
What you need for this book
You need to have Microsoft SQL Server 2012 Business Intelligence Edition for running all
recipes of this book.
Visual Studio 2010 is also needed for Chapter 14, Programming SSIS, which is about creating
SQL Server Integration Services packages programmatically; so if you want to read and
practice all the recipes in this book it is necessary to have Microsoft Visual Studio 2010.
Who this book is for
If you are a SQL database administrator or developer looking to explore all the aspects of SSIS
and need to use SSIS in the data transfer parts of systems, then this is the best guide for you.
Basic understanding of working with SQL Server Integration Services is required.
Conventions
In this book, you will nd a number of styles of text that distinguish between different kinds of
information. Here are some examples of these styles, and an explanation of their meaning.
Code words in text are shown as follows: "This Data Flow reads some customer data (rst
name and last name) from an Excel le, applies some common transformations and inserts
the data into an SQL table named
SalesLT.Customer."
A block of code is set as follows:
<title>The First Book</title>
<title>Becoming Somebody</title>
<title>The Poet's First Poem</title>
Preface
4
When we wish to draw your attention to a particular part of a code block, the relevant lines or
items are set in bold:
<xsd:element name="genre" type="xsd:string"/>
<xsd:element name="price"
type="xsd:float" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="pub_date"

type="xsd:date" minOccurs="0" maxOccurs="unbounded" />
<xsd:element name="review" type="xsd:string"/>
Any command-line input or output is written as follows:
x "C:\SSIS\Ch02_ControlFlowTasks\R03_FTP Task\LocalFolder\files.7z"
New terms and important words are shown in bold. Words that you see on the screen,
in menus or dialog boxes for example, appear in the text like this: "If any error occurs
while executing the process, the error can be stored into a variable with the
StandardErrorVaraible option".
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about this
book—what you liked or may have disliked. Reader feedback is important for us to develop
titles that you really get the most out of.
To send us general feedback, simply send an e-mail to , and
mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or
contributing to a book, see our author guide on www.packtpub.com/authors.
Preface
5
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to help you
to get the most from your purchase.
Downloading the example code
You can download the example code les for all Packt books you have purchased from your
account at . If you purchased this book elsewhere, you can
visit and register to have the les e-mailed directly
to you.
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes

do happen. If you nd a mistake in one of our books—maybe a mistake in the text or the
code—we would be grateful if you would report this to us. By doing so, you can save other
readers from frustration and help us improve subsequent versions of this book. If you nd
any errata, please report them by visiting
selecting your book, clicking on the errata submission form link, and entering the details
of your errata. Once your errata are veried, your submission will be accepted and the
errata will be uploaded on our website, or added to any list of existing errata, under the
Errata section of that title. Any existing errata can be viewed by selecting your title from
/>Piracy
Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt,
we take the protection of our copyright and licenses very seriously. If you come across any
illegal copies of our works, in any form, on the Internet, please provide us with the location
address or website name immediately so that we can pursue a remedy.
Please contact us at with a link to the suspected
pirated material.
We appreciate your help in protecting our authors, and our ability to bring you valuable content.
Questions
You can contact us at if you are having a problem with any
aspect of the book, and we will do our best to address it.

1
Getting Started
with SQL Server
Integration Services
by Reza Rad and Pedro Perfeito
In this chapter, we will cover the following topics:
f Import and Export Wizard: First experience with SQL Server Integration Services
(SSIS)
f Getting started with SSDT
f Creating the rst SSIS package

f Getting familiar with Data Flow Task
f SSIS 2012 versus previous versions in Developer Experience
Introduction
As technology evolves, it is always necessary to integrate data between different systems.
The integration component is increasingly gaining importance, especially the component
responsible for data quality as well as the cleaning rules applied between source and
destination databases. Different vendors have their own integration tools and components,
and Microsoft with its SSIS tool is recognized as one of the leaders in this eld.
Getting Started with SQL Server Integration Services
8
SSIS can be used to perform a broad range of data integration tasks, and the most
common scenarios are applied to Data Warehousing. The known term associated with Data
Warehousing is the Extract Transform and Load (ETL) that is responsible for the extraction of
data from several sources, their cleansing, customization, and loading into a central repository
(for example, to a Data Warehouse, Data Mart, Hub, and so on). SSIS is also used in other
scenarios, for example data migration and data consolidation. Data Migration is the one-time
movement of data between databases and computer systems, and is needed when changes
occur or when we upgrade our systems. Data Consolidation combines and integrates data
from disparate systems and assumes high importance in a business environment with
increasing acquisitions and mergers. The following diagram adapted from TDWI
(www.tdwi.org) helps clarify the different scenarios where SSIS could be used:
Data Warehousing Data Migration
Data Consolidation
Source Data
data
warehouse
Legacy Systems
ERP/Integrated
Systems
company

#1 data
company
#2 data
Merged
(consolidated data)
New business challenges are driving organizations to adopt data integration projects. Some
of these challenges are:
f Increasing demand for real-time information reporting and analysis
f Large volumes of data spread along the entire organization
f The need to comply with regulations, which often require to continuously track all
changes to data and not just the net result of those changes
Although SSIS is an amazing tool for data integration, the same work can be done manually
in almost all cases. As you can imagine, performing data integration tasks manually could
be hard to maintain in terms of code, hard to scale properly, and would require more time
to implement. From our perspective, since we have SSIS, there is no real reason to do it
manually. The cost of ownership is not a problem either, because SSIS is included with SQL
Server licenses that most organizations have already acquired.

×