Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

SSIS Materials

Download as pdf or txt
Download as pdf or txt
You are on page 1of 133

MS-BI MATERIAL WITH SCENARIOS

MSBI (Microsoft Business Intelligence)

What is Business Intelligence?


BI is a process (set of activities) to run business smarter way, i.e. collecting the data from
various operational sources, (staging area), combining and storing it in a database or
data warehouse.
And, this consolidated data should be reported, analyzed and distributed to the right
people in the required format like PDF, Excel etc.
Staging Area or Transformation Area:
A temporary memory location (in-memory buffer in SQL Server) , where the transformation
activities takes place to validate the source or origination business data.
Staging Area is a layer between Source system and target/destination system
Transformation Activities:
Data Merging: Merge/integrate the business data which is from various operational
data sources into single database/data warehouse or file system.

Practically, Data merging operation can be achieved by using predefined


transformations in SSIS (ETL or Data integration tool),
Merge Transformation
Merge Join Transformation and
Union All Transformation
Note: Lets discuss these transformations in further sessions.
Data Cleansing: Cleaning dirty data. Data cleansing process helps ensure data
integrity and quality by data profiling, matching, cleansing, and correcting the invalid
and in-accurate data.

Practically, data cleaning operation can be achieved by using the predefined


transformations in SSIS,
Fuzzy Lookup
Fuzzy Grouping
Aggregate (To group the related similar records into single record using group by
clause)
Sort (To sort the source data and also remove rows with duplicate sort values)
Transformations.

Data Aggregation: Data aggregation operations allows us to aggregate source


data or input data and works same as SQL Group By and with other functions i.e.
Count(*), Count(distinct), Sum(), Min(), Max(), Avg() function and group by
clause.

Data Scrubbing: Defining new structure (metadata) based on existing upstream


columns/variables/various functions available
MSBI stands for Microsoft Business Intelligence. This suite is composed of tools which helps
in providing best solutions for Business Intelligence Queries. These tools use Visual studio
along with SQL server. It empower users to gain access to accurate, up-to-date information

Kelly Technologies, Hyderabad. Page 1


MS-BI MATERIAL WITH SCENARIOS

for better decision making in an organization. It offers different tools for different processes
which are required in Business Intelligence (BI) solutions.

MSBI is divided into 3 categories:-


SSIS SQL Server Integration Services.
SSRS SQL Server Reporting Services.
SSAS SQL Server Analysis Services.
A visual always help better to understand any concept. Below Diagram broadly defines
Microsoft Business-Intelligence (MSBI).

SSIS-
SSIS stands for SQL Server Integration Services. It is a platform for Data integration and
Work flow applications. It can perform operations like Data Migration and ETL (Extract,
Transform and Load).
E Merging of data from heterogeneous data stores (i.e. it may be a text file, spreadsheets,
mainframes, Oracle, etc.).This process is known as EXTRACTION.
T Refreshing data in the data warehouses and data marts. Also used to cleanse data before
loading to remove errors. This process is known as TRANSFORMATION.
L- High-speed load of data into Online Transaction Processing (OLTP) and Online Analytical
Processing (OLAP) databases. This process is known as LOADING.
Tools used for the development of SSIS projects are -
SSDT (SQL Server Data Tool) (In SSIS2008 SQL Server Data Tool).
SSMS (SQL Server Management Studio).
Note: - Prior to SSIS, the same task was performed with DTS (Data Transformation Services)
in SQL Server 2000 but with fewer features. Difference between DTS and SSIS is as follows:-
DTS:-
Limited error Handling.
Message Boxes in ActiveX Scripts.

Kelly Technologies, Hyderabad. Page 2


MS-BI MATERIAL WITH SCENARIOS

No deployment wizard and BI functionality.

SSIS (SQL Server Integration Services):-


Complex and powerful error handling.
Message Boxes in .NET Scripting.
Interactive deployment wizard and Complete BI functionality.

To develop your SSIS package, you need to install SQL Server Data Tool (SSDT) which
will be available as client tool after installing SQL Server Management Studio (SSMS).
SSDT (SQL Server Data Tool): It is a tool which is used to develop the SSIS packages.
It is available with SQL Server as an interface which provides the developers to work on
the control flow of the package step by step.
SSMS: - It provides different options to make a SSIS package such as Import Export
wizard. With this wizard, we can create a structure on how the data flow should happen.
Created package can be deployed further as per the requirement.
Now, you must be hitting your head to know about Data flow and Control flow. So, Data
flow means extracting data into the servers memory, transform it and write it out to an
alternative destination whereas Control flow means a set of instructions which specify the
Program Executor on how to execute tasks and containers within the SSIS Packages. All
these concepts are explained in SSIS

SSIS Architecture:-
Packages A package (an executable in SSIS) is a collection of tasks framed
together with precedence constraints to manage and execute tasks in an order. It is
compiled in a XML structured file with .dtsx extension.
Control Flow Control Flow defines the actions (tasks/containers) and these would be
executed during execution of package. It consists of one or more tasks and containers that
executes when package runs. Control flow orchestrates the order of execution for all its
components.
Tasks - A task is an individual unit of work.
Precedence Constraints - These are the arrows in a Control flow of a package
that connect the tasks together and manage the order in which the tasks will execute. In
Data flow, these arrows are known as Service paths.
Containers - Core units in the SSIS architecture for grouping tasks together logically into
units of work are known as Containers.
Connection Managers - Connection managers are used to centralize connection strings to
data sources and to abstract them from the SSIS packages. Multiple tasks can share the
same Connection manager.
Data Flow - The core strength of SSIS is its capability to extract data into the servers
memory (Extraction), transform it (Transformation) and write it out to an alternative
destination (Loading).

Kelly Technologies, Hyderabad. Page 3


MS-BI MATERIAL WITH SCENARIOS

Sources - A source is a component that you add to the Data Flow design surface to specify
the location of the source data.
Transformations - Transformations are key components within the Data Flow that allow
changes to the data within the data pipeline.
Destinations - Inside the Data Flow, destinations consume the data after the data pipe
leaves the last transformation components.
Variables - Variables can be set to evaluate to an expression at runtime.
Parameters - Parameters behave much like variables but with a few main exceptions.
Event Handlers The event handlers that run in response to the run-time events that
packages, tasks, and containers raise.
Log Providers Logging of package run-time information such as the start time and the
stop time of the package and its tasks and containers.
Package Configurations After development your package and before deploying the
package in production environment from UAT you need to perform certain package
configurations as per production Server.
This completes the basics of SSIS and its architecture

SSIS Architecture
Microsoft SQL Server Integration Services (SSIS) consist of four key parts:
SSIS Service
SSIS Object Model
SSIS runtime engine and the runtime executables
SSIS dataflow engine and the dataflow components (SSIS Data Flow Pipe line Engine)

Kelly Technologies, Hyderabad. Page 4


MS-BI MATERIAL WITH SCENARIOS

Integration Services Service


Monitors running Integration Services packages and manages the storage of packages
Integration Services object model
Includes native and managed application programming interfaces (API) for accessing
Integration Services tools, command-line utilities, and custom applications
SSIS Run-time Engine &executables
Runs packages
Supports logging, debugging, config, connections, & transactions
SSIS Run-time executables
Package, Containers, Tasks and Event Handlers
SSIS Data-flow Engine & components
Provides In-Memory buffers to move data
Calls Source Adaptors to files & DBs
Provides Transformations to modify data
Destination Adaptors to load data into data stores Components

Kelly Technologies, Hyderabad. Page 5


MS-BI MATERIAL WITH SCENARIOS

Source, Destination Adaptors & transformations


SQL Server Data Tool
SQL Server Data Tool (SSDT) allows users to create / edit SSIS packages using a drag-and-
drop user interface. SSDT is very user friendly and allows you to drag-and-drop
functionalities. There are a variety of elements that define a workflow in a single package.
Upon package execution, the tool provides color-coded, real-time monitoring.
Components of SSIS Package include
1. Control Flow
2. Data Flow

Control Flow
Control flow defines the actions and these actions will be executed during execution of the
package (In other words, it controls the flow of sequence of execution of executables
(Package, Container and/or Task) in a package).
Control flow deals with orderly processing of tasks, which are individual, isolated units of
work that perform a specific action ending with a finite outcome (such that can be evaluated
as either Success, Failure, or Completion). While their sequence can be customized by
linking them into arbitrary arrangements with precedence constraints and grouping them
together or repeating their execution in a loop with the help of containers, a subsequent task
does not initiate unless its predecessor has completed.

Kelly Technologies, Hyderabad. Page 6


MS-BI MATERIAL WITH SCENARIOS

Elements of Control Flow include,


Container
Containers provide structure in packages and services to tasks in the control flow.
Integration Services include the following container types, for grouping tasks and
implementing repeating control flows:
The ForeachLoop container: It enumerates a collection and repeats its control flow for
each member of the collection. The ForeachLoop Container is for situations where you have
a collection of items and wish to use each item within it as some kind of input into the
downstream flow.

For Loop Container: Its a basic container that provides looping functionality. A For loop
contains a counter that usually increments (though it sometimes decrements), at which
point a comparison is made with a constant value. If the condition evaluates to True, then
the loop execution continues.
Sequence Container: One special kind of container both conceptually and physically can
hold any other type of container or Control Flow component. It is also called container
container, or super container.
Note: If the developer does not specify/use any container in package, SSIS Runtime Engine
considers a default container called Task Host Container.

Tasks
A unit of work a work flow. Tasks do the work in packages. Integration Services includes
tasks for performing a variety of functions.
The Data Flow task: It defines and runs data flows that extract data, apply transformations,
and load data.
Data preparation tasks: It copies files and directories, downloads files and data, saves data
returned by Web methods, or works with XML documents.
Workflow tasks: It communicates with other processes to run packages or programs, sends
and receives messages between packages, sends e-mail messages, reads Windows
Management Instrumentation (WMI) data, or watch for WMI events.
SQL Server tasks: It accesses, copy, insert, delete, or modify SQL Server objects and data.
Analysis Services tasks: It creates, modifies, deletes, or processes Analysis Services objects.
Scripting tasks: It extends package functionality through custom scripts.
Maintenance tasks: It performs administrative functions, such as backing up and shrinking
SQL Server databases, rebuilding and reorganizing indexes, and running SQL Server Agent
jobs.
Data Flow
Its processing responsibilities by employing the pipeline paradigm, carrying data record by
record from its source to a destination and modifying it in transit by applying
transformations. (There are exceptions to this rule, since some of them, such as Sort or
Aggregate require the ability to view the entire data set before handing it over to their

Kelly Technologies, Hyderabad. Page 7


MS-BI MATERIAL WITH SCENARIOS

downstream counterparts). Items which are used to creating a data flow categorize into
three parts.

Elements of Data Flow are categorized into three parts:


Data Flow Sources: These elements are used to read data from different type of sources
like (SQL Server, Excelsheet, etc.)
Data Flow Transformations: These elements are used to do process on data like
(cleaning, adding new columns, etc.)
Data Flow Destinations: These elements are used save processed data into desired
destination. (SQL Server, Excelsheet, etc.)

Kelly Technologies, Hyderabad. Page 8


MS-BI MATERIAL WITH SCENARIOS

Other Sources
Different items which can communicate in various types of source data are listed below:
DataReader Source: The DataReader source uses an ADO.NET connection manager to
read data from a DataReader and channel it into the Data Flow.
Excel Source: The Excel source connects to an Excel file and, selecting content based on a
number of configurable settings, supplies the Data Flow with data. The Excel Source uses
the Excel connection manager to connect to the Excel file.
Flat File source: Formats of which include CSV and fixed-width columnsare still popular.
For many reasons, individual circumstances can dictate the use of CSV files over other
formats, which is why the Flat File Source remains a popular Data Flow data source.
OLE DB Source: The OLEDB Source is used when the data access is performed via an OLE
DB provider. Its a fairly simple data source type, and everyone is familiar with OLE DB
connections.
Raw file Source: The Raw File Source is used to import data that is stored in the SQL Server
raw file format. It is a rapid way to import data that has perhaps been output by a previous
package in the raw format.
XML Source: The XML Source requires an XML Schema Definition (XSD) file, which is really
the most important part of the component because it describes how SSIS should handle the
XML document.
Common and Other Transformations
Items in this category are used to perform different operations to make data in required
format.
Aggregate: The Aggregate transformation component essentially encapsulates number of
aggregate functions as part of the Data Flow, like Count, Count distinct, Sum, Average,
Minimum, Maximum, Group By with respect to one or more columns.

Kelly Technologies, Hyderabad. Page 9


MS-BI MATERIAL WITH SCENARIOS

Audit: The Audit transformation exposes system variables to the Data Flow that can be used
in the stream. This is accomplished by adding columns to the Data Flow output. When you
map the required system variable or variables to the output columns, the system variables
are introduced into the flow and can be used.
Character Map: It performs string manipulations on input columns Like Lowercase,
Uppercase, etc.
Conditional Split: The Conditional Split task splits Data Flow based on a condition.
Depending upon the results of an evaluated expression, data is routed as specified by the
developer.
Copy Column: The Copy Column task makes a copy of a column contained in the input-
columns collection and appends it to the output-columns collection.
Data Conversion: It is converting data from one type to another. Just like Type Casting.
Data Mining Query: The data-mining implementation in SQL Server 2005 is all about the
discovery of factually correct forecasted trends in data. This is configured within SSAS
against one of the provided data-mining algorithms. The DMX query requests a predictive
set of results from one or more such models built on the same mining structure. It can be a
requirement to retrieve predictive information about the same data calculated using the
different available algorithms.
Derived Column: One or more new columns are appended to the output-columns collection
based upon the work performed by the task, or the result of the derived function replaces an
existing column value.
Export Column: It is used to extract data from within the input stream and write it to a file.
Theres one caveat: the data type of the column or columns for export must be DT_TEXT,
DT_NTEXT, or DT_IMAGE.
Fuzzy Grouping: Fuzzy Grouping is for use in cleansing data. By setting and tweaking task
properties, you can achieve great results because the task interprets input data and makes
intelligent decisions about its uniqueness.
Fuzzy Lookup: It uses a reference (or lookup) table to find suitable matches. The reference
table needs to be available and selectable as a SQL Server 2005 table. It uses a configurable
fuzzy-matching algorithm to make intelligent matches.
Import Column: It is used to import data from any file or source.
Lookup: The Lookup task leverages reference data and joins between input columns and
columns in the reference data to provide a row-by-row lookup of source values. This
reference data can be a table, view, or dataset.
Merge: The Merge task combines two separate sorted datasets into a single dataset that is
expressed as a single output.
Merge Join: The Merge Join transform uses joins to generate output. Rather than requiring
you to enter a query containing the join, however (for example SELECT x.columna,
y.columnb FROM tablea x INNER JOIN tableb y ON x.joincolumna = y.joincolumnb),
the task editor lets you set it up graphically.
Multicast: The Multicast transform takes an input and makes any number of copies directed
as distinct outputs. Any number of copies can be made of the input.
OLE DB Command: The OLE DB command transform executes a SQL statement for each
row in the input stream. Its kind of like a high-performance cursor in many ways.
Percentage Sampling: The Percentage Sampling transform generates and outputs a
dataset into the Data Flow based on a sample of data. The sample is entirely random to
represent a valid cross-section of available data.

Kelly Technologies, Hyderabad. Page 10


MS-BI MATERIAL WITH SCENARIOS

Pivot: The Pivot transformation essentially encapsulates the functionality of a pivot query in
SQL. A pivot query demoralizes a normalized data set by rotating the data around a
central pointa value.
Row Count: The Row Count task counts the number of rows as they flow through the
component. It uses a specified variable to store the final count. It is a very lightweight
component in that no processing is involved, because the count is just a property of the
input-rows collection.
Row Sampling: The Row Sampling task, in a similar manner to the Percentage Sampling
transform I discussed earlier, is used to create a (pseudo) random selection of data from the
Data Flow. This transform is very useful for performing operations that would normally be
executed against a full set of data held in a table. In very high-volume OLTP databases,
however, this just isnt possible at times. The ability to execute tasks against a
representative subset of the data is a suitable and valuable alternative.
Sort: This transform is a step further than the equivalent ORDER BY clause in the average
SQL statement in that it can also strip out duplicate values.
Script Component: The Script Component is using for scripting custom code in
transformation. It can be used not only as a transform but also as a source or a destination
component.
Slowly Changing Dimension: The Slowly Changing Dimension task is used to maintain
dimension tables held in data warehouses. It is a highly specific task that acts as the conduit
between an OLTP database and a related OLAP database.
Term Extraction: This transformation extracts terms from within an input column and then
passes them into the Data Flow as an output column. The source column data type must be
either DT_STR or DT_WSTR.
Term Lookup: This task wraps the functionality of the Term Extraction transform and uses
the values extracted to compare to a reference table, just like the Lookup transform.
Union All: Just like a Union All statement in SQL, the Union All task combines any number
of inputs into one output. Unlike in the Merge task, no sorting takes place in this
transformation. The columns and data types for the output are created when the first input
is connected to the task.
Unpivot: This task essentially encapsulates the functionality of an unpivot query in SQL. An
unpivot query increases the normalization of a less-normalized or denormalized data set by
rotating the data back around a central pointa value.
Other Destinations
Finally, processed data will saved/Loads at destination with the help of these items.
Data Mining Model Training: It trains data-mining models using sorted data contained in
the upstream Data Flow. The received data is piped through the SSAS data-mining
algorithms for the relevant model.
DataReader Destination: The results of an SSIS package executed from a .NET assembly
can be consumed by connecting to the DataReader destination.
Dimension Processing: Dimension Processing is another SSAS-related destination
component. It is used to load and process an SSAS dimension.
Excel Destination: The Excel Destination has a number of options for how the destination
Excel file should be accessed. (Table or View, TableName or ViewName variable, and SQL
Command)
Flat File Destination: The Flat File Destination component writes data out to a text file in
one of the standard flat-file formats: delimited, fixed width, fixed width with row delimiter.

Kelly Technologies, Hyderabad. Page 11


MS-BI MATERIAL WITH SCENARIOS

OLE DB Destination: The OLE DB Destination component inserts data into any OLE DB
compliant data source.
Partition Processing: The Partition Processing destination type loads and processes an
SSAS partition. In many ways, it is almost exactly the same as the Dimension Processing
destinationat least in terms of configuration. You select or create an SSAS connection
manager, choose the partition to process, and then map input columns to the columns in the
selected partition.
Raw File Destination: The Raw File Destination is all about raw speed. It is an entirely
native format and can be exported and imported more rapidly than any other connection
type, in part because the data doesnt need to pass through a connection manager.
Recordset Destination: The Recordset Destination creates an instance of an ActiveX Data
Objects (ADO) Recordset and populates it with data from specified input columns.
SQL Server Destination: The SQL Server Destination provides a connection to a SQL
Server database. Selected columns from the input data are bulk inserted into a specified
table or view. In other words, this destination is used to populate a table held in a SQL
Server database.
SQL Server Mobile Destination: The SQL Server Mobile Destination component is used to
connect and write data to a SQL Server Mobile (or SQL Server Compact Edition) database

How to create an SSIS Project:

1) Open SSDT (SQL Server Data Tool)


Make sure you have SQL Server (2005 or higher) installed on your machine with SSDT.
Go to Start Programs Microsoft SQL Server (with version you have installed) and open
SQL Server Data Tool.

Kelly Technologies, Hyderabad. Page 12


MS-BI MATERIAL WITH SCENARIOS

1) Create new project, In SSDT select File New Project

2. You will get new project dialog box where you should:
I. Select Business Intelligence Projects in Project Types
II. Select Integration Services Project in Templates:
III. Give it a name (Try to avoid spaces for compatibility reasons)
IV. Remember or change location
V. Click ok to create SSIS Project

Kelly Technologies, Hyderabad. Page 13


MS-BI MATERIAL WITH SCENARIOS

Click Ok.

Below is example of an empty package. I have highlighted the elements we will use and
briefly discuss it below (you can ignore the rest):

Kelly Technologies, Hyderabad. Page 14


MS-BI MATERIAL WITH SCENARIOS

Solution Explorer - on the right you see solution explorer with your SSIS project (first icon
from top). If you dont have it go to view//solution explorer. In majority of cases you will use
SSIS Packages only.
Package tab - In middle we have a package.dtsx opened which contains control flow, data
flow that we will use.
Toolbox - This shows tools (items/tasks) that we can use to build our ETL package. Toolbox
is different for control flow and data flow tabs in the package.
Control Flow - Here you will be able to control your execution steps. For example you can
log certain information before you start the data transfer, you can check if file exists, and
you can send an email when a package fails or finishes. In here you will also add a task to
move data from source to destination however you will use data flow tab to configure.
Data Flow - This is used to extra source data and define destination. During the "data flow"
you can perform all sorts of transformation for instance create new calculation fields,
perform aggregations and many more.

EXECUTE SQL TASK:


Execute SQL Task is used to execute any valid SQL statements (DDL, DML, DCL or functions,
Stored Procedures etc.).
In Control Flow tab, Drag and drop Execute SQL Task from Control Flow Items (Toolbox)
In SSMS (SQL Server Management Studio), run the following query to create a new table
(HumanResources.EmployeeSource) from existing table (HumanResources.Employee),
SELECT * INTO HumanResources.EmployeeSource
FROM HumanResources.Employee
Scenario: Here, giving a scenario to truncate or cleanup a table
(HumanResources_EmployeeSource) using Execute SQL task truncate SQL statement,
Steps to Configure Execute SQL Task:
In General Page set the following properties,
1. Under General Section,
Name: Cleanup HumanResources_EmployeeSource
Description: Cleanup HumanResources_EmployeeSource

2. Under Option section:


Timeout: 0 (in seconds, which is default timeout)
Code Page: 1252 (Unicode Locale-ID for USA English language)
3. Under Result Set section:
Resultset: None (Default), Single row, full result set and XML) (we describe more in
later sessions)
4. Under SQL Statement section:
Connection Type: OLE DB (Default option)
Connection: Select New connection Click New In Connection Manager editor,
Server Name: Sys or Localhost or . (period)

Kelly Technologies, Hyderabad. Page 15


MS-BI MATERIAL WITH SCENARIOS

Database Name: Select Adventure Works database from drop down list Click
ok twice.
SQL Source Type: Direct Input (This option allows us to Enter SQL query to be
executed),
File Connection: Enclosed/Attach a .SQL file to execute the query in your file system.
Variable: Create a SSIS variable and set,
Name: uvSQL Datatype: String Scope: Package and Value: Truncate Table
Table-Name
5. Finally, Click Ok to save the changes.

6. Exeucte Package (F5) Solution Explorer Select a Package Right click Execute
Package.
7. Go to SSMS Run Select * from HumanResources.EmployeeSource query, 0 rows would
be affected, as its cleaned up.

Execute Package Task:


Execute package task is used to execute a package in another package (the way we call and
execute a function with-in another function in general C/C++ programming language).
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as Master.dtsx

Kelly Technologies, Hyderabad. Page 16


MS-BI MATERIAL WITH SCENARIOS

Steps to configure Execute Package Task:


Reference Type:
I. Project Reference
II. External Reference
IF Reference Type Project Reference then,
Select PackageNameFromProjectReference Select a package from current
project.
IF Reference Type External Reference then,
Location:
1. SQL Server
2. File System
Location: File System,
Connection: New Connection Select any package to be executed in Master.dtsx Package
Click Ok
OR
Location: SQL Server
Connection: Select New Connection Provide Server Name and select Database
(Adventure Works) Click Ok
Package Name: Select a package from server which has been already deployed/published
to Server.

Kelly Technologies, Hyderabad. Page 17


MS-BI MATERIAL WITH SCENARIOS

Execute Package.
----------------------------------------------------------------------------------------------------------------------------

Data Conversion Transformation: Data conversion transformation is used to converts


columns to different data-type (type cast) and also add a new column/output alias for every
input column,
Note: Data conversion transformation is used to make sure both Source table structure and
destination structure are same/sync in terms of data type and length of columns.
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as DataConversion.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and set the following setting to configure
OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)

Kelly Technologies, Hyderabad. Page 18


MS-BI MATERIAL WITH SCENARIOS

Click New In Connection Manager Editor,


Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build
Query to provide all the columns instead of using * (select *, kills performance of data
extract process)

Kelly Technologies, Hyderabad. Page 19


MS-BI MATERIAL WITH SCENARIOS

Click Ok.

4. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination,
OLEDB Connection Manager Select Connection Manager
Data access mode Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[DataConversion]
In the create table structure, change the datatype for the following columns to
replicate/reproduce the data conversion issue,
[NationalIDNumber] varchar(15)
[LoginID] varchar(256)
[MaritalStatus] varchar(256)
Note: Now, data types of the above columns are not matching with source columns, which
cause an issue/error.

Click Ok (Now, you can observe that, [DataConversion] table is created at specified
database).
Select Mappings Page
Click Ok.

Kelly Technologies, Hyderabad. Page 20


MS-BI MATERIAL WITH SCENARIOS

You can observe that the following error raised by OLEDB Destination as the data
type of NationalIDNumber is not matching source system/table,

Select Green Arrow Mark (Data Flow pipe line) and remove/delete between Source
and destination.

Drop and drop Data Conversion Transformation (from Data Flow Transformations)
between OLEDB Source and OLEDB Destination.
Select OLEDB Source, drag and drop green data flow pipe line to Data conversion
transformation,
Double click on Data Conversion transformation and set the following properties,
Select NationalIDNumber from Available Input Columns
Input Column (NationalIDNumber) is fixed, which never changes
For every input column, SSIS creates a new Output alias which would be carry
forwarded to next level for further process,
Change the data type from DT_WSTR (nvarchar) to DT_STR (varchar)
Input Column: NationalIDNumber
Output alias: NationalIDNumberDC (renamed) and Click Ok

Kelly Technologies, Hyderabad. Page 21


MS-BI MATERIAL WITH SCENARIOS

n. Drag and drop Green Arrow mark to OLEDB Destination


o. Double Click on OLEDB Destination
p. Select Mappings Page
q. Change mapping between,
Input Column: NationalIDNumberDC
Output Column: NationalIDNumber

Kelly Technologies, Hyderabad. Page 22


MS-BI MATERIAL WITH SCENARIOS

r. Click Ok to save the changes.


Note: Please follow the above mentioned steps to resolve issue with LoginID and
MaritalStatus related issues.
s. Execute Package.

Derived Column Transformation: Derived column transformation is used to derive new


columns (computed columns) which can be derived using existing variable, columns and/or
functions.
Scenario: Create a new column LastExecutedDate to find out the when the package
had last executed as part of audit. Append,

Steps to Configure Derived Column Transformation:


In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as DataConversion.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and set the following setting to configure
OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 23


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Department WITH (NOLOCK) and click Build
Query to have a provision to select required columns/all the columns instead of using *
(select *, kills performance of data extract process). Uncheck ModifiedData field,

Kelly Technologies, Hyderabad. Page 24


MS-BI MATERIAL WITH SCENARIOS

Click Ok.
4. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination,
OLEDB Connection Manager Select Connection Manager
Data access mode Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[DerivedColumn]
In the create table structure, add/append a new column LastExecutedData
column with DateTime datatype as mentioned below,
, LastExecutedData DateTime

Click Ok (Now, you can observe that, [DerivedColumn] table is created at specified
database).
Select Mappings Page (Destination Column LastExecutedData is not mapped to any
Input column i.e. Ignore)
Click Ok.

Select Green Arrow Mark (Data Flow pipe line) and remove/delete between Source
and destination.
Drop and drop Derived Column Transformation (from Data Flow Transformations)
between OLEDB Source and OLEDB Destination.

Kelly Technologies, Hyderabad. Page 25


MS-BI MATERIAL WITH SCENARIOS

Select OLEDB Source, drag and drop green data flow pipe line to Derived Column
transformation,
Double click on Derived Column transformation and set the following properties,
Expression: Getdate()
Derived Column Name: LastExecutedDate
Click Ok
Drag and drop Green Arrow mark to OLEDB Destination
Double Click on OLEDB Destination
Select Mappings Page
Change mapping between,
Input Column: Select LastExecutedData
Destination Column: LastExecutedData
Click OK
Execute Package

Conditional Split Transformation: The Conditional Split transformation splits Data Flow
based on a condition. Depending upon the results of an evaluated expression, data is routed
as specified by the developer.
Note: The implementation of conditional split transformation is similar to Switch case
decision structure in general programming language (C/C++).
Steps to configure Conditional Split Transformation:
Scenario: Split the source Employees data to multiple destinations based on few conditions
(like Gender and Marital Status)
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as Conditional_Split.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and set the following setting to configure
OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 26


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build
Query to provide all the columns instead of using * (select *, kills performance of data
extract process)

Kelly Technologies, Hyderabad. Page 27


MS-BI MATERIAL WITH SCENARIOS

Click Ok.
4. Drag and drop Conditional Split transformation (from Data Flow Transformations section)
after
OLEDB Source and set the following properties,
Select OLEDB Source, drag and drop green data flow pipe line to Conditional Split
transformation,
Double click on Conditional Split transformation and set the following properties,
Output Name: Single Male (rename the Case1)
Condition: [Gender] == "M" && [MaritalStatus] == "S"

Output Name: Single Female (rename the Case1)


Condition: [Gender] == "F" && [MaritalStatus] == "S"

Default Output Name : Other than Single Male and Single Female (Married
Employees)

Kelly Technologies, Hyderabad. Page 28


MS-BI MATERIAL WITH SCENARIOS

Click Ok
5. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination1,
and
Select Single Male Output option and click Ok.
OLEDB Connection Manager Select Connection Manager
Data access mode Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[SingleMaleData]

Kelly Technologies, Hyderabad. Page 29


MS-BI MATERIAL WITH SCENARIOS

Click Ok (Now, you can observe that, [SingleMaleData] table is created at specified
database).
Select Mappings Page
Click Ok.
6. Drag and drop OLEDB Destination2, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination2,
and
Select Single Female Output option and click Ok.
OLEDB Connection Manager: Select Connection Manager
Data access mode: Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[SingleMaleData]

Click Ok (Now, you can observe that, [SingleFemaleData] table is created at specified
database).
Select Mappings Page
Click Ok.
Note: Follow the above mentioned steps to capture Conditional Split Default Output.

Merge Transformation: Merges/integrates data from two sorted data sources into
single destination.
Steps to configure Merge Transformation:
Scenario: Merges/integrates data from two sorted data sources into single destination.
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as MergeEmployee_EmpAddress.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc1. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 30


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build
Query to provide all the columns instead of using * (select *, kills performance of data
extract process)

Kelly Technologies, Hyderabad. Page 31


MS-BI MATERIAL WITH SCENARIOS

Click Ok.

4. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc2. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 32


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.EmployeeAddress WITH (NOLOCK) and click
Build Query to provide all the columns instead of using *
(select *, kills performance of data extract process)

Kelly Technologies, Hyderabad. Page 33


MS-BI MATERIAL WITH SCENARIOS

5. Click Ok.
At this point if you try to edit the MERGE transformation you will get the below error. The
reason for this is because the data needs to be sorted for the MERGE transformation to
work. We will look at two options for handling this sorting need.

6. Data is presorted prior to loading the data.


Let's assume that are data is sorted prior to loading. We therefore need to tell SSIS this is
the case as well as show which column the data is sorted on.
First if you right click on "OLEDBSrc1"
Select the "Show Advanced Editor".
O the Input and Output Properties tab (Last Tab)
Select OLEDB Source Output and set "IsSorted" to True
Expand OLEDB Source Output
Expand Output Columns
Select EmployeeID and set SortKeyPosition - 1

Right click on "OLEDBSrc2 and set the following properties to sort the source output,

Kelly Technologies, Hyderabad. Page 34


MS-BI MATERIAL WITH SCENARIOS

Select the "Show Advanced Editor".


O the Input and Output Properties tab (Last Tab)
Select OLEDB Source Output and set "IsSorted" to True
Expand OLEDB Source Output
Expand Output Columns
Select EmployeeID and set SortKeyPosition - 1

Note: Of course, we can use Sort Transformations to sort source data, but not suggestible as
it is blocked transformation and which hampers the performance data load process.

7. Drag and drop Merge transformation and make a connection between OLEDBSrc1 to
Merge transformation, you will be able to see the below mentioned window and select the
input process as shown below,

Kelly Technologies, Hyderabad. Page 35


MS-BI MATERIAL WITH SCENARIOS

8. Click Ok
9. Drag and drop Merge transformation and make a connection between OLEDBSrc2 to
Merge transformation.
10. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination,
and set
the following properties,
OLEDB Connection Manager: Select Connection Manager
Data access mode: Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[MergedData]

Click Ok (Now, you can observe that, [MergedData] table is created at specified
database).
Select Mappings Page
Click Ok.

11. Execute Package

Merge Join Transformation: Merge Join Transformation merges data from 2 sorted
datasets into single destination using Joins (Inner, Left Outer and full outer Join, in SSIS
merge Join transformation does not support Right Outer Join).

Kelly Technologies, Hyderabad. Page 36


MS-BI MATERIAL WITH SCENARIOS

In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as MergeEmployee_EmpAddress.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc1. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build
Query to provide all the columns instead of using * (select *, kills performance of data
extract process)

Kelly Technologies, Hyderabad. Page 37


MS-BI MATERIAL WITH SCENARIOS

Click Ok.

4. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc2. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 38


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.EmployeeAddress WITH (NOLOCK) and click
Build Query to provide all the columns instead of using * (select *, degrades
the
performance of data extracts process)

Kelly Technologies, Hyderabad. Page 39


MS-BI MATERIAL WITH SCENARIOS

5. Click Ok.
At this point if you try to edit the MERGE transformation you will get the below error. The
reason for this is because the data needs to be sorted for the MERGE transformation to
work. We will look at two options for handling this sorting need.

6. Data is presorted prior to loading the data.


Let's assume that are data is sorted prior to loading. We therefore need to tell SSIS this is
the case as well as show which column the data is sorted on.
First if you right click on "OLEDBSrc1"
Select the "Show Advanced Editor".
O the Input and Output Properties tab (Last Tab)
Select OLEDB Source Output and set "IsSorted" to True
Expand OLEDB Source Output
Expand Output Columns
Select EmployeeID and set SortKeyPosition - 1

Right click on "OLEDBSrc2 and set the following properties to sort the source output,

Kelly Technologies, Hyderabad. Page 40


MS-BI MATERIAL WITH SCENARIOS

Select the "Show Advanced Editor".


O the Input and Output Properties tab (Last Tab)
Select OLEDB Source Output and set "IsSorted" to True
Expand OLEDB Source Output
Expand Output Columns
Select EmployeeID and set SortKeyPosition - 1

Note: Of course, we can use Sort Transformations to sort source data, but not suggestible as
it is blocked transformation and which hampers the performance data load process.

7. Drag and drop Merge transformation and make a connection between LeftSource to
Merge transformation, you will be able to see the below mentioned window and select the
input process as shown below,

Kelly Technologies, Hyderabad. Page 41


MS-BI MATERIAL WITH SCENARIOS

8. Click Ok
9. Drag and drop Merge Join transformation and make a connection between RightSource
to Merge Join transformation.
10. Double click on Merge Join transformation, set the following properties,
Join Type: Inner (By default) (select the corresponding join types as discussed
in
class room training)
11. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Source which we configure above, we can see two arrow marks, Green and
Red
arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination,
and set
the following properties,
OLEDB Connection Manager: Select Connection Manager
Data access mode: Table or View fast load (default option)
Name of the table or view Select a destination table if it exists, else Click New to
create a new table,
In Create Table editor, Rename OLEDB Destination (default table name) as
[MergeJoinData]

Click Ok (Now, you can observe that, [MergeJoinData] table is created at specified
database).
Select Mappings Page
Click Ok.
12. Execute Package.

Kelly Technologies, Hyderabad. Page 42


MS-BI MATERIAL WITH SCENARIOS

Variables: SSIS supports variables to store values based on the data types, same as
programming language,
Types of Variables:
System Defined Variables
User Defined Variable
How to use Variables in SSIS:
System Defined Variable: System defined variables can be used in any
expressions/tasks/containers/data flow components as mentioned below,
@[System::VariableName] Or @VariableName
User Defined Variable: User defined variables can be used in any
expressions/tasks/containers/data flow components as mentioned below,
@[User::VariableName] Or @VariableName

Example Package on Variables and Precedence Constraints:

In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as RowCountwithVariables_PrecedenceConst.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc1. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 43


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build Query.

Kelly Technologies, Hyderabad. Page 44


MS-BI MATERIAL WITH SCENARIOS

4. Click Ok.
5. Go to Control Flow tab and select package.
6. Select SSIS Menu Variables
7. In Variables Editor, Click Add Variable and set the following properties,

Variable Name Scope DataType Value


uvSrcCount Package Int32 0

5. Use RowCount Transformation to capture number of rows which are coming from
source table/system and hold the rows count in a variable (uvSrcCount ) of type int.

6. Click Ok in RowCount Transformation.


7. Go to Control Flow table
8. Drag and drop another Execute SQL Task and rename it as Execute SQL Task2, and set the

Kelly Technologies, Hyderabad. Page 45


MS-BI MATERIAL WITH SCENARIOS

following mandatory properties to configure Execute SQL Tas2,


Connection: Specify the OLEDB Connection
SQL Command: Select Getdate() (Provide a valid SQL statement)
9. Make a Connection (using green arrow mark) from OLEDB Source to Execute SQL
Tas2. That means, after successful execution of OLEDB Source, that control goes to
Execute SQL Tas2 to execute it.
10. But, I want to execute Execute SQL Tas2, based on the output of OLEDB Source,
11. Double click on Green Arrow mark, it opens Precedence Constraints editor, and set the
following properties,
Evaluation operation: Expression and Constraints
Value: Success
Expression: @uvSrcCount > 0 (Used this variable in row count transformation)
12. Click Ok
13. Drag and drop another Execute SQL Task and rename it as Execute SQL Task3, and set
the
following mandatory properties to configure Execute SQL Tas3,
Connection: Specify the OLEDB Connection
SQL Command: Select Getdate() (Provide a valid SQL statement)
14. Make a Connection (using green arrow mark) from OLEDB Source to Execute SQL
Tas3. That means, after successful execution of OLEDB Source, that control goes to
Execute SQL Tas3 to execute it.
15. But, I want to execute Execute SQL Tase3, when OLEDB Source failed,
16. Double click on Green Arrow mark, it opens Precedence Constraints editor, and set the
following properties,
Evaluation operation: Constraints
Value: Failure (we can RED arrow mark)
17. Click Ok
18. Drag and drop another Execute SQL Task and rename it as Execute SQL Task4, and set
the
following mandatory properties to configure Execute SQL Tas4,
Connection: Specify the OLEDB Connection
SQL Command: Select Getdate() (Provide a valid SQL statement)
19. Make a Connection (using green arrow mark) from OLEDB Source to Execute SQL
Tas4. That means, after successful execution of OLEDB Source, that control goes to
Execute SQL Tas4 to execute it.
20. Execute SQL Tas4 would be executed irrespective of the status of OLEDB Source.
21. Double click on Green Arrow mark, it opens Precedence Constraints editor, and set the
following properties,
Evaluation operation: Constraints
Value: Completion (we can BLUE arrow mark)
22. Click Ok
23. Execute Package and see, how precedence constraints concepts implemented.

Kelly Technologies, Hyderabad. Page 46


MS-BI MATERIAL WITH SCENARIOS

As of now, we have gone through multiple transformations with various scenarios


individually. But, now, let us use few of the transformations (Row Count, Derived Column,
Conditional Split and Union All) in a single package by creating a different scenario.

Steps to replicate the above mentioned scenario:


In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as TestCondition.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Actual Business
Logic.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Data access mode: Select SQL Command option

Kelly Technologies, Hyderabad. Page 47


MS-BI MATERIAL WITH SCENARIOS

SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.Employee WITH (NOLOCK) and click Build Query.

4. Click Ok.
5. Go to Control Flow tab and select package.
6. Select SSIS Menu Variables
7. In Variables Editor, Click Add Variable and set the following properties,
Name Scope Data Type Value
uvSrcCount Package Int32 0
uvDst1Count Package Int32 0
uvDst2Count Package Int32 0
uvSolutionNam Package String SSIS_Morning730AMIST
e
uvTableNames Package String HumanResources.Employees,
dbo.DestinationTbl1, dbo.DestinationTbl2

8. In Data Flow, drag and drop RowCount transformation and make connection between
OLEDBSrc and RowCount, and also set the following properties to configure RowCount
transformation.
Variable name User::uvSrcCount (Select a variable from drop down list to hold row
count)
9. Click Ok

Kelly Technologies, Hyderabad. Page 48


MS-BI MATERIAL WITH SCENARIOS

10. Drag and drop Derived Column Transformation, to define/derive LastExecutedDate (to
find that when the package last executed).
10. In Derived Column transformation, drag and drop and define the following expression
i. Expression -- @[System::StartTime] (i.e. a system defined variable which gives
system
date and time)
ii. Derived Column Name Rename column name as LastExecutedDate
11. Click Ok.
12. Drag and drop Conditional Split Transformation to split the source data to multiple
destinations based on the conditions,
13. Please follow the below steps to configure the conditional Split Transformation,
Condition: MaritalStatus == "S" && Gender == "M"
Output Name: Case1 and click Ok
14. Drag and drop RowCount transformation and configure it by following the below
mentioned steps,
15. Select Case1 from Input and output Selection Wizard.
16. Set, Variable: uvDst1Count
17. Drag and drop OLEDBDestination and configure it,
Connection: Provide Destination Connection Manager
Table: If table exists, then select a table from drop down list or click New to create a
new table.

18. In Create table editor, rename the table as DestinationTbl2, select Mappings page and
click Ok.
19. Drag and drop RowCount transformation and configure it by following the below
mentioned steps,
20. Select Conditional Split Default Output from Input and output Selection Wizard.
21. Set, Variable: uvDst2Count
22. Drag and drop OLEDBDestination and configure it,
Connection: Provide Destination Connection Manager
Table: If table exists, then select a table from drop down list or click New to create a
new table.
23. In Create table editor, rename the table as DestinationTbl2, select Mappings page and
click Ok.
24. Finally, the package Data flow definition looks like the below shown diagram,

Kelly Technologies, Hyderabad. Page 49


MS-BI MATERIAL WITH SCENARIOS

As of now, actual ETL business login has been implemented. But, I would like to add few
more enhancements to the same package to capture below log details into a dbo.SSIS_Log
table,
a. Solution Name
b. Package Name
c. Table Name
d. Source Count
e. Destination Count
f. Status
g. LastExecutedDate

25. In the same the package, Control Flow, Drag and drop Data Flow Task and rename it DFT
Test Condition. Make a Precedence Constraint (connection) between both the Data Flow
Tasks.
26. Double click on DFT Test Condition and in Data Flow Tab, drag and drop OLEDBSrc and
provide the following settings to configure it,
i. OLEDB Connection: Provide New Connection Manager (Sys.AdventureWorks)
ii. Data Access Mode: SQL Server
iii. SQL Command: Select MAX(LastExecutedDate) as LastExecutedDate from
DestinationTbl2
27. Drag and drop Derived Column Transformation and define the following expressions and
new columns for log table, and click Ok.

Kelly Technologies, Hyderabad. Page 50


MS-BI MATERIAL WITH SCENARIOS

Derived Column Name Expression


Solution Name (DT_STR,50,1252)@[User::uvSolutionName]
Package Name (DT_STR,50,1252)@[System::PackageName]
Table Names (DT_STR,50,1252)@[User::uvTableNames]
Source Count @[User::uvSrcCount]
Destination Count @[User::uvDst1Count] + @[User::uvDst2Count]

28. Drag and drop Conditional Split, to check below condition, and click Ok
i. Condition: [Source Count] == [Destination Count]
ii. Output Name: Source Is Destination
iii. Default Output Name: Source Is Not Destination
29. Drag and drop Derived column, to derive Status field which nor in format of variable
or column,
i. Expression: (DT_STR, 10, 1252)"Success"
ii. Derived Column Name: Status
30. Drag and drop Derived column, to derive Status field which nor in format of variable
or column,
i. Expression: (DT_STR, 10, 1252)"Failure"
ii. Derived Column Name: Status
31. Drag and drop Union All transformation, to merge the data from multiple sources into
single destination, which has similar structure. Click Ok

32. Drag and drop OLEDB Destination, and configure it with the following settings,
i. Connection: Sys.AdventureWorks
ii. Destination Table: dbo.SSIS_Log

Kelly Technologies, Hyderabad. Page 51


MS-BI MATERIAL WITH SCENARIOS

33. Execute Package


34. Open SSMS Connection Database Engine Run the following SQL query,
SELECT * FROM dbo.SSIS_Log WITH (NOLOCK)

Kelly Technologies, Hyderabad. Page 52


MS-BI MATERIAL WITH SCENARIOS

Excel Source: Excel source adapter/component is used to extract data from excel work
book.
Note: Prepare an Excel 2003 version file with the following information and save it at any
shared location (C:\Files\Students Details.xls)
Sno Name Class
1 Balaji MCA
2 Balu MBA
3 Nikhila MS
4 Pujitha MD
5 Jnanesh MD
Balaji Mba
Balaji Mbbs
Lekhasr
8 ee MSc
9 Balaji MS
10 Balaji MCA

Steps to Configure Excel Source Component:


Scenario: We have data in excel work book (sheet1) with valid and invalid data. But, I
would like to load only valid data (where SNo is not NULL rows) to actual destination table
and capture invalid data to excel destination to send the invalid rows to business user/end
user for further analysis.
Note: This functionality can be achieved by using Conditional Split Transformation:
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as ExcelSource_RemoveInvalidData.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop Excel Source adapter/component.
Note: Excel Source component is used to extract the data from any relational database
using Microsoft.Jet.OLEDB 4.0 provider (if Excel 2003 and Microsoft.Jet.OLEDB 12.0
provider for Excel 2007).
4. Excel Connection Manager, click New (to create a new connection as this package a new)
5. In Excel Connection Manager Editor, Click Browse for a excel files and navigate to path
where file is location (C:\Files\Students Details.xls),
6. Make sure, we select, First Row has column name check box.
7. Click Ok

Kelly Technologies, Hyderabad. Page 53


MS-BI MATERIAL WITH SCENARIOS

8. Drag and drop Conditional Split Transformation to filter invalid rows to destination, and set
the following condition,

a. Output: Case1 (renamed as Sno Is Null)


b. Condition: IsNull(Sno)
c. Conditional Split Default Output: Sno Is Not Null
9. Drag and drop OLEDB Destination, and set the following properties to configure it,
Select OLEDB Conditional Split transformation which we configured above, we can see two
arrow marks, Green and Red arrows marks,
Select green arrow mark (data flow pipeline) and drag and drop to OLEDB Destination and
select Sno Is Not Null option from Output drop down list and click Ok
10. Double click on OLEDB Destination and set the following properties,
a. OLEDB Connection Manager: Select Connection Manager
b. Data access mode: Table or View fast load (default option)
c. Name of the table or view: Select a destination table if it exists, else Click
New to create a new table
d. In Create Table editor, Rename OLEDB Destination (default table name) as
[SnoIsNotNull]

11. Click Ok (Now, you can observe that, [SnoIsNotNull] table is created at specified
database).
12. Select Mappings Page and Click Ok.

Steps to Capture Invalid Data to Excel Destination:


13. Drag and drop Excel Destination to load SNo Null rows.
14. Double click on Excel Destination to configure it, by using the following properties,
a. OLEDB Connection Manager: Select Excel Connection Manager (used it for
source)
b. Data access mode: Table or View fast load (default option)
c. Name of the table or view: Select a destination excel sheet if it exists, else try
to create a Sheet, Click New.
In Create Table editor, Re-name OLEDB Destination (default table name) as
[SnoIsNull].
15. Click Ok
16. Select Mappings Page and Click Ok.
17. Execute Package.

Kelly Technologies, Hyderabad. Page 54


MS-BI MATERIAL WITH SCENARIOS

Flat File Source:

Formats of which include CSV and fixed-width columnsare still popular. For many reasons,
individual circumstances can dictate the use of CSV files over other formats, which is why
the Flat File Source remains a popular Data Flow data source.

Note: Create a new notepad and provide the following data and save the file as
StudentDetails.txt as mentioned below,

Scenario - 1: How to remove


duplicate rows or records in
a flat file source using
Aggregate Transformation.

In above created Project,


Create a New SSIS Package
(Project Menu Select
New SSIS Package) and
rename it as

FFile_RemoveDuplicateRows_Aggregate.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop Flat File Source adapter/component.
4. Right click on Flat File Source and set the following setting to configure Flat File Source,
a. Connection Manager: Flat File Connection for Student Details
b. Description: Flat File Connection for Student Details
c. Click Browse and navigate to the path where flat file is located and select the
existing Flat file
(Students details.txt)
d. Check, Column Names in the first data row check box,

Kelly Technologies, Hyderabad. Page 55


MS-BI MATERIAL WITH SCENARIOS

5. Select Columns page


6. Select Preview Page, to preview the source data, and click Ok.
7. Drag and drop Aggregate Transformation and connect it from Flat File Source.
8. In Aggregate Transformation, select all the Input column make sure we have selected
Group by operation to group the similar rows as mentioned below,

Kelly Technologies, Hyderabad. Page 56


MS-BI MATERIAL WITH SCENARIOS

9. Click Ok.
10. Drag and drop OLEDB Destination and set the following properties to configure OLEDB
Destination as mentioned below,
i. Connection Manager: Select Destination Connection Manager
ii. Name of Table or view: Select FFAggregateData table from drop down list.
iii. Select Mapping Page and Click Ok.
11. Execute Package.

Kelly Technologies, Hyderabad. Page 57


MS-BI MATERIAL WITH SCENARIOS

Scenario - 2: How to remove duplicate rows or records in a flat file source using Sort
Transformation.
Note: Basically, Sort transformation is used to sort data downstream data. And also Sort
Transformation removes duplicate rows from the source data. But, In our earlier scenario,
(Aggregate Transformation), the output is not sorted. So, I want to sort and remove duplicate
rows in a Flat File.

In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as FFile_RemoveDuplicateRows_Sort.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop Flat File Source adapter/component.
4. Right click on Flat File Source and set the following setting to configure Flat File Source,
a. Connection Manager: Flat File Connection for Student Details
b. Description: Flat File Connection for Student Details
c. Click Browse and navigate to the path where flat file is located and select the
existing Flat file
(Students details.txt)
e. Check, Column Names in the first data row check box,

Kelly Technologies, Hyderabad. Page 58


MS-BI MATERIAL WITH SCENARIOS

5. Select Columns page


6. Select Preview Page, to preview the source data, and click Ok.
7. Drag and drop Sort Transformation and connect it from Flat File Source.
8. In Sort Transformation, select the Available Input column to sort the data as mentioned
below,

9. Click Ok.
10. Drag and drop OLEDB Destination and set the following properties to configure OLEDB
Destination as mentioned below,
i. Connection Manager: Select Destination Connection Manager

Kelly Technologies, Hyderabad. Page 59


MS-BI MATERIAL WITH SCENARIOS

ii. Name of Table or view: Select FFSortData table from drop down list.
iii. Select Mapping Page and Click Ok.

11. Execute Package.

Bulk Insert Task:

Bulk insert task is used to copy large amount of data into SQL Server tables from
text files. For example, imagine a data analyst in your organization provides a feed from a
mainframe system to you in the form of a text file and you need to import this into a SQL
server table. The easiest way to accomplish this is in SSIS package is through the bulk insert
task.

Steps to Configuring Bulk Insert Task

1. Drag the bulk insert task from the toolbox into the control flow window.

Kelly Technologies, Hyderabad. Page 60


MS-BI MATERIAL WITH SCENARIOS

Double click on the bulk insert task to open the task editor. Click on connections in left tab.

2. In the connections tab, specify the OLE DB connection manager to connect to the
destination SQL Server database and the table into which data is inserted. Also,
specify Flat File connection manager to access the source file.

3. Select,

i. Column Delimiters used in the flat file : Select Comma {,}

Kelly Technologies, Hyderabad. Page 61


MS-BI MATERIAL WITH SCENARIOS

ii. Row Delimiters used in the flat file : Select {CR}{LF}

4. Click on the Options in the left tab of the editor, and select the Code page the file,
starting row number (First row). Also Specify actions to perform on the destination
table or view when the task inserts the data.

5. The options are to check constraints, enable identity inserts, keep nulls, fire triggers,
or lock the table.

Kelly Technologies, Hyderabad. Page 62


MS-BI MATERIAL WITH SCENARIOS

6. On running the package the data will get be copied from the source to the
destination. Bulk Insert doesnt have an option to truncate and load; hence you must
use an Execute SQL Task to delete the data already present in the table before
loading flat file data.

It is an easy to use and configure task but with few cons.

1. It only allows to append the data into the table and you cannot perform truncate and
load.

2. Only Flat file can be used as source and not any other type of databases.

3. Only SQL Server Databases can be used as destination. It doesnt support any other
files/ RDBMS systems.

4. A failure in the Bulk Insert task does not automatically roll back successfully loaded
batches.

Note: Only members of the SYSADMIN fixed server role can run a package that contains a
Bulk Insert task.

Kelly Technologies, Hyderabad. Page 63


MS-BI MATERIAL WITH SCENARIOS

Adding Data Viewers to the Data Path:

When troubleshooting data flow, it can be useful to view the actual data as it
passes through a data path. You can do this by adding one or more data viewers to your
data flow. SSIS supports several types of data viewers. The one most commonly used is the
grid data viewer, which displays the data in tabular format.

However, you can also create data viewers that display,

1. Grid
2. Histograms
3. Scatter plot charts
4. Column charts

These types of data viewers tend to be useful for more analytical types of data review, but
for basic troubleshooting, the grid data viewer is often the best place to start.

To create a grid data viewer, open the editor for the data path on which you want to view the
data, then go to the Data Viewers page, as shown below.

The Data Flow Path editor is where you add your data viewers, regardless of the type. To add
a data viewer, click the Add button to launch the Configure Data Viewer dialog box,
shown in Figure 5. Here you select the type of viewer you want to create and provide a name
for that viewer.

Kelly Technologies, Hyderabad. Page 64


MS-BI MATERIAL WITH SCENARIOS

After you select the Grid option from the Type list and provide a name, go to the Grid tab.
This is where you determine what columns you want to include in the grid. At this point,
were interested only the BusinessEntityID and FullName columns because those are the
columns in our target table.

Kelly Technologies, Hyderabad. Page 65


MS-BI MATERIAL WITH SCENARIOS

After you specify the columns to include in the grid, click OK. Youll be returned to the Data
Flow Path Editor. The new grid data viewer should now be displayed in the Data Viewers
list. In addition, a small icon is added next to the data path.

When you debug a package in which a data viewer has been defined, the package will stop
running at the viewers data path and a window will appear and display the data in that part
of the data flow.

Kelly Technologies, Hyderabad. Page 66


MS-BI MATERIAL WITH SCENARIOS

Notice that the data viewer displays the BusinessEntityID and FullName values for each row.
You can scroll down the list, detach the viewer from the data flow, resume the data flow, or
copy the data to the clipboard. The data itself and the ultimate outcome of the package are
unaffected.

Kelly Technologies, Hyderabad. Page 67


MS-BI MATERIAL WITH SCENARIOS

Lookup Transformation:
The Lookup transformation performs lookups by joining data in input columns with columns
in a reference dataset. We use the lookup to access additional information in a related table
that is based on values in common join columns.
Lookup transformation dataset can be a cache file, an existing table or view, a new table, or
the result of an SQL query.
Note: Lookup transformation is used to maintain both source and target or destination
tables synchronize. That means, if any rows/record existing in both source and destination
tables and which matches the join condition, then the row will be updated and all
unmatched rows will be inserted to destination system.

Steps to configure Lookup transformation:


Open SQL Server Management Studio (SSMS) and run/execute the below SQL query to
create a destination table before start configuring Lookup transformation,

USE [AdventureWorks]
GO

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[Lookup_Destination](


[ProductCategoryID] [int] NOT NULL,
[Name] [nvarchar](50) NULL,
[rowguid] [uniqueidentifier] NULL,
[ModifiedDate] [datetime] NULL
) ON [PRIMARY]

GO
In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as Lookup_To Make_both_Source_and_Destination_Sync.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc1. And set the
following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.

Kelly Technologies, Hyderabad. Page 68


MS-BI MATERIAL WITH SCENARIOS

Database: Select Adventure Works database from drop down.


Click Ok twice.

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM Production.ProductCategory WITH (NOLOCK) and click Build Query
and
click Ok.
4. Drag and drop Lookup transformation and make pipeline between OLEDBSrc and Lookup
transformation.
5. In Lookup, Select Connection Page and specify connection.
6. Check Use Results of an SQL Query radio button and provide the following reference
SQL statement to join with source table and Click Ok to save the changes. Please refer the
below screenshot for further details,

Kelly Technologies, Hyderabad. Page 69


MS-BI MATERIAL WITH SCENARIOS

7. Drag and drop OLEDB Command Transformation and make a pipeline connection between
Lookup and OLEDB Command transformation to update matched rows to destination.
Configure the OLEDB Command Transformation by providing the following steps,
8. In, Input and Output Selection wizard, select Output: Lookup Match Output and
click Ok
9. Now, let us see how to configure OLEDB Command Transformation,
In Connection Managers Page: Select Destination Connection Manager
In Component Properties Page,
SQL Command: UPDATE dbo.Lookup_Destination set Name = ? Where
ProductCategoryId = ?
(Here, ? means, Parameter in SSIS)

Kelly Technologies, Hyderabad. Page 70


MS-BI MATERIAL WITH SCENARIOS

In Column Mappings: Make mapping between Input and destination columns as


mentioned below,

Kelly Technologies, Hyderabad. Page 71


MS-BI MATERIAL WITH SCENARIOS

10. Click Ok.


11. Drag and drop OLEDB Destination to insert unmatched rows to destination table.
12. Select Lookup transformation and drag and drop a RED arrow mark (which is used to
handle error outputs) to OLEDB Destination. Automatically, configure Error Outputs
editor opens and set the following properties,
Error : Select Redirect Rows (to redirect the error caused rows to other destination)
and click Ok

13. Configure OLEDB Destination as mentioned below,


Connection Manager: Select Destination Connection Manager
Name of Table or view: Select Lookup_Destination table from drop down list.
Select Mapping Page and Click Ok.
Note: SSIS uses an equi-join, each row of the input dataset must match at least one row
in the referenced dataset. The rows are considered matching if the values in the joined

Kelly Technologies, Hyderabad. Page 72


MS-BI MATERIAL WITH SCENARIOS

columns are equal. By default, if an input row cannot be joined to a referenced row,
the Lookup transformation treats the row as an error.

However, you can override the default behavior by configuring the transformation to
instead redirect any rows without a match to a specific output. If an input row matches
multiple rows in the referenced dataset, the transformation uses only the first row. The
way in which the other rows are treated depends on how the transformation is
configured.

The Lookup transformation lets you access a referenced dataset either through an OLE
DB connection manager or through a Cache connection manager.
The Cache connection manager accesses the dataset held in an in-memory cache store
throughout the duration of the package execution. You can also persist the cache to a
cache file (.caw) so it can be available to multiple packages or be deployed to several
computers.

Lookup Output

The Lookup transformation has the following outputs:

Match output- It handles the rows in the transformation input that matches at least
one entry in the reference dataset.
No Match output- It handles rows in the input that do not match any entry in the
reference dataset.
As mentioned earlier, if Lookup transformation is configured to treat the rows without
matching entries as errors, the rows are redirected to the error output else they are
redirected to the no match output.
Error output- It handles the error records.

ALTERNATE SOLUTION FOR LOOKUP TRANSFORMATION:


The alternate solution for lookup transformation is Merge Statement which was newly
introduced from SQL Server 2008 onwards,
MERGE Lookup_Destination d -Destination Table
USING Production.ProductCategory s -Source Table
ON s.ProductCategoryID = d.ProductCategoryID

WHEN MATCHED and s.Name <> d.Name THEN


UPDATE SET d.Name = s.Name
WHEN NOT MATCHED THEN
INSERT (ProductCategoryID,Name,RowGuid,ModifiedDate)
VALUES(s.ProductCategoryID,s.Name,s.RowGuid,s.ModifiedDate)
WHEN NOT MATCHED BY SOURCE THEN
DELETE; /*Delete rows which are not existing in source but exists in
Destination called them as orphan rows*/

Slowly Changing Dimension Transformation:

Kelly Technologies, Hyderabad. Page 73


MS-BI MATERIAL WITH SCENARIOS

SCD is Slowly Changing Dimension. As the name suggests, a dimension which changes
slowly. For Example, say there is a table Employee, which stores information regarding
employee as below:

BusinessEntityID, NationalIDNumber, First_Name, last_Name LoginID, OrganizationNode


OrganizationLevel, JobTitle, BirthDate, MaritalStatus, Gender, HireDate, SalariedFlag
CurrentFlag, ModifiedDate

In this Employee table, the data for an employee doesn't change very often, but yes we
cant say that the changes wont be there. The changes, which may happen, are
Mistakenly spelling of First_Name is stored incorrect.
The employee gets married and marital status changes.
Last_Name changes.
The employee gets promotion and job designation changes and organization level
changes.
The columns which doesn't change except if we assume that no mistake happens
while data entry are HireDate, Gender, NationalIDNumber
The changes discussed dont happen frequently, but may happen after certain time.

SCD supports four types of changes:

1. Changing attribute
2. Historical attribute
3. Fixed attribute
4. Inferred member

Type 1 (changing attribute): When the changes in any attribute or column


overwrites the existing records.

For example; as discussed first name of employee is misspelled and wrong spelling is stored
in first name of that employee. For making the first name correct, we dont need to add one
more record for the same employee, so we can overwrite the first name. SCD which does
this kind of changes comes into type 1 category. This SCD transformation directs these rows
to an output named Changing Attributes Updates Output.

Emp ID First Name Last Name

Kelly Technologies, Hyderabad. Page 74


MS-BI MATERIAL WITH SCENARIOS

1 Rajan Gupta

1 Ranjan Gupta

This SCD transformation directs these rows to an output named Changing Attributes
Updates Output.

Type 2 (historical attribute): when we need to maintain the history of records,


whenever some particular column value changes.

For example the employee gets promotion, designation changes and organization level
changes. In such case we need to maintain the history of the employee, that with which
designation he joined, and when his designation and organizational level changes.

For these kinds of changes, there will be multiple records for the same employee with
different designation. Then to indentify the current records, we can either add a column as
Status, which will be Current for the current or latest records, Or else we can add two
column as start date and end date (expiry date), through which we can maintain history of
employees records. This SCD directs these rows to two outputs: Historical Attribute
Inserts Output and New Output.

EmpID FirstName DEsignation StartDate EndDate Status

1 Ranjan Graduate Engineer 20-01-2010 25-01-2011 Expired

1 Ranjan Analyst Programmer 25-01-2011 25-01-2012 Expired

1 Ranjan Business Analyst 25-01-2012 1-01-2099 Current

Fixed attribute: when the attribute must not change.

For example HireDate, Gender, NationalIDNumber should never change. So whenever


changes will occur in these columns value then either it should throw error or the changes
can be saved in some other destination. But changes should not be applied in the columns.

This SCD transformation detects changes and directs the rows with changes to an output
named Fixed Attribute Output.

SCD Transformation with an example:

Kelly Technologies, Hyderabad. Page 75


MS-BI MATERIAL WITH SCENARIOS

Note: As part of an example, we are going to create a new source and destination table;
hence we need to update/modify the existing sample database (AdventureWorks).
Please use the below mentioned SQL Script to Create Source and destination table from
existing HumanResources.Employee table from AdventureWorks sample database.
Open SQL Server Management Studio (SSMS) and run/execute the below SQL query to
create a destination table before start configuring Lookup transformation,

USE [AdventureWorks]
GO

SELECT * INTO [HumanResources].[EmployeeSource]--Create a New table


FROM [HumanResources].[Employee]--From Existing table
GO
SELECT * INTO [HumanResources].[EmployeeDestination]--Create a New table
FROM [HumanResources].[Employee]--From Existing table
WHERE 1=2 --Use this condition to insert 0 rows to Destination table
Note: In destination table, rename a column from NationalIDNumber to Status

Steps to Configure SCD Transformation with above example tables:


In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as SCD_ToMaintain_History_And_Current_Data.dtsx
1. In Control Flow tab: Drag and drop a Data Flow Task and rename it as DFT Data
Conversion.
2. Double Click on Data Flow Task or Right Click on Data Flow Task and Select Edit to
navigate to Control Flow tab.
3. In Data Flow Tab, from Data Flow Source section (in Toolbox, shortcut key is Alt+Ctrl+x),
drag and drop OLEDB Source adapter/component and rename it OLEDBSrc SCD. And set
the following setting to configure OLEDB Source.
Note: OLEDB Source component is used to extract the data from any relational database
using OLEDB provider.
OLEDB Connection Manager, click New (to create a new connection as this package a
new)
Click New In Connection Manager Editor,
Server Name: Localhost/Servername/.
Database: Select Adventure Works database from drop down.
Click Ok twice.

Kelly Technologies, Hyderabad. Page 76


MS-BI MATERIAL WITH SCENARIOS

Data access mode: Select SQL Command option


SQL Comment: Provide the following query to extract/read the data from specified
table,
SELECT * FROM HumanResources.EmployeeSource WITH (NOLOCK) and click Build
Query and click Ok.
4. Drag and drop SCD transformation and make green pipeline between OLEDBSrc and SCD
transformation.
5. In SCD Transformation Editor, Click Next in first Page,
6. Select a Dimension table (Select HumanResources.EmployeeDestination table from drop
down list) and set the following business key property,
In Key Type, LoginID Business key

Kelly Technologies, Hyderabad. Page 77


MS-BI MATERIAL WITH SCENARIOS

7. Click Next
8. Set the following properties to manage the changes to column data in your slowly
changing dimensions by setting the change type for dimension columns,
Dimension Columns Change Type
BirthDate Fixed Attribute
EmployeeId Fixed Attribute
Gender Fixed Attribute
Marital Status Changing Attribute
Salaried Flag Changing Attribute
SickLeavesHours Historical Attribute
Title Changing Attribute
Vacation Hours Historical Attribute

Kelly Technologies, Hyderabad. Page 78


MS-BI MATERIAL WITH SCENARIOS

9. Click Next twice


10. In Historical Attribute Option editor, select/check Use a single columns to show
current and expired records,
Column to indicate current records Select Status
Value when current: Select Current
Expiration value: Select Expired, and click Next

Kelly Technologies, Hyderabad. Page 79


MS-BI MATERIAL WITH SCENARIOS

11. Click Next


12. Click Finish.
13. We can observe that, SSIS creates a new and big package for us as show below,

Kelly Technologies, Hyderabad. Page 80


MS-BI MATERIAL WITH SCENARIOS

Steps Execute Package in multiple Environments:


Package(s) can be executed using the following multiple ways in various environments,
1. SSDT (SQL Server Data Tool) In Solution Explorer Select any package Right
Click Execute Package.
2. DTEXEC.EXE
3. DTEXECUI.EXE
4. SSMS (SQL Server Management Studio) Connect to Integration Services Expand
Stored Package Expand MSDB Select any package Right click Run
Package Opens a Execute Package Utility Click Execute.
5. SSMS (SQL Server Management Studio) Connect to Database Engine Expand SQL
Server Agent (Service) Expand Jobs Select any job Select Start Job at Step.

In SQL Server 2005 and higher versions there are different ways in which one can execute
an SSIS package. Let us go through each option one by one.

Execute SSIS Package Using SQL Server Data Tool (SSDT)

During the development phase of the project developers can test the SSIS package
execution by running the package from SQL Server Data Tool (SSDT).

1. In Solution Explorer, right click the SSIS project folder that contains the package which
you want to run and then click properties as shown in the snippet below.

Kelly Technologies, Hyderabad. Page 81


MS-BI MATERIAL WITH SCENARIOS

2. In the SSIS Property Pages dialog box, select Build option under the Configuration
Properties node and in the right side panel, provide the folder location where you want the
SSIS package to be deployed within the OutputPath. Click OK to save the changes in the
property page.

3. In Solution Explorer, right click the SSIS Package and then click Set as Startup Object
option as shown in the snippet below.

Kelly Technologies, Hyderabad. Page 82


MS-BI MATERIAL WITH SCENARIOS

4. Finally to execute the SSIS package, right click the package within Solution Explorer and
select Execute Package option from the drop down menu as shown in the snippet below.

Kelly Technologies, Hyderabad. Page 83


MS-BI MATERIAL WITH SCENARIOS

Execute SSIS Package using DTEXEC.EXE Command Line Utility

Using the DTEXEC.EXE command line utility one can execute an SSIS package that is stored
in a File System, SQL Server or an SSIS Package Store. The syntax to execute a SSIS package
which is stored in a File System is shown below.

DTEXEC.EXE /F "C:\BulkInsert\BulkInsertTask.dtsx"

Execute SSIS Package using DTEXECUI.EXE Utility

Using the Execute Package Utility (DTEXECUI.EXE) graphical interface one can execute an
SSIS package that is stored in a File System, SQL Server or an SSIS Package Store.

1. In command line, type DTEXECUI.EXE which will open up Execute Package Utility as
shown in the snippet below. Within the Execute Package Utility, click on the General tab and
then choose the Package source as "File System", next you need to provide the path of the

Kelly Technologies, Hyderabad. Page 84


MS-BI MATERIAL WITH SCENARIOS

SSIS package under Package option and finally click the Execute button to execute the SSIS
package.

The Execute Package Utility is also used when you execute the SSIS package from the
Integration Services node in SQL Server Management Studio.

Kelly Technologies, Hyderabad. Page 85


MS-BI MATERIAL WITH SCENARIOS

Execute SSIS Package using SQL Server Agent Job

Using a SQL Server Agent Job one can execute an SSIS package that is stored in a File
System, SQL Server or an SSIS Package Store. This can be done by creating a new SQL
Server Agent Job and then by adding a new step with details as mentioned in the snippet
below.

1. In New Job Step dialog box provide an appropriate Step name, then choose "SQL Server
Integration Services Package" option as Type from the drop down list, and then choose
"SQL Server Agent Service Account" as Run as value.

2. In the General tab choose the File System as Package Source and provide the location of
the SSIS package under Package option.

Kelly Technologies, Hyderabad. Page 86


MS-BI MATERIAL WITH SCENARIOS

3. Click OK to save the job step and click OK once again to save the SQL Server Agent Job

4. That's it now you can execute the SQL Server Agent Job which will internally execute the
SSIS package.

Kelly Technologies, Hyderabad. Page 87


MS-BI MATERIAL WITH SCENARIOS

How to deploy/publish SSIS Packages to Server/File System (with in server):

Once we are done with designing/development and unit testing of SSIS package and when
we are ready to deploy our packages we have the following options available:

Deploy to the file system

Deploy to the package store

Deploy to SQL Server

The simplest approach to deployment is probably to deploy to the file system. As SSIS
package is actually just an XML file and it can simply be copied from its project location to a
folder on the deployment target. You can use the DOS COPY command, Windows Explorer,
etc. to perform the copy operation. The package store is a particular folder on the file
system; the default for SQL Server 2005 is C:\Program Files\Microsoft SQL
Server\90\DTS\Packages.

Note: SSIS packages deployed to SQL Server are stored in the msdb database.

There are three ways to deploy our packages:

Create a deployment utility from our project

Use SQL Server Management Studio (SSMS) (Using Import Package option)

Use the DTUTIL command line tool

SQL Server Integration Services (SSIS) Deployment Utility:

The deployment utility can be used to create an SSIS package installer. The deployment
utility is a built-in feature in an Integration Services project. Let us discuss how to enable
the deployment utility and create a deployment.

a. Open SQL Server Data Tool (SSDT) from the

Microsoft SQL Server program group.

b. Click File Open Project / Solution on the top level menu to display the Open Project
dialog. Navigate to the location of the solution as shown below then click Open:

Kelly Technologies, Hyderabad. Page 88


MS-BI MATERIAL WITH SCENARIOS

Navigate to the Tutorial-Sample-1 project in Solution Explorer as shown below:

Select Properties from the popup menu. Click Deployment Utility in the Configuration
Properties list and fill in the dialog as follows:

CreateDeploymentUtility property is set to True; the default is False. The


DeploymentOutputPath specifies the location where the deployment files will be written.
The default is shown above and is relative to the project folder. Click OK to save the
settings.

Right click on the Tutorial-Sample-1 project in the Solution Explorer and select Build from the
popup menu. This will build the project and invoke the deployment utility. If all of the SSIS

Kelly Technologies, Hyderabad. Page 89


MS-BI MATERIAL WITH SCENARIOS

packages are in a valid state, you will see the message Build Succeeded in the bottom left of
the window. Navigate to the bin\Deployment folder underneath the project folder to view
the deployment files. You will see the following files:

The above files represent the deployment. You can copy them to the deployment target
then double click on the Tutorial-Sample-1.SSISDeploymentManifest file to perform the
deployment.

Deploying SSIS Packages with SSMS

SQL Server Management Studio (SSMS) can be used to deploy SSIS packages to SQL Server
or to the Package Store.

To begin launch SSMS and connect to Integration Services. Note that the SQL Server
Integration Services service must be running in order to do this. You will see the following in
the Object Explorer:

As you can see there are two nodes under Stored Packages: File System and MSDB.

a. File System is actually the package store with a default location in SQL Server 2005 of
C:\Program Files\Microsoft SQL Server\90\DTS\Packages.

b. MSDB is of course the MSDB database.

In the examples that follow we will deploy the CreateSalesForecastInput.dtsx package from
its location in the project folder to the package store and the MSDB database.

To deploy to the package store, right click on the File System node and select Import
package from the popup menu. Fill in the Import Package dialog as shown below:

Kelly Technologies, Hyderabad. Page 90


MS-BI MATERIAL WITH SCENARIOS

Click OK to import the package.

To deploy to the the MSDB database, right click on the MSDB node and select Import
package from the popup menu. Fill in the Import Package dialog as shown below:

Kelly Technologies, Hyderabad. Page 91


MS-BI MATERIAL WITH SCENARIOS

Command Line Deployment Tool for SSIS Packages:

SQL Server includes the command line tool DTUTIL.EXE which can be used to deploy SSIS
packages. DTUTIL is a good choice when you want to script out the deployment of SSIS
packages. DTUTIL can be executed from a Command Prompt or from a batch (.BAT) file.

To begin open a Command Prompt and navigate to the Tutorial-Sample-1 project folder as
shown below:

In the examples that follow, I will show how to deploy the CreateSalesForecastInput.dtsx
package to the file system, package store, and SQL Server.

To deploy to the file system, you could use the DOS COPY command, Windows Explorer, etc.
or the following DTUTIL command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx


/COPY
FILE;C:\temp\CreateSalesForecastInput.dtsx
Replace the path C:\temp as appropriate.

To deploy to the package store, type the following command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx


/COPY
DTS;CreateSalesForecastInput
To deploy to SQL Server, type the following command (all on one line):

DTUTIL /FILE CreateSalesForecastInput.dtsx


/COPY
SQL;CreateSalesForecastInput
The above command deploys to the default SQL Server instance on the local machine. To
deploy to a different SQL Server add the command line parameter /DESTSERVER
"SERVERNAME\INSTANCENAME".

Kelly Technologies, Hyderabad. Page 92


MS-BI MATERIAL WITH SCENARIOS

Containers:

Containers provide structure in packages and services to tasks in the control flow.
Integration Services include the following container types, for grouping tasks and
implementing repeating control flows:

In SSIS, we have 3 types of containers,


1. For Loop Container
2. For Each Loop Container
3. Sequence Container
4. Task Host Container (a default container)

For Loop Container: Its a basic container that provides looping functionality. A For
loop contains a counter that usually increments (though it sometimes decrements), at which
point a comparison is made with a constant value. If the condition evaluates to True, then
the loop execution continues.

The Foreach Loop container: It enumerates a collection and repeats its control
flow for each member of the collection. The Foreach Loop Container is for situations where
you have a collection of items and wish to use each item within it as some kind of input into
the downstream flow.

Sequence Container: One special kind of container both conceptually and


physically can hold any other type of container or Control Flow component. It is also called
container container, or super container

For Loop Container:

The For Loop is one of two Loop containers available in SSIS. In my opinion it is easier to set
up and use than the For Each Loop, but it is just as useful. The basic Function of the for loop
is to loop over whatever tasks you put inside the container a predetermined number of
times, or until a condition is met. The For Loop Container, as is true of all the containers in
SSIS, supports transactions by setting the Transaction Option in the properties pane of the
container to ?Required?, or ?Supported? if a parent container, or the package itself is set to ?
Required?

There are three expressions that control the number of times the loop executes in the For
Loop container.

The InitExpression is the first expression to be evaluated on the For Loop and is only
evaluated once at the beginning. This expression is optional in the For Loop Container. It is
evaluated before any work is done inside the loop. Typically you use it to set the initial value
for the variable that will be used in the other expressions in the For Loop Container. You can
also use it to initialize a variable that might be used in the workflow of the loop.

The EvalExpression is the second expression evaluated when the loop first starts. This
expression is not optional. It is also evaluated before any work is performed inside the
container, and then evaluated at the beginning of each loop. This is the expression that
determines if the loop continues or terminates. If the expression entered evaluates to TRUE,
the loop executes again. If it evaluates to FALSE, the loop ends. Make sure to pay particular
attention to this expression. I will admit that I have accidentally written an expression in the
EvalExpression that evaluates to False right away and terminated the loop before any work

Kelly Technologies, Hyderabad. Page 93


MS-BI MATERIAL WITH SCENARIOS

was done, and it took me longer than it probably should have to figure out that the
EvalExpression was the reason why it was wrong.

The AssignExpression is the last expression used in the For Loop. It is used to
change the value of the variable used in the EvalExpression. This expression is evaluated for
each pass through the loop as well, but at the end of the workflow. This expression is
optional

Example Scenario:-We will create a table with the name Data having fields SNo and Date.
Then, with the help of For Loop Container, we will increment the values of SNo from 1 to 10
(iterative process) and insert them into our Data table.

Solution:

1. Open SQL Server Management Studio (SSMS) and Connection to Database Engine.

2. Run the below mentioned SQL query to create a new table.

USE [AdventureWorks]
GO

CREATE TABLE [dbo].[FEContainer_UsingESQLTask](


[SNo] [int] NOT NULL,
[Name] [varchar](150) NULL,
[Class] [varchar](50) NULL
) ON [PRIMARY]
GO

3. In above created Project, Create a New SSIS Package (Project Menu Select New SSIS
Package) and rename it as FLoopContainer_With_ExecuteSQLTask.dtsx

4. In Control Flow tab: Create a variable (uvCounter) of type Int32, as mentioned below

5. Drag and drop For Loop Container and set the following properties,

6. In Control Flow tab: Drag and drop Execute SQL Task, and configure it by providing the
following
settings,

Kelly Technologies, Hyderabad. Page 94


MS-BI MATERIAL WITH SCENARIOS

Connection : Provide Connection Manager

SQL Statement : INSERT INTO [AdventureWorks].[dbo].


[FEContainer_UsingESQLTask]
([SNo]
,[Name]
,[Class])
VALUES
(? -- Parameter1
,'Kelly Technology_' + CAST (? AS VARCHAR)
---Parameters2
,'MSBI')
GO
7. Select Parameter Mapping Page, Click Add to add parameters, and click Ok

Here, Parameter Size 0 means, points/maps to first parameter in Insert Statement


and
Parameter Size 1 means, points/maps to second parameters in Insert
statement.

8. Execute Package.
9. For results, Connect to SSMS and run the following query,
SELECT * FROM FEContainer_UsingESQLTask WITH (NOLOCK)

For Each Loop Container:

1. Scenario: Execute multiple files/SSIS packages (.dtsx) which are located in a file system.

Kelly Technologies, Hyderabad. Page 95


MS-BI MATERIAL WITH SCENARIOS

1. In above created Project, Create a New SSIS Package (Project Menu Select New
SSIS Package) and rename it as FELC_To_Loopthrough_Multiple.dtsx

2. Copy and paste few SSIS packages in a local/shared location (H:\SSIS


Packages\Packages)

3. In Control Flow tab; Drag and drop Execute Package Task, and configure it by
using the following setting,

a. Location : File System

b. Connection : New Connection Specify the path where the files/packages


are location to be executed,

4. Click Ok.

5. Now, Execute Package.

6. We can observe that, only one specified package would be executed. But, if you
want execute all the package/files at the specified location; we need to use For
Each Loop Container. Now, Let us see how to configure For Each Loop Container.

7. In Control Flow tab; drag and drop For Each Loop Container and set the following
properties,

In General Page :

Name : FELC Loop Through multiple Pacakges

Description : FELC Loop Through multiple Pacakges

In Collection Page:

Enumerator : Select For Each File Enumerator

Folder : Specify the location where files/packages are located

(H:\SSIS Packages\Packages)

Kelly Technologies, Hyderabad. Page 96


MS-BI MATERIAL WITH SCENARIOS

Files : *.dtsx

In Variable Mappings Page:

Variable: New Variable Name: uvPackagesToRun


Index: 0 (By default)

8. In Control Flow tab; drag and drop, already configured Execute Package Task to
For Each Loop Container to loop through Execute Package task for every package
at the specified location.

9. In Connection manager section, select Package Connection and press F4 for


properties, and set the following properties to create package connection
dynamically (at run time),

Expression Click (Browse or Eclipse)

Property Select Connection String

Expression Click (Browse or Eclipse)

In Expression Builder, Drag and drop @[User::uvPackagesToRun] and Click Ok


Twice.

10. Save the Package

Kelly Technologies, Hyderabad. Page 97


MS-BI MATERIAL WITH SCENARIOS

11. Execute the Package.

12. Output and package and explanation taken care in class room sessions.

2. Scenario: Extract data from multiple excel files and load the data in to single
destination table. That means, loop through multiple excel workbooks (*.xls) which are
located in a file system.

1. In above created Project, Create a New SSIS Package (Project Menu Select New
SSIS
Package) and rename it as FELoopContainer_To Loop through Multiple Excel
Files.dtsx
2. Create multiple excel files (with proper data) and save them in a location
(H:\SSIS Packages\Files)
3. In Control Flow tab; Drag and drop Data Flow Task
4. In Data Flow tab; Drop and drop Excel Source component and configure it as
mentioned
below,

5. User OLEDB Destination component and configure it as mentioned below,


OLEDB Connection Manager Provide Server Name and DB Name (destination
connection manager)
Name of the table or the view Click New to create a new destination table
and
rename table as DataFromExcelFiles or any other suitable name.

Kelly Technologies, Hyderabad. Page 98


MS-BI MATERIAL WITH SCENARIOS

6. Select Mappings Page and Click Ok.


7. In Control Flow tab; create new variable as mentioned below,

Variable Scope Data Value


Name Type

uvExcelFiles Package String H:\SSIS Packages\Files\Students


On2004.xls

uvFullPath Package String

uvSrcPath Package String H:\SSIS Packages\Archive

8. Now, Lets see how to configure, For Each Loop Container to loop through multiple
excel workbooks,

9. Drag and drop For Each Loop Container, and set the following properties,

a. In Collection Page

i. Enumerator -- Foreach File Enumerator

ii. Folder -- H:\SSIS Packages\Files

iii. Files -- *.xls

b. In Variable Mappings Page

i. Variable Select User::uvExcelFiles

ii. Index 0 (By default)

10. Click Ok.

Kelly Technologies, Hyderabad. Page 99


MS-BI MATERIAL WITH SCENARIOS

11. In Connection Manager, selection Excel Connection Manager, Press F4 for properties
to create dynamic connection manager.

12. In Excel Connection Manager Properties Editor,

i. Expression Click (Browse or Eclipse)

ii. Property Select Connection String

iii. Expression Click (Browse or Eclipse)

13. In Expression Builder, build the following expressions to create Excel Connection,

"Provider=Microsoft.Jet.oLEDB.4.0;Data Source= + @[User::uvExcelFiles] +


";Extended Properties=\"Excel 8.0;HDR=YES\";"

Kelly Technologies, Hyderabad. Page 100


MS-BI MATERIAL WITH SCENARIOS

14. Click Ok Twice.


15. Save the Package

Once, each excel file is executed, need to Archive the executed excel files to newly created
archive directory which creates when we execute the package.
16. In Variable editor, Select uvFullPath variable and click F4 for properties editor,

17 In Properties Editor,

i. Expression Click (Browse or Eclipse)

ii. Property Select Connection String

iii. Expression Click (Browse or Eclipse)

18 In Expression Builder, build the following expression to build value of the variable at
runtime,

Kelly Technologies, Hyderabad. Page 101


MS-BI MATERIAL WITH SCENARIOS

@[User::uvSrcPath] + "\\" + (DT_WSTR, 10) (DT_DBDATE)


@[System::StartTime]

19. In control flow; drag and drop File System Task to perform Create Directory operation,
and set the following properties as mentioned in the below screen shot,

i. Operation : Create directory


ii. IsSourcePathVariable: True
iii. Source Variable: Select User::uvFullPath
iv. UseDirectoryIfExists: True
v. Name: FST Create new Directory
vi. Description: FST Create new Directory
20. Click Ok and make connection between File System Task and For Each Loop Container
21. Drag and drop File System Task inside For Each Loop Container and after Data Flow Task
and configure it by providing the following setting,

i. Operation: Copy File


ii. IsSourcePathVariable: True
iii. Source Variable: Select User::uvExcelFiles
iv. IsDestinationPathVariable: True

Kelly Technologies, Hyderabad. Page 102


MS-BI MATERIAL WITH SCENARIOS

v. DestinationVariable: User::uvFullPath
vi. Overwrite Destination: True
vii. Name: FST Copy Executed File to Archive
viii. Description: FST Copy Executed File to Archive

22. Save Package and execute Package.

Kelly Technologies, Hyderabad. Page 103


MS-BI MATERIAL WITH SCENARIOS

3. Scenario: Extract data from multiple sheets in a single excel workbook and load the
data in to destination table. That means, loop through multiple sheets in excel workbooks
(*.xls) which is located in a file system.

1. In above created Project, Create a New SSIS Package (Project Menu Select New
SSIS
Package) and rename it as FELoopContainer_To Loop through Multiple Sheets
In Excel File.dtsx
2. Create multiple excel files (with proper data) and save them in a location
(H:\SSIS Packages\Files)
3. In Control Flow tab; Drag and drop Data Flow Task
4. In Data Flow tab; Drop and drop Excel Source component and configure it as
mentioned
below,

5. User OLEDB Destination component and configure it as mentioned below,


OLEDB Connection Manager Provide Server Name and DB Name (destination
connection manager)
Name of the table or the view Click New to create a new destination table
and
rename table with suitable name.

Kelly Technologies, Hyderabad. Page 104


MS-BI MATERIAL WITH SCENARIOS

17. Select Mappings Page and Click Ok.


18. In Control Flow tab; Create the below variables,

Variable Scope Data Value


Name Type
uvExcelSheets Package String Sheet1$

19. In control Flow tab; Drag and drop For Each Loop Container, and set the following
properties,
a. In General Page:
i. Name : FELC Loop Through Mutliple Excle Sheets
ii. Description : FELC Loop Through Mutliple Excle Sheets

b. In Collection Page :
i. Enumerator : Foreach ADO.NET Schema Rowset Enumerator
ii. Connection : Select New Connection Click New Provider : Select
.Net Provider for OLEDB\Microsoft Jet 4.0 OLEDB Provider
iii. Database File Name : Click Browse File Name (Select All Files (*.*))
and select the excel file.

Kelly Technologies, Hyderabad. Page 105


MS-BI MATERIAL WITH SCENARIOS

c. In Connection Manager, Click All and set the following properties,


Extended Properties: Excel 8.0

And Click Ok twice.

Kelly Technologies, Hyderabad. Page 106


MS-BI MATERIAL WITH SCENARIOS

20. Schema : Select Tables

Kelly Technologies, Hyderabad. Page 107


MS-BI MATERIAL WITH SCENARIOS

21. In Variable Mapping Page, set the below mentioned in screen shot, and click Ok.

22. In Data Flow tab; Open Excel Source and set the following properties,
Data access mode: Select Table name or View name from Variable
Variable Name : Select User::uvExcelSheets

23. Finally, we could find 3 connection managers as mentioned below screen shot.

Kelly Technologies, Hyderabad. Page 108


MS-BI MATERIAL WITH SCENARIOS

24. Save and package and execute it.

Checkpoints to restart package from point of failure:

In SSIS checkpoints used to restart the package execution from the point of failure behavior
where it failed earlier.

Steps to Configure Checkpoints,


To implement checkpoints in your package, you must configure several properties at the
package level:

a. Create a new SSIS Package make sure we use multiple tasks or containers as
mentioned below,

b. CheckpointFileName: Specifies the full path and filename of your checkpoint file.
c. CheckpointUsage: Specifies when to use checkpoints. The property supports the
following three options:

Kelly Technologies, Hyderabad. Page 109


MS-BI MATERIAL WITH SCENARIOS

i. Never: A checkpoint file is not used.


ii. IfExists: A checkpoint file is used if one exists. This option is the one most
commonly used if enabling checkpoints on a file.
iii. Always: A checkpoint file must always be used. If a file doesnt exist, the
package fails.
d. SaveCheckpoints: Specifies whether the package saves checkpoints. Set to True to
enable checkpoints on the package.
e. And also, we need to set two more properties with respect to each and every
container or task in the package,
f. Select Execute SQL Task1 (first task in our example) and press F4 for properties and
set the following two properties,
i. FailPackageOnFailure True
ii. FailParentOnFailure True
g. Execute Package.

Logging in Packages:
In the above package, lets see how to create new loggings,

We can maintain log about package & system information, various events for the container
can also be logged.

We can enable logging in two ways as shown in below two images,

Right-click on Control Flow tab,

Kelly Technologies, Hyderabad. Page 110


MS-BI MATERIAL WITH SCENARIOS

We can capture logged information in multiple sources as mentioned below,

a. SSIS Log Provider for Text Files


b. SSIS Log Provider for Windows Event Log
c. SSIS Log Provider for XML File
d. SSIS Log Provider for SQL Server
e. SSIS Log Provider for SQL Server Profiler

Kelly Technologies, Hyderabad. Page 111


MS-BI MATERIAL WITH SCENARIOS

a. SSIS Log Provider for Text Files:

i. Select Provider type: SSIS log provider for Text files click Add
ii. You can give the path of existing flat file /create new flat file
under Configuration.
iii. For logging the information into flat file, select the options as shown in below
image.

iv. Click Details tab and set the following properties,

Kelly Technologies, Hyderabad. Page 112


MS-BI MATERIAL WITH SCENARIOS

v. Click Advanced, to see the log information.


vi. Execute the package and open the flat-file to see the logged information.

b. SSIS log provider for XML files:

i. Select Provider type: SSIS log provider for XML files click Add
ii. You can give the path of existing XML file /create new XML file
under Configuration.
iii. For logging the information into XML file, select the options as shown in below
image.

Kelly Technologies, Hyderabad. Page 113


MS-BI MATERIAL WITH SCENARIOS

iv. Execute the package and open the XML file to see the logged information.

c. SSIS log provider for XML files:


i. Select Provider type: SSIS log provider for Windows Event Log and
ii. Click Add

Kelly Technologies, Hyderabad. Page 114


MS-BI MATERIAL WITH SCENARIOS

iii. Execute the package


iv. After package is executed, go to Control Panel - > Administrative Tools -
> Event Viewer - >Windows Logs - > click on Application - > under
Source tab with value SQLISService are package logged information.

v. Right Click Control Flow tab and select Log Events, you will notice as shown
below,

d. SSIS log provider for SQL Server:

i. Select Provider type: SSIS log provider for SQL Server and
ii. Click Add
iii. Under Configuration create a connection to SQL Server database.
iv. For logging the information into table in SQL Server database ,select the
options as shown in below image,

Kelly Technologies, Hyderabad. Page 115


MS-BI MATERIAL WITH SCENARIOS

v. Execute the package


vi. Execute the below query in the database which you have selected, while
configuring logging options.
vii.SELECT * FROM msdb.dbo.SYSSSISLOG

Returning Single Row using Execute SQL Task in SSIS:


In Execute SQL Task, the General section contains the Name property and
the Description property. The Name property refers to the task name. You should name the
task something suitable. On my system, I named the task Get ResultSet. I then added a
description to the Description property to explain what the task does.

In the Options section, Lets go ahead with the default property values.

The next section on the General page is Result Set. Notice that this section includes only
the ResultSet property. The property lets you select one of the following four options:

None: The query returns no result set.


Singlerow: The query returns a single-row result set.
Fullresultset: The query returns a result set that can contain multiple rows.
XML: The query returns a result set in an XML format.

The option you select depends on the results of the query you pass into the Execute
SQL task. For this exercise, our query will return only a single value. Consequently, we
will choose the Single row option.

Next, we need to configure the properties in the SQL Statement section. Table 1 shows
the values you should use to configure these properties.

Property Value
AdventureWorks (or whatever you named the connection
Connection
manager you created earlier)
Direct input

This means well type the code straight in and not use a
SQLSourceType
stored procedure.
Because weve selected the Direct input option, we need to
enter a T-SQL statement for this option. Ive used the following
statement, which returns a single value:

SQLStatement SELECT MAX(EmployeeID) AS [MaxEmployeeId]


FROM HumanResources.Employee
IsQueryStoredProcedure
This option is greyed out because we selected Direct

Kelly Technologies, Hyderabad. Page 116


MS-BI MATERIAL WITH SCENARIOS

Property Value
input for the SQLSourceType property. Had we
selected Stored Procedure, this property would be available
and the SQLStatement property would be greyed out.

The property defaults to False. If you change the value


BypassPrepare
to True, you can click the Parse Query button to verify that
your T-SQL statement is valid.

Our next step is to associate our result set value with a variable that will store the value
we retrieve from the database. To do this, go to the Result Set page of the Execute SQL
Task Editor.

The main grid of the Result Set page contains two columns: Result Name and Variable
Name. Click the Add button to add a row to the grid. In the Result Name column, enter
the column name returned by your query (MaxEmployeeId). In the Variable
Name column, select the User::MaxEmployeeId variable. Your Result Set page should
now look similar to the one shown in Figure 6.

If our single-row result set contains multiple columns, we would have had to map a
variable to each column. However, because we returned only one value, we needed only
one mapping.

Once youve associated your result set value with a variable, click OK to close
the Execute SQL Task Editor. You task should now be set up to return a single-row
result set. Now we need to do something with that result set!

Now, lets work with a Single-Row Result Set:

Our next step is to drag a new Execute SQL task onto our design surface so we can use
the result set returned by the first Execute SQL task. So add the task, and then connect
the precedence constraint (the green arrow) from the first task to the new one. Next,
right-click the second task and click Edit to open the Execute SQL Task Editor.

Kelly Technologies, Hyderabad. Page 117


MS-BI MATERIAL WITH SCENARIOS

In the General section, provide a name and description for the task. (I named the
task Using Result Set.) For the ResultSet property, stick with the default value, None.
In this case, the task wont be returning a result set. Instead, well be using the results
returned by the previous task.

Now lets look at the SQL Statement section shown in Figure 8. Notice that, for
the SQLStatement property, I entered the following T-SQL code:

As you can see, were executing the UpdateSSISLog stored procedure. Notice, however,
that we follow the name of the stored procedure with a question mark (?). The question
mark serves as a placeholder for the parameter value that the stored procedure requires.
You cannot name parameters within the actual query, so we have to take another step to
provide our value.

Go to the Parameter Mapping page of the Execute SQL Task Editor. On this page,
you map the parameters referenced in your queries to variables. You create your
mappings in the main grid, which contains the following five columns:

Kelly Technologies, Hyderabad. Page 118


MS-BI MATERIAL WITH SCENARIOS

Variable Name: The variable that contains the value to be used for the
parameter. In this case, well use the User::EmpNum variable, which contains the
result set value returned by the first Execute SQL task.
Direction: Determines whether to pass a value into a parameter (input) or return
a value through the parameter (output)
Data Type: Determines the type of data provided from the variable. This will
default to the type used when setting up the variable.
Parameter Name: The name of the parameter. The way in which parameters are
named depends on your connection type. When running a T-SQL statement
against a SQL Server database through an OLE DB connection, as were doing
here, we use numerical values to represent the statements parameters, in the
order they appear in the statement, starting with 0. In this case, because theres
only one parameter, we use 0.
Parameter Size: The size of the parameter if it can be a variable length. The
default is -1, which lets SQL Server determine the correct size.

Once youve mapped your variable to your parameter, the Parameter Mapping page
should look similar to the one shown in Figure 8.

When youre finished configuring the Execute SQL task, click OK.

Your package should now be ready to run. Click the green Execute button. When the
package has completed running, query the SSISLog table and verify that a row has been
added that contains the expected results.

Kelly Technologies, Hyderabad. Page 119


MS-BI MATERIAL WITH SCENARIOS

Returning Single Row using Execute SQL Task in SSIS:

Returning XML Result using Execute SQL Task in SSIS:


Scenario: Create a new xml file with the result set of SQL Server Query (So how
you can save result of a query from SQL Server to XML file?) i.e. SSIS SQL Server to XML -
Save to file

Solution is:

Note: As you know there is no XML destination in SSIS.

First Of all you can use FOR XML to get result of query in XML, look at our sample query:

SELECT EmployeeID, NationalIDNumber, ContactID


FROM HumanResources.Employee
FOR XML RAW('Facility'),Root('Extract'),Elements

this will creates elements with name 'Facility', and attributes 'EmployeeID' ,
'NationalIDNumber' , 'ContactID' and the Root node is 'Extract'. For more information about
FOR XML in sql server look here .

So, start with SSIS:

1- Create a Variable of String type in package scope, and name it as XMLData.

2- Add an Execute SQL Task, set connection as OLEDB to the AdventureWorks Database,
write this query in SQL Statement:

Kelly Technologies, Hyderabad. Page 120


MS-BI MATERIAL WITH SCENARIOS

SELECT EmployeeID,NationalIDNumber,ContactID
FROM HumanResources.Employee
FOR XML RAW('Facility'),Root('Extract'),Elements

Set Result Set properties to XML.Then go to Result Set tab, and do this mapping:

Result Name -- 0

Variable Name -- User::XMLData

Kelly Technologies, Hyderabad. Page 121


MS-BI MATERIAL WITH SCENARIOS

3- Add a Script Task after execute SQL task, set language as C#. and set ReadOnlyVariables
as User::XMLData .

then edit script and write this code in Main() method:

Public void Main()


{
System.Xml.XmlDocument xdoc = new System.Xml.XmlDocument();
xdoc.InnerXml = Dts.Variables["XMLData"].Value.ToString();
xdoc.Save(@"E:\Output.xml");
Dts.TaskResult = (int)ScriptResults.Success;
}

Kelly Technologies, Hyderabad. Page 122


MS-BI MATERIAL WITH SCENARIOS

Provide Security to Package in SSIS

ProtectionLevel is an SSIS package level property that is used to specify how sensitive
information is saved within the package and also whether to encrypt the package or the
sensitive portions of the package. The classic example of sensitive information would be a
password. Each SSIS component designates that an attribute is sensitive by including
Sensitive="1" in the package XML; e.g. an OLE DB Connection Manager specifies that the
database password is a sensitive attribute as follows:

<DTS:PasswordDTS:Name="Password" Sensitive="1">

When the package is saved, any property that is tagged with Sensitive="1" gets handled per
the ProtectionLevel property setting in the SSIS package. The ProtectionLevel property can
be selected from the following list of available options (click anywhere in the design area of
the Control Flow tab in the SSIS designer to show the package properties):

1. DontSaveSensitive

2. EncryptSensitiveWithUserKey

3. EncryptSensitiveWithPassword

4. EncryptAllWithPassword

5. EncryptAllWithUserKey

Kelly Technologies, Hyderabad. Page 123


MS-BI MATERIAL WITH SCENARIOS

6. ServerStorage

Kelly Technologies, Hyderabad. Page 124


MS-BI MATERIAL WITH SCENARIOS

To show the effect of the ProtectionLevel property, add an OLE DB Connection Manager to
an SSIS package:

The above connection manager is for a SQL Server database that uses SQL Server
authentication; the password gives the SSIS package some sensitive information that must
be handled per the ProtectionLevel package property.

Now let's discuss each ProtectionLevel setting using an SSIS package with the above OLE
DB Connection Manager added to it.

1. DontSaveSensitive

When you specify DontSaveSensitive as the ProtectionLevel, any sensitive information is


simply not written out to the package XML file when you save the package. This could be
useful when you want to make sure that anything sensitive is excluded from the package
before sending it to someone. After saving the package using this setting, when you open it
up and edit the OLE DB Connection Manager, the password is blank even though the Save
my password checkbox is checked:

Kelly Technologies, Hyderabad. Page 125


MS-BI MATERIAL WITH SCENARIOS

2. EncryptSensitiveWithUserKey

EncryptSensitiveWithUserKey encrypts sensitive information based on the credentials of the


user who created the package; e.g. the password in the package XML would look like the
following (actual text below is abbreviated to fit the width of the article):

<DTS:PASSWORD Sensitive="1" DTS:Name="Password"


Encrypted="1">AQAAANCMnd8BFdERjHoAwE/Cl+...</DTS:PASSWORD>

Note that the package XML for the password has the attribute Encrypted="1"; when the
user who created the SSIS package opens it the above text is decrypted automatically in
order to connect to the database. This allows the sensitive information to be stored in the
SSIS package but anyone looking at the package XML will not be able to decrypt the text
and see the password.

There is a limitation with this setting; if another user (i.e. a different user than the one who
created the package and saved it) opens the package the following error will be displayed:

Kelly Technologies, Hyderabad. Page 126


MS-BI MATERIAL WITH SCENARIOS

If the user edits the OLE DB Connection Manager, the password will be blank. It is important
to note that EncryptSensitiveWithUserKey is the default value for the ProtectionLevel
property. During development this setting may work okay. However, you do not want to
deploy an SSIS package with this setting, as only the user who created it will be able to
execute it.

3. EncryptSesnitiveWithPassword

The EncryptSensitiveWithPassword setting for the ProtectionLevel property requires that you
specify a password in the package, and that password will be used to encrypt and decrypt
the sensitive information in the package. To fill in the package password, click on the button
in the PackagePassword field of the package properties as shown below:

You will be prompted to enter the password and confirm it. When opening a package with a
ProtectionLevel of EncryptSensitiveWithPassword, you will be prompted to enter the
password as shown below:

The EncryptSensitiveWithPassword setting for the ProtectionLevel property overcomes the


limitation of the EncryptSensitiveWithUserKey setting, allowing any user to open the
package as long as they have the password.

Kelly Technologies, Hyderabad. Page 127


MS-BI MATERIAL WITH SCENARIOS

When you execute a package with this setting using DTEXEC, you can specify the password
on the command line using the /Decrypt password command line argument.

4. EncryptAllWithPassword

The EncryptAllWithPassword setting for the ProtectionLevel property allows you to encrypt
the entire contents of the SSIS package with your specified password. You specify the
package password in the PackagePassword property, same as with the
EncryptSensitiveWithPassword setting. After saving the package you can view the package
XML as shown below:

Note that the entire contents of the package is encrypted and the encrypted text is shown in
the CipherValue element. This setting completely hides the contents of the package. When
you open the package you will be prompted for the password. If you lose the password
there is no way to retrieve the package contents. Keep that in mind.

When you execute a package with this setting using DTEXEC, you can specify the password
on the command line using the /Decrypt password command line argument.

5. EncryptAllWithUserKey

The EncryptAllWithUserKey setting for the ProtectionLevel property allows you to encrypt
the entire contents of the SSIS package by using the user key. This means that only the
user who created the package will be able open it, view and/or modify it, and run it. After
saving a package with this setting the package XML will look similar to this:

Kelly Technologies, Hyderabad. Page 128


MS-BI MATERIAL WITH SCENARIOS

Note that the entire contents of the package are encrypted and contained in the Encrypted
element.

6. ServerStorage

The ServerStorage setting for the ProtectionLevel property allows the package to retain all
sensitive information when you are saving the package to SQL Server. SSIS packages saved
to SQL Server use the MSDB database. This setting assumes that you can adequately secure
the MSDB database and therefore it's okay to keep sensitive information in a package in an
unencrypted form.

Steps to Creating a Package to be used as Template in SSIS:


To create a SSIS package to be used as template you have to follow the same
approach as creating a new package. You need to use Business Intelligence
Development Studio (BIDS) to create a new project of type "SQL Server Data Tool
(SSDT) Integration Services Project". Create a new package, specify an appropriate
name for this package and add the work flow and components you want to be part
of the template.

For example, I have a sample package below in which the first task logs the start of
the ETL batch. Next I have a container which will eventually contain components for
loading data into staging. After that I have another container which will contain
components for data loading into dimensions and facts and for cube processing. At
the end, it will log success or failure for the package.

Kelly Technologies, Hyderabad. Page 129


MS-BI MATERIAL WITH SCENARIOS

Once you are done with creating the basic structure of the package and have added
the common components, you need to save a copy of this package at the following
locations based on the version of SQL Server you are using:

For SQL Server 2008


<<Installation drive>>:\Program Files (x86)\Microsoft Visual Studio
9.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransform
ationItems
OR
<<Installation drive>>:\Program Files\Microsoft Visual Studio
9.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransform
ationItems

Kelly Technologies, Hyderabad. Page 130


MS-BI MATERIAL WITH SCENARIOS

For SQL Server 2012


<<Installation drive>>:\Program Files (x86)\Microsoft Visual Studio
10.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransfor
mationItems
OR
<<Installation drive>>:\Program Files\Microsoft Visual Studio
10.0\Common7\IDE\PrivateAssemblies\ProjectItems\DataTransformationProject\DataTransfor
mationItems

You need to specify the drive location where Business Intelligence Development
Studio (BIDS) or SQL Server Data Tools (SSDT) has been deployed. Please note, as
BIDS or SSDT runs locally on client machine, you need to copy the template
package to the above location on all the development machines you want it to use.
For this example we are naming the template package
"SamplePackageTemplate.dtsx".

You are not restricted to deploying only one template. You can deploy as many
templates as you want to the folders listed above and reuse them as needed.

Using the SSIS Template in Other Projects


In a new or existing project where you want to add this SSIS package template,
you just need to right click on the project name in the Solution Explorer, click on
Add > New Item as shown below:

In the Add New Item dialog box, you will notice the deployed package template as
shown below. You can select it and specify a name for the package for which the
template will be used and click on the Add button to add the new package to your
project based on the selected template. That's all you have to do. You now have a
package that is pre-configured and you can now customize it for your specific need.
Please note, the modifications that are done in the new package do not impact the

Kelly Technologies, Hyderabad. Page 131


MS-BI MATERIAL WITH SCENARIOS

deployed template as we are working with a copy of the template which is now part
of the current project and not the template itself.

If you are using SQL Server 2012, when you add a new item you will see the
template appearing in the Add New Item dialog box as shown below. Select the
template and specify the name for the new package which will be based on this
template.

Kelly Technologies, Hyderabad. Page 132


MS-BI MATERIAL WITH SCENARIOS

Generate Unique ID
If you are using SQL Server 2005 or 2008 then you should generate a unique ID.
This is recommended because it helps in analyzing log data with better
differentiation for each package. To generate a unique ID value for the package,
click the ID in the Properties pane, and then click Generate New ID.

In SQL Server 2012 when you add a package based on a template, SSDT generates
a unique ID for each package and hence you don't need to do it separately.

Kelly Technologies, Hyderabad. Page 133

You might also like