Lab 1 - Simple ETL Process Using SSIS: Problem Statement
Lab 1 - Simple ETL Process Using SSIS: Problem Statement
Lab 1 - Simple ETL Process Using SSIS: Problem Statement
Problem statement
You have Two excel files as follows,
It will open up SSIS designer which you will use for creating and maintaining Integration service packages. It looks like follows,
In the solution explorer under SSIS packages folder you will see one default package created with name Package,dtsx. If you
want you can simply rename itor remove it and add new one (right click the folder and say New SSIS Package).
Note: Package is simply a collection of connections, control flow elements, data flow elements, event handlers, parameters etc. We
will talk about each one of this as move further.
Step 2. Create Connection Manager for Excel File
2.1 Right click Connection Manager and Say New Connection.
2.3 Click the browse button and select the excel file and click on OK.
5.2 Select data flow task from the toolbox and drag it into designer.
5.3 Rename Data Flow Task to Source excel to Destination excel transfer task
Control Flow
Control flow will be used to define the workflow. As the name implies it control the flow of execution.
8.2 Select Data Source as "SourceExcelManager", Data Access Mode as "Table or View" and Name of the sheet as "DataSheet1"
(Name of the sheet in the excel file).
Note: This Excel Source will perform the Extraction Task (E) in the ETL process
Step 9. Create Derived Column
9.1 From the SSIS toolbox from Transformation group drag Derived column to SSIS designer.
Step 10. Connect Source to Derived Column
10.1 Click the Excel source added in prior step.
You can see a small blue arrow attached to the source. We call it Data Flow Path.
Data Flow Path: It lets you define how data will flow.
Click on the blue arrow and connect it to Derived Column.
Note: we will speak about the red arrow in one of the future article in the series.
Step 11. Configure the derived column
11.1 Double click the derived column. Popup looks like follows.
11.2 Put down Derived Column Name as Name, Select Derived Column as and expression as Title + + FirstName + +
LastName
11.3 Click Ok.
Note: This DerivedColumn will perform the Transformation Task (T) in the ETL process
Step 12. Create Excel Destination
12.1 Add Excel Destination from the Destination group in SSIS toolbox.
Note: This Excel Destination will perform the Load Task (L) in the ETL process
Step 13. Connect Derived Column to Excel Destination
13.1 Just like step no 10, connect derived column to excel destination.
Step 14. Configure Excel Destination
14.1 Double click the Excel destination, popup looks like follow.
14.2 Set connection Manager to ExcelConnectionManager, Data Access mode to Table or View and Name of the excel sheet to
Datasheet1.
14.3 Click on mapping and make sure its proper, if not make sure to do it before proceeding.
Note: In our case, mapping will be already done by the IDE itself (because names of columns are matching).
14.4 Click ok.
Step 15. Execute package
15.1 Press F5.
On successful execution you will get a screen something like this.
Time to celebrate