Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Informatica Basics

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 253

Informatica PowerCenter 7

Basics Training

Kanbay Incorporated - All Rights Reserved

Course Objectives
Understand how development to use PowerCenter 7 components for

Be able to build basic ETL mappings Be able to create, run and monitor workflows Understand available options for loading target data Be able to Troubleshoot most problems

Introduction and Product Overview


Chapter 1

Kanbay Incorporated - All Rights Reserved

PowerCenter 7 Architecture

Sources

Native

Native

Targets

TCP/IP

Repository Server

Repository Agent TCP/IP Native

Repository

Repository Designer Workflow Manager Manager

Workflow Monitor

Rep Server Administrative Console

Not shown : Client ODBC Connections for Source and Target metadata

PowerCenter 7 Architecture

PowerCenter 7 Architecture
You can register multiple PowerCenter Servers to a repository. The PowerCenter Server moves data from sources to targets based on workflow and mapping metadata stored in a repository.

The PowerCenter Server runs workflow tasks according to the conditional links connecting the tasks.
When you have multiple PowerCenter Servers, you can assign a server to start a workflow or a session.

This allows you to distribute the workload.


Server Grid: You can increase performance by using a server grid to balance the workload. A server grid is a server object that allows you to automate the distribution of sessions across multiple servers.

PowerCenter 7 Architecture
The PowerCenter Server can combine data from different platforms and source types. For example, you can join data from a flat file and an Oracle source.

The PowerCenter Server can also load data to different platforms and target types.
For example, you can load transformed data to both a flat file target and a Microsoft SQL Server database in the same session.

PowerCenter 7 Server Connectivity


The PowerCenter Server connects to the following Informatica platform components: PowerCenter Client Other PowerCenter Servers Repository Server Repository Agent Source and target databases The PowerCenter Server is a repository client application. It connects to the Repository Server and Repository Agent to retrieve workflow and mapping metadata from the repository database. When the PowerCenter Server requests a repository connection from the Repository Server, the Repository Server starts and manages the Repository Agent. The Repository Server then re-directs the PowerCenter Server to connect directly to the Repository Agent. The Workflow Manager communicates directly with the PowerCenter Server over a TCP/IP connection.
8

PowerCenter 7 Server Connectivity


You create the connection by defining the port number in the Workflow Manager and the PowerCenter Server configuration. Use the Workflow Manager to register the PowerCenter Server in the repository. In a server grid, the Workflow Manager communicates directly with multiple PowerCenter Servers over TCP/IP connections. Each PowerCenter Server retrieves a server grid object from the repository, which it uses to connect to the other PowerCenter Servers in the grid. The PowerCenter Server maintains a database connection pool for stored procedures or lookup databases in a workflow. The PowerCenter Server allows an unlimited number of connections to lookup or stored procedure databases. If a database user does not have permission for the number of connections a session requires, the session fails. You can optionally set a parameter to limit the database connections.

PowerCenter 7 Server Connectivity


For a session, the PowerCenter Server holds the connection as long as it needs to read data from source tables or write data to target tables.

10

Designer Overview
Chapter 2

Kanbay Incorporated - All Rights Reserved

Designer Interface
Designer Windows: Navigator Workspace Status bar Output Overview Instance Data Target Data

12

Designer Interface
Designer Tools: The Designer provides the following tools: Source Analyzer: Use to import or create source definitions for flat file, XML, COBOL, Application, and relational sources

Warehouse Designer: Use to import or create target definitions


Transformation Developer: Use to create reusable transformations Mapplet Designer: Use to create mapplets Mapping Designer: Use to create mappings Navigator: Use to connect to and work in multiple repositories and folders. You can also copy and delete objects and create shortcuts using the Navigator. Workspace: Use to view or edit sources, targets, mapplets, transformations, and mappings. You can work with a single tool at a time in the workspace. You can use the workspace in default or workbook format.

13

Designer Interface
Status bar: Displays the status of the operation you perform. Output: Provides details when you perform certain tasks, such as saving your work or validating a mapping. Right-click the Output window to access window options, such as printing output text, saving text to file, and changing the font size. Overview: An optional window to simplify viewing workbooks containing large mappings or a large number of objects. Outlines the visible area in the workspace and highlights selected objects in color. To open the Overview window, choose View-Overview Window. Instance Data: View transformation data while you run the Debugger to debug a mapping. Target Data: View target data while you run the Debugger to debug a mapping. You can view a list of open windows and switch from one window to another in the Designer.

14

Designer Tasks
The common tasks performed in each of the Designer tools: Add a repository Print the workspace Open and close a folder Create shortcuts Check in and out repository objects Search for repository objects Enter descriptions for repository objects Copy objects Export and import repository objects Work with multiple objects, ports, or columns Rename ports Use shortcut keys

15

Naming Conventions
Chapter 3

Kanbay Incorporated - All Rights Reserved

Naming Conventions
Good Practice to Follow Naming Conventions Can be project specific: Workflow: Session: Mapping: Source: wfl_ followed by workflow functionality s_ followed by mapping name m_ followed by mapping functionality Table/File name

Target:
Ports: Input & Output : Variable:-

Table/File name
Column Names v_ followed by functionality

17

Naming Conventions - Transformations:


Source Qualifier: Stored Procedure: Sequence Generator: Expression: Joiner: Lookup: Filter: Rank: Router: sql_(followed by Source Name) sp_(followed by purpose of transformation) seq_ exp_ jnr_ lkp_ fil_ rnk_ rtr_

Update Strategy:
Aggregator: Normalizer:

upd_
agg_ nrm_

18

Working With Sources and Targets


Chapter 4

Kanbay Incorporated - All Rights Reserved

Design Process Overview


Create Source definition(s) Create Target definition(s) Create a Mapping Create a Session Task Create a Workflow with Task components Run the Workflow and verify the results

20

Methods of Analyzing Sources


To extract data from a source, you must first define sources in the repository. You can import or create the following types of source definitions in the Source Analyzer: Relational database Flat file COBOL file XML object

21

Working with Relational Sources


You can add and maintain relational source definitions for tables, views, and synonyms: Import source definitions. Import source definitions into the Source Analyzer. Update source definitions. Update source definitions either manually, or by re-importing the definition.

22

Importing Relational Source Definitions


You can import relational source definitions from database tables, views, and synonyms. When you import a source definition, you import the following source metadata: Source name Database location Column names Datatypes Key constraints Note: When you import a source definition from a synonym, you might need to manually define the constraints in the definition. To import a source definition, you must be able to connect to the source database from the client machine using a properly configured ODBC data source or gateway. You may also require read permission on the database object. You can also manually define key relationships, which can be logical relationships created in the repository that do not exist in the database.
23

Importing Relational Source Definitions


To import a source definition: 1. In Source Analyzer, choose Sources-Import from Database.

24

Importing Relational Source Definitions


If no table names appear or if the table you want to import does not appear, click All.

25

Importing Relational Source Definitions

6. Click OK.

26

Importing Relational Source Definitions

7. Choose Repository-Save
27

Creating Target Definitions


You can create the following types of target definitions in the Warehouse Designer: Relational. You can create a relational target for a particular database platform. Create a relational target definition when you want to use an external loader to the target database. Flat File. You can create fixed-width and delimited flat file target definitions.

XML File. You can create an XML target definition to output data to an XML file.

28

Importing a Relational Target Definition


When you import a target definition from a relational table, the Designer imports the following target details: Target name.

Database location.
Column names. Datatypes. Key constraints. Key relationships.

29

Automatic Target Creation


Drag-and-drop a Source Definition into the Warehouse Designer Workspace

30

Target Definition properties

31

Target Definition properties

32

Metadata Extensions Metadata Extensions Allows developers and partners to extend the metadata stored in the
repository Accommodates the following metadata types: User-defined: PowerCenter users can define and create their own metadata Vendor-defined: Third-party application vendor-created metadata lists For e.g., Applications like Power Connect for Siebel can add information such as contacts, version etc Can be re-usable or non-reusable Can promote non-reusable metadata extensions to reusable; this is not reversible Reusable ones are associated with all repository objects of that object type. Non-reusable one is associated with a single repository object Administrator or Super User privileges are required for managing reusable metadata extensions

33

Data Previewer
Preview data in Relational Sources Flat File Sources Relational Targets Flat File Targets Data Preview Option is available in

Source Analyzer
Warehouse Designer Mapping Designer Mapplet Designer

34

Data Previewer Source Analyzer


From Source Analyzer Select Source drop down Menu, then preview data

35

Data Previewer Source Analyzer

A right mouse click can also be used to preview data


36

Mappings Overview
Chapter 5

Kanbay Incorporated - All Rights Reserved

Overview
A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. Mappings represent the data flow between sources and targets. When the PowerCenter Server runs a session, it uses the instructions configured in the mapping to read, transform, and write data. Every mapping must contain the following components: Source instance: Describes the characteristics of a source table or file. Transformation: Modifies data before writing it to targets. Use different transformation objects to perform different functions. Target instance: Defines the target table or file. Links: Connect sources, targets, and transformations so the PowerCenter Server can move the data as it transforms it. Note: A mapping can also contain one or more Mapplets. A mapplet is a set of transformations that you build in the Mapplet Designer and can use in multiple mappings.
38

Sample Mapping

39

Developing a Mapping [1/2]


When you develop a mapping, use the following procedure as a guideline: 1. Verify that all source, target, and reusable objects are created. Create source and target definitions. If you want to use mapplets, you must create them also. You can create reusable transformations in the Transformation Developer, or you can create them while you develop a mapping. 2. Create the mapping. You can create a mapping by dragging a source, target, mapplet, or reusable transformation into the Mapping Designer workspace, or you can choose Mapping-Create from the menu. 3. Add sources and targets. Add sources and targets to the mapping.

40

Developing a Mapping [2/2]


4. Add transformations and transformation logic. Add transformations to the mapping and build transformation logic into the transformation properties. 5. Connect the mapping. Connect the mapping objects to create a flow of data from sources to targets, through mapplets and transformations that add, remove, or modify data along this flow. 6. Validate the mapping. Validate the mapping to identify connection or transformation errors. 7. Save the mapping. When you save the mapping, the Designer validates it, identifying any errors. The Designer displays validation messages in the Output window. A mapping with errors is invalid, and you cannot run a session against it until you validate it.

41

Transformation Concepts
A Transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. Transformations can be active or passive. Transformations can be connected to the data flow, or they can be unconnected. An unconnected transformation is not connected to other transformations in the mapping. An Unconnected transformation is called within another transformation, and returns a value to that transformation. Transformations in a mapping represent the operations the PowerCenter Server performs on the data. Data passes into and out of transformations through ports that you link in a mapping or mapplet.

42

Transformation Concepts
Perform the following tasks to incorporate a transformation into a mapping: 1. Create the transformation. 2. Configure the transformation. 3. Link the transformation to other transformations and target definitions. You can create transformations using the following Designer tools: Mapping Designer: Create transformations that connect sources to targets. Transformations in a mapping cannot be used in other mappings unless you configure them to be reusable. Transformation Developer: Create individual transformations, called reusable transformations, that you can use in multiple mappings. Mapplet Designer: Create and configure a set of transformations, called mapplets, that you can use in multiple mappings.
43

Getting Help
Chapter 6

Kanbay Incorporated - All Rights Reserved

Startup Pages 1/3

45

Startup Pages 2/3

46

Startup Pages 3/3

47

Navigating The Online Documentation


Informatica provides a comprehensive help manual for designers The entire manual can be accessed by using the Help menu from the main menu bar. It provides standard content wise, index wise and search based help with an option to save certain pages as favorites. Informatica also provides a context based help where in the help button on any window would directly take to the help page related to that window

48

Source Qualifier Transformation


Chapter 7

Kanbay Incorporated - All Rights Reserved

Source Qualifier Transformation


Active Transformation Connected Port All Input/Output Usage Modify SQL statements User defined Join Source Filter Sorted ports Select Distinct Pre/Post SQL Convert Data Types Relational sources ONLY

50

Source Qualifier Transformation


Represents the source record set queried by the server. Mandatory in Mappings using relational or flat file sources

51

Default Query
For relational sources, the PowerCenter Server generates a query for each Source Qualifier transformation when it runs a session. The default query is a SELECT statement for each source column used in the mapping. Thus, the PowerCenter Server reads only the columns that are connected to another transformation.

Although there are many columns in the source definition, only three columns are connected to another transformation. In this case, the PowerCenter Server generates a default query that selects only those three columns:
SELECT CUSTOMERS.CUSTOMER_ID, CUSTOMERS.COMPANY, FROM CUSTOMERS CUSTOMERS.FIRST_NAME
52

Joining Multiple sources


You can use one Source Qualifier transformation to join data from multiple relational tables. These tables must be accessible from the same instance or database server. When a mapping uses related relational sources, you can join both sources in one Source Qualifier transformation. Default join is inner equi-join (where Src1.col_nm = Src2.col_nm) if the relationship between the tables is defined in the Source Analyzer This can increase performance when source tables are indexed. Tip: Use the Joiner transformation for heterogeneous sources and to join flat files.

53

Joining Multiple sources

54

Pre-SQL and Post-SQL Rules


Pre & Post SQL statements are run against the source database Can use any command that is valid for the database type; no nested comments

Can use Mapping Parameters and Variables


Use a semi-colon to separate multiple statements Informatica server ignores semi-colons within single quotes, double quotes or within /* */ To use semi-colon outside of quotes or comments, escape it with a back slash (\)

55

Hands on Exercises - I
Chapter 8

Kanbay Incorporated - All Rights Reserved

Lab 1 - Setting Connections


This is a demonstration which has to be followed by the participants. This lab briefs about connections to Informatica Clients and other necessary configurations

57

Lab 2 - Creating Source Definitions


Connect to tdbu01 Database using the SOURCE_INFA_TRN connection Import the Source Table Employee

58

Lab 3 - Creating Target Definitions


Connect to tdbu02 Database using the TARGET_INFA_TRN connection Import the Target Table Employee

59

Lab 4 Simple Mapping


Create a Mapping using Employees as the Source and Employees as the Target instance No other transformations are required.

During execution of the map, select file as the target instead of relational and delimiter is Pipe
Ensure target file name is user specific (e.g.: Student01 should use file_name01)

60

Transformation Objects (1)


Chapter 9

Kanbay Incorporated - All Rights Reserved

Active Vs Passive Transformation


Active Passive

Number or rows input may not equal Number or rows input always number of rows output equals number of rows output Can operate on groups of data rows Operates on one row at a time

May not be re-linked into another data stream (except into a sorted join where both flows arise from the same source qualifier)

May be re-linked into another data stream

e.g. Aggregator, Filter, Joiner, Rank, e.g. Expression, Lookup, External Normalizer, Source Qualifier, Procedure, Sequence Generator, Update Strategy, Custom Stored Procedure

62

Transformation Types Explained [1/5]


Aggregator: Active/Connected Performs aggregate calculations

Application Source Qualifier:


Active/Connected Reads ERP object sources Custom: [Active or Passive]/Connected Calls a procedure in a shared library or DLL. External Procedure: Passive/[Connected or Unconnected] Calls a procedure in a shared library / the COM layer of Windows. Expression: Passive/Connected

Performs low-level calculations


63

Transformation Types Explained [2/5]


Filter: Active/Connected Drops rows conditionally Input: Passive/Connected Defines mapplet input rows Available in the Mapplet Designer Joiner: Active/Connected Joins heterogeneous sources Lookup: Passive/[Connected or Unconnected] Looks up values and passes them to other objects Normalizer: Active/Connected Reorganizes records from VSAM, Relational and Flat file
64

Transformation types Explained [3/5]


Output: Passive/Connected Defines mapplet output rows Available in the Mapplet Designer Rank: Active/Connected Limits record to the top or bottom of a range Router: Active/Connected Splits rows conditionally Sequence Generator: Passive/Connected Generates unique ID values Sorter: Active/ Connected Sorts data
65

Transformation types Explained [4/5]


Source Qualifier: Active/Connected Reads data from Flat file & Relational Sources Stored Procedure: Passive/Connected or Unconnected] Calls a database stored procedure Transaction Control: Active/Connected Defines commit and rollback transactions. Union: Active/Connected Merges data from different databases or flat file systems. Update Strategy: Active/Connected Determines whether to insert, delete, update, or reject rows.
66

Transformation types Explained [5/5]


XML Generator: Active/Connected Reads data from one or more input ports & outputs XML through a single output port. XML Parser: Active/Connected Reads XML from one input port and outputs data to one or more output ports. XML Source Qualifier: Active/Connected Represents the rows that the PowerCenter Server reads from an XML source when it runs a session

67

Transformation Views
A transformation has three views : Iconized Normal Edit Iconized: shows the transformation in the relation to the rest of the mapping

68

Transformation Views
Normal: shows the flow of data through the transformation

Edit: shows the transformation ports and the properties; allows editing

69

Data Flow Rules


Each Source Qualifier starts a single data stream (a data flow) Transformations can send rows to more than one transformation (split one data flow into multiple pipelines) Two or more data flow can converge only if they originate from a common active transformation.

70

Ports & Expressions


Ports are present in each transformation and are used to propagate the field values from the source to the target via the transformations. Ports are basically of 3 types:-

Input
Output Variable Ports evaluation follows the Top-Down Approach An Expression is a calculation or conditional statement added to a transformation. An Expression can be composed or Ports, Functions, operators, variables, literals, return values & constants.

71

Ports - Evaluation
The best practice recommends the following approach for port evaluation Input Ports: Should be evaluated first There is no evaluation ordering among input ports (as they do not depend on any other ports) Variable Ports: Should be evaluated after all input ports are evaluated (as variable ports can reference any input port) Variable ports can reference other variable ports also but not any output ports. Ordering of variables is also very important as they can reference each others values.

72

Ports - Evaluation
Output Ports: Should be evaluated last They can reference any input port or any variable port. There is no ordered evaluation of output ports (as they cannot reference each other)

73

Using Variable Ports


Also known as Local variables. Used for temporary storage Used to simplify complex expressions E.g. create and store a depreciation formula to be referenced more than once Used in another variable port or output port expression A variable port cannot also be an input or output port. Available in the Expression, Aggregator and Rank. Variable ports are NOT visible in Normal view, only in Edit view

74

Using Variable Ports


The scope of variable ports is limited to a single transformation. Variable ports are initialized to either zero (for numeric values) or empty string (for character & date variables) when the Mapping logic is processed. They are not initialized to Null Variable ports can remember values across rows (useful for comparing values) & they retain their values until the next evaluation of the variable expression. Thus we can effectively use the order of variable ports to do procedural computation.

75

Default Values Two Usages


For Input and I/O ports Used to replace null values For Output ports Used to handle transformation calculation errors (not-null handling)

76

Expressions
Expressions can be entered at the row-level (port) or field-level (transformation level) Expressions can be used in the following transformations:-

Expression:
Aggregator Rank Filter Router Update Strategy Transaction Control

- Output Port Level


- Output Port Level - Output Port Level - Transformation Level - Transformation Level - Transformation Level - Transformation Level

77

Expression Transformation

Kanbay Incorporated - All Rights Reserved

Expression Transformation
Passive Transformation Connected Ports Mixed Variables allowed Create expression in output or variable port

Used to perform majority of data manipulation

79

Expression Transformation
Perform calculations using non-aggregate functions (row level)

80

Expression Editor
An expression formula is a calculation or conditional statement for a specific port in a transformation Performs calculation based on ports, functions, operators, variables, constants, and return values from other transformations

81

Expression Editor

82

Expression Validation
The Validate or OK button in Expression Editor will: Parse the current expression

Remote port searching (resolves references to ports in other transformations)


Parse default values Check spelling, correct number of arguments in functions, other syntactical errors

83

Informatica Functions
Character Functions Conversion Functions Date Functions Numerical Functions Scientific Functions Test Functions Special Functions

84

Informatica Functions
Character Functions Used to manipulate character data InitCap returns the string value with the first letter in upper case followed by lower case Conversion Functions Used to convert data types

85

Informatica Functions
Date Functions Used to round, truncate, or compare dates; extract one part of the date; or perform arithmetic on a date To pass a string to a date function,first use the to_date() to convert it to an alternate date/time data type

Numerical Functions
Used to perform mathematical operations on numeric data

86

Informatica Functions
Scientific Functions Used to calculate geometric values of numeric data Test Functions Used to test if a lookup result is null Used to validate data ISNULL() IS_DATE() IS_NUMBER() IS_SPACES()

87

Informatica Functions
Special Functions Used to handle specific conditions within a session; search for certain values; test conditional statements IIF(Condition,True,False) ERROR() ABORT()

DECODE()

88

Informatica New Functions Explained


METAPHONE Encodes string values. You can specify the length of the string that you want to encode. METAPHONE encodes characters of the English language alphabet (A-Z). It encodes both uppercase and lowercase letters in uppercase. METAPHONE encodes characters according to the following list of rules: Skips vowels (A, E, I, O, and U) unless one of them is the first character of the input string. METAPHONE(CAR) returns KR, METAPHONE (Lamb) returns LM and METAPHONE(AAR) returns AR. Syntax:- METAPHONE( string [,length] ) Return Value:- String/NULL
Note: If value passed is NULL, empty string or does not have any letter of English language

89

Informatica New Functions Explained


SOUNDEX Encodes a string value into a four-character string. SOUNDEX works for characters in the English alphabet (A-Z). It uses the first character of the input string as the first character in the return value and encodes the remaining three unique consonants as numbers. SOUNDEX encodes characters according to the following list of rules: Uses the first character in string as the first character in the return value, and encodes it in uppercase. For e.g., both SOUNDEX(John) and SOUNDEX(john) return J500. Encodes the first three unique consonants following the first character in string and ignores the rest. For example, both SOUNDEX(JohnRB) and SOUNDEX(JohnRBCD) return J561.

Assigns a single code to consonants that sound alike.


Syntax:- SOUNDEX( string ) Return Value:- String/NULL

90

Informatica Data Types


Native Data types
Specific to the source and target database types Display in source and target tables within Mapping Designer

Transformation Data Types


PowerCenter internal database types based on ANSII SQL-92 Display in transformations within Mapping Designer

Note: a) Transformation data types allow mix-n-match of source and target database types b) When connecting ports, native and transformation data types must be either compatible or explicitly converted
91

Data type Conversions


Implicit Type Conversions: All numeric data can be converted to all other numeric datatype (e.g. integer, double and decimal) All numeric data types can be converted to string, and vice-versa Date can be converted only to date and string, and vice-versa Raw (binary) can only be linked to raw

92

Connect Validation
Examples of invalid connections in a Mapping: Connecting ports with incompatible data types Connecting output ports to a Source Connecting a Source to anything but a Source Qualifier or Normalizer Transformation Connecting an output port to an output port or an input port to another input port

93

Mapping Validation
Mappings must Be valid for a session to run Be end-to-end complete and contain valid expressions Pass all data flow rules Mappings are always validated when saved; can be validated without saving Output window will always display reason for invalidity

94

Filter Transformation

Kanbay Incorporated - All Rights Reserved

Filter Transformation
Active Transformation Connected Ports

All Input/Output
Usage Filter rows from mapping/mapplet pipeline

96

Filter Transformation
Drops rows conditionally

Use of logical operators makes the filter very effective (e.g. SALARY > 30000 AND SALARY < 100000)
97

Filter Transformation in a Mapping

98

Router Transformation

Kanbay Incorporated - All Rights Reserved

Router Transformation
Rows sent to multiple filter conditions
Active Transformation Connected

Ports
All input/output Specify filter conditions for each Group Used to Link source data in one pass to multiple filter conditions

100

Router Groups
Input group(always one) User-defined groups Each group has one condition All group conditions are evaluated for each row One row can pass multiple conditions Unlinked group outputs are ignored Default group(always one) can capture rows that fail all Group conditions

101

Router Group Filter Conditions

102

Using Router in a mapping

103

Workflows- I
Chapter 10

Kanbay Incorporated - All Rights Reserved

Workflow Manager Tools


Workflow Designer Maps the execution order and dependencies of Sessions, Tasks & Worklets, for the Informatica Server Task Developer Create Session, Shell Command and Email Tasks Tasks created in the Task Developer are reusable Worklet Designer Creates objects that represent a set of tasks

Worklet objects are reusable

105

Workflow Manager Interface

e.g. The simplest Workflow


106

Workflow - Overview
A workflow is a set of instructions that describes how and when to run tasks related to extracting, transforming, and loading data. The PowerCenter Server runs workflow tasks according to the conditional links connecting the tasks. You can run a task by placing it in a workflow. Workflow Manager is used to develop and manage workflows. Workflow Monitor is used to monitor workflows and stop the PowerCenter Server. When a workflow starts, the PowerCenter Server retrieves mapping, workflow, and session metadata from the repository to extract data from the source, transform it, and load it into the target. It also runs the tasks in the workflow. You can run as many sessions in a workflow as you need. You can run the Session tasks sequentially or concurrently, depending on your needs.
107

Session Overview
A session is a set of instructions that tells the PowerCenter Server how and when to move data from sources to targets. A mapping is a set of source and target definitions linked by transformation objects that define the rules for data transformation. A session is a type of task, similar to other tasks available in the Workflow Manager. In the Workflow Manager, you configure a session by creating a Session task. To run a session, you must first create a workflow to contain the Session task.

108

Designing & Developing Workflows


Create a new Workflow in Workflow Designer Specify Workflow name,and select a Server Customize Workflow Properties Workflow log displays Set and customize workflow-specific schedule Metadata Extensions provide for additional user data

Building Workflow Components


Add sessions and other Tasks to the workflow Connect all Workflow components with Links Save the workflow

Start the workflow

109

Workflow Designer - Links


Required to connect Workflow Tasks Can be used to create branches in a Workflow All links are executed-unless a link condition is used which makes a link false Links connecting the tasks in a workflow are not allowed to form a closed loop

110

Session Task

Kanbay Incorporated - All Rights Reserved

Session Task
Server Instructions to run the logic of ONE specific Mapping E.g- source and target data location specifications, memory allocation,optional Mapping overrides, scheduling,processing and load instructions Becomes a component of a Workflow or Worklet If configured in the Task Developer,the Session Task is reusable

When a session is to be created, valid mappings are displayed in the dialog box

112

Session Task
Session Task Tabs : General Properties Config Object Mapping Components

Metadata Extensions

113

Session Task

114

Session Task

115

Validating a Session
The Workflow Manager validates a Session task when you save it. You can also manually validate Session tasks and session instances. Validate reusable Session tasks in the Task Developer. Validate non-reusable sessions and reusable session instances in the Workflow Designer.

116

Workflows Monitor Overview


Chapter 11

Kanbay Incorporated - All Rights Reserved

Monitor Workflows
The Workflow Monitor is the tool for monitoring Workflows and Tasks Review details about a Workflow or Tasks in two views:

Gantt Chart view


Task view The Workflow Monitor displays Workflows that have been run at least once

118

Gantt Chart View

119

Task View

120

Monitoring Workflows
Perform operations in the Workflow Monitor Restart: restart a Task, Workflow or Worklet Stop: stop a Task, Workflow or Worklet Abort: abort a Task, Workflow or Worklet Resume: resume a suspended Workflow after a failed Task is corrected View Session and Workflow logs Abort has a 60 second timeout If the Server has not completed processing and committing data during the timeout period, the threads and processes associated with the Session are killed.

121

Monitoring Workflows -Task view


Task View Start, Stop, Abort, Resume Tasks, Workflows and Worklets Workflow Monitoring Filtering Task view provides filtering Monitoring filters can be set using drop-down menus Truncating Monitor Logs

The Repository Manager Truncate Log option clears the Workflow Monitor logs

122

Hands on Exercises - II
Chapter 12

Kanbay Incorporated - All Rights Reserved

Lab 5 - Expression Transformation


Create a mapping using the DIM_EMPLOYEE as the target Employee flat file as source,

Concatenate First Name and Last Name to get Employee Name

Ensure all leading and trailing spaces are removed for character columns
Use NEXTVAL of Sequence Generator transformation to connect to Employee_wk

Target load will be truncate/load.


Do not connect direct_report_wk geography_wk, region_nk, region_name and

124

Lab 6 - Filter Transformation


Create copy of Lab 5 mapping for LAB 6 Add a Filter to the mapping to filter out all records having Region as NULL, set audit_id = 0

Target load will be truncate/load

125

Lab 7 - Using Router Transformation


Create a mapping using Customer as Source Add a Router to have groups by Customer, based on Country = USA, Country = Germany and all others

Load to 3 instances of the DIM_CUSTOMER table

126

Using the Debugger


Chapter 13

Kanbay Incorporated - All Rights Reserved

Debugger Features
Debugger is a wizard-driven tool View source/target data View transformation data

Set breakpoints and evaluate expressions


Initialize variables Manually change variable values Debugger is Session Driven Data can be loaded or discarded Debug environment can be saved for later use

128

Debugger Features
You can debug a valid mapping to gain troubleshooting information about data and error conditions. To debug a mapping, you configure and run the Debugger from within the Mapping Designer. The Debugger uses a session to run the mapping on the PowerCenter Server. When you run the Debugger, it pauses at breakpoints and allows you to view and edit transformation output data. You might want to run the Debugger in the following situations: Before you run a session After you run a session

129

Debugger Session Types


You can select three different debugger session types when you configure the Debugger. The Debugger runs a workflow for each session type.

You can choose from the following Debugger session types when you configure the Debugger: Use an existing non-reusable session for the mapping Use an existing reusable session for the mapping Create a debug session instance for the mapping

130

Debugger Session Types


Use an existing non-reusable session for the mapping: The Debugger uses existing source, target, and session configuration properties. When you run the Debugger, the PowerCenter Server runs the nonreusable session and the existing workflow. The Debugger does not suspend on error. Use an existing reusable session for the mapping: The Debugger uses existing source, target, and session configuration properties. When you run the Debugger, the PowerCenter Server runs a debug instance of the reusable session and creates and runs a debug workflow for the session. Create a debug session instance for the mapping: The Debugger Wizard allows you to configure source, target, and session configuration properties. When you run the Debugger, the PowerCenter Server runs a debug instance of the debug workflow and creates and runs a debug workflow for the session.
131

Debug Process
1. Create breakpoints You create breakpoints in a mapping where you want the PowerCenter Server to evaluate data and error conditions. 2. Configure the Debugger Use the Debugger Wizard to configure the Debugger for the mapping. Select the session type the PowerCenter Server uses when it runs Debugger. When you create a debug session, you configure a subset of session properties within the Debugger Wizard, such as source and target location. You can also choose to load or discard target data.

132

Debug Process
3. Run the Debugger Run the Debugger from within the Mapping Designer. When you run the Debugger the Designer connects to the PowerCenter Server. The PowerCenter Server initializes the Debugger and runs the debugging session and workflow. The PowerCenter Server reads the breakpoints and pauses the Debugger when the Breakpoints evaluate to true. 4. Monitor the Debugger While you run the Debugger, you can monitor the target data, transformation & Mapplets output data, the debug log, and the session log. When you run the Debugger, the Designer displays the following windows: Debug log. View messages from the Debugger. Target window. View target data. Instance window. View transformation data.
133

Debug Process
5. Modify data and breakpoints When the Debugger pauses, you can modify data and see the effect on transformations, Mapplets, and targets as the data moves through the pipeline. You can also modify breakpoint information. The Designer saves mapping breakpoint and Debugger information in the workspace files. You can copy breakpoint information and the Debugger configuration to another mapping. If you want to run the Debugger from another PowerCenter Client machine, you can copy the breakpoint information and the Debugger configuration to the other PowerCenter Client machine.

134

Debugger Interface

135

Creating Breakpoints
Use the Breakpoint Editor in the Mapping Designer to create breakpoint conditions in a mapping. You can create data or error breakpoints.

When you run the Debugger, the PowerCenter Server pauses the Debugger when a breakpoint evaluates to true.
A breakpoint can consist of an instance name, a breakpoint type, and a condition. When you enter breakpoints, set breakpoint parameters in the following order: 1. Select the instance name. 2. Select the breakpoint type. 3. Enter the condition.

136

Breakpoints Editor

137

Debugger Tips
Server must be running before starting a Debug Session When the Debugger is started, a spinning icon displays. Spinning stops when Debugger Server is ready

Flashing yellow/green arrow points to the current active Source Qualifier. Solid yellow arrow points to the current Transformation instance
Next Instance-is a single step at a time, one row moves from transformation to transformation Step to Instance-examines one transformation at a time, one row after other through the same transformation

138

Transformations in Depth - II
Chapter 14

Kanbay Incorporated - All Rights Reserved

Target Instances

Kanbay Incorporated - All Rights Reserved

Target Instances
A single mapping can have more than one instance of the same target The data would be loaded into the instances in bulk mode like a pipeline

Usage of multiple instances of the same target for loading is dependant on the RDBMS in use. Multiple instances may not be used if the underlying database locks the entire table while inserting records

141

Target Instances - example

142

Joiner Transformation

Kanbay Incorporated - All Rights Reserved

Joiner Transformation
Active/Connected Ports Input Output Master

144

Joins Types
Homogeneous Joins Joins that can be performed with a SQL SELECT statement Source Qualifier contains a SQL join Tables on same database server(or are synonyms) Database server does the join work Multiple Homogeneous joins can be joined

Heterogeneous Joins
Examples of joins that cannot be done with an SQL statement : An Oracle table and a DB2 table Two flat files A flat file and a database table

145

Heterogeneous Joins

146

Joiner Properties
Join Types: Normal (inner) Master Outer Detail Outer Full Outer

Joiner can accept sorted data (configure the join condition to use the sort origin ports) Joiner Conditions & Nested Joins: Multiple Join conditions are supported

Used to join three or more heterogeneous sources


147

Mid-Mapping Join
The joiner does not accept input in the following situations Both input pipelines begin with the same Source Qualifier Both input pipelines begin with the same Normalizer

Both input pipelines begin with the same Joiner


Either input pipeline contains an Update Strategy

148

Aggregator Transformation

Kanbay Incorporated - All Rights Reserved

Aggregator Transformation
Active Transformation Connected Ports

Mixed
Variables allowed Group by allowed Used for Standard aggregations Can also be used to get distinct records

150

Aggregator Transformation
Performs aggregate calculations

151

Aggregate Expressions
Aggregate functions are supported only in the Aggregator Transformation

Conditional Aggregate Expressions are supported Ex : Conditional SUM format SUM(value,condition)


152

Aggregator Transformation
Aggregate Functions Return summary values for non-null data in selected ports

Used only in Aggregator Transformations


Used only in Output ports Calculate a single value(and row) for all records in a group Nested aggregate functions are allowed Ex : AVG(),COUNT(),MAX(),SUM() Conditional statements can be used with these functions

153

Aggregate Properties
Sorted Data(can be aggregated more efficiently) The Aggregator can handle sorted or unsorted data

The Server will cache data from each group and release the cached data- upon reaching the first record of the next group
Data Must be sorted according to the order of the Aggregator Group By ports

Performance gain will depend upon varying factors


Sorted Input property Instructs the Aggregator to expect the data to be sorted Difference between Sorted and Unsorted Data Unsorted Data No rows are released from Aggregator until all rows are aggregated Sorted Data Each Separate group (one row) released as soon as the last row in the group is aggregated
154

Lookup Transformation

Kanbay Incorporated - All Rights Reserved

Lookup Transformation
Passive Transformation Connected/Unconnected Ports Mixed L indicates Lookup port R indicates port used as a return value Usage Get related values Verify if records exist or if data has changed Multiple conditions are supported Lookup SQL override is allowed

156

Lookup Transformation

157

Lookup Transformation
Looks up values in a database table and provides data to other components in a mapping

158

Lookup Properties
Lookup conditions Lookup Table Name

Lookup condition
Native Database connection Object name

159

How a Lookup Transformation works


For each mapping row, one or more port values are looked up in a database table If a match is found, one or more table values are returned to the mapping. If no match is found, default value is returned

160

Lookup Caching
Caching can significantly impact performance Cached Lookup table data is cached locally on the server Mapping rows are looked up against the cache Only one SQL SELECT is needed Cache is indexed based on the order by clause

Uncached
Each Mapping row needs one SQL SELECT If the data does not fit in the memory cache, the PowerCenter Server stores the overflow values in the cache files.

When the session completes, the PowerCenter Server releases cache memory and deletes the cache files unless you configure the Lookup transformation to use a persistent cache.

161

Lookup Caches
When configuring a lookup cache, you can specify any of the following options: Persistent cache: You can save the lookup cache files and reuse them the next time the PowerCenter Server processes a Lookup transformation configured to use the cache. When Session completes, the persistent cache is stored on the server hard disk. The next time Session runs, cached data is loaded fully or partially into RAM and reused. A named persistent cache may be shared by different Sessions Recache from source: If the persistent cache is not synchronized with the lookup table, you can configure the Lookup transformation to rebuild the lookup cache.

162

Lookup Caches
Static cache: You can configure a static, or read-only, cache for any lookup source. By default, the PowerCenter Server creates a static cache. It caches the lookup file or table and looks up values in the cache for each row that comes into the transformation. Dynamic cache: If you want to cache the target table and insert new rows or update existing rows in the cache and the target, you can create a Lookup transformation to use a dynamic cache. The PowerCenter Server dynamically inserts or updates data in the lookup cache and passes data to the target table. Shared cache: You can share the lookup cache between multiple transformations. You can share an unnamed cache between transformations in the same mapping. You can share a named cache between transformations in the same or different mappings.
163

Lookup Policy on Multiple Match


Options are Use first value

Use last value


Report error Note: When Dynamic Cache is enabled Multiple match will report error.

164

Unconnected Lookups

Kanbay Incorporated - All Rights Reserved

Unconnected Lookup
Will be physically unconnected from other transformations There can be NO data flow arrows leading to or from an unconnected Lookup

Lookup function can be set within any transformation that supports expression
Function in the Aggregator calls the unconnected Lookup

166

Conditional Lookup Technique


Two requirements: 1. Must be Unconnected(or function mode) Lookup 2. Lookup function used within a conditional statement E.g - IIF(ISNULL(cust_id), :lkp.MYLOOKUP(order_no) Conditional statement is evaluated for each row Lookup function is called only under the pre-defined condition

167

Conditional Lookup Advantage


Data lookup is performed only for those rows which require it.Substantial performance can be gained E.g.- A Mapping will process 500,000 rows. For two percent of those rows(10,000) the item_id value is NULL. Item_id can be derived from the SKU_NUMB IIF(ISNULL(item_id),:lkp.MYLOOKUP(sku_numb)) Net Savings=490,000 lookups

168

Unconnected Lookup Functionality


One Lookup port value (Return Port) may be returned for each Lookup WARNING: If the Return port is not defined, you may get unexpected results.

169

Connected Vs Unconnected Lookups


Connected LOOKUP Part of the mapping data flow Returns multiple values (by linking output ports to another transformation Executed for every record passing through the transformation More visible, shows where the lookup values are used Default values are used Unconnected LOOKUP Separate from the mapping data flow Returns one value (by checking the Return port option for the output port that provides the return value) Only executed when the lookup function is called Less visible, as the lookup is called from an expression within another transformation Default values are ignored

170

Update Strategy Transformation

Kanbay Incorporated - All Rights Reserved

Update Strategy Transformation


Active Transformation Connected Ports All input/output Usage To mark a record for insert/update/delete or rejection IIF or DECODE logic determines how to handle the record

172

Update Strategy Transformation


Specifies how each individual row will be used to update target tables (insert,update,delete, reject)

173

Update Strategy expressions


Operations
INSERT UPDATE DELETE REJECT

Constant
DD_INSERT DD_UPDATE DD_DELETE DD_REJECT

Numeric Value
0 1 2 3

IIF ( score>69,DD_INSERT,DD_DELETE) Expression is evaluated for each row Rows are tagged according to the logic of the expression Appropriate SQL(DML) is submitted to the target database: insert, delete or update

DD_REJECT means the row will not have SQL written for it.
Target will not see the row Rejected rows may be forwarded through Mapping to a reject file
174

Sequence Generator Transformation

Kanbay Incorporated - All Rights Reserved

Sequence Generator Transformation


Generates unique keys for any port on a row Passive Transformation / Connected Ports Two predefined output ports NEXTVAL CURRVAL

No input ports allowed


Usage Generate Sequence numbers Shareable across mappings

176

Sequence Generator Transformation

Connecting CURRVAL and NEXTVAL Ports to a Target

177

Sequence Generator Properties


Properties Start value End Value Increment By Number of cached values Reset

Cycle
Design tip: Set Reset property and Increment by 1. Use in conjunction with lookup. Lookup to get max(value) from target. Add NextVal to it to get the new ID.

178

Hands on Exercises - 3
Chapter 15

Kanbay Incorporated - All Rights Reserved

Lab 8 Using Debugger


Execute a debugger session for Customer mapping created in Lab 7 with Load to Target as disabled Check usage of Step To Instance , Next Instance and Continue

Add a Breakpoint for Country =USA for the Source Qualifier and Router Transformations
Stop at rows where condition is satisfied to observe data

180

Lab 9 Sequence Generator


Create copy of mapping created in Lab 5 Use Sequence Generator to Populate Employee_wk Set Properties for Reset, Cycle, Range should be 1 to 100

181

Lab 10 Joiner Transformation


Use Employee file created in Lab 4 as source. Add another source that is a combination of EmployeeTerritories, Territories, Region tables

Join to the flat file source by doing an inner join on EmployeeID to get RegionID, RegionDescription, TerritoryID and TerritoryDescription from the db tables and other details from the flat file
Avoid PhotoPath and Notes from the Flat file join

Target is a flat file (New target definition required)

182

Lab 11 Aggregator Transformation


Create a mapping with Sources as Orders, OrderDetails Target is Fact_Orders Aggregate at Order_ID level

Formulae:
lead_time_days = requireddate - orderdate, internal_response_time_days = shippeddate - orderdate, external_response_time_days = requireddate - shippeddate total_order_item_count = SUM(Quantity) total_order_discount_dollars = SUM((Quantity * UnitPrice) * Discount) total_order_dollars = SUM((Quantity * UnitPrice) - ((Quantity * UnitPrice) * Discount)) DEFAULT to -1 for customer_wk, employee_wk, order_date_wk, required_date_wk, shipped_date_wk, ship_to_geography_wk, shipper_wk
183

Lab 12 Using Connected Lookup


Use Mapping in Lab 11 Add 2 Lookup transformation for DIM_EMPLOYEE and DIM_SHIPPER Populate using lookups with natural keys, default = -1 employee_wk: Orders.EmployeeID shipper_wk: Orders.ShipVia = Dim_Shipper.Shipper_nk = DIM_EMPLOYEE.employee_nk

Populate the other keys with Default = -1

184

Lab 13 Using Unconnected Lookup


Use the Mapping in Lab 12 Add two Unconnected Lookups for DIM_CUSTOMER DIM_CALENDER (Empty table) Add an Expression transformation between the aggregator and the target where the unconnected lookups can be called

185

Lab 14 Update Strategy


Add Employee as Source. Add 2 instances of DIM_EMPLOYEE as Target Add a Lookup Transformation (LKP_Target) to get employee_wk from DIM_EMPLOYEE.

Add Expression Transformation for trimming string columns and getting values from LKP_Target
Add Router Transformation to separate the data flow for New and Existing Records

Add 2 Update Strategy Transformations to Flag for Insert and Update


Add a Sequence Generator for populating the employee_wk for insert rows. Add an Unconnected Lookup to retrieve the Max_value of employee_wk from the Target Add an Expression Transformation (EXP_MAX_SEQ - between the Update Strategy for insert and the Target instance for insert) to call the unconnected lookup Note: Run LAB 6 first. Some rows are filtered & then run this workflow
186

Designer Features
Chapter 16

Kanbay Incorporated - All Rights Reserved

Arranging Workspace

188

Propagating Changed Attributes

189

Link Paths

190

Exporting Objects to XML

191

Importing Objects from XML

192

Comparing Objects

193

Documentation
Informatica also provides a very descriptive collection of Documentation and Guides. The complete set of documentation for PowerCenter includes: Data Profiling Guide. Designer Guide. Getting Started Installation and Configuration Guide.

PowerCenter Connect for JMS User and Administrator Guide.


Repository Guide. Transformation Language Reference. Transformation Guide

Troubleshooting Guide.
Web Services Provider Guide Workflow Administration Guide. XML User Guide.
194

Versioning
If you have the team-based development license, you can configure the repository to store multiple versions of objects During development, you can use the following change management features to create and manage multiple versions of objects in the repository: Check out and check in versioned objects Compare objects

Track changes to an object


Delete or purge a version You can also apply labels to versioned objects, run queries to search for objects in the repository, and include versioned objects in deployment groups

195

Versioning
A repository enabled for versioning can store multiple versions of the following objects: Sources

Targets
Transformations Mappings & Mapplets Sessions & Tasks Workflows & Worklets Session configurations Schedulers Cubes Dimensions

196

Transformations in Depth - III


Chapter 17

Kanbay Incorporated - All Rights Reserved

Normalizer Transformation

Kanbay Incorporated - All Rights Reserved

Normalizer Transformation
Normalization is the process of organizing data. Normalizes Records from relational or VSAM sources Active Transformation Connected Ports Input/Output or Output

Usage
Required for VSAM source definitions Normalize flat file or relational source definitions Generate multiple records from one record

199

Overview
You primarily use the Normalizer transformation with COBOL sources, which are often stored in a de-normalized format. The Normalizer transformation normalizes records from COBOL and relational sources, allowing you to organize the data according to your own needs. A Normalizer transformation can appear anywhere in a pipeline when you normalize a relational source.

You break out repeated data within a record into separate records.
You can also use the Normalizer transformation with relational sources to create multiple rows from a single row of data.

200

Overview
Use a Normalizer transformation instead of the Source Qualifier transformation when you normalize a COBOL source.

201

Different Normalizer Transformations


There are a number of differences between a VSAM Normalizer using COBOL sources and a pipeline Normalizer.
VSAM Normalizer Transformation Connection Port Creation COBOL Source Automatically created based on the COBOL Source No Pipeline Normalizer Transformation Any Transformation Created Manually

Transforms allowed before the Normalizer transformation Transforms allowed before the Normalizer transformation Reusable Ports

Yes

Yes

Yes

no Input/Output

Yes Input/Output
202

Pipeline Normalizer Transformation

203

Sorter Transformation

Kanbay Incorporated - All Rights Reserved

Sorter Transformation
Active Transformation Connected Ports Input/Output Define one or more sort keys Define sort order for each key Usage

Sort data in mapping/mapplet pipeline


Before Aggregator

205

Sorter Transformation
Can sort data from relational tables or flat files Sort takes place on the Informatica Server machine Multiple sort keys are supported

206

Sorter Transformation
Sorter Properties Cache size Can be adjusted. [Default is 8MB] Server uses twice the cache listed If cache size is unavailable, Session Task will fail

207

Rank Transformation

Kanbay Incorporated - All Rights Reserved

Rank Transformation
Filters the top or bottom range of records for selection. Active Transformation Connected

Ports
Mixed One pre-defined output port RANK INDEX Variable allowed Group By allowed Usage Select top/bottom

Number of records

209

Overview
You can use a Rank transformation to: Return the largest/smallest numeric value in a port or group. Return the strings at the top/bottom of a session sort order.

210

Overview
Rank transformation allows you to group information (like Aggregator) create local variables and write non-aggregate expressions. The Rank transformation differs from the transformation functions MAX and MIN, in that it allows you to select a group of top or bottom values, not just one value. You can connect ports from only one transformation to the Rank transformation.

The Rank transformation includes input or input/output ports connected to another transformation in the mapping.
It also includes variable ports and one rank port. Use the rank port to specify the column you want to rank.

211

Rank Index
The Designer automatically creates a RANKINDEX port for each Rank transformation. The PowerCenter Server uses the Rank Index port to store the ranking position for each row in a group. For example, if you create a Rank transformation that ranks the top five salespersons for each quarter, the rank index numbers the salespeople from 1 to 3:
RANKINDEX 1 2 3 SALES_PERSON Sam Mary Alice SALES 10,000 9,000 8,000

The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target.

212

Rank Index
If two rank values match, they receive the same value in the rank index and the transformation skips the next value. For example, if you want to see the top five retail stores in the country and two stores have the same sales, the return data might look similar to the following:
RANKINDEX 1 1 3 4 SALES 10000 10000 90000 80000 STORE Orange Brea Los Angeles Ventura

213

Transaction Control Transformation

Kanbay Incorporated - All Rights Reserved

Overview
PowerCenter allows you to control commit and rollback transactions based on a set of rows that pass through a Transaction Control transformation. A transaction is the set of rows bound by commit or rollback rows. You can define a transaction based on a varying number of input rows. You might want to define transactions based on a group of rows ordered on a common key, such as employee ID or order entry date.

215

Overview
In PowerCenter, you define transaction control at two levels: Within a mapping: Within a mapping, you use the Transaction Control transformation to define a transaction (using an expression) Based on the return value of the expression, you can choose to commit, roll back, or continue without any transaction changes. Within a session: When you configure a session, you configure it for user-defined commit. You can choose to commit or roll back a transaction if the PowerCenter Server fails to transform or write any row to the target. When you run the session, the PowerCenter Server evaluates the expression for each row that enters the transformation. When it evaluates a commit row, it commits all rows in the transaction to the target or targets. When the PowerCenter Server evaluates a rollback row, it rolls back all rows in the transaction from the target or targets.

216

Transaction Control Properties


Enter the transaction control expression in the Transaction Control Condition field. The transaction control expression uses the IIF function to test each row against the condition. Use the following syntax for the expression: IIF (condition, value1, value2) The expression contains values that represent actions the PowerCenter Server performs based on the return value of the condition. The PowerCenter Server evaluates the condition on a row-by-row basis. The return value determines whether the PowerCenter Server commits, rolls back, or makes no transaction changes to the row. When the PowerCenter Server issues a commit or rollback based on the return value of the expression, it begins a new transaction.

217

Transaction Control Properties


Use the following built-in variables in the Expression Editor when you create a transaction control expression: TC_CONTINUE_TRANSACTION (default value)

TC_COMMIT_BEFORE
TC_COMMIT_AFTER TC_ROLLBACK_BEFORE TC_ROLLBACK_AFTER

218

Stored Procedure Transformation

Kanbay Incorporated - All Rights Reserved

Overview
A Stored Procedure transformation is an important tool for populating and maintaining databases. Database administrators create stored procedures to automate tasks that are too complicated for standard SQL statements. A stored procedure is a precompiled collection of Transact-SQL, PL-SQL or other database procedural statements and optional flow control statements, similar to an executable script. Stored procedures are stored and run within the database. You can run a stored procedure with the EXECUTE SQL statement in a database client tool. Not all databases support stored procedures, and stored procedure syntax varies depending on the database. You might use stored procedures to do the following tasks: Check the status of a target database before loading data into it. Determine if enough space exists in a database. Perform a specialized calculation. Drop and recreate indexes
220

Overview
Stored procedures also provide error handling and logging necessary for critical tasks. The stored procedure must exist in the database before creating a Stored Procedure transformation, and the stored procedure can exist in a source, target, or any database with a valid connection to the PowerCenter Server. You might use a stored procedure to perform a query or calculation that you would otherwise make part of a mapping. If you already have a well-tested stored procedure for calculating sales tax, you can perform that calculation through the stored procedure instead of recreating the same calculation in an Expression transformation.

221

Input and Output Data


There are three types of data that pass between the PowerCenter Server and the stored procedure: Input/output parameters Return values Status codes Input/Output Parameters: For many stored procedures, you provide a value and receive a value in return. These values are known as input and output parameters. The Stored Procedure transformation sends / receives input and output parameters using ports, variables, or by entering a value in an expression. Return Values: Most databases provide a return value after running a stored procedure. The Stored Procedure transformation captures return values in a similar manner as input/output parameters, depending on the method that the input/output parameters are captured.
222

Input and Output Data


Status Codes: Status codes provide error handling for the PowerCenter Server during a workflow.

The stored procedure issues a status code that notifies whether or not the stored procedure completed successfully.
You cannot see this value. The PowerCenter Server uses it to determine whether to continue running the session or stop. You configure options in the Workflow Manager to continue or stop the session in the event of a stored procedure error.

223

Connected and Unconnected


Stored procedures run in either connected or unconnected mode. You can configure connected and unconnected Stored Procedure transformations in a mapping.

The mode you use depends on what the stored procedure does and how you plan to use it in your session.

224

Connected and Unconnected


Connected The flow of data through a mapping in connected mode also passes through the Stored Procedure transformation.

All data entering the transformation through the input ports affects the stored procedure.
You should use a connected Stored Procedure transformation when you need data from an input port sent as an input parameter to the stored procedure, or the results of a stored procedure sent as an output parameter to another transformation. Unconnected The unconnected Stored Procedure transformation is not connected directly to the flow of the mapping. It either runs before or after the session, or is called by an expression in another transformation in the mapping.

225

Comparison
If you want to Use this mode

Run a stored procedure before or after your session.

Unconnected

Run a stored procedure once during your mapping, such as pre- or post-session.

Unconnected

Run a stored procedure every time a row passes through Connected Or the Stored Procedure transformation. Unconnected Run a stored procedure based on data that passes through the mapping, such as when a specific port does not contain a null value. Pass parameters to the stored procedure and receive a single output parameter. Pass parameters to the stored procedure and receive multiple output parameters. Run nested stored procedures. Call multiple times within a mapping. Unconnected

Connected Or Unconnected Connected Or Unconnected Unconnected Unconnected


226

Specifying when the Stored Procedure Runs


In the case of the unconnected stored procedure, the Expression transformation references the stored procedure, which means the stored procedure runs every time a row passes through the Expression transformation. If no transformation references the Stored Procedure transformation, you have the option to run the stored procedure once before or after the session. The following list describes the options for running a Stored Procedure transformation: Normal Pre-load of the Source

Post-load of the Source


Pre-load of the Target Post-load of the Target
227

Specifying when the Stored Procedure Runs


Normal: The stored procedure runs where the transformation exists in the mapping on a row-by-row basis.

This is useful for calling the stored procedure for each row of data that passes through the mapping, such as running a calculation against an input port.
Connected stored procedures run only in normal mode. Pre-load of the Source: Before the session retrieves data from the source, the stored procedure runs. This is useful for verifying the existence of tables or performing joins of data in a temporary table.

228

Specifying when the Stored Procedure Runs


Post-load of the Source: After the session retrieves data from the source, the stored procedure runs. This is useful for removing temporary tables. Pre-load of the Target: Before the session sends data to the target, the stored procedure runs.

This is useful for verifying target tables or disk space on the target system.
Post-load of the Target: After the session sends data to the target, the stored procedure runs. This is useful for re-creating indexes on the database.

229

Specifying when the Stored Procedure Runs


You can run several Stored Procedure transformations in different modes in the same mapping. A pre-load source stored procedure can check table integrity, a normal stored procedure can populate the table, and a post-load stored procedure can rebuild indexes in the database. However, you cannot run the same instance of a Stored Procedure transformation in both connected and unconnected mode in a mapping.

You must create different instances of the transformation.

230

Stored Procedure Transformation 1

Connected
231

Stored Procedure Transformation 2

Unconnected
232

Calling Stored Procedure From Expression

233

Workflows - II
Chapter 18

Kanbay Incorporated - All Rights Reserved

Additional Workflow Tasks


Eight additional Tasks are available in the Workflow Designer Command Email Decision Assignment Timer Control Event Wait Event Raise

235

Reusable Tasks
Three types of reusable tasks Session: Set of instructions to execute a specific Mapping Command: Specific shell commands to run during any Workflow Email: Sends email during the Workflow Use the Task Developer to create a reusable tasks These tasks will then appear in the Navigator and can be dragged & dropped into any workflow

236

Command Task
Specify one or more Unix shell or DOS commands to run during the Workflow Runs in the Informatica Server(Unix or Windows) environment

Shell command status(successful completion or failure) is held in the pre-defined variable $command_task_name.STATUS
Each command Task shell command can execute before the Session begins or after the Informatica Server executes a Session

Specify one or more Unix shell or DOS (NT, WIn2000) commands to run at a specific point in the Workflow
Becomes a component of a Workflow (or Worklet)

237

Command Task
If configured in the Task Developer, the Command Task is reusable (optional) You can use a Command task in the following ways:

Standalone Command task.


Pre- and post-session shell command.

238

Email Task
Configure to have the Informatica Server to send email at any point in the Workflow Becomes a component in a Workflow (or Worklet)

If configured in the Task Developer, the Email Task is reusable(optional)

239

Non-reusable Tasks
Six additional Tasks are available in the Workflow Designer Decision

Assignment
Timer Control Event Wait Event Raise

240

Decision Task
Specifies a condition to be evaluated in the Workflow Use the Decision Task in branches of a Workflow Provides additional functionality over a Link

241

Decision Task
Example Workflow without a Decision Task

242

Assignment Task
Assigns a value to a Workflow variable Variables are defined in the Workflow object

243

Timer Task
Waits for a specified period of time to execute the next Task Absolute Time Datetime variable Relative Time

244

Control Task
Used to stop, abort, or fail the top-level workflow or the parent workflow based on an input link condition. A parent workflow or worklet is the workflow or worklet that contains the Control task.

245

Event Wait Task


Waits for a User-defined or a Pre-defined event to occur Once the vent occurs,the Informatica Server completes the rest of the Workflow

Used with the Event Raise Task


Events can be a file watch (indicator file) or User-defined User-defined events are defined in the Workflow itself

246

Event Raise Task


Represents the location of a User-defined event The Event Raise Task triggers the User-defined event when the Informatica Server executes the Event Raise Task

247

Hands-On - III
Chapter 19

Kanbay Incorporated - All Rights Reserved

Lab 15 Using Command Task


Copy the workflow of Lab 4 for this lab. Add a command task which copies the output file of session task to another directory.

249

Lab 16 Using Email Task


Copy the workflow of Lab 4 for this lab. Configure an email task after the session, to inform successful completion.

250

Lab 17 Using Timer Task


Copy the workflow of Lab 15 for this Lab. Include a Timer task after the session and configure it so that the command task runs after 1 minute.

251

Conclusion

Thank You!

252

Kanbay
WORLDWIDE HEADQUARTERS: 6400 SHAFER COURT I ROSEMONT, ILLINOIS USA 60018 TEL. 847.384.6100 I FAX 847.384.0500 I WWW.KANBAY.COM

Kanbay Incorporated - All Rights Reserved

You might also like