Lookup Transformation
Lookup Transformation
idwbitraining@gmail.com 1
Lookup Basics
Purpose of Lookup Transformation:
Getting related value: Retrieve value from the lookup table
based on a value in the source. And the value returned can
also be used to perform a calculation like any other port.
Update slowly changing dimension tables: Determine whether
rows exist in a target and accordingly you can create a new
record or update the existing one.
Lookup can be used as Connected/Unconnected and it is
termed as both Passive/Active based on the type of output
we want it to deliver.
The lookup can be performed on flat file/relation tables ,views
or synonym.
idwbitraining@gmail.com 2
How a Lookup Transformation Works
For each Mapping row, one or more port values are looked up
in a database table
If a match is found, one or more table values are returned to
the Mapping. If no match is found, NULL is returned
Look Up Transformation
Look-up
Values
Return
SQ_TARGET_ITEMS_OR... LKP_OrderID TARGET_ORDERS_COS...
Values
Source Qualifier Lookup Procedure Target Definition
Name Datatype Len... Name Datatype Len... Loo... Ret... AssociatedK...Name
... Datatype L
ITEM_ID decimal 38 IN_ORDER_ID decimal 38 No No ORDER_ID number(p,s) 3
ITEM_NAME string 72 DATE_ENTERED date/ time 19 Yes No DATE_ENTERED date 1
ITEM_DESC string 72 DATE_PROMISED date/ time 19 Yes No DATE_PROMISED date 1
WHOLESALE_CO... decimal 10 DATE_SHIPPED date/ time 19 Yes No DATE_SHIPPED date 1
DISCONTINUED_... decimal 38 EMPLOYEE_ID decimal 38 Yes No EMPLOYEE_ID number(p,s) 3
MANUFACTURER...decimal 38 CUSTOMER_ID decimal 38 Yes No CUSTOMER_ID number(p,s) 3
DISTRIBUTOR_ID decimal 38 SALES_TAX_RATE decimal 5 Yes No SALES_TAX_RATE number(p,s) 5
ORDER_ID decimal 38 STORE_ID decimal 38 Yes No STORE_ID number(p,s) 3
TOTAL_ORDER_... decimal 38 TOTAL_ORDER_... number(p,s) 3
idwbitraining@gmail.com 3
Lookup Transformation
Looks up values in a database table or flat files and provides data to
downstream transformation in a Mapping
Passive Transformation
Connected / Unconnected
Ports
Mixed
L denotes Lookup port
R denotes port used as a
return value (unconnected
Lookup only)
Specify the Lookup Condition
Usage
Get related values
Verify if records exists or if
data has changed
idwbitraining@gmail.com 4
Lookup Properties
Override
Lookup SQL
option
Toggle
caching
Native Database
Connection
Object name
idwbitraining@gmail.com 5
Additional Lookup Properties
Set cache
directory
Make cache
persistent
Set Lookup
cache sizes
idwbitraining@gmail.com 6
Lookup Conditions
Multiple conditions are supported
idwbitraining@gmail.com 7
Connected Lookup
SQ_TARGET_ITEMS_OR... LKP_OrderID TARGET_ORDERS_COS...
Source Qualifier Lookup Procedure Target Definition
Name Datatype Len... Name Datatype Len... Loo... Ret... AssociatedK...Name
... Datatype L
ITEM_ID decimal 38 IN_ORDER_ID decimal 38 No No ORDER_ID number(p,s) 3
ITEM_NAME string 72 DATE_ENTERED date/ time 19 Yes No DATE_ENTERED date 1
ITEM_DESC string 72 DATE_PROMISED date/ time 19 Yes No DATE_PROMISED date 1
WHOLESALE_CO... decimal 10 DATE_SHIPPED date/ time 19 Yes No DATE_SHIPPED date 1
DISCONTINUED_... decimal 38 EMPLOYEE_ID decimal 38 Yes No EMPLOYEE_ID number(p,s) 3
MANUFACTURER...decimal 38 CUSTOMER_ID decimal 38 Yes No CUSTOMER_ID number(p,s) 3
DISTRIBUTOR_ID decimal 38 SALES_TAX_RATE decimal 5 Yes No SALES_TAX_RATE number(p,s) 5
ORDER_ID decimal 38 STORE_ID decimal 38 Yes No STORE_ID number(p,s) 3
TOTAL_ORDER_... decimal 38 TOTAL_ORDER_... number(p,s) 3
Connected Lookup
Part of the data flow pipeline
idwbitraining@gmail.com 8
Unconnected Lookup
Will be physically unconnected from other transformations
There can be NO data flow arrows leading to or from an unconnected Lookup
Lookup data is
called from the
point in the
Mapping that
needs it
idwbitraining@gmail.com 9
Unconnected Lookup - Return Port
The port designated as R is the return port for the unconnected lookup
There can be only one return port
The look-up (L) / Output (O) port can be assigned as the Return (R) port
The Unconnected Lookup can be called in any other transformations
expression editor using the expression
:LKP.Lookup_Tranformation(argument1, argument2,..)
idwbitraining@gmail.com 10
Connected vs. Unconnected Lookups
Part of the mapping data flow Separate from the mapping data
flow
Returns multiple values (by linking Returns one value (by checking the
output ports to another Return (R) port option for the output
transformation) port that provides the return value)
Executed for every record passing Only executed when the lookup
through the transformation function is called
More visible, shows where the Less visible, as the lookup is called
lookup values are used from an expression within another
transformation
Default values are used Default values are ignored
idwbitraining@gmail.com 11
Conditional Lookup Technique
Two requirements:
Must be Unconnected (or function mode) Lookup
Lookup function used within a conditional statement
Row keys
Condition (passed to Lookup)
IIF ( ISNULL(customer_id),0,:lkp.MYLOOKUP(order_no))
Lookup function
EXAMPLE: A Mapping will process 500,000 rows. For two percent of those rows
(10,000) the item_id value is NULL. Item_ID can be derived from the
SKU_NUMB.
Condition Lookup
(true for 2 percent of all rows) (called only when condition is true)
idwbitraining@gmail.com 13
To Cache or not to Cache?
Caching can significantly impact performance
Cached
Lookup table data is cached locally on the machine
Mapping rows are looked up against the cache
Only one SQL SELECT is needed
Uncached
Each Mapping row needs one SQL SELECT
Rule Of Thumb: Cache if the number (and size) of records in
the Lookup table is small relative to the number of mapping
rows requiring lookup or large cache memory is available for
Integration Service
idwbitraining@gmail.com 14
Lookup cache - overview
The Integration Service builds the cache in memory when the first row is
processed. If the memory is inadequate, the data is paged into a cache file.
If you use a flat file lookup, the Integration Service always caches the lookup rows.
Cache if the number (and size) of records in the Lookup table is small relative to
the number of mapping rows requiring the lookup.
idwbitraining@gmail.com 15
Lookup cache - Types
There are two types of lookup caches Static and Dynamic
Un-cached Static cache Dynamic cache
The lookup table is queried each Cannot insert/update the cache once Can insert/update rows in the cache for each
time. created row from source (previous widget)
Cannot use flat file as lookup Can use relational and flat file lookups Can use relational and flat file lookups
source
When the condition matches, When the condition matches, lookup When the condition matches, rows are
lookup returns a row returns a row updated in the cache or left unchanged
depending on the row type
If the condition is false, the If the condition is false, the default value When the condition is false, rows are
default value is returned for is returned for connected and NULL is updated in the cache or left unchanged
connected and NULL is returned returned for unconnected lookups depending on the row type
for unconnected lookups
idwbitraining@gmail.com 16
Lookup cache for connected
The Integration Service can build cache for connected lookups in two ways
Sequential cache: The Integration Service builds the cache in memory when it processes the
first row of the data in a cached lookup transformation. It waits for upstream transformations
to complete before building a cache.
Concurrent cache: The Integration Service does not wait for upstream active transformations
to complete. It starts building the cache as soon as session starts. This may improve
performance if you are sure that the cache is needed each time the mapping is run.
For example: if the transformation logic in a mapping is configured to route data to different
pipelines, the downstream lookup might not be hit each time. In this case, it is advisable to
go for sequential cache.
Unconnected lookup caches cannot be processed concurrently.
idwbitraining@gmail.com 17
Lookup cache: Static
For each row that passes the transformation, the cache is queried for specified
condition.
If a match is not available either default value (for connected lookups only) or
NULL is returned.
If multiple matches are found, rows are returned based on the option specified in
Lookup policy on multiple match in the lookup properties.
idwbitraining@gmail.com 18
Lookup cache: Dynamic
Insert - Inserts the row into the cache if it is not present and you specified to insert
rows. You can configure to insert rows into cache based on input ports or
generated sequence IDs.
Update updates the row in cache if the row is already present and an update is
specified in the properties
No change:
Row does not exist in cache, but you have specified to only insert new rows
Row does not exist in cache, but you have specified update existing rows only
Row exists in the cache, but based on the lookup conditions nothing changes
idwbitraining@gmail.com 19
Lookup cache dynamic when to use
Updating a master customer table with new and updated customer information.
Use a Lookup transformation to perform a lookup on the customer table to determine if
a customer exists in the target. Use a dynamic lookup cache that inserts and updates
rows in the cache as it passes rows to the target.
Loading data into a slowly changing dimension table and a fact table.
Load data into a slowly changing dimension table and a fact table. Create two pipelines
and configure a Lookup transformation that performs a lookup on the dimension table.
Use a dynamic lookup cache to load data to the dimension table. Use a static lookup
cache to load data to the fact table, and specify the name of the dynamic cache from the
first pipeline.
idwbitraining@gmail.com 20
Lookup cache dynamic properties
Dynamic lookup cache consists of the following properties
Property Description
NewLookupRow This port is added when the lookup is configured as dynamic. 0=No change, 1=insert, 2=update
Associated port The data in the associated port is used to determine whether to insert/update rows in cache. A
sequence id can also be used as associated port wherein Informatica generates and uses a
primary key
Ignore Null Inputs for This port is selected when you do not want to update the data in cache when this column is
Updates NULL
Ignore in Comparison The Integration Service compares the values in all lookup ports with the values in their
associated input ports by default. Select this property if you want the Integration Service to
ignore the port when it compares values before updating a row.
Insert else Update This affects only rows that enters the lookup transformation flagged as insert. Inserts a row into
cache if it is new. If the row exists in index cache, but the data cache is different, then it updates
the cache. If this option is not selected, Informatica inserts all new rows and ignores update
rows.
Update else Insert This affects only rows that enter the lookup transformation flagged as update. If the row exists
in cache, Informatica updates the data cache. If a row does not exist in cache, it inserts a new
row. If this option is not selected, Informatica updates rows in cache and ignores new rows
idwbitraining@gmail.com 21
Lookup cache dynamic - behavior
Dynamic lookup cache behavior for insert row type
Insert else update Row found in cache Data cache is different Lookup cache result NewLookupRow
option value
Not selected Yes n/a No change 0
No n/a Insert 1
selected Yes Yes Update 2 (0)
Yes No No change 0
No n/a Insert 1
idwbitraining@gmail.com 22
Lookup cache dynamic - guidelines
The Lookup transformation must be a connected transformation.
You can only create an equality lookup condition. You cannot look up a range of data in
dynamic cache.
Associate each lookup port that is not in the lookup condition with an input port or a
sequence ID.
When you use a lookup SQL override, make sure you map the correct columns to the
appropriate targets for lookup.
When you add a WHERE clause to the lookup SQL override, use a Filter transformation before
the Lookup transformation.
Use Update Strategy transformations after the Lookup transformation to flag the rows for
insert or update for the target.
Use an Update Strategy transformation before the Lookup transformation to define some or
all rows as update if you want to use the Update Else Insert property in the Lookup
transformation.
Set the row type to Data Driven in the session properties.
Select Insert and Update as Update for the target table options in the session properties.
idwbitraining@gmail.com 23
Lookup cache sharing unnamed cache
For example, if you have two instances of the same reusable Lookup
transformation in one mapping and you use the same output ports for both
instances, the Lookup transformations share the lookup cache by default
Shared transformations must use the same ports in the lookup condition. The
conditions can use different operators, but the ports must be the same.
idwbitraining@gmail.com 24
Lookup cache sharing named cache
You can also share the cache between multiple Lookup transformations by using a
persistent lookup cache and naming the cache files.
If the Integration Service finds the cache files and you do not specify to recache
from source, the Integration Service uses the saved cache files.
If the Integration Service does not find the cache files or if you specify to recache
from source, the Integration Service builds the lookup cache us.
The Integration Service saves the cache files to disk after it processes each target
load order.
idwbitraining@gmail.com 25
Lookup cache sharing named cache
The Integration Service fails the session if you configure subsequent Lookup transformations
to recache from source, but not the first one in the same target load order group.
If the cache structures do not match, the Integration Service fails the session.
The Integration Service processes multiple sessions simultaneously when the Lookup
transformations only need to read the cache files.
The Integration Service fails the session if one session updates a cache file while another
session attempts to read or update the cache file.
For example, Lookup transformations update the cache file if they are configured to use a dynamic
cache or recache from source.
idwbitraining@gmail.com 26
Lookup cache - Tips
Cache small lookup tables.
Improve session performance by caching small lookup tables. The result of the
lookup query and processing is the same, whether or not you cache the lookup
table.
Use a persistent lookup cache for static lookup tables.
If the lookup table does not change between sessions, configure the Lookup
transformation to use a persistent lookup cache.
The Integration Service then saves and reuses cache files from session to session,
eliminating the time required to read the lookup table.
Care should be taken to ensure that data does not become stale while using
persistent cache.
For example: in a daily load, always cache a persistent lookup first (using re-cache from
source option), before they are used in other mappings. It is a good idea to re-cache a
persistent lookup in order to match any changes in the lookup table
idwbitraining@gmail.com 27
Lookup cache
Enable caching
Cache directory
Dynamic lookup
idwbitraining@gmail.com 28