Optimization of Lookup Transformation
Optimization of Lookup Transformation
Transformation
DECLARATION
Balaji Subramanian
MHRSINFA Project
Date: 07th December, 2004
ACKNOWLEDGEMENTS
I also take this opportunity to thank our teammates who provided many of the inputs
to create this BOK.
Thank You,
Balaji Subramanian
mailto: balaji_subramanian@infosys.com
INDEX
DECLARATION.................................................................................................
ACKNOWLEDGEMENTS....................................................................................
1. LOOKUP TRANSFORMATION OVERVIEW......................................................
2. LOOKUP PROPERTIES..................................................................................
3. LOOKUP CACHE...........................................................................................
4. LOOKUP TRANSFORMATION TIPS...............................................................
1. LOOKUP TRANSFORMATION OVERVIEW
The Informatica Server queries the lookup table based on the lookup ports in the
transformation. It compares Lookup transformation port values to lookup table
column values based on the lookup condition. We can configure the Lookup
transformation to perform different types of lookups. We can configure the
transformation to be connected or unconnected, cached or uncached.
CACHED OR UNCACHED
We can configure a Lookup transformation to cache the lookup table. The Informatica
Server builds a cache in memory when it processes the first row of data in a cached
Lookup transformation. It allocates memory for the cache based on the amount we
configure in the transformation or session properties. The Informatica Server stores
condition values in the index cache and output values in the data cache. The
Informatica Server queries the cache for each row that enters the transformation.
2. LOOKUP PROPERTIES
Properties for the Lookup transformation identify the database source, how the
Informatica Server processes the transformation, and how it handles caching and
multiple matches.
Indicates whether the Informatica Server caches lookup values during the session.
When we enable lookup caching, the Informatica Server queries the lookup table
once, caches the values, and looks up values in the cache during the session. This
can improve session performance.
When we disable caching, each time a row passes into the transformation, the
Informatica Server issues a select statement to the lookup table for lookup values.
LOOKUP DATA CACHE SIZE
Indicates the maximum size the Informatica Server allocates to the data cache in
memory. If the Informatica Server cannot allocate the configured amount of memory
when initializing the session, it fails the session. When the Informatica Server cannot
store all the data cache data in memory, it pages to disk as necessary.
The Lookup Data Cache Size is 2,000,000 bytes by default. The minimum
size is 1,024 bytes. Use only with the lookup cache enabled.
For optimized performance the data cache size should be total size of all
fields * rows
Indicates the maximum size the Informatica Server allocates to the index cache in
memory. If the Informatica Server cannot allocate the configured amount of memory
when initializing the session, it fails the session. When the Informatica Server cannot
store all the index cache data in memory, it pages to disk as necessary.
The Lookup Index Cache Size is 1,000,000 bytes by default. The
minimum size is 1,024 bytes. Use only with the lookup cache enabled.
For optimized performance the index cache size should be total size of
condition fields * rows
3. LOOKUP CACHE
The Informatica Server creates the cache files by default in the $PMCacheDir. If the
data does not fit in the memory cache, the Informatica Server stores the overflow
values in the cache files. When the session completes, the Informatica Server
releases cache memory and deletes the cache files unless we configure the Lookup
transformation to use a persistent cache.
When configuring a lookup cache, we can specify any of the following options:
PERSISTENT CACHE
If we want to save and reuse the cache files, we can configure the transformation to
use a persistent cache. Use a persistent cache when the lookup table does not
change between session runs. The first time the Informatica Server runs a session
using a persistent lookup cache, it saves the cache files to disk instead of deleting
them. The next time the Informatica Server runs the session, it builds the memory
cache from the cache files.
If the persistent cache is not synchronized with the lookup table, we can configure
the Lookup transformation to rebuild the lookup cache.
STATIC CACHE
We can configure a static, or read-only, cache for any lookup table. By default, the
Informatica Server creates a static cache. It caches the lookup table and looks up
values in the cache for each row that comes into the transformation. When the
lookup condition is true, the Informatica Server returns a value from the lookup
cache. The Informatica Server does not update the cache while it processes the
Lookup transformation.
DYNAMIC CACHE
The lookup transformation is configured to use a dynamic cache when the target
table is also the lookup table. When we use a dynamic cache, the Informatica Server
updates the lookup cache as it passes rows to the target.
SHARED CACHE
If we include more than one lookup condition, place the conditions with
an equal sign first to optimize lookup performance.
Default buffer block size should be total size of all fields * (20 to 100) for
optimal performance.
s p rsist nt lppkup pr st ti
lppkup t l s. I t lppkup t l dp s npt
n tw n s ssipns pn i ur t
Lppkup tr ns prm tipn tp us p rsist nt
lppkup . T In prm ti S rv r t n
s v s nd r us s il s rpm s ssipn tp
s ssipn limin tin t tim r quir d tp
r d t lppkup t l .
When your source is large, cache lookup table columns for those lookup
tables of 500,000 rows or less.
Cache only lookup tables if the number of lookup calls is more than 10-
20% of the lookup table rows.
For small lookup tables, less than 5,000 rows, cache for more than 5-10
lookup calls.