Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Informatica CDC

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4
At a glance
Powered by AI
The article describes a Change Data Capture (CDC) implementation in Informatica using a mapping to track data changes in the source system and load them to the target table. A full outer join is used to identify matching and non-matching records between the source and target tables based on primary keys. Expressions are defined to flag records as insert, update, or delete based on null primary key values. Delete records are flagged with a 'D' in the target table.

The mapping uses a full outer join between the source and target CDC tables. Expressions define flags for insert (null old primary keys), update (different non-key columns but matching primary keys), and delete (null new primary keys). A router groups the records by change type flag.

The change type flag expressions are Flag_DELETE (all primary keys null), Flag_INSERT (all old primary keys null), Flag_UPDATE (matching primary keys but non-key columns different). The overall flag is set using IIF based on the individual flags.

Change Data Capture Informatica Mapping Logic for

CDC Implementation
So, Finally here I go with an article on CDC (Change Data Capture) implementation through an informatica which
had been a long waiting from my side to be posted.

Requirement To track on the data change in the source system with the earlier data pulled into the warehouse
and to track on Updates, new Inserts and Delete on the data that is no more in the source. I would continue on
this CDC implementation with an another article on loading of data forward into a table to save the change
history for the data user to keep a track on the change history.

Mapping Design:

The mapping I have come up with for the CDC logic would be as below,

(Click on the image for a better view)

The mapping would involve the Staging/source table which would be a daily truncate load from the source system
as is with the current status on the transactions from a source. And the data are then to be sorted with respect to
the primary key columns, sample table used would be as below. The sample implies on the target CDC table to
have the same composite primary key columns.

Source Table:
Target Table:

The mapping would involve a simple logic to have a joiner (Full Outer Join) with the source table and the Target
CDC table and to check for the matching and non matching records on the Primary key columns as is illustrated
below,

This being a Full Outer Join, enables the loading of all matching and non matching transactions to flow through
and then the following three expressions should be defined to check on the Change type,

Flag_DELETE:

ISNULL(TransactionID) AND ISNULL(ProductID) AND ISNULL(ReferenceOrderID)


and ISNULL(ReferenceOrderLineID)

Flag_INSERT:

ISNULL(TransactionID_old) AND ISNULL(ProductID_old) AND ISNULL(ReferenceOrderID_old) AND


ISNULL(ReferenceOrderLineID_old)

Flag_UPDATE:
TransactionID=TransactionID_old AND

ProductID=ProductID_old AND

ReferenceOrderID=ReferenceOrderID_old AND

ReferenceOrderLineID=ReferenceOrderLineID_old AND

(TransactionDate != TransactionDate_old OR

TransactionType != TransactionType_old OR

Quantity != Quantity_old OR

ABS(ActualCost ActualCost_old)>1)

Flag => IIF(v_Flag_Update=1,-1,IIF(v_Flag_Insert=1,-2,IIF(v_Flag_Delete=1,-3,0)))

where 0 is for no change records, and then to use a filter transformation to filter out all the no change
transactions with Flag=0.

Then a Router to define the grouping as per the result of the above expression on checking if the Transaction is an
INSERT, UPDATE or DELETE.

(Click on the image for a better view)

And the Update strategy transformation for each to perform and Update or Insert for each, and to note even for
DELETE flags I will be going for a DD_UPDATE, which would update the CDC flag for the transactions not
available in the source system to D in the target table.

Also to note, that the Delete flow, would involve pulling the ports of CDC source table from the Router to the
target and Update on the key column matching records with CDC_Flag=D.

And finally to make the session level changes for each target to define Update as Update or Update as Insert for
the targets respectively. Upon running the mapping for the first time all transactions would be tagged as INSERT
and then the proceeding runs will then depend on the source system changes to be captured through the CDC
logic of our mapping.

Note The DELETE transactions are the one which are no more in the source table, and hence these once after
being tagged as D for the CDC_Flag can be deleted from the target CDC table as well after making use of the
change on to the History tracking table which will be covered more in detail with the next article shortly.
I will shortly be posting another phase of this topic on CDC and on tracking of history on a transaction change in
your warehouse with the CDC. Hope you enjoyed reading this and helps you. Please post your comments and
feedback for a better work and clear ideas. Thanks! !

You might also like