Informatica CDC
Informatica CDC
Informatica CDC
CDC Implementation
So, Finally here I go with an article on CDC (Change Data Capture) implementation through an informatica which
had been a long waiting from my side to be posted.
Requirement To track on the data change in the source system with the earlier data pulled into the warehouse
and to track on Updates, new Inserts and Delete on the data that is no more in the source. I would continue on
this CDC implementation with an another article on loading of data forward into a table to save the change
history for the data user to keep a track on the change history.
Mapping Design:
The mapping I have come up with for the CDC logic would be as below,
The mapping would involve the Staging/source table which would be a daily truncate load from the source system
as is with the current status on the transactions from a source. And the data are then to be sorted with respect to
the primary key columns, sample table used would be as below. The sample implies on the target CDC table to
have the same composite primary key columns.
Source Table:
Target Table:
The mapping would involve a simple logic to have a joiner (Full Outer Join) with the source table and the Target
CDC table and to check for the matching and non matching records on the Primary key columns as is illustrated
below,
This being a Full Outer Join, enables the loading of all matching and non matching transactions to flow through
and then the following three expressions should be defined to check on the Change type,
Flag_DELETE:
Flag_INSERT:
Flag_UPDATE:
TransactionID=TransactionID_old AND
ProductID=ProductID_old AND
ReferenceOrderID=ReferenceOrderID_old AND
ReferenceOrderLineID=ReferenceOrderLineID_old AND
(TransactionDate != TransactionDate_old OR
TransactionType != TransactionType_old OR
Quantity != Quantity_old OR
ABS(ActualCost ActualCost_old)>1)
where 0 is for no change records, and then to use a filter transformation to filter out all the no change
transactions with Flag=0.
Then a Router to define the grouping as per the result of the above expression on checking if the Transaction is an
INSERT, UPDATE or DELETE.
And the Update strategy transformation for each to perform and Update or Insert for each, and to note even for
DELETE flags I will be going for a DD_UPDATE, which would update the CDC flag for the transactions not
available in the source system to D in the target table.
Also to note, that the Delete flow, would involve pulling the ports of CDC source table from the Router to the
target and Update on the key column matching records with CDC_Flag=D.
And finally to make the session level changes for each target to define Update as Update or Update as Insert for
the targets respectively. Upon running the mapping for the first time all transactions would be tagged as INSERT
and then the proceeding runs will then depend on the source system changes to be captured through the CDC
logic of our mapping.
Note The DELETE transactions are the one which are no more in the source table, and hence these once after
being tagged as D for the CDC_Flag can be deleted from the target CDC table as well after making use of the
change on to the History tracking table which will be covered more in detail with the next article shortly.
I will shortly be posting another phase of this topic on CDC and on tracking of history on a transaction change in
your warehouse with the CDC. Hope you enjoyed reading this and helps you. Please post your comments and
feedback for a better work and clear ideas. Thanks! !