Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Data Profile Demo

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 8

Data Profile Demo

Introduction

•Data profiling is used to understand the data


patterns and exceptions of source system. Data
profiling can be used to know more about source
data and to validate documented business rules
about the source data. By using data profiling we
can identify different patterns of data getting in the
source system.

2
Profile warehouse

• Data profiling has its own warehouse structure.

• All the profiling information is stored in the


profiling warehouse.

• Data profiling database has 33 tables and 24


views through which all reports will be retrieved.

3
Data Profiling Framework

•Framework designed for auditing using data profiling

DATA LOAD PROCESS


THE PROCESS
BEING AUDITED

SOURCE

TARGET
Informatica Process

Pre-Session Command to
fetch
PROFILE _START_TIME and
PROFILE_END_TIME

AUDITING FRAMEWORK
MASTER
TABLE

<Process Name >


SOURCE MAPPLET
PROFILE
DATA PROFILE

DETAIL TABLE
<Function>

REPOSITORY

AUDIT TABLE
DATA PROFILING
PROCESS
Informatica MSTR Reports
(Informatica data
Process
profiler generated
mappings)

TARGET MAPPLET
PROFILE
Power Analyzer
Reports

4
Data Profiling

Problem
• Auditing every day ETL Load manually is cumbersome.

Design Consideration

• Generic solution

• To build a audit warehouse which stores history audit information

• To reduce development effort on building for auditing process for new


projects

Solution Design
•Develop Generic Auditing framework using data profiling components
•of Informatica

5
Data Profiling Steps
Steps Done

1. As Special ledger process is a incremental load, cannot profile source


or target directly.Need to create a mapplet to have the incremental
logic.

2. Implement business logic also in source mapplet.


Credit=IIF (DRCRK='H', TSL+HSL+KSL, 0)
Debit =IIF (DRCRK='S', TSL+HSL+KSL, 0)

3. create Profile on the output of the source mapplet, target mapplet


which will in turn create the profile mappings.

4. Master information about the Profile will be entered manually in the


master table.

5. Data profiler mappings extract data (incremental Data) from source and
target chosen, load into the data profiler warehouse which stores the
audit processed data.
6
6. Generic ETL process is built to extract data from data profiler
warehouse to generically designed DETAIL table (both source profile
and target profile).

7. Generic ETL Process is built to compare source and target profiles that
loads data into AUDIT Result table which stores data of the overall
audit process at higher level stating ‘Y’ if ETL Process had been
successful (source and target data matches). If Failed or partially
succeeded then 'N‘.

8. For a particular audit area and date of processing chosen audit


Result(‘Y’ or ‘N’) at higher level can be viewed in Reports. This Report
can be drilled to lower level viewing the detail information of which
account data is not matching.

7
THANK YOU

- Franklin.D

You might also like