ETL Testing or Data Warehouse Testing Tutorial
ETL Testing or Data Warehouse Testing Tutorial
(https://www.guru99.com/)
What is BI?
Business Intelligence is the process of collecting raw data or business data and turning it
into information that is useful and more meaningful. The raw data is the records of the daily
transaction of an organization such as interactions with customers, administration of
finance, and management of employee and so on. These data’s will be used for “Reporting,
Analysis, Data mining, Data quality and Interpretation, Predictive Analysis”.
(/images/ETL_Testing/ETLTesting_1.jpg)
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 1/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
What is ETL?
ETL stands for Extract-Transform-Load and it is a process of how data is loaded from the
source system to the data warehouse. Data is extracted from an OLTP database,
transformed to match the data warehouse schema and loaded into the data warehouse
database. Many data warehouses also incorporate data from non-OLTP systems such as text
files, legacy systems and spreadsheets.
For example, there is a retail store which has different departments like sales, marketing,
logistics etc. Each of them is handling the customer information independently, and the
way they store that data is quite different. The sales department have stored it by
customer’s name, while marketing department by customer id.
Now if they want to check the history of the customer and want to know what the different
products he/she bought owing to different marketing campaigns; it would be very tedious.
The following diagram gives you the ROAD MAP of the ETL process
(/images/ETL_Testing/ETLTesting_2.png)
1. Extract
2. Transform
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 2/14
1/10/2019
Transform data to DW (Data Warehouse) format
ETL Testing or Data Warehouse Testing Tutorial
Build keys - A key is one or more data attributes that uniquely identify an
entity. Various types of keys are primary key, alternate key, foreign key,
composite key, surrogate key. The datawarehouse owns these keys and
never allows any other entity to assign them.
Cleansing of data :After the data is extracted, it will move into the next
phase, of cleaning and conforming of data. Cleaning does the omission in
the data as well as identifying and fixing the errors. Conforming means
resolving the conflicts between those data’s that is incompatible, so that
they can be used in an enterprise data warehouse. In addition to these, this
system creates meta-data that is used to diagnose source system problems
and improves data quality.
3. Load
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 3/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
(/images/ETL_Testing/ETLTesting_3.png)
(/images/ETL_Testing/ETLTesting_4.jpg)
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 4/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Types of ETL Testing
Source to Target Testing (Validation Testing) Such type of testing is carried out to
validate whether the data values
transformed are the expected data values.
Data Completeness Testing To verify that all the expected data is loaded
in target from the source, data
completeness testing is done. Some of the
tests that can be run are compare and
validate counts, aggregates and actual data
between the source and target for columns
with simple transformation or no
transformation.
Data Accuracy Testing This testing is done to ensure that the data
is accurately loaded and transformed as
expected.
While performing ETL testing, two documents that will always be used by an ETL tester are
1. ETL mapping sheets :An ETL mapping sheets contain all the information of
source and destination tables including each and every column and their look-
up in reference tables. An ETL testers need to be comfortable with SQL queries
as ETL testing may involve writing big queries with multiple joins to validate
data at any stage of ETL. ETL mapping sheets provide a significant help while
writing queries for data verification.
2. DB Schema of Source, Target: It should be kept handy to verify any detail in
mapping sheets.
Validation
1. Validate the source and target table
structure against corresponding
mapping doc.
2. Source data type and target data type
should be same
3. Length of data types in both source and
target should be equal
4. Verify that data field types and formats
are specified
5. Source data type length should not less
than the target data type length
6. Validate the name of columns in the
table against mapping doc.
Completeness Issues
1. Ensure that all expected data is loaded
into target table.
2. Compare record counts between source
and target.
3. Check for any rejected records
4. Check data should not be truncated in
the column of target tables
5. Check boundary value analysis
6. Compares unique values of key fields
between data loaded to WH and source
data
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 7/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Correctness Issues
1. Data that is misspelled or inaccurately
recorded
2. Null, non-unique or out of range data
Transformation Transformation
Data Quality
1. Number check: Need to number check
and validate it
2. Date Check: They have to follow date
format and it should be same across all
records
3. Precision Check
4. Data check
5. Null check
Duplicate Check
1. Needs to validate the unique key,
primary key and any other column
should be unique as per the business
requirements are having any duplicate
rows
2. Check if any duplicate values exist in any
column which is extracting from
multiple columns in source and
combining into one column
3. As per the client requirements, needs to
be ensure that no duplicates in
combination of multiple columns within
target only
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 8/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Complete Data Validation
1. To validate the complete data set in
source and target table minus a query in
a best solution
2. We need to source minus target and
target minus source
3. If minus query returns any value those
should be considered as mismatching
rows
4. Needs to matching rows among source
and target using intersect statement
5. The count returned by intersect should
match with individual counts of source
and target tables
6. If minus query returns of rows and count
intersect is less than source count or
target table then we can consider as
duplicate rows are existed.
(/images/ETL_Testing/ETLTesting_5.png)
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 9/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Type of Bugs Description
Input/Output bugs
Valid values not accepted
Invalid values accepted
Calculation bugs
Mathematical errors
Final output is wrong
H/W bugs
Device is not responding to the
application
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 10/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Help Source bugs
Mistakes in help documents
Verifies whether data is moved as expected The primary goal is to check if the data is
following the rules/ standards defined in
the Data Model
Verifies whether counts in the source and Verify that there are no orphan records and
target are matching Verifies whether the foreign-primary key relations are
data transformed is as per expectation maintained
Verifies that the foreign primary key Verifies that there are no redundant tables
relations are preserved during the ETL and database is optimally normalized
Verifies for duplication in loaded data Verify if data is missing in columns where
required
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 12/14
Prev (/learn-sap-testing-create-your-first-sap-test-case.html)
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
Report a Bug
Next (/data-testing.html)
(https://www.facebook.com/guru99com/)
(https://twitter.com/guru99com)
(https://www.youtube.com/channel/UC19i1XD6k88KqHlET8atqFQ)
(https://forms.aweber.com/form/46/724807646.htm)
About
About US (/about-us.html)
Advertise with Us (/advertise-us.html)
Write For Us (/become-an-instructor.html)
Contact US (/contact-us.html)
Career Suggestion
SAP Career Suggestion Tool (/best-sap-module.html)
Software Testing as a Career (/software-testing-career-
complete-guide.html)
Certificates (/certificate-it-professional.html)
Interesting
Books to Read! (/books.html)
Suggest a Tutorial
Blog (/blog/)
Quiz (/tests.html)
eBook (/ebook-pdf.html)
Execute online
Execute Java Online (/try-java-editor.html)
Execute Javascript (/execute-javascript-online.html)
Execute HTML (/execute-html-online.html)
Execute Python (/execute-python-online.html)
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 13/14
1/10/2019 ETL Testing or Data Warehouse Testing Tutorial
https://www.guru99.com/utlimate-guide-etl-datawarehouse-testing.html 14/14