Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

DWH-BI Engineer - Assessment Questionaire

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

DWH-BI Engineer - Assessment Questionaire

Interview questions

1. Please describe Facts vs Dimensions with simple example.

2. What are different types of dimensions? Don't count them all from wikipedia, top
three to five types is enough with short description for each.

3. Please describe Star vs Snowflake schema.

4. What is ETL acronym in context of DWH? Please explain.

5. What strategy would you choose for late arriving dimensions? Could you explain your
strategy on a simple yet descriptive example?

Answers:
1. Please describe Facts vs Dimensions with simple example.

Fact table is a primary table in a dimensional model. Fact Table contains

 Measurements/facts
 Foreign key to dimension table

Dimension table contains dimensions of a fact. They are joined to fact table via a foreign
key. Dimension tables are de-normalized tables.
The Dimension Attributes are the various columns in a dimension table. Dimensions offers
descriptive characteristics of the facts with the help of their attributes. No set limit set for
given for number of dimensions. The dimension can also contain one or more hierarchical
relationships

eiSoftLab d.o.o. | Židovska ulica 8 | 1000 Ljubljana | Slovenia | E: info@eisoftlab.com| W: www.eisoftlab.com


VAT: 40577902 | Registration N.: 8611505000 | Stock Capital: 7.500,00 EUR
Example:

2. What are different types of dimensions? Don't count them all from wikipedia, top three to
five types is enough with short description for each.

Conformed Dimensions:
Conformed dimensions is the very fact to which it relates. This dimension is used in more
than one-star schema or Datamart.

Outrigger Dimensions:
A dimension may have a reference to another dimension table. These secondary dimensions
called outrigger dimensions. This kind of Dimensions should be used carefully.

eiSoftLab d.o.o. | Židovska ulica 8 | 1000 Ljubljana | Slovenia | E: info@eisoftlab.com| W: www.eisoftlab.com


VAT: 40577902 | Registration N.: 8611505000 | Stock Capital: 7.500,00 EUR
Shrunken Rollup Dimensions:
Shrunken Rollup dimensions are a subdivision of rows and columns of a base dimension.
These kinds of dimensions are useful for developing aggregated fact tables.

3. Please describe Star vs Snowflake schema.


Star Schema:
In data warehouse, in which the center of the star can have one fact table and a number of
associated dimension tables. It is known as star schema as its structure resembles a star.
The Star Schema data model is the simplest type of Data Warehouse schema. It is also
known as Star Join Schema and is optimized for querying large data sets.

In the following Star Schema example, the fact table is at the center which contains keys to
every dimension table like Dealer_ID, Model ID, Date_ID, Product_ID, Branch_ID & other
attributes like Units sold and revenue.

Snowflake Schema:
Snowflake Schema In data warehouse is a logical arrangement of tables in a
multidimensional database such that the ER diagram resembles a snowflake shape. A
Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. The
dimension tables are normalized which splits data into additional tables.

In the following Snowflake Schema example, Country is further normalized into an individual
table.

eiSoftLab d.o.o. | Židovska ulica 8 | 1000 Ljubljana | Slovenia | E: info@eisoftlab.com| W: www.eisoftlab.com


VAT: 40577902 | Registration N.: 8611505000 | Stock Capital: 7.500,00 EUR
4. What is ETL acronym in context of DWH? Please explain.
ETL is a process in Data Warehousing and it stands for Extract, Transform and Load. It is a
process in which an ETL tool extracts the data from various data source systems, transforms
it in the staging area and then finally, loads it into the Data Warehouse system

5. What strategy would you choose for late arriving dimensions? Could you explain your
strategy on a simple yet descriptive example?

That realy depends on the nature of the data but i will use technic that is called
' Complete the dimension later'
Which means i will still create the dmnession using its natural key and fill descripton
columns with N/A or unknown
Example:

A new employee may be eligible for healthcare insurance coverage beginning with their first
day on the job and be issued a valid insurance card with a valid patient ID. However, the
employer may not provide detailed enrollment information to their healthcare insurance
provider for several weeks; it may take several more weeks before the new employee is
entered into the insurer’s operational systems. Of course, the new employee may require
health care during this time and submit claims using their patient ID. In this case, the
insurer’s data warehouse ETL system will receive claim fact row input with a valid patient ID
that doesn’t have an associated row in the patient dimension – yet.

eiSoftLab d.o.o. | Židovska ulica 8 | 1000 Ljubljana | Slovenia | E: info@eisoftlab.com| W: www.eisoftlab.com


VAT: 40577902 | Registration N.: 8611505000 | Stock Capital: 7.500,00 EUR

You might also like