DWH-BI Engineer - Assessment Questionaire
DWH-BI Engineer - Assessment Questionaire
DWH-BI Engineer - Assessment Questionaire
Interview questions
2. What are different types of dimensions? Don't count them all from wikipedia, top
three to five types is enough with short description for each.
5. What strategy would you choose for late arriving dimensions? Could you explain your
strategy on a simple yet descriptive example?
Answers:
1. Please describe Facts vs Dimensions with simple example.
Measurements/facts
Foreign key to dimension table
Dimension table contains dimensions of a fact. They are joined to fact table via a foreign
key. Dimension tables are de-normalized tables.
The Dimension Attributes are the various columns in a dimension table. Dimensions offers
descriptive characteristics of the facts with the help of their attributes. No set limit set for
given for number of dimensions. The dimension can also contain one or more hierarchical
relationships
2. What are different types of dimensions? Don't count them all from wikipedia, top three to
five types is enough with short description for each.
Conformed Dimensions:
Conformed dimensions is the very fact to which it relates. This dimension is used in more
than one-star schema or Datamart.
Outrigger Dimensions:
A dimension may have a reference to another dimension table. These secondary dimensions
called outrigger dimensions. This kind of Dimensions should be used carefully.
In the following Star Schema example, the fact table is at the center which contains keys to
every dimension table like Dealer_ID, Model ID, Date_ID, Product_ID, Branch_ID & other
attributes like Units sold and revenue.
Snowflake Schema:
Snowflake Schema In data warehouse is a logical arrangement of tables in a
multidimensional database such that the ER diagram resembles a snowflake shape. A
Snowflake Schema is an extension of a Star Schema, and it adds additional dimensions. The
dimension tables are normalized which splits data into additional tables.
In the following Snowflake Schema example, Country is further normalized into an individual
table.
5. What strategy would you choose for late arriving dimensions? Could you explain your
strategy on a simple yet descriptive example?
That realy depends on the nature of the data but i will use technic that is called
' Complete the dimension later'
Which means i will still create the dmnession using its natural key and fill descripton
columns with N/A or unknown
Example:
A new employee may be eligible for healthcare insurance coverage beginning with their first
day on the job and be issued a valid insurance card with a valid patient ID. However, the
employer may not provide detailed enrollment information to their healthcare insurance
provider for several weeks; it may take several more weeks before the new employee is
entered into the insurer’s operational systems. Of course, the new employee may require
health care during this time and submit claims using their patient ID. In this case, the
insurer’s data warehouse ETL system will receive claim fact row input with a valid patient ID
that doesn’t have an associated row in the patient dimension – yet.