Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Assignment 2

Download as pdf or txt
Download as pdf or txt
You are on page 1of 1

Modern Data Management & Business Intelligence

Assignment #2 – Due Date: Sunday, December 11th 11:59pm (All)


Presentations (in class): December 12th (PT) / December 13th (FT)

You are going to use SQL Server Database, SQL Server Analysis Services and Power BI or Tableau for this project.
You are going to design and develop a data warehouse, build one or more data cubes on top of it, develop some
OLAP reports and visualize your results. You are going to present your project in Teams (10’-15’ each group). This
should be in the form of a business case. This includes:

- business goals, description of the problem/domain


- description of data sources, where did you find the datasets
- design of the data warehouse, cubes, etc
- import/cleaning/transformation challenges and what did you do
- examples of OLAP queries, reports, etc.
- visualization examples

Try to make it as a story – you are the story teller!

1. Find a dataset in the web that seems attractive and interesting to you. Possible links:

www.kaggle.com
https://github.com/caesar0301/awesome-public-datasets
http://www.kdnuggets.com/datasets/index.html
https://catalog.data.gov/dataset?tags=data-warehouse

or, search google for "datasets for data warehousing / data mining / OLAP / etc."

2. Understand the facts and the dimensions of the application. Define a star/snowflake schema in your database
SQLServer. Populate the fact and the dimension tables from the dataset you found - for example by using the
import task in your database server. You may have to clean, transform the dataset, manually define dimension
tables or insert values.

3. Use SQL Server Analysis Services to define a multi-dimensional model (a cube) over your schema. Play with the
reporting capabilities of your tool and show some OLAP reports (drill down/roll up, pivoting, ranking, etc.)

4. Install Power BI and using your database schema, show OLAP examples and visualize these - or whatever else
you consider interesting. Better (and more interesting/interactive/etc) visualizations mean better grade 

The deliverables (aside the presentation) should be a document (.doc or .pdf) describing in detail each of the
above steps - with a lot of screenshots: (a) what kind of application you are targeting, description of the dataset
you used, where did you find it, what problems you are trying to solve, what analysis you want to do, (b)
description of the relational design of your fact and dimension tables, import methods, cleaning/transformation
procedures in detail, (c) what cube you have built on top of your schema, dimensions, measures, calculated - if any
- measures; description (in English) of OLAP reports and screenshots, and (d) visualizations of these reports and
description of the visualization, how it was produced, etc.

You might also like