Assignment 2 DM

The instructions provide guidelines for submitting Assignment 4 by the deadline of November 25, 2019 at 5:00 PM as follows: 1. Each problem must be solved on separate white pages. 2. Assignments must be properly tagged with details. 3. The last date to submit the assignment is November 25, 2019 before 5:00 PM. 4. Students must submit their own work individually and be present for verification.

Uploaded by

shadow mode

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

256 views

Assignment 2 DM

Uploaded by

shadow mode

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

Instructions for how to solve Assignment

1. Each assignment must be solved in white pages sheet separately.

2. Each assignment must be properly tagged.

3. Last date for submitting assignment is 25/11/2019 on before 5:00 P.M.

4. All students are requested to submit only your own assignment separately and your presence will
countable.
Motilal Nehru National Institute of Technology, Allahabad.
Department of Computer Science & Engineering,
B. Tech. (CSE/IT) VII Semester
Subject:- Data Warehouse and Mining

Assignment-IV

1. A very large tele-communications company called “Cell9”, providing cellular phone services to a
number of states in various regions of the country, plans to build a data warehouse for decision
support. They have millions of subscribers in the country. They want to track the duration (in
minutes) as well as the prevailing rate (per minute) of each phone call made by its subscribers.
They also want to analyze if there is any link between the total amount of time spent in talking on
cellphones by a subscriber and the number of graduates in the state or the number of married
persons in the state or the male-female ratio of the state to which the subscriber belongs. Further,
they want to analyse the relation between the age, salary and marital status of the customers to
their total bill amount per day/month/year. One other important requirement is to make queries
like determining the current total number of customers in the various age groups for each state
having certain ranges of male-female ratio.

a) Design a suitable relational database schema for such a data warehouse, clearly identifying
the fact table(s), the facts in the fact table(s), the dimension table(s), their primary key(s) and
foreign key(s). Your schema should at least be able to satisfy the above mentioned analysis
requirements. You may consider other suitable attributes for the dimension table(s).
b) Classify the facts in your fact table(s) as additive, non-additive and semi-additive.
c) Draw possible concept hierarchies for each dimension that you have designed, identifying
whether these are schema hierarchies or set grouping hierachies.
d) Write an SQL query that runs on your schema and returns the region-wise yearly average bill
amounts of married and unmarried customers.
e) Draw a cuboid to represent the result of your query.
f) From this cuboid, which sequence of OLAP operations would you perform to get the average
monthly bill amounts of all the customers for the states of Bihar and West Bengal?
g) Write an SQL query to return the current total number of customers in the various age groups
for each state with male-female ratio between 0.9 and 1.1.
h) For any one fact table (You may have only one, depending on your design), and any one
attribute of any one dimension table, draw the bitmap index table(s) and join index table(s).
Before drawing the index tables, first mention the representative rows in the tables.

2. A hospital cum medical research institute is carrying out a study on the nature of differenttypes of
fevers. In order to track every patient as he/she keeps coming back to the hospital, aunique id is
maintained. For each patient, they keep track of the body temperature at everyhour of the day as
long as the patient is admitted in the hospital. They also maintain dataabout the different types of
medicine being given to the patient. Patients may be given morethan one medicine in a day. Every
medicine is administered as many times in a day as thedoctor has prescribed. Since there is
history of different types of fevers occurring in variousdistricts, states and regions in the country,
the hospital research team wants to maintain suchresidence details of each patient. One of the
goals of the research is to determine if there isany relation between the age and gender of the
patients with their body temperature whenvarious medicines are administered. Another goal is to
determine if there is a relationbetween the % of population who are farmers, office goers or
teachers in the patient’s statewith the body temperature of the patients when various medicines are
administered.

a) Design a suitable schema for the hospital cum medical research institute, clearly
identifying the Fact table(s), Dimension Tables(s), the Facts, the Dimensions,
Primary Keys and Foreign Keys of all the tables. Your schema should at least be able
to satisfy the above mentioned research requirements. You may consider other
suitable attributes for the dimension table(s).
b) Classify the fact(s) in your fact table(s) as additive, non-additive and semi-additive.
c) Write an SQL query that runs on your schema and returns today’s average, maximum
and minimum body temperature for each married male patient.
d) Draw a cuboid to represent the result of your query.
3. A chain of departmental stores called “India-Mart” having operations only in India, plans to
develop a data warehouse for effective decision-making about their sales and different promotion
schemes. India-Mart puts some of their products on promotional sales from time to time. There
may be a large number of different types of promotions like coupon sales, end-of-the-aisle
display, buy-two-get-one-free, etc.
India-Mart would like to analyze how item sale is affected by the promotions at each store, in
each state and across the entire country.
With respect to the above business scenario, answer the following questions.

a. Design a star schema for the data warehouse clearly identifying the fact table(s), dimension
table(s), their attributes and measures along with the primary key and foreign key relationships.

b. Write an SQL query by which you can display year-wise, promotion-wise, product-wise total sales
in the entire country from your schema.

c. Draw a cuboid that would display the result of the query specified in Q. b above.

d. From the cuboid of Q. c above, if we want to find the total amount of promotional sales made
during the years 2002 and 2003 for the states of Karnataka and Maharashtra, which sequence of
OLAP operations would you need to perform?

e. Draw possible schema hierarchies for each dimension that you have designed.

f. Based on the schema hierarchies drawn in Q. e above, determine the total number of cuboids,
considering all the aggregation levels.

4. A consortium of banks wants to develop a data warehouse for effective decision-making

about their loan schemes. The banks provide loans to customers for various purposes,
like, House Building Loan, Car Loan, Educational Loan, Personal Loan, etc. The whole
country is categorized into a number of regions, namely, North, South, East and West.
Each region consists of a set of states. Loan is disbursed to customers at interest rates that
change from time to time. Also, at any given point of time, the different types of loans
have different rates. The data warehouse should record an entry for each disbursement of
loan to customer.
With respect to the above business scenario, answer the following questions. Clearly state
any reasonable assumptions you make.

a) Design a star schema for the data warehouse clearly identifying the fact table(s),
dimensional table(s), their attributes and measures along with the primary key and
foreign key relationships.
b) Write an SQL query by which you can display region-wise, bank-wise, year-wise total
amount of loans disbursed from your schema.
c) Starting with the base cuboid, if we want to see the amount of loan disbursed during
the year 2000 for the state of Maharashtra, which sequence of OLAP operations
would you need to perform?

5. Use the similarity matrix in Table (shown below) to perform single and complete
linkhierarchical clustering. Show your results by drawing a dendrogram.
Thedendrogram should clearly show the order in which the points are merged.

6. Compute the hierarchical F-measure for the eight objects {p1, p2, p3, p4, p5, p6, p7,
p8} and hierarchical clustering shown in Figure (shown below). Class A contains
points p1, p2, and p3, while p4, p5, p6, p7, and p8 belong to class B.
7. Following is a data set that contains two attributes, X and Y , and two class labels,
“+” and “ −”. Each attribute can take three different values: 0, 1, or 2. The concept for
the “+” class is Y = 1 and the concept for the “−” class is X =0 ∨ X=2.

(a) Build a decision tree on the data set. Does the tree capture the “+” and “−”
concepts?

(b) What are the accuracy, precision, recall, andF1 -measure of the decision tree?
(Note that precision, recall, and F1 -measure are defined with respect to the “+”
class.)

8. The original association rule mining formulation uses the support and confi-dence
measures to prune uninteresting rules.
(a) Draw a contingency table for each of the following rules using the trans-actions
shown in Table shown below.
(b) Use the contingency tables in part (a) to compute and rank the rules in decreasing
order according to the following measures.

Ingersoll Irn 55k CC Partlist
100% (2)
Ingersoll Irn 55k CC Partlist
37 pages
Dimensional Modeling Class
No ratings yet
Dimensional Modeling Class
15 pages
Chapter 3
100% (3)
Chapter 3
4 pages
Data Interpretation Guide For All Competitive and Admission Exams
From Everand
Data Interpretation Guide For All Competitive and Admission Exams
Mohmmad Khaja Shareef
2.5/5 (6)
Black & Decker The Complete Guide To Landscape Projects
97% (34)
Black & Decker The Complete Guide To Landscape Projects
241 pages
The LEGO Group - Group12
100% (1)
The LEGO Group - Group12
5 pages
Exercises On Designing Data Warehouse
0% (1)
Exercises On Designing Data Warehouse
5 pages
Datawarehouse Design Problems and Solutions
100% (3)
Datawarehouse Design Problems and Solutions
22 pages
DW Design Ex
No ratings yet
DW Design Ex
2 pages
DW Design Ex PDF
No ratings yet
DW Design Ex PDF
2 pages
Department of Computer Science and Engineering
No ratings yet
Department of Computer Science and Engineering
3 pages
Ps Assignment - Solution
No ratings yet
Ps Assignment - Solution
7 pages
Exercises On Designing Data Warehouse
No ratings yet
Exercises On Designing Data Warehouse
2 pages
Cis 417.Ccs 415. Cct 416 Cat
No ratings yet
Cis 417.Ccs 415. Cct 416 Cat
4 pages
Cia1 Paper
No ratings yet
Cia1 Paper
2 pages
2011-02-07 Eng DMS
No ratings yet
2011-02-07 Eng DMS
3 pages
DWM QB Cyse
No ratings yet
DWM QB Cyse
8 pages
Data Mining Assgn 1 2025
No ratings yet
Data Mining Assgn 1 2025
2 pages
DWDM-QB
No ratings yet
DWDM-QB
12 pages
MSC CS Mqp0708
No ratings yet
MSC CS Mqp0708
12 pages
Data Warehouse and Data Mining Question Bank R13 PDF
No ratings yet
Data Warehouse and Data Mining Question Bank R13 PDF
12 pages
Tutor Test and Home Assignment Questions For de
No ratings yet
Tutor Test and Home Assignment Questions For de
4 pages
Assignment2 4
No ratings yet
Assignment2 4
13 pages
Unit 1 Assignment
0% (1)
Unit 1 Assignment
6 pages
Jss Mahavidyapeetha: AY 2019-20 (Even Semester)
No ratings yet
Jss Mahavidyapeetha: AY 2019-20 (Even Semester)
2 pages
CS 8031 Data Mining and Data Warehousing Tutorial
No ratings yet
CS 8031 Data Mining and Data Warehousing Tutorial
9 pages
Draw Schema
No ratings yet
Draw Schema
11 pages
Correct DW
No ratings yet
Correct DW
9 pages
Comp IV DW PDF
No ratings yet
Comp IV DW PDF
1 page
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
5 pages
Data Mining-1
No ratings yet
Data Mining-1
15 pages
Databases Unit Test 3
No ratings yet
Databases Unit Test 3
8 pages
BIT_Database Programming 2024
No ratings yet
BIT_Database Programming 2024
7 pages
Data Mining and Warehousing
No ratings yet
Data Mining and Warehousing
10 pages
6-1-DWM-2019-S
No ratings yet
6-1-DWM-2019-S
7 pages
End Sem
No ratings yet
End Sem
3 pages
DW Assignment
No ratings yet
DW Assignment
1 page
Data Mining Paer 2 Oct 12, 2024_241012_224522 (1)
No ratings yet
Data Mining Paer 2 Oct 12, 2024_241012_224522 (1)
13 pages
OLAP EXAMPLES
No ratings yet
OLAP EXAMPLES
10 pages
SampleMidSemQuestionPaper-DataBaseDesign
No ratings yet
SampleMidSemQuestionPaper-DataBaseDesign
3 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
Assignment I
No ratings yet
Assignment I
4 pages
Mid Sem
No ratings yet
Mid Sem
3 pages
CS 515 Data Warehousing and Data Mining
No ratings yet
CS 515 Data Warehousing and Data Mining
5 pages
Midterm
No ratings yet
Midterm
5 pages
DWDM Assignment 1
No ratings yet
DWDM Assignment 1
4 pages
Williams Christian Csc457 Portfolio
No ratings yet
Williams Christian Csc457 Portfolio
20 pages
Data Warehouse and Data Mining Unit 1
No ratings yet
Data Warehouse and Data Mining Unit 1
2 pages
05 DW Project Hotel Text
No ratings yet
05 DW Project Hotel Text
2 pages
DB Competition
No ratings yet
DB Competition
12 pages
(It-704c) Data Warehousing and Data Mining (2013-14)
No ratings yet
(It-704c) Data Warehousing and Data Mining (2013-14)
6 pages
CCS341 Set3
100% (1)
CCS341 Set3
3 pages
Data Warehousing/Mining Comp 150 DW Chapter 5: Concept Description: Characterization and Comparison
No ratings yet
Data Warehousing/Mining Comp 150 DW Chapter 5: Concept Description: Characterization and Comparison
59 pages
Data Warehousing Questions
No ratings yet
Data Warehousing Questions
2 pages
CS402 Data Mining and Warehousing Question Bank
No ratings yet
CS402 Data Mining and Warehousing Question Bank
6 pages
DATABASE UE QNS, DOCS
No ratings yet
DATABASE UE QNS, DOCS
17 pages
SSG515 I
No ratings yet
SSG515 I
5 pages
Tarea 3
No ratings yet
Tarea 3
4 pages
DWM - Lab Manual - July2024
No ratings yet
DWM - Lab Manual - July2024
57 pages
Assignment 05
No ratings yet
Assignment 05
2 pages
Exam05 Answers
No ratings yet
Exam05 Answers
13 pages
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
From Everand
AP Computer Science Principles: Student-Crafted Practice Tests For Excellence
Sama Alshatali
No ratings yet
AI-900: Microsoft Azure AI Fundamentals Preparation
From Everand
AI-900: Microsoft Azure AI Fundamentals Preparation
Georgio Daccache
No ratings yet
SLE500 Smart and SLE500 Easy Compressors
No ratings yet
SLE500 Smart and SLE500 Easy Compressors
4 pages
Handbook FOR Protection Engineers: Mohammed Fasil T 9746988538 0483 2854487
100% (2)
Handbook FOR Protection Engineers: Mohammed Fasil T 9746988538 0483 2854487
127 pages
Metallurgical Heritage of India
No ratings yet
Metallurgical Heritage of India
49 pages
Low Voltage Distribution Products LD
No ratings yet
Low Voltage Distribution Products LD
43 pages
Standardized Fasteners
No ratings yet
Standardized Fasteners
30 pages
Warwick Sweet 15 Manuel Utilisateur en 26792 PDF
No ratings yet
Warwick Sweet 15 Manuel Utilisateur en 26792 PDF
8 pages
Ansys Manual Combines
No ratings yet
Ansys Manual Combines
240 pages
MSDS Asam Klorida
No ratings yet
MSDS Asam Klorida
4 pages
Aeolian Vibration: Leksono Hartanto PLP Indonesia
No ratings yet
Aeolian Vibration: Leksono Hartanto PLP Indonesia
31 pages
Curriculum Vitae - Ari Suprayogi
No ratings yet
Curriculum Vitae - Ari Suprayogi
3 pages
Glass and Glazing BMC
No ratings yet
Glass and Glazing BMC
13 pages
PA_Single function Ammeter VEGA_Manual_1_1
No ratings yet
PA_Single function Ammeter VEGA_Manual_1_1
2 pages
Flexural Design of Prestressed Beams Using Elastic Stresses Example
No ratings yet
Flexural Design of Prestressed Beams Using Elastic Stresses Example
5 pages
Benzene XyleneChemicals 30052012
100% (1)
Benzene XyleneChemicals 30052012
50 pages
Model J112 and JL112 Series Sprinkler Summary
No ratings yet
Model J112 and JL112 Series Sprinkler Summary
7 pages
CG PrimaryGyratory Crusher SFMweb
No ratings yet
CG PrimaryGyratory Crusher SFMweb
8 pages
100 S 2021 Standard Details 10 (RC) r0
No ratings yet
100 S 2021 Standard Details 10 (RC) r0
1 page
SDS1022 / SDS1102 Digital Storage Oscilloscope User Manual
No ratings yet
SDS1022 / SDS1102 Digital Storage Oscilloscope User Manual
66 pages
Zhone Fiberhome Solution
No ratings yet
Zhone Fiberhome Solution
4 pages
LB1838
No ratings yet
LB1838
5 pages
Failure of Cooling Fan Shaft Bearings
No ratings yet
Failure of Cooling Fan Shaft Bearings
7 pages
9 Power Train
No ratings yet
9 Power Train
214 pages
Solution of DPP # 2: Physics
No ratings yet
Solution of DPP # 2: Physics
11 pages
Lab 4 Nodal, Mesh & Superposition
No ratings yet
Lab 4 Nodal, Mesh & Superposition
4 pages
Ball Valve - GLT
No ratings yet
Ball Valve - GLT
19 pages
Geotechnical Engineering-Ii: BSC Civil Engineering - 5 Semester
No ratings yet
Geotechnical Engineering-Ii: BSC Civil Engineering - 5 Semester
12 pages
DS PM0204 GB 3681
No ratings yet
DS PM0204 GB 3681
2 pages