IT - Sem VI - DMBI - Sample Questions
IT - Sem VI - DMBI - Sample Questions
IT - Sem VI - DMBI - Sample Questions
Information Technology
Choose the correct option for following questions. All the Questions
carry equal marks
1. Which of the following can be considered as the correct process of Data
Mining?
Option A: Infrastructure, Exploration, Analysis, Interpretation, Exploitation
Option B: Exploration, Infrastructure, Analysis, Interpretation, Exploitation
Option C: Exploration, Infrastructure, Interpretation, Analysis, Exploitation
Option D: Exploration, Infrastructure, Analysis, Exploitation, Interpretation
6. Which one of the following can be defined as the data object which does not
comply with the general behavior (or the model of available data)?
Option A: Evaluation Analysis
Option B: Outliner Analysis
Option C: Classification
Option D: Prediction
7. Which one of the following correctly refers to the task of the classification?
Option A: A measure of the accuracy, of the classification of a concept that is given by a
certain theory
Option B: The task of assigning a classification to a set of examples
Option C: A subdivision of a set of examples into a number of classes
Option D: None of the above
10. Efficiency and scalability of data mining algorithms” issues come under?
Option A: Mining Methodology and User Interaction Issues
Option B: Performance Issues
Option C: Diverse Data Types Issues
Option D: None of the above
11. ________ is the clustering technique which needs the merging approach.
Option A: Naïve Bayes
Option B: Hierarchical
Option C: Partitioned
Option D: All of the above
21. When you ____ the data, you are aggregating the data to a higher level
Option A: Slice
Option B: Roll Up
Option C: Roll Down
Option D: Drill Down
23. ______supports basic OLAP operations, including slice and dice, drill-down,
roll-up and pivoting.
Option A: Information processing
Option B: Analytical processing
Option C: Data processing
Option D: Transaction processing
10 marks each
1. Explain role of Business intelligence in any one of following domain:
Fraud Detection, Market Segmentation, retail industry, and telecommunications
industry. Explain how data mining can be helpful in any of these cases.
2. Explain Star, Snowflake, and Fact Constellation Schema for
Multidimensional Database
3. Explain Data warehouse architecture
4. What is clustering? Explain K-means clustering algorithm. Suppose the data for
clustering- {2, 4, 10, 12, 3, 20, 11, 25} Consider k-2, cluster the given data using above
algorithm.
5. Explain multilevel association & multidimensional association rules with example.
6. Define support, confidence. Also generate association rules. A database has four
transitions. Let minimum support and confidence is 50%
7. Define support, confidence. Also generate association rules. A database has four
transitions. Let minimum support = 2 and confidence is 80%
14. Consider a data warehouse for a hospital where there are three dimensions:
16. Explain different methods that can be used to evaluate and compare the accuracy of
different classification algorithms?
2 No Business Average No
3 No Employed Low No
6 No Business Low No
9 No Business Low No
1
No Employed Average Yes
0
5 marks each
1) Explain why data warehouses are needed for developing business solutions from today’s
perspective. Discuss the role of data marts.
2) Explain various features of Data Warehouse?
3) Discuss the application of data warehousing and data mining
4) A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection
of data – Justify.
5) Give differences between OLAP and OLTP.
6) Explain various OLAP operations
7) Differentiate Fact table vs. Dimension table
8) Define the term “data mining”. Discuss the major issues in data mining
9) In real-world data, tuples with missing values for some attributes are a common
occurrence. Describe various methods for handling this problem
10) Explain the following data normalization techniques: (i) min-max normalization and (ii)
decimal scaling.
11) Describe various methods for handling missing data values
12) What are the limitations of the Apriori approach for mining? Briefly describe the techniques
to improve the efficiency of Apriori algorithm
13) What is market basket analysis? Explain the two measures of rule interestingness: support
and confidence with suitable example.
14) Explain measures for finding rule interestingness (support, confidence) with
example.
15) Compare association and classification. Briefly explain associative classification with
suitable example.
16) What is an attribute selection measure? Explain different attribute selection measures with
example.
17) Do feature wise comparison between classification and prediction.
18) Explain Linear regression with example.
19) Explain data mining application for fraud detection.
20) Discuss applications of data mining in Banking and Finance.
21) How K-Mean clustering method differs from K-Medoid clustering method?
22) How FP tree is better than Apriori algorithm- Justify
23) Define information gain, entropy, gini index