2019 May IT404-A - Ktu Qbank
2019 May IT404-A - Ktu Qbank
2019 May IT404-A - Ktu Qbank
PART A
Answer any two full questions, each carries 15 marks. Marks
1 a) Compare and contrast analysis and reporting in data analytics with suitable (7)
example.
b) Consider the population of size N = 6, displayed in the table given below. ‘X’ is (8 )
the population variable. A random sample of size n = 2 is drawn with replacement
from this population. Show the distributions of the sample and the sampling
distribution of the sample statistics and S2.
TID Items_bought
T100 {M,O,N,K,E,Y}
T200 {D,O,N,K,E,Y}
T300 {M,A,K,E}
T400 {M,U,C,K,Y}
T500 {C,O,O,K,I,E}
(i) Find all frequent itemsets using Apriori algorithm.
(ii) List all of the strong association rules (with support s and confidence c)
b) List the advantages and disadvantages of K Means clustering. (5)
5 a) What are the various stages in big data analytics life cycle? Illustrate with a (9)
figure, explaining each of them.
b) What are the characteristics of Big data? (6)
6 a) Define sequential rule mining. Give example (5)
b) Discuss the trends in big data generation and acquisition (10)
Page 1of 2
For More Visit : KtuQbank.com
B H1115 Pages: 2
PART C
Answer any two full questions, each carries 20 marks.
7 a) List six R functions which are used in descriptive statistics. (12)
b) What is the significance of scatter plot matrix? (8)
8 a) Illustrate and explain the concept of Map Reduce framework (10)
b) With an example, explain the term social media analytics. (10)
9 a) Write R function to check whether the given number is prime or not? (10)
b) What is HDFS? How does it handle Big Data? (10)
****
Page 2of 2
For More Visit : KtuQbank.com