Past Smart Data
Past Smart Data
Past Smart Data
Duration: 2 Hours
1
Question 1 (25 Marks)
(c) Identify 3 different data mining tasks and define them briefly. (6 Marks)
(d) What are the 3 types of data structures? Give examples for each. (6 Marks)
2
Question 2 (25 Marks)
(a) Write an SQL to retrieve all columns and rows of Performance table. (5 Marks)
(b) Write an SQL statement to retrieve the first name and last name of all performers who
are tap dancers. (5 Marks)
(c) Write an SQL statement to list all performances on 01/01/2017 at Kings Theatre.
(5 Marks)
(d) Write an SQL statement to list the distinct ids of theatre groups. (5 Marks)
(e) Write an SQL statement to retrieve the first name and number of groups performers
are enrolled by Group Id and First Name. (5 Marks)
3
Question 3 (25 Marks)
(a) Describe briefly dimensional data modelling and its role in data warehousing.
(8 Marks)
(b) Explain what is subject-oriented, time variant, non-volatile and integrated mean.
(7 Marks)
(c) A typical architecture of a big data has four components. Identify and describe these
four components and show in a diagram how these components are related to each
other. (10 Marks)
(a) Considering the decision tree model as shown at Figure 1, identify the outcome of the
following new case: Peter wants to go outside to play basketball with his friends and
the weather seems to be sunny with humidity of 65.500. Explain your reasoning
(10 Marks)
(b) Define what are the root node, internal nodes, leaf nodes and the splitting rules and
give examples for each from Figure 1. (15 Marks)
(c) Text mining is the application of data mining techniques to process and analyse
unstructured texts such as tweets, emails, reviews, facebook posts, blogs, online
forums, and news articles. Before we apply any text mining techniques, the raw text
should be pre-processed using various techniques such as bag-of-words, tokenisation,
and stemming. Explain in detail these three pre-processing techniques (bag-of-words,
tokenisation and stemming). (10 Marks)