Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Past Smart Data

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Examination question paper: May 2023

Module code: CC5067

Component number: 002

Module title: Smart Data Discovery

Module leader: Svetla Hubenova

Date: 11 May 2023

Start time: 10:00

Duration: 2 Hours

Exam type: Unseen, Closed

Materials supplied: Exam papers

Materials permitted: None

Warning: Candidates are warned that possession of


unauthorised materials in an examination is a
serious assessment offence.

Instructions to Answer all FOUR questions.


candidates:

Do not turn page over until instructed


© London Metropolitan University

1
Question 1 (25 Marks)

(a) Define data analytics. (3 Marks)

(b) What are the six stages of a data analytics lifecycle?

Describe them briefly. (10 Marks)

(c) Identify 3 different data mining tasks and define them briefly. (6 Marks)

(d) What are the 3 types of data structures? Give examples for each. (6 Marks)

2
Question 2 (25 Marks)

Consider a relational data model for a theatre company depicted below.

(a) Write an SQL to retrieve all columns and rows of Performance table. (5 Marks)

(b) Write an SQL statement to retrieve the first name and last name of all performers who
are tap dancers. (5 Marks)

(c) Write an SQL statement to list all performances on 01/01/2017 at Kings Theatre.
(5 Marks)

(d) Write an SQL statement to list the distinct ids of theatre groups. (5 Marks)

(e) Write an SQL statement to retrieve the first name and number of groups performers
are enrolled by Group Id and First Name. (5 Marks)

3
Question 3 (25 Marks)

(a) Describe briefly dimensional data modelling and its role in data warehousing.
(8 Marks)
(b) Explain what is subject-oriented, time variant, non-volatile and integrated mean.
(7 Marks)
(c) A typical architecture of a big data has four components. Identify and describe these
four components and show in a diagram how these components are related to each
other. (10 Marks)

Question 4 (25 Marks)

(a) Considering the decision tree model as shown at Figure 1, identify the outcome of the
following new case: Peter wants to go outside to play basketball with his friends and
the weather seems to be sunny with humidity of 65.500. Explain your reasoning
(10 Marks)

Figure 1 – Decision Tree Model

(b) Define what are the root node, internal nodes, leaf nodes and the splitting rules and
give examples for each from Figure 1. (15 Marks)

(c) Text mining is the application of data mining techniques to process and analyse
unstructured texts such as tweets, emails, reviews, facebook posts, blogs, online
forums, and news articles. Before we apply any text mining techniques, the raw text
should be pre-processed using various techniques such as bag-of-words, tokenisation,
and stemming. Explain in detail these three pre-processing techniques (bag-of-words,
tokenisation and stemming). (10 Marks)

You might also like