BigData ML
BigData ML
The procedure to organize items of a given collection into groups based on some similar
features called as ————-
(1 Point)
Decision Trees
Regression
Association
Clustering
6.
The terms used in Machine Learning with Big data are. i). Pattern Recognition ii). Data
mining iii). Data slang iv).Predictive Analytics
(1 Point)
(iii) is wrong.
(ii) is correct.
Veracity
Volume
Integrity
Variety
8.
Which of the following is false for Apache Spark?
(1 Point)
Enables powerful interactive and data analytics application across live streaming data
Clustering Problem
Classification Problem
Regression Problem
10.
The process of constructing a mathematical model that can be used to predict one
variable by another variable
(1 Point)
Correlation
Outlier
Regression
Cluster Analysis
11.
How is KNN model used for classification?
(1 Point)
All the neighbours that are ‘K’ distance apart from the new sample point determine the label
for the new sample
The class labels of ‘K’ neighbouring samples determine the label for the new sample.
All the training samples within a circle of ‘K’ radius determine the label for the new sample.
12.
Choose the false statement.
(1 Point)
One can uncover unexpected and useful relationships with association analysis
Association rules are not used to determine when items or events occur together
The goal is to come up with a set of rules to capture associations between items or events
13.
What are the two parts in data understanding phase of CRISP-DM?
(1 Point)
Calculate the centroids, then determine the appropriate stopping criterion depending on the
number of centroids.
Assign each sample to the closest centroid, then calculate the new centroid.
Calculate the distances between the cluster centroids, then find the two closest centroids.
Regression
Classification
Prediction
Analysis
16.
Misclassification rate is another name given for :
(1 Point)
Classification Rate
Training Rate
Error Rate
Testing Rate
17.
What is the purpose of exploring data?
(1 Point)
Gini Index
Association
Probability
Regression
20.
What is involved in data wrangling?
(1 Point)
KNN
Correlation
Decision Tree
Naïve Bayes
23.
Choose the correct statement
(1 Point)
Histograms and bar plots are used for categorical and numeric data respectively.
Regularise
Generalize
Justify
Optimize
25.
In linear regression, the least squares method is used to
(1 Point)
Determine how to partition the data into training and test sets.
Regression Only
All
data
none of these
feature
domain
29.
What is Dimensionality reduction?
(1 Point)
Dimensionality reduction is finding a smaller subset of feature that can effectively capture
the characteristics of the input data
30.
Cluster results can be used to
(1 Point)
Segment the data into groups so that each group can be analyzed further
Centroid clustering
K-Mean clustering
Density clustering
Simple clustering
33.
Which is not a way to accomplish pre-pruning in decision trees?
(1 Point)
None
Non-linear
Linear
Submit
This content is created by the owner of the form. The data you submit will be sent to the form owner. Microsoft is not
responsible for the privacy or security practices of its customers, including those of this form owner. Never give out
your password.
Doubts - 8, 23, 26, 32, 34 please write question and answer no..
PLEASE CONFIRM REMAINING 4
-> Choose the correct statement - Bar plots never use aggregation - i had done this since
bar is category type things and histogram is numeric type things
-> Which of the following is not a type of clustering algorithm? - Simple Clustering - SURE?
-> Which of the following is false for Apache Spark? - Enables powerful interactive and
data analytics application across live streaming data
submit??
Submitted - shameek ++
Thanks everyone
Submitted - Devesh