Int 354 ML-1

Registration No.
: 222321NT10137
Paper Code: A
Course Code:INT354
Course Title:MACHINE LEARNING-! Max Marks: 70
Time Allowed: 03:00hrs.
Read the following instructions carefully before attempting the question paper.
Mtchthe Paper Code shaded on the OMR Sheet with the Paper code mentioned on the question paper and
ensure that both are the same.
and B.
2. This question paper is divided into two parts A
3 Pat Acontains 30 questions of 1mark each. 0.25 marks wil questions
be deducted for each wrong answer.</>
out of these 5 questions, n case all th
any 4
4 Part B contains 5 questions of 10 marks each. Attempt questions will be evaluated.
attempted
questions are attempted then only the first four
5. Attempt all the questions in serial order.
paper except your registration no. on the designated space.
6. Do not write or mark anything on the question sheet
7 After completion of first 90 minutes, the OMR will be taken by the invigilator.
8. Submit the question paper and the rough sheet(s) along with the answer sheet to the invigilator before leaving the
examination hal.
Part-A
Q1)
1) Which of the following is true about regularized linear regression model?
(a) Increase in regularization parameter (lambda) will make the model to underfit the data and the validation error will
go up.
(b) Decrease in regularization parameter(lambda) willmake the model to overfit the data and the training error go up
(c) tncrease in regularization parameter (lambda) will make the model to underft the data and the training error go
down
(d) All of the above are true
CO4,L6
2) Which one of the statement is true regarding residuals in regression analysis?
(a)Mean of residuals is always zero (b) Mean of residuals is always less than zero
(c) Mean of residuals is always greater than zero (d) There is no such rule for residuals.
CO4,L6
3) Choose the correct statement/statements:
S1: RANSAC model can estimate the parameters with a high degree of accuracy even when a significant number of
Outliers are present in the data set.
S2: in RANSAC model number of iterations increases logarithmicaly with outlier percentage
(9) S1 is true and S2 is true (b) S1 is true and S2 is false
(c) S1 is false and $2 is true
(d) S1 is false and S2 is false
4) The performance of classification algorithms does not CO5,L2
(a)Size of data set depend on
(c) Classes are linearly separable or not (b) total number of feature
() Evaluation parameter
CO4,L6
1o represent perfect positive correlation the Pearson coefficient in Correlation analysis should be
(a) 1 (b) <=1 (c) -1<cofficient<1 (d)-1
CO4,L6
6) Choose the correct statement in terms of handling the overtittng
1. Increase the dimensionality of data
I. Decrease the dimensionality of data
lIL Use regularization method
V. Use kernel
(al and ilt approach (b)1 and 11 (c) lland IM (d) Il and IV
7) Which one is true? CO1,L4
(A) Ridge regression decreases the
complexity of a model
leads to a coefficient been zero rather only minimizes it but does not reduce the number of variables since it never
(B) Lasso regression is not good for feature reduction
(C)As the regularization parameter increases, the value of the
coefficient tends towards
variance (as some coefficient leads to negligible eftect on prediction) and low bias zero This
(minimization of leads to both
coefficient low
reduces
the dependency of prediction on a particular variable)
(a) Only Aand B (b) Only Aand C (c) All A, Band C (d) None of them CO1L4
Page 1 of 4
Registration No.:
for handling missing data?(b) pairwise deletion
8) Which of the following are used
(a)Deleting rows (d) Allof the above CO1,L4
(c) Imputations
what is 0i2
In the linear regression model of the following equation,
9)
parameter (b)Oj is jth feature value
(a) Oj is jth model (d) None of the above
(c)Oj is jth predicted value CO4,L6
categorical data.
category is best forMultivariate
10) A simple tabulation of frequency of each (b) non graphical
(a) Univariate non-graphical (d) Multivariate categorical CO4,L6
(c) Univariate graphical variable and
"Correlation" ? Note: y is dependent
true regarding "Regression" and
11) Which of the following option is
X is independent variable.
symmetric betweenxand yin bothboth
(a) The relationship is symmetric between x and y in of regression it is symmetric
(b) The relationship is not symmetric between x and y in case of correlation but in caseregression it is not symmetric.
not
(C) The relationship is symmetric in case of
relationship is between xand y in case of correlation but CO5,L2
(d) The model?l
matrix. What is the precision of the
12) Consider the following confusion
pred ictedClass pos Class_neg
Class_pos 114 86
Class_neg 7 93
)-0.94 (d)0.40
(b) 0.75 CO5,L2
(a) U57
13) we apply Perceptron Model on the following dataset.
Consider the following dataset. Let us suppose
0.1 02
0.18
0.1
1
02
3025 2
more suitable?
Wlhich loss function will be (d) All of the mentioned
Loss (c) 0-1 Loss CO5,L2
(a)Generalized LOss (b)Least Square Error
evaluateregressionmodels?
performance metric is used to
14) Which of the following
)Accracy score
) Mean sauarad'error
(a) Both a andb (b) Bothb andc
(c) All a,b and c (d) None of the given options CO8.L2
variableYis
a ser of indenendent vaiablesX and a dependent
15) The strength (degree) of the correlation between
measured by Correlation Determination
(b) Coetficient of
(o) Probability
(a) Coefficient of CO5,L2
(c) Standard
error of estimate
function ?
model uses Sigmoid activation (c) Multiple regression () Logistic regression
16) Which of the following (b) Polynomial regression CO5,L2
(a)Linear Regression
Page 2 of 4
Registration No.:
performance metric is used to evaluate regression models?

17) Which of the following
a)Accuracy score
b)R2 SCOre
c)Mean squared error (c) a and c (d) a,b and c
(a) a andb (b) bandc CO5,L2
to evaluate regression models?
18) Which of the folowing performance metric is used
a) Accuracy score
b) R2 sCore
) Mean Squared Error (d) Only a
(b) Bothband c (c) All a,b and c CO5,L2
(a) Botha and b
made by the model is known as.
19) The quality of a positive prediction
(b) R2 Score (c) Recall (d) Accuracy
CO5,L2
(a) Precision
variable(s) is
dependent variable that is predictable from the independent
20) The proportion of the variance in the
known as. (c) Recall () R2 Score
(a) Accuracy (b) Precession CO5L2
21) Acoording to no free lunch theorem: prior knowledge

(a)One classifier can be prefer over another without average overall objective functions/
(b) Allclassifier perform equally if performance is taken
prior knowledge
(c) One feature can be prefer over another without
equaly if performance is taken average overall objective functions
(d) All classifier do not perform CO5,L2
22) Natarajan dimension is the generalization of (b)VC Dimension/

(a) Redemacher complexity
(C) Non-uniform learnability (d) Consistency Learnability
CO5.L2
of the following statements are true?
23) Regarding bias and variance, which
bias. (b) Models which overfit have a low bias.
(a) Models which overfit have ahigh
variance. (d) None of the mentioned
(c) Models which underfit have a high CO5,L2
are circle, rectangle, pentagon, square and
24) Let us suppose, a dataset consists of 5 classes in total. The classes of voting is this.
triangle The majority of the samples are predicted as pentagon. Then which type (d) Data insufficient
(a) Unanímity (b) Majority () Plurality
CO5L2
25) Boosting algorithms executes in
manner.
(b) Sequential (c) Ordered (d) None of the mentioned
(a)Parallel CO5,L2
26) VC dimesion is used for
problem
a) Finite hypothesis and multiclass classification problem
(b) Infinite hypothesis and multiclass classification
(c) Finite hypothesis and binary classification problem
(d) infinite hypothesis and binary classification problem. CO5,L2
27) Axis aligned rectangle have the VC dimension (O)3 (d)4

(a)1 (b) 2 CO5,L2
in Random Forest Algorithm?
2 y c of tte foilowing is used to create different decision trees (d) All of the mentioned
(a)Stacking (6)Boosting (C) Bagging
CO5,L2
29) VC dimesion is used for
(a) Finite hypothesis and multiclass cassification problem
(b) Infinite hypothesis and multiclass classification problem
(c)Finite hypothesis and binary classification problem
(d) Infinite hypothesis and binary classification problem CO5L2
30) One-versus-al can be performed in which algorithm (b) SGD Classifier

(a) Random forest (d) All of the above
(c) Naive bayes classifier CO5L2
Page 3 of 4
Registration No.:
Part-B
tree
target attribute as 'is there a party', Find the root node of the decision
02) Consider the following datasetwith the
using ID3 algorithm.
Dealiesthere a party?TLazy Activity
Trgent P'arty
Urgent Yes Study
Near Ys Yes Party
None Yes No Party
Sone Yes Pub
Sone Yes No Party
Near No No Stuly
Near No Yes TV
Near Yes Yes Party
U'rgent No No Study
CO2,L4, [10 marks]

Q3). Write a program to classify text1.csv data into three classes using naive bayes classifier and evaluate the
performance using K-fold cross validation.
Q4). Propose the different solutions to overcome the problems of overfitting in regression methods. CO3,L3, [10 marks]
CO4,L6. [10 marks]
Q5). Explain different evaluation parameters of regression model.
CO5,L2, [10 marks]
Q6). Discuss the different methods of ensemble learning with suitable examples.
CO5,L2. [10 marks]
--End of Question paper-

Int 354 ML-1

Uploaded by

Copyright:

Available Formats

Int 354 ML-1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Int 354 ML-1

Uploaded by

Copyright:

Available Formats

Registration No.

performance metric is used to evaluate regression models?

21) Acoording to no free lunch theorem: prior knowledge

22) Natarajan dimension is the generalization of (b)VC Dimension/

27) Axis aligned rectangle have the VC dimension (O)3 (d)4

30) One-versus-al can be performed in which algorithm (b) SGD Classifier

CO2,L4, [10 marks]

You might also like