Assignment 7 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
Assignment 7 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
Assignment 7 (Sol.) : Introduction To Machine Learning Prof. B. Ravindran
)
Introduction to Machine Learning
Prof. B. Ravindran
1. Which of the following constitute Type I errors?
(a) the null hypothesis is rejected when it is true.
(b) the null hypothesis is accepted when it is false.
(c) the null hypothesis is accepted when it is true.
(d) the alternate hypothesis is accepted when it is true.
Solution: A
By definition of Type I errors.
2. Suppose you are an online advertiser (like Google Ads), which accepts advertisements (con-
sisting of short text and a link) from your customers (companies, such as say Samsung or
Hindustan Unilever). You needed to build a system, which on submitting an ad-page to it,
classifies it as spam or not spam, and immediately adds it to your corpus of ads if it is not
spam. Your development team has come up with two systems - system A and system B, to
perform this task. You need to evaluate which system is better for the task using hypothesis
testing based methods. Which of these variables are likely to be extraneous to the task? (Note
that multiple answers may be correct)
Solution: C, D
A, B represent the variables whose outcome is which we want to monitor and influence the
choice of the system. C, D also influence the quality, but they are not being modelled hence
they are extraneous.
3. Suppose that a psychologist wants to evaluate the effectiveness of a new learning strategy.
She randomly assigns students to two groups and assigns each student the same passage on
a particular topic to study for half an hour. Subsequently each student participates in an
individual assessment on the topic, where students of the one group use the new learning
strategy, and students of the other group use any strategy they prefer. Which among the
following is an extraneous variable in the above experiment.
1
Solution: C
The number of groups is part of the setup which doesn’t effect the effectiveness of the learning
strategy. The amount of time, is constant for both the groups, hence it is not a factor. The
amount of knowledge of the students can hurt influence the results.
4. In the previous question, what step has the experimenter taken to reduce the effect of extra-
neous variables?
Solution: B
Assigning students randomly attempts to prevent any unfair advantage to either group arising
due to existing knowledge among the students.
5. I have sampled 20 points from an unknown probability distribution. The sample mean is 5.0
and the standard deviation of the sample is 2.3. Estimate a 95% confidence interval for the
mean of the distribution. (You might need to round your answer a little bit to agree with the
right option. You can use the t-table available here)
Solution: C
µ=5
σ = 2.3
Compute the Standard Error,
σ
SE = √
n
2.3
SE = √
20
Now look up the in the two tailed t-table for 0.975 and degree of freedom as 19.
2
(a) 0.93
(b) 0.959
(c) 0.98
(d) 0.97
Solution: B
PN
i=0 xi
µ=
N
7. What is the sample standard deviation for the accuracies?
(a) 0.0243
(b) 0.0256
(c) 6.5444e-04
(d) 5.8900e-04
Solution: B s
PN
− µ)2
i=0 (xi
σ=
N −1
8. Estimate a 95% confidence interval for the true accuracy of the classifier.
(a) (0.9407, 0.9773)
(b) (0.9397, 0.9783)
(c) (0.9442, 0.9738)
(d) None of the above.
Solution: A
0.0256
SE = √ = 0.0081
10
From the t-table, the critical value for cumulative probability of 0.975 with 9 degrees of freedom.
Thus making the Margin of error 0.0183.
9. Which of the following statements is/are true?
(a) T-test is used when the number of samples is small.
(b) Z-test is used when the number of samples is small.
(c) T-test assumes the underlying distribution is a normal distribution.
(d) T-test assumes the underlying distribution is a beta distribution.
Solution: A, C
10. If a test of hypothesis has a Type I error probability (α) of 0.01, we mean
(a) If the null hypothesis is true, we don’t reject it 1% of the time.
(b) If the null hypothesis is true, we reject it 1% of the time.
3
(c) If the null hypothesis is false, we dont́ reject it 1% of the time.
(d) If the null hypothesis is false, we reject it 1% of the time.
Solution: B