Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

My Courses 2022 Second Summer CSC 7333 For Jianhua Chen Final Exam Final Exam

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Final Exam: Attempt review 8/7/22, 11:29 PM

My Courses / 2022 Second Summer CSC 7333 for Jianhua Chen / Final Exam / Final Exam

Started on Sunday, August 7, 2022, 10:42 PM


State Finished
Completed on Sunday, August 7, 2022, 11:23 PM
Time taken 40 mins 34 secs
Grade 95.00 out of 100.00

Question 1
Correct 2.00 points out of 2.00

Which ONE of the following components is necessary for a well-defined learning


problem?

Select one:
A. Experience (in various forms, typically training data) the learner is exposed to in some task !
environment

B. Human teacher to provide helpful feedback

C. Logic formulas describing the background knowledge

D. Natural language interface between learner and teacher

Question 2
Incorrect 0.00 points out of 2.00

Consider the following statements (1)-(5) regarding the random forest learning
method:

(1) Random forest is exactly the same as bagging with decision trees

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 1 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

(2) Random forest method proves on bagging with decision trees by adding a
randomized restriction on attributes used at each splitting node

(3) We need to build multiple training datasets by sampling from the single
original training dataset
(4) The sampling from the original dataset is sampling with replacement

(5) When making a prediction on a new instance, only the best tree is used for
prediction

You need to select ONE option below that identifies ALL TRUE statements among (1)
- (5).

Select one:
A. (1), (3) and (5)

B. (2), (3) and (5)

C. (1), (3) and (4)

D. (2), (4) and (5) "

E. (2), (3) and (4)

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 2 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 3
Correct 3.00 points out of 3.00

Select ALL scenarios below such that regularization would help avoid overfitting
when doing gradient descent for linear (or logistic) regression:

Select one or more:


A. There are many input variables, and possibly not all of them are relevant to the target variable. !
B. There are many training data points.
C. We are using polynomial features with a polynomial of high degree. !
D. There are few input variables.

Question 4
Correct 3.00 points out of 3.00

When applying regularization for logistic regression, regularization is applied to ALL


parameters j , including the parameter 0

Select one:
A. True
B. False !

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 3 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 5
Incorrect 0.00 points out of 3.00

One advantage of deep learning is that features automatically extracted by deep


networks can be adapted easily to other domains

Select one:
True
False "

Question 6
Correct 3.00 points out of 3.00

Learning in autoencoders is unsupervised learning.

Select one:
True !
False

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 4 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 7
Correct 3.00 points out of 3.00

Consider the target function f , a Boolean function f(x1, x2) = x1 ∨ x2.


The distribution over the Boolean vectors for x1 and x2 values is given by below.
For a learned hypothesis h (x1, x2) = x1 ∧ x2, what is the true error errorD ( h) for this
hypothesis h? Calculate the EXACT value of the true error.

x1 x2 D(x1 , x2) f(x1 , x2) h(x1 , x2)

0 0 1/4 0 0

0 1 1/8 1 0

1 0 1/8 1 0

1 1 1/2 1 1

Select one:
a. 0.20
b. 0.25 or 0.250 !
c. 0.625
d. 0.26

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 5 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 8
Correct 3.00 points out of 3.00

When training SVM classifier, we need to select a suitable value for the
hyperparameter C. We understand that if we choose C value too small, then:

Select one:
a. It does NOT impact the SVM learning result.
b. The slack variables would be forced to take small values, and the learned decision hyperplane would
have a small margin. So it may overfit the data.
c. The slack variables could be relatively large, and the learned decision hyperplane may have a !
larger margin but misclassify many data points. So it may underfit the data.
d. This will give the best SVM classifier.

Question 9
Correct 3.00 points out of 3.00

Select ALL statements below that are true for logistic regression:

Select one or more:


A. One can solve for the optimal hypothesis h analytically without using gradient descent.
B. Logistic regression handles classification problems instead of regression problems. !
C. The sigmoid function used in logistic regression is a non-linear function. !
D. If we use mean squared error as the loss function for training a logistic regression model, the loss
function is convex
E. The output of a logistic regression model can be interpreted as a probability. !

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 6 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 10
Correct 3.00 points out of 3.00

Identify ALL true statements below that suggest possible ways to handle local
minimal and avoid overfitting in Backpropagation (BP) training:
1. Consider doing multiple BP runs each with randomly selected initial weights to
enhance the chance of finding the global minima
2. The learning rate should always be very small
3. Consider adding a regularization term to the error function to tackle overfitting
4. Consider adding momentum in BP training to help avoid local minima
5. When designing the network topology, we should use many hidden layer nodes
to fit the data well

Which choice below is correct?

Select one:
A. 1, 2, 4
B. 1, 3, 4 !
C. 3, 4, 5
D. 1, 4, 5
E. 2, 3

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 7 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 11
Correct 3.00 points out of 3.00

Select ALL statements from below that are true about using the “kernel trick” in SVM
learning.

Select one or more:


a. Using kernels, SVM can handle learning problems in which the decision boundaries are inherently !
non-linear.
b. When using kernels in SVM learning, the training algorithm needs to explicitly compute dot-product in
the transformed feature space 𝜙(𝑥)
c. In SVM learning, the training algorithm avoids explicitly computing dot-product in the !
transformed feature space by using kernels: k !x i ! x j " " ϕ !x i "#ϕ !x j "
d. There is NO need to tune any hyperparameters in applying the “kernel trick”.
e. Intuitively, one can think of k !x i ! x j " as computing some kind of “similarity” between data points !
x i !nd x j .
f. When using the Gaussian kernel (“rbf” in sklearn), the smaller value of hyperparameter γ! , the !
more flat (not so narrowly sharp) k !x! y " the function surface is around data point x .

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 8 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 12
Correct 5.00 points out of 5.00

Which one of the following best describes the main advantage of using Bayes Naive
Assumption for developing a practical classification method such as the Naive Bayes
Classifier?

Select one:
a.
Naive Bayes Assumption makes it easier for ordinary people to understand probability.

b.
Naive Bayes Assumption is always true in real-world applications.

c. Naive Bayes Assumption significantly reduces the number of probabilities to be estimated, thus !
making it more e!icient to build the classifier.

d. Naive Bayes Assumption enhances the modeling power of the related Naive Bayes Classifier.

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 9 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 13
Correct 8.00 points out of 8.00

Look at the following Bayes Belief Network:

List all variables in the network that is conditionally independent of D, given B.

Select one:
a. {A, C, E, G}
b. {A, C}
c. {A, E}
d. {A, C, E} !
e. {A, C, F}

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 10 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 14
Correct 10.00 points out of 10.00

Consider the following Bayes Net with the graph and the CPD tables:

What is the value of Pr(C) according to the network? Calculate the EXACT value.

Select one:
a.
0.26

b. 0.48

c. 0.55

d. !
0.52

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 11 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 15
Correct 10.00 points out of 10.00

The Bayes Net is the same as the previous question.

What is the value for Pr(B) according to the network? Calculate the EXACT value.

Select one:
a. 0.44

b. 0.75

c. 0.34

d. 0.5 !

e. 0.50

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 12 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 16
Correct 10.00 points out of 10.00

Assume a diagnostic test is performed for a patient for screening cancer and returns
a positive result. The test is not perfect. It returns a positive result only in 95% of
cases when the cancer is present, and it returns a negative result only in 97% cases
when the patient does NOT have cancer. Moreover, the cancer is quite rare, it occurs
in 0.009 of the population.
If we represent the event of having cancer by proposition C and the event of the
positive test result by +. What is the value for the numerator in calculating the Pr(C
|+), the posterior probability of cancer C a"er observing the positive test result +?
Remember by the Bayes theorem, Pr(C|+) =

Here the question asks you to calculate the value for the numerator formula. Round
your calculated numerator value to 4 digits a"er the decimal point.

Answer: 0.0086 !

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 13 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 17
Correct 10.00 points out of 10.00

Here we are trying to apply the Bayes Optimal Classifier to predict the class label (+,
or -) of a new data instance x. Here a"er seeing a training data set D to train the
hypotheses in hypothesis space H, only 4 hypotheses h1 , h2 , h3 , and h4 have non-
zero posterior probabilities. We have
Pr(h1 | D) = 0.2, Pr(h1 (x) = +) = 0.5, Pr(h1 (x) = -) = 0.5,
Pr(h2 | D) = 0.3, Pr(h2 (x) = +) = 0.2, Pr(h2 (x) = -) = 0.8,
Pr(h3 | D) = 0.3, Pr(h Pr(h3 (x) = -) = 0.6,
3 (x) = +) = 0.4,

Pr(h4| D) = 0.2, Pr(h4 (x) = +) = 0.8, Pr(h4 (x) = -) = 0.2.

Your task: apply the Bayes Optimal Classifier, compute the probability Pr( c(x) = +|D).
Here c(x) is the prediction of class label by the target concept c for instance x.
Round your probability calculation to 2 digits a"er the decimal point - in fact, no
need to round, as the products (and their sum) are always just 2 digits a"er the
decimal point.

Answer: 0.44 !

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 14 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 18
Correct 11.00 points out of 11.00

Consider the following training dataset D shown in the table:

Training Dataset D
Exp_ID Shape Color Size Class
e1 circle blue large +
e2 circle red medium -
e3 circle red large -
e4 square blue small -
e5 square red small +
e6 square red medium +
e7 square blue medium +
e8 square blue large -
e9 triangle red small +
e10 triangle red large +
e11 triangle blue medium +

Assume that we trained a Naïve Bayes Classifier using the above dataset D. Now we
have a new instance
e12 = <circle, red, small>. What is the probability for the positive class "+" using the
Naïve Bayes Classifier formula for this instance e12?
Round your computed result to 3 digits a"er the decimal point.

Answer: 0.015 !

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 15 of 16
Final Exam: Attempt review 8/7/22, 11:29 PM

Question 19
Correct 5.00 points out of 5.00

Consider the following statement: “If random variables X, and Y are conditionally
independent given random variable Z, then X and Y must be also unconditionally
independent”. Is this statement true or false?

Select one:
True
False !

Jump to...

https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 16 of 16

You might also like