My Courses 2022 Second Summer CSC 7333 For Jianhua Chen Final Exam Final Exam

Final Exam: Attempt review 8/7/22, 11:29 PM
My Courses / 2022 Second Summer CSC 7333 for Jianhua Chen / Final Exam / Final Exam
Started on Sunday, August 7, 2022, 10:42 PM

State Finished
Completed on Sunday, August 7, 2022, 11:23 PM
Time taken 40 mins 34 secs
Grade 95.00 out of 100.00
Question 1
Correct 2.00 points out of 2.00
Which ONE of the following components is necessary for a well-defined learning

problem?
Select one:
A. Experience (in various forms, typically training data) the learner is exposed to in some task !
environment
B. Human teacher to provide helpful feedback
C. Logic formulas describing the background knowledge
D. Natural language interface between learner and teacher
Question 2
Incorrect 0.00 points out of 2.00
Consider the following statements (1)-(5) regarding the random forest learning
method:
(1) Random forest is exactly the same as bagging with decision trees
https://lsuonline.moodle.lsu.edu/mod/quiz/review.php?attempt=441110&cmid=264585 Page 1 of 16
(2) Random forest method proves on bagging with decision trees by adding a
randomized restriction on attributes used at each splitting node
(3) We need to build multiple training datasets by sampling from the single
original training dataset
(4) The sampling from the original dataset is sampling with replacement
(5) When making a prediction on a new instance, only the best tree is used for
prediction
You need to select ONE option below that identifies ALL TRUE statements among (1)
- (5).
Select one:
A. (1), (3) and (5)
B. (2), (3) and (5)
C. (1), (3) and (4)
D. (2), (4) and (5) "
E. (2), (3) and (4)
Question 3
Select ALL scenarios below such that regularization would help avoid overfitting
when doing gradient descent for linear (or logistic) regression:
Select one or more:

A. There are many input variables, and possibly not all of them are relevant to the target variable. !
B. There are many training data points.
C. We are using polynomial features with a polynomial of high degree. !
D. There are few input variables.
Question 4
When applying regularization for logistic regression, regularization is applied to ALL

parameters j , including the parameter 0
Select one:
A. True
B. False !
Question 5
Incorrect 0.00 points out of 3.00
One advantage of deep learning is that features automatically extracted by deep

networks can be adapted easily to other domains
Select one:
True
False "
Question 6
Learning in autoencoders is unsupervised learning.
Select one:
True !
False
Question 7
Consider the target function f , a Boolean function f(x1, x2) = x1 ∨ x2.

The distribution over the Boolean vectors for x1 and x2 values is given by below.
For a learned hypothesis h (x1, x2) = x1 ∧ x2, what is the true error errorD ( h) for this
hypothesis h? Calculate the EXACT value of the true error.
x1 x2 D(x1 , x2) f(x1 , x2) h(x1 , x2)
0 0 1/4 0 0
0 1 1/8 1 0
1 0 1/8 1 0
1 1 1/2 1 1
Select one:
a. 0.20
b. 0.25 or 0.250 !
c. 0.625
d. 0.26
Question 8
When training SVM classifier, we need to select a suitable value for the
hyperparameter C. We understand that if we choose C value too small, then:
Select one:
a. It does NOT impact the SVM learning result.
b. The slack variables would be forced to take small values, and the learned decision hyperplane would
have a small margin. So it may overfit the data.
c. The slack variables could be relatively large, and the learned decision hyperplane may have a !
larger margin but misclassify many data points. So it may underfit the data.
d. This will give the best SVM classifier.
Question 9
Select ALL statements below that are true for logistic regression:
Select one or more:

A. One can solve for the optimal hypothesis h analytically without using gradient descent.
B. Logistic regression handles classification problems instead of regression problems. !
C. The sigmoid function used in logistic regression is a non-linear function. !
D. If we use mean squared error as the loss function for training a logistic regression model, the loss
function is convex
E. The output of a logistic regression model can be interpreted as a probability. !
Question 10
Identify ALL true statements below that suggest possible ways to handle local
minimal and avoid overfitting in Backpropagation (BP) training:
1. Consider doing multiple BP runs each with randomly selected initial weights to
enhance the chance of finding the global minima
2. The learning rate should always be very small
3. Consider adding a regularization term to the error function to tackle overfitting
4. Consider adding momentum in BP training to help avoid local minima
5. When designing the network topology, we should use many hidden layer nodes
to fit the data well
Which choice below is correct?
Select one:
A. 1, 2, 4
B. 1, 3, 4 !
C. 3, 4, 5
D. 1, 4, 5
E. 2, 3
Question 11
Select ALL statements from below that are true about using the “kernel trick” in SVM
learning.
Select one or more:

a. Using kernels, SVM can handle learning problems in which the decision boundaries are inherently !
non-linear.
b. When using kernels in SVM learning, the training algorithm needs to explicitly compute dot-product in
the transformed feature space 𝜙(𝑥)
c. In SVM learning, the training algorithm avoids explicitly computing dot-product in the !
transformed feature space by using kernels: k !x i ! x j " " ϕ !x i "#ϕ !x j "
d. There is NO need to tune any hyperparameters in applying the “kernel trick”.
e. Intuitively, one can think of k !x i ! x j " as computing some kind of “similarity” between data points !
x i !nd x j .
f. When using the Gaussian kernel (“rbf” in sklearn), the smaller value of hyperparameter γ! , the !
more flat (not so narrowly sharp) k !x! y " the function surface is around data point x .
Question 12
Which one of the following best describes the main advantage of using Bayes Naive
Assumption for developing a practical classification method such as the Naive Bayes
Classifier?
Select one:
a.
Naive Bayes Assumption makes it easier for ordinary people to understand probability.
b.
Naive Bayes Assumption is always true in real-world applications.
c. Naive Bayes Assumption significantly reduces the number of probabilities to be estimated, thus !
making it more e!icient to build the classifier.
d. Naive Bayes Assumption enhances the modeling power of the related Naive Bayes Classifier.
Question 13
Look at the following Bayes Belief Network:
List all variables in the network that is conditionally independent of D, given B.
Select one:
a. {A, C, E, G}
b. {A, C}
c. {A, E}
d. {A, C, E} !
e. {A, C, F}
Question 14
Consider the following Bayes Net with the graph and the CPD tables:
What is the value of Pr(C) according to the network? Calculate the EXACT value.
Select one:
a.
0.26
b. 0.48
c. 0.55
d. !
0.52
Question 15
The Bayes Net is the same as the previous question.
What is the value for Pr(B) according to the network? Calculate the EXACT value.
Select one:
a. 0.44
b. 0.75
c. 0.34
d. 0.5 !
e. 0.50
Question 16
Assume a diagnostic test is performed for a patient for screening cancer and returns
a positive result. The test is not perfect. It returns a positive result only in 95% of
cases when the cancer is present, and it returns a negative result only in 97% cases
when the patient does NOT have cancer. Moreover, the cancer is quite rare, it occurs
in 0.009 of the population.
If we represent the event of having cancer by proposition C and the event of the
positive test result by +. What is the value for the numerator in calculating the Pr(C
|+), the posterior probability of cancer C a"er observing the positive test result +?
Remember by the Bayes theorem, Pr(C|+) =
Here the question asks you to calculate the value for the numerator formula. Round
your calculated numerator value to 4 digits a"er the decimal point.
Answer: 0.0086 !
Question 17
Here we are trying to apply the Bayes Optimal Classifier to predict the class label (+,
or -) of a new data instance x. Here a"er seeing a training data set D to train the
hypotheses in hypothesis space H, only 4 hypotheses h1 , h2 , h3 , and h4 have non-
zero posterior probabilities. We have
Pr(h1 | D) = 0.2, Pr(h1 (x) = +) = 0.5, Pr(h1 (x) = -) = 0.5,
Pr(h2 | D) = 0.3, Pr(h2 (x) = +) = 0.2, Pr(h2 (x) = -) = 0.8,
Pr(h3 | D) = 0.3, Pr(h Pr(h3 (x) = -) = 0.6,
3 (x) = +) = 0.4,
Pr(h4| D) = 0.2, Pr(h4 (x) = +) = 0.8, Pr(h4 (x) = -) = 0.2.
Your task: apply the Bayes Optimal Classifier, compute the probability Pr( c(x) = +|D).
Here c(x) is the prediction of class label by the target concept c for instance x.
Round your probability calculation to 2 digits a"er the decimal point - in fact, no
need to round, as the products (and their sum) are always just 2 digits a"er the
decimal point.
Answer: 0.44 !
Question 18
Consider the following training dataset D shown in the table:
Training Dataset D
Exp_ID Shape Color Size Class
e1 circle blue large +
e2 circle red medium -
e3 circle red large -
e4 square blue small -
e5 square red small +
e6 square red medium +
e7 square blue medium +
e8 square blue large -
e9 triangle red small +
e10 triangle red large +
e11 triangle blue medium +
Assume that we trained a Naïve Bayes Classifier using the above dataset D. Now we
have a new instance
e12 = <circle, red, small>. What is the probability for the positive class "+" using the
Naïve Bayes Classifier formula for this instance e12?
Round your computed result to 3 digits a"er the decimal point.
Answer: 0.015 !
Question 19
Consider the following statement: “If random variables X, and Y are conditionally
independent given random variable Z, then X and Y must be also unconditionally
independent”. Is this statement true or false?
Select one:
True
False !
Jump to...

My Courses 2022 Second Summer CSC 7333 For Jianhua Chen Final Exam Final Exam

Uploaded by

Copyright:

Available Formats

My Courses 2022 Second Summer CSC 7333 For Jianhua Chen Final Exam Final Exam

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

My Courses 2022 Second Summer CSC 7333 For Jianhua Chen Final Exam Final Exam

Uploaded by

Copyright:

Available Formats

Final Exam: Attempt review 8/7/22, 11:29 PM

Started on Sunday, August 7, 2022, 10:42 PM

Which ONE of the following components is necessary for a well-defined learning

B. Human teacher to provide helpful feedback

C. Logic formulas describing the background knowledge

D. Natural language interface between learner and teacher

B. (2), (3) and (5)

C. (1), (3) and (4)

D. (2), (4) and (5) "

E. (2), (3) and (4)

Select one or more:

When applying regularization for logistic regression, regularization is applied to ALL

One advantage of deep learning is that features automatically extracted by deep

Learning in autoencoders is unsupervised learning.

Consider the target function f , a Boolean function f(x1, x2) = x1 ∨ x2.

x1 x2 D(x1 , x2) f(x1 , x2) h(x1 , x2)

Select one or more:

Which choice below is correct?

Select one or more:

Look at the following Bayes Belief Network:

List all variables in the network that is conditionally independent of D, given B.

The Bayes Net is the same as the previous question.

Pr(h4| D) = 0.2, Pr(h4 (x) = +) = 0.8, Pr(h4 (x) = -) = 0.2.

Consider the following training dataset D shown in the table:

You might also like