University
University
University
Question 1
Learning XOR function - 10pts
If we fit the dataset with a logistic regression model, what is the results of the gradient descent
training: Convergence or Non-Convergence?
1. The results of the gradient descent training is Non-Convergence. With the XOR function,
because the data is not linearly separable, so cannot find a line to help divide the 0 and 1
layers, so the problem has no solution.
It’s easy to catch that we cannot find a line to divide the 0 and 1 range
Try with 2d data:
1
Question 2
1. Apply Newton’s method to training the Logistic Regression Model (concretize updated
formulas of parameter sets w, w0 ).
2. Plot the learning path of Newton’s method and the Gradient descent’s method (like in
slide) with the AND dataset and XOR dataset. You should test with several initial values
of parameter sets w, w0 .
1
Note: hθ (x) = 1+e−z and z = θ1 x + θ2
We have:
2
Command Line
def newtons_method(x, y):
theta_1 = random
theta_2 = random
delta_l = np.Infinity
l = log_likelihood(x, y, theta_1, theta_2)
max_iterations = 15
i = 0
while abs(delta_l) > .0000001 and i < max_iterations:
i += 1
g = gradient(x, y, theta_1, theta_2)
hess = hessian(x, y, theta_1, theta_2)
H_inv = np.linalg.inv(hess)
delta = np.dot(H_inv, g)
delta_theta_1 = delta[0][0]
delta_theta_2 = delta[1][0]
# Update step
theta_1 += delta_theta_1
theta_2 += delta_theta_2
AND dataset: