Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Indian Institute of Technology Kharagpur

Centre of Excellence in Artificial Intelligence


AI61003 Linear Algebra for AI and ML
Assignment 2, Due on: October 20, 2021

ANSWER ALL THE QUESTIONS

1. Let A, B ∈ Rn×n . Prove that ∥AB∥2 ⩽ ∥A∥2 ∥B∥2 . This property of 2-norm
is called as sub-multiplicativity property. Does this property hold true for
Frobenius norm?

2. Let A ∈ Rn×n be an invertible matrix. Define max mag(A) and min mag(A)
and cond(A). Show that
1
(a) max mag(A) =
min mag(A−1 )
max mag(A)
(b) cond(A) =
min mag(A)
3. In each of the following cases, consider the matrix A ∈ Rm×n as a linear
function from Rn to Rm . Plot the unit sphere in Rn . Plot the ellipsoid obtained
in Rm as image of the unit sphere in Rn . Compute the condition number of
A (using inbuilt command). Further, if m = n, check whether the matrix is
invertible. Compute the determinant of A as well. Is there any relationship
between determinant and condition number?
 1 
−√ 0
 2 
(a) A =  0
 1 
− √ 
2
 
−1 1
 
−2 1 2
(b) A =
0 2 0
 
1 0.9
(c) A =
0.9 0.8
 
1 0
(d) A =
0 −10
 
1 1
(e) A = , where ε = 10, 5, 1, 10−1 , 10−2 , 10−4 , 0.
1 ε
4. For a matrix A with the property that the columns of A are linearly indepen-
dent, give the geometrical interpretation of the least squares solution to the
problem Ax = b and justify the name normal equations. In case, the matrix
A does not have linearly independent columns, comment on the nature of the
least squares solution.

1/3
5. Consider the system of linear equations Ax = b where A ∈ Rn×n is an invertible
matrix and b ∈ Rn is a given vector. Discuss the advantages in the case when
A is orthogonal.

6. Bi-linear interpolation: We are given scalar value at each of the M N grid


points of a grid in R2 with a typical grid point represented as Pij = (xi , yj )
where i = 1, 2, . . . , M and j = 1, 2, . . . , N and x1 < x2 < · · · < xM and
y1 < y2 < · · · < yN . Let the scalar value at the grid point Pij be referred to
as Fij for i = 1, 2, . . . , M and j = 1, 2, . . . , N . A bi-linear interpolation is a
function of the form

f (u, v) = θ1 + θ2 u + θ3 v + θ4 uv

where θ1 , θ2 , θ3 , θ4 are the coefficients. This function further satisfies f (Pij ) =


Fij for i = 1, 2, . . . , M and j = 1, 2, . . . , N .

(a) Express these interpolation conditions as a system linear equations of the


form Aθ = b where b is an M N vector consisting of Fij values. Write
clearly all the entries of A, θ and b and their sizes.
(b) What are the minimum values of M and N so that you may expect a
unique solution to the system of equations Aθ = b?

7. Iterative LS : Let A ∈ Rm×n have linearly independent columns and let b ∈ Rm


be a given vector. Further, let x
b denote the LS solution to the problem Ax = b.
(1)
Define x = 0 and for k = 0, 1, 2, . . .
1 ⊤
x(k+1) = x(k) − (k)

A Ax − b
∥A∥2

(a) Show that the sequence {x(k) } converges to x


b as k → ∞.
(b) Discuss the computational complexity of computing {x(k) } for any k ⩾ 1.
(c) Generate a 30 × 10 random matrix A and a 30 × 1 random vector b.
Check that the matrix is full column rank! Run the algorithm for 100
steps. Verify numerically that the algorithm converges to x
b.
(d) Do you think this iterative method may be computationally beneficial
over the direct methods of computing the LS solution?

8. Suppose that z1 , z2 , . . . , z100 is observed time series data. An autoregressive


model for this data has the following form.

zbt+1 = θ1 zt + · · · + θM zt−M +1 , t = M, M + 1, . . . , 100

where M is the memory or the lag of the model. This model can be used to
predict the next observation in the time series.

(a) Set up a least squares problem to estimate the parameters in the model.
(b) Clearly write down the matrices A and b in the least squares formulation.
(c) What is the special structure that one can observe in A?

2/3
(d) Is there any relation of rank of A with M ?

9. Polynomial Classifier: Generate 500 random vectors x(i) ∈ R2 for i = 1, 2, . . . , 500


from a standard normal distribution. Define, for i = 1, 2, . . . , 500,
(i) (i)

(i) +1 x1 x2 ⩾ 0
y =
−1 otherwise

Fit a polynomial least squares classifier of degree 2 to the data set using the
polynomial

fe(x) = θ1 + θ2 x1 + θ3 x2 + θ4 x1 x2 + θ5 x21 + θ6 x22

(a) Give the error rate of the classifier using the confusion matrix.
(b) Show the regions in the R2 plane where the classifier model fb(x) = 1 and
fb(x) = −1.
(c) Does the second degree polynomial g = x1 x2 classify the generated points
with zero error? Compare the parameters estimated polynomial model
from the data with those of g.

10. MNIST dataset: For each of the digit 0, 1, . . . , 9 randomly select 1000 images
to generate a training data set of size 10000 images. Similarly generate a test
data set of 1000 images as a test data set. Fit a linear least squares classifier to
classify the data set into 10 classes and test prediction accuracy of the model
using the 10 × 10 confusion matrix. Do not use any inbuit functions for fitting
the model.

****************** THE END ******************

3/3

You might also like