EE263s Homework 4
EE263s Homework 4
EE263s Homework 4
EE263s homework 4
1. Orthogonal matrices.
(a) Show that if U and V are orthogonal, then so is U V .
(b) Show that if U is orthogonal, then so is U −1 .
(c) Suppose that U ∈ R2×2 is orthogonal. Show that U is either a rotation or a reflection. Make
clear how you decide whether a given orthogonal U is a rotation or reflection.
Solution.
(a) To prove that U V is orthogonal we have to show that (U V )T (U V ) = I given U T U = I and
V T V = I. We have
(U V )T (U V ) = V T UT UV
= V TV (since U T U = I)
= I (since V T V = I)
Clearly, from the lecture notes, this represents a rotation. Note that in this case det U =
cos2 α + sin2 α = 1.
1
• k is odd so sin(kπ − α) = sin α and cos(kπ − α) = − cos α, and therefore
cos α sin α
U= .
sin α − cos α
From the lecture notes, this represents a reflection. The determinant in this case is det U =
− cos2 α − sin2 α = −1.
Therefore we have shown that any orthogonal matrix in R2×2 is either a rotation or reflection
whether its determinant is +1 or −1 respectively.
2. Projection matrices. A matrix P ∈ Rn×n is called a projection matrix if P = P T and P 2 = P .
Solution.
(a) To show that I − P is a projection matrix we need to check two properties:
i. I − P = (I − P )T
ii. (I − P )2 = I − P .
The first one is easy: (I − P )T = I − P T = I − P because P = P T (P is a projection matrix.)
The show the second property we have
(I − P )2 = I − 2P + P 2
= I − 2P + P (since P = P 2 )
= I −P
(U U T )2 = (U U T )(U U T )
= U (U T U )U T
= UUT (since U T U = I).
T
(c) First note that A(AT A)−1 AT = A(AT A)−1 AT because
T T
A(AT A)−1 AT = (AT )T (AT A)−1 AT
−1 T
= A (AT A)T A
= A(AT A)−1 AT .
2
2
Also A(AT A)−1 AT = A(AT A)−1 AT because
2
A(AT A)−1 AT = A(AT A)−1 ATA(AT A)−1 AT
= A (AT A)−1 AT A (AT A)−1 AT
= A(AT A)−1 AT (since (AT A)−1 AT A = I).
(d) To show that P x is the projection of x on R(P ) we verify that the “error” x − P x is orthogonal
to any vector in R(P ). Since R(P ) is nothing but the span of the columns of P we only need to
show that x − P x is orthogonal to the columns of P , or in other words, P T (x − P x) = 0. But
P T (x − P x) = P (x − P x) (since P = P T )
= P x − P 2x
= 0 (since P 2 = P )
Q = I − 2uuT ,
Solution.
(a)
QT Q = (I − 2uuT )T (I − 2uuT )
= (I − 2uuT )(I − 2uuT )
= I − 2uuT − 2uuT + 4uuT uuT
= I − 2uuT − 2uuT + 4uuT using uT u = 1
= I so Q is orthogonal
(b)
Qu = u − 2uuT u = u − 2u = −u using uT u = 1
t T
Qv = v − 2uu v = v using u v = 0
3
Qn
(c) We know det(Q) = i=1 λi . Since Q is symmetric, all eigenvalues are real and we can construct
an orthonormal eigenvector basis. From parts (a) and (b), u is an eigenvector with associated
eigenvalue −1, and any vector v orthogonal to u is an eigenvector with associated eigenvalue 1.
The nullspace of uT has dimension n − 1, so we can construct an orthogonal eigenbasis with all
eigenvalues 1 except for the −1 eigenvalue with eigenvector u. Thus the product of the eigenvalues
is −1 = det(Q).
Alternate solution: We proved the matrix inversion lemma in class, and showed that if A and D
are invertible, the following factorizations hold:
A B I 0 A 0 I A−1 B
=
C D CA−1 I 0 D − CA−1 B 0 I
−1
−1
I BD A − BD C 0 I 0
=
0 I 0 D D−1 C I
det(I − 2uuT ) = −1
(d) Since Q is orthogonal, QT Q = I has all eigenvalues 1, hence all singular values of Q are 1, so
κ(Q) = 1 (i.e., Q is as well-conditioned as can be.)
(e) We follow the hint and choose u = (x + αe1 )/kx + αe1 k. Then
We can achieve this by choosing α = ±kxk. This leads to Qx = ∓kxke1 (which makes sense . . . Q
should always preserve norm). Some people used a geometric argument here as well, and this can
make the solution a lot neater if it’s well presented. The idea is to find a reflection plane that
reflects the given vector onto the e1 axis (there are two possibilities, for negative and positive
parts of the e1 axis), and u is then a unit vector orthogonal to this plane.
4
4. Interpolation with rational functions. In this problem we consider a function f : R → R of the form
a0 + a1 x + · · · + am xm
f (x) = ,
1 + b1 x + · · · + bm xm
where a0 , . . . , am , and b1 , . . . , bm are parameters, with either am 6= 0 or bm 6= 0. Such a function is
called a rational function of degree m. We are given data points x1 , . . . , xN ∈ R and y1 , . . . , yN ∈ R,
where yi = f (xi ). The problem is to find a rational function of smallest degree that is consistent with
this data. In other words, you are to find m, which should be as small as possible, and a0 , . . . , am ,
b1 , . . . , bm , which satisfy f (xi ) = yi . Explain how you will solve this problem, and then carry out
your method on the problem data given in ri_data.m. (This contains two vectors, x and y, that give
the values x1 , . . . , xN , and y1 , . . . , yN , respectively.) Give the value of m you find, and the coefficients
a0 , . . . , am , b1 , . . . , bm . Please show us your verification that yi = f (xi ) holds (possibly with some small
numerical errors).
This is a set of complicated nonlinear functions of the coefficient vectors a and b. If we multiply out
by the denominator, we get
yi (1 + b1 xi + · · · + bm xm m
i ) − (a0 + a1 xi + · · · + am xi ) = 0, i = 1, . . . , N.
These equations are linear in a and b. We can write these equations in matrix form as
a
G = y, (1)
b
where
a0 b1
a1
b2
a= .. , b= .. ,
. .
am bm
and
1 x1 ··· xm1 −y1 x1 −y1 x21 ··· −y1 xm
1
1 x2 ··· xm2 −y2 x2 −y2 x22 ··· −y2 xm
2
G= 1 x3 ··· xm3 −y3 x3 −y3 x23 ··· −y3 xm
3
.
.. .. .. .. .. ..
. . . . . .
1 xN ··· xm
N −yN xN −yN x2N ··· −yN xm
N
Thus, we can interpolate the data if and only if the equation (1) has a solution. Our problem is to find
the smallest m for which these linear equations can be solved, or, equivalently, y ∈ R(G). We can do
this by finding the smallest value of m for which
Then we can find a set of coefficients by solving the equation (1) for a and b. The following Matlab
code carries out this method.
clear all
close all
5
rat_int_data
for m=1:20 %we sweep over different values of m
G=ones(N,1);
for i=1:m;
G=[G x.^i];
end
for i=1:m
G=[G -x.^i.*y];
end
if rank(G)== rank([G y])
break;
end
end
ab=G\y;
a=ab(1:m+1);
b=ab(m+2:2*m+1);
m
a
b
m =
5
a =
0.2742
1.0291
1.2906
-5.8763
-2.6738
6.6845
b =
-1.2513
-6.5107
3.2754
17.3797
6.6845
Thus, we find that m = 5 is the lowest order rational function that interpolates the data, and a rational
function that interpolates the data is given by
num=zeros(N,1);
for i=1:m+1
num=a(i)*x.^(i-1)+num;
end
den=ones(N,1);
6
for i=1:m
den=b(i)*x.^i+den;
end
f=num./den;
err=norm(f-y)
err =
7.7649e-14.
Solution.
(a) We know that
R(A) = N (AT )⊥ .
This means that any y in R(A) is perpendicular to all vectors in the N (AT ); and any vector
which is perpendicular to all vectors in N (AT ), must be in R(A). We will show that
⊥
R(A) ∩ R(B) = N (AT ) + N (B T ) .
Let y be a vector in R(A) ∩ R(B). Then y = Axa , for some xa and y = Bxb , for some xb . Let
v be any vector in the N (AT ) + N (B T ). Then v = va + vb for some va ∈ N (B T ), vb ∈ N (B T ).
Then we have,
y T v = y T va + y T vb = xTa AT va + xTb B T vb = xTa (AT va ) + xTb (B T vb ) = 0.
Thus y ⊥ N (AT ) + N (B T
) . Since any vector in (R(A) ∩ R(B)) is perpendicular to every vector
T T
in N (A ) + N (B ) ,
⊥
R(A) ∩ R(B) ⊆ N (AT ) + N (B T ) .
⊥
Let y be a vector in N (AT ) + N (B T ) . Then y is perpendicular to all vectors in N (AT ) which
means y ∈ R(A). Similarly y is perpendicular to all vectors in N (B T ) which means y ∈ R(B).
Thus y ∈ (R(A) ∩ R(B)) and we have,
⊥
R(A) ∩ R(B) = N (AT ) + N (B T ) .
7
The full QR factorization of a matrix A is,
R1A
A = [Q1A Q2A ] ,
0
and then C = Q2D as N (DT ) = R(Q2D ) = R(C). Thus we have the matrix C such that
R(C) = R(A) ∩ R(B).
(b) The following Matlab code gives the required matrix C and the dimension of R(C).
clear ;
intersect_range_data;
Q_2A = null(A’);
Q_2B = null(B’);
D = [Q_2A Q_2B];
C = null(D’)
rC = rank(C)
>>
C =
-0.3365 -0.2349 0.3581
0.2927 -0.4471 -0.0277
-0.6691 0.0460 0.0131
0.1963 0.3655 0.2581
0.3599 -0.1406 -0.1416
-0.0929 0.1880 -0.5108
0.1967 0.4497 0.3712
0.2019 -0.5007 0.0800
0.2901 0.2292 0.2283
-0.1140 -0.2208 0.5718
rC = 3
Show R(C) ⊆ R(A) and R(C) ⊆ R(B).
rA = rank(A)
rAC = rank([A C])
rB = rank(B)
rBC = rank([B C])
>>
8
rA = 6
rAC = 6
rB = 5
rBC = 5
6. Signal estimation using least-squares. This problem concerns discrete-time signals defined for t =
1, . . . , 500. We’ll represent these signals by vectors in R500 , with the index corresponding to the time.
We are given a noisy measurement ymeas (1), . . . , ymeas (500), of a signal y(1), . . . , y(500) that is thought
to be, at least approximately, a linear combination of the 22 signals
2 2 t − 50k 2 2
fk (t) = e−(t−50k) /25 , gk (t) = e−(t−50k) /25 ,
10
where t = 1, . . . , 500 and k = 0, . . . , 10. Plots of f4 and g7 (as examples) are shown below.
1.5
1
f4 (t)
0.5
−0.5
0 50 100 150 200 250 300 350 400 450 500
1.5
0.5
g7 (t)
−0.5
−1
−1.5
0 50 100 150 200 250 300 350 400 450 500
t
As our estimate of the original signal, we will use the signal ŷ = (ŷ(1), . . . , ŷ(500)) in the span of
f0 , . . . , f10 , g0 , . . . , g10 , that is closest to ymeas = (ymeas (1), . . . , ymeas (500)) in the RMS (root-mean-
square) sense. Explain how to find ŷ, and carry out your method on the signal ymeas given in
sig_est_data.m on the course web site. Plot ymeas and ŷ on the same graph. Plot the residual
(the difference between these two signals) on a different graph, and give its RMS value.
Solution. We’ll form the estimated signal as a linear combination of f0 , . . . , f10 , g0 , . . . , g10 ,
ŷ = x1 f0 + x2 f1 + · · · + x11 f10 + x12 g0 + · · · + x22 g10 .
(Here we are representing the signals as vectors in R500 .) We can write this in matrix form as ŷ = Ax,
where
A = [f0 f1 · · · f10 g0 · · · g10 ] ∈ R500×22 .
The coefficients x are chosen to minimize the RMS deviation between ŷ and ymeas , which is the same
as minimizing the norm of the difference. The matrix A is full rank (i.e., 22), so the best coefficients
are given by
xls = (AT A)−1 AT ymeas .
9
Our estimate of the original signal is
sig_est_data;
nfcts = 22; ydim = 500;
t = 1:ydim;
A = zeros(ydim,nfcts);
for k = 1:nfcts/2
fk = exp(-(t-50*(k-1)).^2/25^2);
gk = (t-50*(k-1))/10.*exp(-(t-50*(k-1)).^2/25^2);
A(:,k) = fk’; A(:,k+11) = gk’;
end
yhat = A*(A\ymeas);
residual = ymeas-yhat;
RMS = 1/sqrt(500)*norm(residual);
figure(1); plot(t,ymeas,’g--’,t,yhat,’k’);
xlabel(’time’); ylabel(’fit’);
figure(2); plot(t,ymeas-yhat);
xlabel(’time’); ylabel(’residual’);
>> rank(A) = 22
>> RMS = 0.4596
The figure below shows the measured signal ymeas and the estimated signal ŷ. The RMS value of the
residual is 0.4596. The next figure shows the residual.
1.5
0.5
ymeas and ŷ
−0.5
−1
−1.5
−2
0 50 100 150 200 250 300 350 400 450 500
10
1.5
−0.5
−1
−1.5
0 50 100 150 200 250 300 350 400 450 500
11