EE263s Homework 4
EE263s Homework 4
EE263s Homework 4
EE263s homework 4
1. Orthogonal matrices.
(a) Show that if U and V are orthogonal, then so is U V .
(b) Show that if U is orthogonal, then so is U −1 .
(c) Suppose that U ∈ R2×2 is orthogonal. Show that U is either a rotation or a reflection. Make
clear how you decide whether a given orthogonal U is a rotation or reflection.
(a) To prove that U V is orthogonal we have to show that (U V )T (U V ) = I given U T U = I and
V T V = I. We have
(U V )T (U V ) = V T UT UV
= V TV (since U T U = I)
= I (since V T V = I)
Clearly, from the lecture notes, this represents a rotation. Note that in this case det U =
cos2 α + sin2 α = 1.
• k is odd so sin(kπ − α) = sin α and cos(kπ − α) = − cos α, and therefore
cos α sin α
U= .
sin α − cos α
From the lecture notes, this represents a reflection. The determinant in this case is det U =
− cos2 α − sin2 α = −1.
Therefore we have shown that any orthogonal matrix in R2×2 is either a rotation or reflection
whether its determinant is +1 or −1 respectively.
2. Projection matrices. A matrix P ∈ Rn×n is called a projection matrix if P = P T and P 2 = P .
(a) To show that I − P is a projection matrix we need to check two properties:
i. I − P = (I − P )T
ii. (I − P )2 = I − P .
The first one is easy: (I − P )T = I − P T = I − P because P = P T (P is a projection matrix.)
The show the second property we have
(I − P )2 = I − 2P + P 2
= I − 2P + P (since P = P 2 )
= I −P
(U U T )2 = (U U T )(U U T )
= U (U T U )U T
= UUT (since U T U = I).
(c) First note that A(AT A)−1 AT = A(AT A)−1 AT because
A(AT A)−1 AT = (AT )T (AT A)−1 AT
−1 T
= A (AT A)T A
= A(AT A)−1 AT .
Also A(AT A)−1 AT = A(AT A)−1 AT because
A(AT A)−1 AT = A(AT A)−1 ATA(AT A)−1 AT
= A (AT A)−1 AT A (AT A)−1 AT
= A(AT A)−1 AT (since (AT A)−1 AT A = I).
(d) To show that P x is the projection of x on R(P ) we verify that the “error” x − P x is orthogonal
to any vector in R(P ). Since R(P ) is nothing but the span of the columns of P we only need to
show that x − P x is orthogonal to the columns of P , or in other words, P T (x − P x) = 0. But
P T (x − P x) = P (x − P x) (since P = P T )
= P x − P 2x
= 0 (since P 2 = P )
Q = I − 2uuT ,
QT Q = (I − 2uuT )T (I − 2uuT )
= (I − 2uuT )(I − 2uuT )
= I − 2uuT − 2uuT + 4uuT uuT
= I − 2uuT − 2uuT + 4uuT using uT u = 1
= I so Q is orthogonal
Qu = u − 2uuT u = u − 2u = −u using uT u = 1
t T
Qv = v − 2uu v = v using u v = 0
(c) We know det(Q) = i=1 λi . Since Q is symmetric, all eigenvalues are real and we can construct
an orthonormal eigenvector basis. From parts (a) and (b), u is an eigenvector with associated
eigenvalue −1, and any vector v orthogonal to u is an eigenvector with associated eigenvalue 1.
The nullspace of uT has dimension n − 1, so we can construct an orthogonal eigenbasis with all
eigenvalues 1 except for the −1 eigenvalue with eigenvector u. Thus the product of the eigenvalues
is −1 = det(Q).
Alternate solution: We proved the matrix inversion lemma in class, and showed that if A and D
are invertible, the following factorizations hold:
A B I 0 A 0 I A−1 B
C D CA−1 I 0 D − CA−1 B 0 I
I BD A − BD C 0 I 0
0 I 0 D D−1 C I
det(I − 2uuT ) = −1
(d) Since Q is orthogonal, QT Q = I has all eigenvalues 1, hence all singular values of Q are 1, so
κ(Q) = 1 (i.e., Q is as well-conditioned as can be.)
(e) We follow the hint and choose u = (x + αe1 )/kx + αe1 k. Then
We can achieve this by choosing α = ±kxk. This leads to Qx = ∓kxke1 (which makes sense . . . Q
should always preserve norm). Some people used a geometric argument here as well, and this can
make the solution a lot neater if it’s well presented. The idea is to find a reflection plane that
reflects the given vector onto the e1 axis (there are two possibilities, for negative and positive
parts of the e1 axis), and u is then a unit vector orthogonal to this plane.
4. Interpolation with rational functions. In this problem we consider a function f : R → R of the form
a0 + a1 x + · · · + am xm
f (x) = ,
1 + b1 x + · · · + bm xm
where a0 , . . . , am , and b1 , . . . , bm are parameters, with either am 6= 0 or bm 6= 0. Such a function is
called a rational function of degree m. We are given data points x1 , . . . , xN ∈ R and y1 , . . . , yN ∈ R,
where yi = f (xi ). The problem is to find a rational function of smallest degree that is consistent with
this data. In other words, you are to find m, which should be as small as possible, and a0 , . . . , am ,
b1 , . . . , bm , which satisfy f (xi ) = yi . Explain how you will solve this problem, and then carry out
your method on the problem data given in ri_data.m. (This contains two vectors, x and y, that give
the values x1 , . . . , xN , and y1 , . . . , yN , respectively.) Give the value of m you find, and the coefficients
a0 , . . . , am , b1 , . . . , bm . Please show us your verification that yi = f (xi ) holds (possibly with some small
numerical errors).
This is a set of complicated nonlinear functions of the coefficient vectors a and b. If we multiply out
by the denominator, we get
yi (1 + b1 xi + · · · + bm xm m
i ) − (a0 + a1 xi + · · · + am xi ) = 0, i = 1, . . . , N.
These equations are linear in a and b. We can write these equations in matrix form as
G = y, (1)
a0 b1
a= .. , b= .. ,
. .
am bm
1 x1 ··· xm1 −y1 x1 −y1 x21 ··· −y1 xm
1 x2 ··· xm2 −y2 x2 −y2 x22 ··· −y2 xm
G= 1 x3 ··· xm3 −y3 x3 −y3 x23 ··· −y3 xm
.. .. .. .. .. ..
. . . . . .
1 xN ··· xm
N −yN xN −yN x2N ··· −yN xm
Thus, we can interpolate the data if and only if the equation (1) has a solution. Our problem is to find
the smallest m for which these linear equations can be solved, or, equivalently, y ∈ R(G). We can do
this by finding the smallest value of m for which
Then we can find a set of coefficients by solving the equation (1) for a and b. The following Matlab
code carries out this method.
clear all
close all
for m=1:20 %we sweep over different values of m
for i=1:m;
G=[G x.^i];
for i=1:m
G=[G -x.^i.*y];
if rank(G)== rank([G y])
m =
a =
b =
Thus, we find that m = 5 is the lowest order rational function that interpolates the data, and a rational
function that interpolates the data is given by
for i=1:m+1
for i=1:m
err =
(a) We know that
R(A) = N (AT )⊥ .
This means that any y in R(A) is perpendicular to all vectors in the N (AT ); and any vector
which is perpendicular to all vectors in N (AT ), must be in R(A). We will show that
R(A) ∩ R(B) = N (AT ) + N (B T ) .
Let y be a vector in R(A) ∩ R(B). Then y = Axa , for some xa and y = Bxb , for some xb . Let
v be any vector in the N (AT ) + N (B T ). Then v = va + vb for some va ∈ N (B T ), vb ∈ N (B T ).
Then we have,
y T v = y T va + y T vb = xTa AT va + xTb B T vb = xTa (AT va ) + xTb (B T vb ) = 0.
Thus y ⊥ N (AT ) + N (B T
) . Since any vector in (R(A) ∩ R(B)) is perpendicular to every vector
in N (A ) + N (B ) ,
R(A) ∩ R(B) ⊆ N (AT ) + N (B T ) .
Let y be a vector in N (AT ) + N (B T ) . Then y is perpendicular to all vectors in N (AT ) which
means y ∈ R(A). Similarly y is perpendicular to all vectors in N (B T ) which means y ∈ R(B).
Thus y ∈ (R(A) ∩ R(B)) and we have,
R(A) ∩ R(B) = N (AT ) + N (B T ) .
The full QR factorization of a matrix A is,
A = [Q1A Q2A ] ,
and then C = Q2D as N (DT ) = R(Q2D ) = R(C). Thus we have the matrix C such that
R(C) = R(A) ∩ R(B).
(b) The following Matlab code gives the required matrix C and the dimension of R(C).
clear ;
Q_2A = null(A’);
Q_2B = null(B’);
D = [Q_2A Q_2B];
C = null(D’)
rC = rank(C)
C =
-0.3365 -0.2349 0.3581
0.2927 -0.4471 -0.0277
-0.6691 0.0460 0.0131
0.1963 0.3655 0.2581
0.3599 -0.1406 -0.1416
-0.0929 0.1880 -0.5108
0.1967 0.4497 0.3712
0.2019 -0.5007 0.0800
0.2901 0.2292 0.2283
-0.1140 -0.2208 0.5718
rC = 3
Show R(C) ⊆ R(A) and R(C) ⊆ R(B).
rA = rank(A)
rAC = rank([A C])
rB = rank(B)
rBC = rank([B C])
rA = 6
rAC = 6
rB = 5
rBC = 5
6. Signal estimation using least-squares. This problem concerns discrete-time signals defined for t =
1, . . . , 500. We’ll represent these signals by vectors in R500 , with the index corresponding to the time.
We are given a noisy measurement ymeas (1), . . . , ymeas (500), of a signal y(1), . . . , y(500) that is thought
to be, at least approximately, a linear combination of the 22 signals
2 2 t − 50k 2 2
fk (t) = e−(t−50k) /25 , gk (t) = e−(t−50k) /25 ,
where t = 1, . . . , 500 and k = 0, . . . , 10. Plots of f4 and g7 (as examples) are shown below.
f4 (t)
0 50 100 150 200 250 300 350 400 450 500
g7 (t)
0 50 100 150 200 250 300 350 400 450 500
As our estimate of the original signal, we will use the signal ŷ = (ŷ(1), . . . , ŷ(500)) in the span of
f0 , . . . , f10 , g0 , . . . , g10 , that is closest to ymeas = (ymeas (1), . . . , ymeas (500)) in the RMS (root-mean-
square) sense. Explain how to find ŷ, and carry out your method on the signal ymeas given in
sig_est_data.m on the course web site. Plot ymeas and ŷ on the same graph. Plot the residual
(the difference between these two signals) on a different graph, and give its RMS value.
Solution. We’ll form the estimated signal as a linear combination of f0 , . . . , f10 , g0 , . . . , g10 ,
ŷ = x1 f0 + x2 f1 + · · · + x11 f10 + x12 g0 + · · · + x22 g10 .
(Here we are representing the signals as vectors in R500 .) We can write this in matrix form as ŷ = Ax,
A = [f0 f1 · · · f10 g0 · · · g10 ] ∈ R500×22 .
The coefficients x are chosen to minimize the RMS deviation between ŷ and ymeas , which is the same
as minimizing the norm of the difference. The matrix A is full rank (i.e., 22), so the best coefficients
are given by
xls = (AT A)−1 AT ymeas .
Our estimate of the original signal is
nfcts = 22; ydim = 500;
t = 1:ydim;
A = zeros(ydim,nfcts);
for k = 1:nfcts/2
fk = exp(-(t-50*(k-1)).^2/25^2);
gk = (t-50*(k-1))/10.*exp(-(t-50*(k-1)).^2/25^2);
A(:,k) = fk’; A(:,k+11) = gk’;
yhat = A*(A\ymeas);
residual = ymeas-yhat;
RMS = 1/sqrt(500)*norm(residual);
figure(1); plot(t,ymeas,’g--’,t,yhat,’k’);
xlabel(’time’); ylabel(’fit’);
figure(2); plot(t,ymeas-yhat);
xlabel(’time’); ylabel(’residual’);
>> rank(A) = 22
>> RMS = 0.4596
The figure below shows the measured signal ymeas and the estimated signal ŷ. The RMS value of the
residual is 0.4596. The next figure shows the residual.
ymeas and ŷ
0 50 100 150 200 250 300 350 400 450 500
0 50 100 150 200 250 300 350 400 450 500