Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

EE263s Homework 4

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

EE263s Summer 2009-10 Laurent Lessard

EE263s homework 4
1. Orthogonal matrices.
(a) Show that if U and V are orthogonal, then so is U V .
(b) Show that if U is orthogonal, then so is U −1 .
(c) Suppose that U ∈ R2×2 is orthogonal. Show that U is either a rotation or a reflection. Make
clear how you decide whether a given orthogonal U is a rotation or reflection.

Solution.
(a) To prove that U V is orthogonal we have to show that (U V )T (U V ) = I given U T U = I and
V T V = I. We have
(U V )T (U V ) = V T UT UV
= V TV (since U T U = I)
= I (since V T V = I)

and we are done.


(b) Since U is square and orthogonal we have U −1 = U T and therefore by taking inverses of both
sides U = (U T )−1 or equivalently U = (U −1 )T (the inverse and transpose operations commute.)
But U T U = I and by substitution U −1 (U −1 )T = I. Since U −1 is square this also implies that
(U −1 )T U −1 = I so U −1 is orthogonal.
 
a b
(c) Suppose that U = ∈ R2×2 is orthogonal. This is true if and only if
c d
• columns of U are of unit length, i.e., a2 + c2 = 1 and b2 + d2 = 1,
• columns of U are orthogonal, i.e., ab + cd = 0.
Since a2 + c2 = 1 we can take a and c as the cosine and sine of an angle α respectively, i.e.,
a = cos α and c = sin α. For a similar reason, we can take b = sin β and d = cos β. Now
ab + cd = 0 becomes
cos α sin β + sin α cos β = 0
or
sin(α + β) = 0.
The sine of an angle is zero if and only if the angle is an integer multiple of π. So α + β = kπ or
β = kπ − α with k ∈ Z. Therefore
 
cos α sin(kπ − α)
U= .
sin α cos(kπ − α)

Now two things can happen:


• k is even so sin(kπ − α) = − sin α and cos(kπ − α) = cos α, and therefore
 
cos α − sin α
U= .
sin α cos α

Clearly, from the lecture notes, this represents a rotation. Note that in this case det U =
cos2 α + sin2 α = 1.

1
• k is odd so sin(kπ − α) = sin α and cos(kπ − α) = − cos α, and therefore
 
cos α sin α
U= .
sin α − cos α

From the lecture notes, this represents a reflection. The determinant in this case is det U =
− cos2 α − sin2 α = −1.
Therefore we have shown that any orthogonal matrix in R2×2 is either a rotation or reflection
whether its determinant is +1 or −1 respectively.
2. Projection matrices. A matrix P ∈ Rn×n is called a projection matrix if P = P T and P 2 = P .

(a) Show that if P is a projection matrix then so is I − P .


(b) Suppose that the columns of U ∈ Rn×k are orthonormal. Show that U U T is a projection matrix.
(Later we will show that the converse is true: every projection matrix can be expressed as U U T
for some U with orthonormal columns.)
(c) Suppose A ∈ Rn×k is full rank, with k ≤ n. Show that A(AT A)−1 AT is a projection matrix.
(d) If S ⊆ Rn and x ∈ Rn , the point y in S closest to x is called the projection of x on S. Show
that if P is a projection matrix, then y = P x is the projection of x on R(P ). (Which is why such
matrices are called projection matrices . . . )

Solution.
(a) To show that I − P is a projection matrix we need to check two properties:
i. I − P = (I − P )T
ii. (I − P )2 = I − P .
The first one is easy: (I − P )T = I − P T = I − P because P = P T (P is a projection matrix.)
The show the second property we have

(I − P )2 = I − 2P + P 2
= I − 2P + P (since P = P 2 )
= I −P

and we are done.


(b) Since the columns of U are orthonormal we have U T U = I. Using this fact it is easy to prove
that U U T is a projection matrix, i.e., (U U T )T = U U T and (U U T )2 = U U T . Clearly, (U U T )T =
(U T )T U T = U U T and

(U U T )2 = (U U T )(U U T )
= U (U T U )U T
= UUT (since U T U = I).
T
(c) First note that A(AT A)−1 AT = A(AT A)−1 AT because
T T
A(AT A)−1 AT = (AT )T (AT A)−1 AT
−1 T
= A (AT A)T A
= A(AT A)−1 AT .

2
2
Also A(AT A)−1 AT = A(AT A)−1 AT because
2  
A(AT A)−1 AT = A(AT A)−1 ATA(AT A)−1 AT

= A (AT A)−1 AT A (AT A)−1 AT
= A(AT A)−1 AT (since (AT A)−1 AT A = I).

(d) To show that P x is the projection of x on R(P ) we verify that the “error” x − P x is orthogonal
to any vector in R(P ). Since R(P ) is nothing but the span of the columns of P we only need to
show that x − P x is orthogonal to the columns of P , or in other words, P T (x − P x) = 0. But

P T (x − P x) = P (x − P x) (since P = P T )
= P x − P 2x
= 0 (since P 2 = P )

and we are done.


3. Householder reflections. A Householder matrix is defined as

Q = I − 2uuT ,

where u ∈ Rn is normalized, that is, uT u = 1.


(a) Show that Q is orthogonal.
(b) Show that Qu = −u. Show that Qv = v, for any v such that uT v = 0. Thus, multiplication by Q
gives reflection through the plane with normal vector u.
(c) Show that det Q = −1.
(d) Given a vector x ∈ Rn , find a unit-length vector u for which Qx lies on the line through e1 . Hint:
Try a u of the form u = v/kvk, with v = x + αe1 (find the appropriate α and show that such
a u works . . . ) Compute such a u for x = (3, 2, 4, 1, 5). Apply the corresponding Householder
reflection to x to find Qx.
Note: Multiplication by an orthogonal matrix has very good numerical properties, in the sense that it
does not accumulate much roundoff error. For this reason, Householder reflections are used as building
blocks for fast, numerically sound algorithms.

Solution.
(a)

QT Q = (I − 2uuT )T (I − 2uuT )
= (I − 2uuT )(I − 2uuT )
= I − 2uuT − 2uuT + 4uuT uuT
= I − 2uuT − 2uuT + 4uuT using uT u = 1
= I so Q is orthogonal

(b)

Qu = u − 2uuT u = u − 2u = −u using uT u = 1
t T
Qv = v − 2uu v = v using u v = 0

3
Qn
(c) We know det(Q) = i=1 λi . Since Q is symmetric, all eigenvalues are real and we can construct
an orthonormal eigenvector basis. From parts (a) and (b), u is an eigenvector with associated
eigenvalue −1, and any vector v orthogonal to u is an eigenvector with associated eigenvalue 1.
The nullspace of uT has dimension n − 1, so we can construct an orthogonal eigenbasis with all
eigenvalues 1 except for the −1 eigenvalue with eigenvector u. Thus the product of the eigenvalues
is −1 = det(Q).
Alternate solution: We proved the matrix inversion lemma in class, and showed that if A and D
are invertible, the following factorizations hold:
     
A B I 0 A 0 I A−1 B
=
C D CA−1 I 0 D − CA−1 B 0 I
 −1
  −1
 
I BD A − BD C 0 I 0
=
0 I 0 D D−1 C I

Taking determinants, we obtain the formula:


 
A B
det = det(A) det(D − CA−1 B) = det(D) det(A − BD−1 C)
C D

Apply this formula to the matrix:


   
A B I u
= 1
C D uT 2
 
and we obtain the relation: det(I) det 21 − uT u = det 12 det(I − 2uuT ). Since u is normalized,
uT u = 1, so we can evaluate all the pieces of this equation, and deduce that:

det(I − 2uuT ) = −1

(d) Since Q is orthogonal, QT Q = I has all eigenvalues 1, hence all singular values of Q are 1, so
κ(Q) = 1 (i.e., Q is as well-conditioned as can be.)
(e) We follow the hint and choose u = (x + αe1 )/kx + αe1 k. Then

(x + αe1 )(x + αe1 )T


Q = I −2
(x + αe1 )T (x + αe1 )
x(xT + αeT1 ) + αe1 (xT + αeT1 )
= I −2 T
x x + αeT1 x + αxT e1 + α2 eT1 e1
x(kxk2 + αeT1 x) + e1 (αkxk2 + α2 x1
Qx = x−2
kxk2 + 2αx1 + α2
2
2kxk + 2αx1 kxk2 + αx1
= x− x − 2α e1
kxk2 + 2αx1 + α2 kxk2 + 2αx1 + α2
 
2kxk2 + 2αx1 kxk2 + αx1
= 1− x − 2α e1
kxk2 + 2αx1 + α2 kxk2 + 2αx1 + α2
| {z }
Need this zero

We can achieve this by choosing α = ±kxk. This leads to Qx = ∓kxke1 (which makes sense . . . Q
should always preserve norm). Some people used a geometric argument here as well, and this can
make the solution a lot neater if it’s well presented. The idea is to find a reflection plane that
reflects the given vector onto the e1 axis (there are two possibilities, for negative and positive
parts of the e1 axis), and u is then a unit vector orthogonal to this plane.

4
4. Interpolation with rational functions. In this problem we consider a function f : R → R of the form
a0 + a1 x + · · · + am xm
f (x) = ,
1 + b1 x + · · · + bm xm
where a0 , . . . , am , and b1 , . . . , bm are parameters, with either am 6= 0 or bm 6= 0. Such a function is
called a rational function of degree m. We are given data points x1 , . . . , xN ∈ R and y1 , . . . , yN ∈ R,
where yi = f (xi ). The problem is to find a rational function of smallest degree that is consistent with
this data. In other words, you are to find m, which should be as small as possible, and a0 , . . . , am ,
b1 , . . . , bm , which satisfy f (xi ) = yi . Explain how you will solve this problem, and then carry out
your method on the problem data given in ri_data.m. (This contains two vectors, x and y, that give
the values x1 , . . . , xN , and y1 , . . . , yN , respectively.) Give the value of m you find, and the coefficients
a0 , . . . , am , b1 , . . . , bm . Please show us your verification that yi = f (xi ) holds (possibly with some small
numerical errors).

Solution. The interpolation condition f (xi ) = yi is


a0 + a1 xi + · · · + am xm i
f (xi ) = = yi , i = 1, . . . , N.
1 + b1 xi + · · · + bm xm
i

This is a set of complicated nonlinear functions of the coefficient vectors a and b. If we multiply out
by the denominator, we get

yi (1 + b1 xi + · · · + bm xm m
i ) − (a0 + a1 xi + · · · + am xi ) = 0, i = 1, . . . , N.

These equations are linear in a and b. We can write these equations in matrix form as
 
a
G = y, (1)
b

where    
a0 b1

 a1 


 b2 

a= .. , b= .. ,
 .   . 
am bm
and  
1 x1 ··· xm1 −y1 x1 −y1 x21 ··· −y1 xm
1

 1 x2 ··· xm2 −y2 x2 −y2 x22 ··· −y2 xm
2



G= 1 x3 ··· xm3 −y3 x3 −y3 x23 ··· −y3 xm
3

.
 .. .. .. .. .. .. 
 . . . . . . 
1 xN ··· xm
N −yN xN −yN x2N ··· −yN xm
N

Thus, we can interpolate the data if and only if the equation (1) has a solution. Our problem is to find
the smallest m for which these linear equations can be solved, or, equivalently, y ∈ R(G). We can do
this by finding the smallest value of m for which

Rank(G) = Rank([G y]).

Then we can find a set of coefficients by solving the equation (1) for a and b. The following Matlab
code carries out this method.

clear all
close all

5
rat_int_data
for m=1:20 %we sweep over different values of m
G=ones(N,1);
for i=1:m;
G=[G x.^i];
end
for i=1:m
G=[G -x.^i.*y];
end
if rank(G)== rank([G y])
break;
end
end
ab=G\y;
a=ab(1:m+1);
b=ab(m+2:2*m+1);
m
a
b

Matlab produces the following output:

m =
5
a =
0.2742
1.0291
1.2906
-5.8763
-2.6738
6.6845
b =
-1.2513
-6.5107
3.2754
17.3797
6.6845

Thus, we find that m = 5 is the lowest order rational function that interpolates the data, and a rational
function that interpolates the data is given by

0.2742 + 1.0291x + 1.2906x2 − 5.8763x3 − 2.6738x4 + 6.6845x5


f (x) =
1 − 1.2513x − 6.5107x2 + 3.2754x3 + 17.3797x4 + 6.6845x5
(we have truncated the coefficients to shorten the formula). We now verify that this expression inter-
polates the given points.

num=zeros(N,1);
for i=1:m+1
num=a(i)*x.^(i-1)+num;
end
den=ones(N,1);

6
for i=1:m
den=b(i)*x.^i+den;
end
f=num./den;
err=norm(f-y)

Matlab produces the following output

err =
7.7649e-14.

This shows that the output is interpolated up to numerical precision.


5. Finding a basis for the intersection of ranges.
(a) Suppose you are given two matrices, A ∈ Rn×p and B ∈ Rn×q . Explain how you can find a
matrix C ∈ Rn×r , with independent columns, for which
R(C) = R(A) ∩ R(B).
This means that the columns of C are a basis for R(A) ∩ R(B).

Hint: begin by showing that if S1 and S2 are subspaces of Rn , then (S1 ∩ S2 ) = S1⊥ + S2⊥, where
the notation “+” is overloaded for subspaces to mean: S1 + S2 = {x1 + x2 | x1 ∈ S1 , x2 ∈ S2 }.
Note that S1 + S2 is again a subspace.
(b) Carry out the method described in part (a) for the particular matrices A and B defined in
intersect_range_data.m. Be sure to give us your matrix C, as well as the Matlab (or other)
code that generated it. Verify that R(C) ⊆ R(A) and R(C) ⊆ R(B), by showing that each
column of C is in the range of A, and also in the range of B.
Please carefully separate your answers to part (a) (the general case) and part (b) (the specific case).

Solution.
(a) We know that
R(A) = N (AT )⊥ .
This means that any y in R(A) is perpendicular to all vectors in the N (AT ); and any vector
which is perpendicular to all vectors in N (AT ), must be in R(A). We will show that
⊥
R(A) ∩ R(B) = N (AT ) + N (B T ) .
Let y be a vector in R(A) ∩ R(B). Then y = Axa , for some xa and y = Bxb , for some xb . Let
v be any vector in the N (AT ) + N (B T ). Then v = va + vb for some va ∈ N (B T ), vb ∈ N (B T ).
Then we have,
y T v = y T va + y T vb = xTa AT va + xTb B T vb = xTa (AT va ) + xTb (B T vb ) = 0.

Thus y ⊥ N (AT ) + N  (B T
) . Since any vector in (R(A) ∩ R(B)) is perpendicular to every vector
T T
in N (A ) + N (B ) ,
⊥
R(A) ∩ R(B) ⊆ N (AT ) + N (B T ) .
⊥
Let y be a vector in N (AT ) + N (B T ) . Then y is perpendicular to all vectors in N (AT ) which
means y ∈ R(A). Similarly y is perpendicular to all vectors in N (B T ) which means y ∈ R(B).
Thus y ∈ (R(A) ∩ R(B)) and we have,
⊥
R(A) ∩ R(B) = N (AT ) + N (B T ) .

7
The full QR factorization of a matrix A is,
 
R1A
A = [Q1A Q2A ] ,
0

and N (AT ) = R(Q2A ). Similarly, let full QR factorization if a matrix B be


 
R1B
B = [Q1B Q2B ] ,
0

and hence N (B T ) = R(Q2B ). Then,

N (AT ) + N (B T ) = R(Q2A ) + R(Q2B ) = R(D),

where D = [Q2A Q2B ]. Now,


⊥
R(A) ∩ R(B) = N (AT ) + N (B T ) = R(D)⊥ = N (DT ).

So we find the QR factorization of D. Let the QR factorization be


 
R1D
D = [Q1D Q2D ] ,
0

and then C = Q2D as N (DT ) = R(Q2D ) = R(C). Thus we have the matrix C such that
R(C) = R(A) ∩ R(B).
(b) The following Matlab code gives the required matrix C and the dimension of R(C).
clear ;
intersect_range_data;
Q_2A = null(A’);
Q_2B = null(B’);
D = [Q_2A Q_2B];
C = null(D’)
rC = rank(C)
>>
C =
-0.3365 -0.2349 0.3581
0.2927 -0.4471 -0.0277
-0.6691 0.0460 0.0131
0.1963 0.3655 0.2581
0.3599 -0.1406 -0.1416
-0.0929 0.1880 -0.5108
0.1967 0.4497 0.3712
0.2019 -0.5007 0.0800
0.2901 0.2292 0.2283
-0.1140 -0.2208 0.5718
rC = 3
Show R(C) ⊆ R(A) and R(C) ⊆ R(B).
rA = rank(A)
rAC = rank([A C])
rB = rank(B)
rBC = rank([B C])
>>

8
rA = 6
rAC = 6
rB = 5
rBC = 5
6. Signal estimation using least-squares. This problem concerns discrete-time signals defined for t =
1, . . . , 500. We’ll represent these signals by vectors in R500 , with the index corresponding to the time.
We are given a noisy measurement ymeas (1), . . . , ymeas (500), of a signal y(1), . . . , y(500) that is thought
to be, at least approximately, a linear combination of the 22 signals
 
2 2 t − 50k 2 2
fk (t) = e−(t−50k) /25 , gk (t) = e−(t−50k) /25 ,
10
where t = 1, . . . , 500 and k = 0, . . . , 10. Plots of f4 and g7 (as examples) are shown below.

1.5

1
f4 (t)

0.5

−0.5
0 50 100 150 200 250 300 350 400 450 500

1.5

0.5
g7 (t)

−0.5

−1

−1.5
0 50 100 150 200 250 300 350 400 450 500

t
As our estimate of the original signal, we will use the signal ŷ = (ŷ(1), . . . , ŷ(500)) in the span of
f0 , . . . , f10 , g0 , . . . , g10 , that is closest to ymeas = (ymeas (1), . . . , ymeas (500)) in the RMS (root-mean-
square) sense. Explain how to find ŷ, and carry out your method on the signal ymeas given in
sig_est_data.m on the course web site. Plot ymeas and ŷ on the same graph. Plot the residual
(the difference between these two signals) on a different graph, and give its RMS value.

Solution. We’ll form the estimated signal as a linear combination of f0 , . . . , f10 , g0 , . . . , g10 ,
ŷ = x1 f0 + x2 f1 + · · · + x11 f10 + x12 g0 + · · · + x22 g10 .
(Here we are representing the signals as vectors in R500 .) We can write this in matrix form as ŷ = Ax,
where
A = [f0 f1 · · · f10 g0 · · · g10 ] ∈ R500×22 .
The coefficients x are chosen to minimize the RMS deviation between ŷ and ymeas , which is the same
as minimizing the norm of the difference. The matrix A is full rank (i.e., 22), so the best coefficients
are given by
xls = (AT A)−1 AT ymeas .

9
Our estimate of the original signal is

ŷ = Axls = A(AT A)−1 AT ymeas .

The following Matlab code implements this estimation method.

sig_est_data;
nfcts = 22; ydim = 500;
t = 1:ydim;
A = zeros(ydim,nfcts);
for k = 1:nfcts/2
fk = exp(-(t-50*(k-1)).^2/25^2);
gk = (t-50*(k-1))/10.*exp(-(t-50*(k-1)).^2/25^2);
A(:,k) = fk’; A(:,k+11) = gk’;
end
yhat = A*(A\ymeas);
residual = ymeas-yhat;
RMS = 1/sqrt(500)*norm(residual);
figure(1); plot(t,ymeas,’g--’,t,yhat,’k’);
xlabel(’time’); ylabel(’fit’);
figure(2); plot(t,ymeas-yhat);
xlabel(’time’); ylabel(’residual’);
>> rank(A) = 22
>> RMS = 0.4596

The figure below shows the measured signal ymeas and the estimated signal ŷ. The RMS value of the
residual is 0.4596. The next figure shows the residual.

1.5

0.5
ymeas and ŷ

−0.5

−1

−1.5

−2
0 50 100 150 200 250 300 350 400 450 500

10
1.5

residual = ymeas − ŷ 0.5

−0.5

−1

−1.5
0 50 100 150 200 250 300 350 400 450 500

11

You might also like