AAB Week 21
AAB Week 21
AAB Week 21
F (x, y, z) = αx2 + βy 2 + γz 2
where 0 < α < β < γ. Find the greatest value of F on the sphere
S = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1}.
So, from the geometrical point of view we look for points where these ellipsoids are tangent
to the unit sphere. We will use the method of Lagrange multipliers. To this end we define a
function:
g(x, y, z) = x2 + y 2 + z 2 − 1.
Then the unit sphere S is the zero level set of g. We have
We can see that the first three equations can be written in the form:
Qx = λx,
where
α 0 0 x
Q= 0 β 0 and x = y
0 0 γ z
So this is a problem of finding eigenvalues and eigenvectors. Since the matrix Q is diagonal we
solve this system quickly and combine it with the condition x2 + y 2 + z 2 = 1. We obtain three
cases:
57
Case 1. λ = α and (x, y, z) = (±1, 0, 0)
Case 2. λ = β and (x, y, z) = (0, ±1, 0)
Case 3. λ = γ and (x, y, z) = (0, 0, ±1).
We now need to check the value of the function F at the six points ”under suspicion”:
Hence the smallest value of F on S is equal to α, and the greatest value is equal to γ. The six
points that we needed to consider can be easily identified on the sphere:
In other words if X = {(xi , yi ) : i = 1, 2, ..., n} and P X = {(x̃i , ỹi ) : i = 1, 2, ..., n}, where
(x̃i , ỹi ), i = 1, 2, ..., n are the orthogonal projections of the elements of X onto the line l, then
we want to maximise the quantity:
n
1X
|(x̃i , ỹi )|2 .
n
i=1
Let u = (u1 , u2 ) be a unit vector in the direction of the line l. From elementary results in linear
algebra we have
(x̃i , ỹi ) = ((xi , yi ) · (u1 , u2 ))u.
Hence, we want to maximise the function
n n
1X 1X
F (u1 , u2 ) = |((xi , yi ) · (u1 , u2 ))u|2 = ((xi , yi ) · (u1 , u2 ))2 .
n n
i=1 i=1
58
Example 46.1. Consider the following four points:
√ √
(x1 , y1 ) = (− 3, 0), (x2 , y2 ) = ( 3, 0), (x3 , y3 ) = (−1, −2), (x4 , y4 ) = (1, 2)
∇F (u) = 2Su.
The gradient of g - that we have calculated many times before - can then be expressed as follows:
The condition that the gradient of F is parallel to the gradient of g is expressed as:
2Su = λ2u.
59
We have
det S = 0 ⇔ (2 − λ)(2 − λ) − 1 = 0
We solve this quadratic equation and get λ = 1 or λ = 3. If λ = 1 then the eigenvectors are
given by
2 1 u1 u1
= ,
1 2 u2 u2
so we get (u1 , u2 ) = α(1, −1). A unit vector in this direction is thus given by
(−1, 1)
w= √ .
2
We note that
F (w) = 1.
When λ = 3 then - using similar technique as above - we find that the corresponding eigenvector
is given by (u1 , u2 ) = α(1, 1). The unit vector in this direction is given by
(1, 1)
v= √ .
2
We note that
F (v) = 3
Hence the greatest value of F on the unit sphere is 3, and the smallest is equal to 1.
F (u) = uT Su.
w·v =0
(according to the Theorem below). Hence if we write down a vector u as the linear combination
of v and w we obtain:
Since vectors v and w are orthonormal it follows that the whole problem can now be written
as follows:
60
Following the ideas from Example 45.1:
• the greatest value of F on the unit circle is attained when α = 0, β = ±1. We then have
F (v) = λ2 .
61
48 Dimension reduction algorithm II
The results of the previous section can be summarised in the following algorithm that can be
easily generalised.
Consider the set X of n data points on R2 :
X = {(xi , yi ) : i = 1, 2, ..., n}
We assume that the mean value of xi and the mean value of yi is zero and that the variance of
each coordinate is 1.
For a given line l: y = αx, we consider the orthogonal projection of X onto the line l:
where u is a unit vector parallel to l. We want to find a line y = αx that - roughly speaking -
maximises ”the variance” of P X on the line l.
STEP 1. We write the problem in the following form: maximise the function
n
1X
F (u) = ((xi , yi ) · (u1 , u2 ))2 = u · (Su),
n
i=1
n n n
s11 s12 1X 2 1X 1X 2
where : S = and s11 = xi , s12 = s21 = x i yi , s22 = yi .
s21 s22 n n n
i=1 i=1 i=1
2 1
Note that in the previous section we considered S = .
1 2
STEP 2. We find two orthonormal eigenvectors and two corresponding eigenvalues of S:
Sw = λ1 w, Sv = λ2 v, v ⊥ w, |v| = |w| = 1, 0 ≤ λ1 ≤ λ2 .
Note that in the previous example we found the following eigenvectors and eigenvalues:
• w = (−1,1)
√
2
corresponding to the eigenvalue λ1 = 1 and
(1,1)
•v= √
2
corresponding to the eigenvalue λ2 = 3.
STEP 3. In the basis w, v the function F can be expressed in a very simple form:
F (αw + βv) = λ1 α2 + λ2 β 2
It follows that the smallest value of F on the circle |u| = 1 is equal to λ1 and is attained in the
direction ±w, while the greatest value of F is equal to λ2 and is attained in the direction ±v.
62
Remark. The algorithm can be easily generalised. For example consider a set
The 1-dimensional space which maximises the ”variance” of projected elements of X is obtained
as follows: First we define the symmetric matrix S with elements:
n n n
1X 2 1X 2 1X 2
s11 = xi , s22 = yi , s33 = zi ,
n n n
i=1 i=1 i=1
n n n
1 X 1 X 1X
s12 = s21 = xi yi , s13 = s31 = xi zi , s23 = s32 = yi z i .
n n n
i=1 i=1 i=1
63