Assignment 2
Assignment 2
Abhinav Pradeep
Tutorial group 12
September 3, 2023
1 Question 1.
Note, all code was written in one file:
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)
hold on
plot3(-3.779310,-3.283186,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(-2.805118,3.131312,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.584428,-1.848126,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.000000, 2.000000, 0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
1
1.1 a
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);
1.2 b
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)
1.3 c
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)
hold on
plot3(-3.779310,-3.283186,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(-2.805118,3.131312,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.584428,-1.848126,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.000000, 2.000000, 0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
Console outputs:
2 Question 2.
z(x, y) = (x2 + y − 11)2 + (x + y 2 − 7)2
x+y =0
Let g(x, y) = x + y
Solutions of the below system of equations are the extrema along restriction x + y = 0:
∇z = λ∇g
x+y =0
Find ∇z:
δ 2 2 2 2
δ 2 2 2 2
∇z = (x + y − 11) + (x + y − 7) , (x + y − 11) + (x + y − 7)
δx δy
δ 2 δ δ δ
∇z = (x + y − 11)2 + (x + y 2 − 7)2 , (x2 + y − 11)2 + (x + y 2 − 7)2
δx δx δy δy
Find ∇g:
δ δ
∇g = (x + y) , (x + y)
δx δy
∇g = (1, 1)
Therefore, the system of equations becomes
4x(x2 + y − 11) + 2(x + y 2 − 7) 1
2 2 =λ·
2(x + y − 11) + 4y(x + y − 7) 1
x+y =0
4x(x2 + y − 11) + 2(x + y 2 − 7) λ
2 2 =
2(x + y − 11) + 4y(x + y − 7) λ
x+y =0
Hence, the system of equations turns to:
x+y =0
Plotting these relations on Desmos,
2.1 a, b, c done in one file:
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
contour(X,Y,Z, ’LevelStep’, 5)
hold on
fimplicit(@(x,y) x+y, [-5 5 -5 5])
A = [1 1];
b = 0;
x0 = [1,1];
Min = fmincon(z,x0,A,b);
disp(Min);
Console output:
This does agree with my understanding of Lagrange multipliers as the minimum calculated by Matlab
is a solution of the Lagrange multiplier system of equations derived earlier. This is show below:
It can be seen that the system of equations does yield a solution at x = 2.8548 and y = −2.8548. From
an intuition standpoint, it can be seen that the minimum is at a point where the constraint curve is
tangent to the contour line. Contour lines are level curves. The constraint curve is a level curve. For
any point P on a level curve f = 0, where f : Rn → R, it can be shown that ∇f (P ) ⊥ (f = 0)@P
(note @P is shorthand for : at point P ). Below is a proof of this:
f (r(t)) = 0
That is, level curve defined by f = 0 is parameterised by r(t)
d
f (r(t)) = 0
dt
d
∇f (r(t)) · r(t) = 0
dt
d
The above equation says that for any point P on r(t), as r(t)@P will yield a vector tangent to
dt
r(t)@P , the dot product of ∇f (P ) and r(t)@P will be 0. Hence, ∇f (P ) ⊥ r(t)@P
By the previous fact, when the two level curves in R2 are tangent to one another, their gradients are
parallel to one another. That is, for level curve z(x, y) = A and level constraint curve g(x, y) = 0
(where g(x, y) = x + y), if they are tangent at some point P , then at P :
∇z(P ) = λ∇g(P )
This is precisely the Lagrange multiplier condition. Hence this does agree with my conceptual under-
standing of Lagrange multipliers.
3 Question 3.
3.1 a
f : Rn → R is differentiable. That is:
f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Where k ∈ Rn
Prove:
∃ E(k) = (E1 (k), E2 (k), . . . , En (k))
Where Ei : Rn → R such that
f (a + k) − f (a) ∇f (a) · k
= + s(k)
∥k∥ ∥k∥
f (a + k) − f (a) ∇f (a) · k
− = s(k)
∥k∥ ∥k∥
f (a + k) − f (a) − ∇f (a) · k
= s(k)
∥k∥
By taking the lim
k→0
f (a + k) − f (a) − ∇f (a) · k
lim = lim s(k)
k→0 ∥k∥ k→0
f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Hence lim s(k) = 0.
k→0
lim s(k) = 0
k→0
E(k) · k = s(k)∥k∥
The existence of s(k) ∀k guarantees the existence of E(k) · k ∀k, which then guarantees the existence
of E(k) ∀k? Hence,
E(k) · k = s(k)∥k∥
To satisfy this condition, set:
s(k)∥k∥
Ei (k) =
nki
s(k)∥k∥ s(k)∥k∥ s(k)∥k∥ s(k)∥k∥
E(k) = , , , ...
nk1 nk2 nk3 nkn
This would ensure that
n
X s(k)∥k∥
E(k) · k = · ki
i=1
nki
n
X s(k)∥k∥
E(k) · k =
i=1
n
n
X 1
E(k) · k = s(k)∥k∥
i=1
n
E(k) · k = s(k)∥k∥
Now consider that ∀ i ∈ [1, n], as ki is just one component of |k|,
∥ki ∥ ≤ ∥k∥
This implies that
−ki ≤ ∥k∥ ≤ ki
∥k∥
−1 ≤ ≤1
ki
−1 ∥k∥ 1
≤ ≤
n nki n
−s(k) s(k)∥k∥ s(k)
≤ ≤
n nki n
s(k)∥k∥
As Ei (k) = nki
−s(k) s(k)
≤ Ei (k) ≤
n n
Taking lim
k→0
−s(k) s(k)
lim ≤ lim Ei (k) ≤ lim
k→0 n k→0 k→0 n
As lim s(k) = 0,
k→0
0 ≤ lim Ei (k) ≤ 0
k→0
lim Ei (k) = 0
k→0
3.2 b
Prove that:
Where (x(t0 + h) − x(t0 )) ∈ Rn (as x(t) ∈ Rn ) and 0 ∈ Rn is the n-dimensional zero vector.
Consider:
lim Ei (k) = 0
k→0
n n n n
!
∂ X ∂ X ∂ X ∂ X
∇x ||x − y||2 = (xi − yi )2 , (xi − yi )2 , (xi − yi )2 , ... (xi − yi )2
∂x1 i=1 ∂x2 i=1 ∂x3 i=1 ∂xn i=1
For j ∈ [1, n]
n
∂ X
(xi − yi )2 = 2(xj − yj )
∂xj i=1
As,
(
∂ 2(xi − yi )When i = j
(xi − yi )2 =
∂xj 0 When i ̸= j
Hence,
4.2 b
A:k×n
x:n×1
u = Ax
Hence, the dimensions of u:
u:k×1
n
X
A1,i xi
i=1
n
X
A 2,i x i
i=1
X n
u=
A 3,i x i
i=1
..
n .
X
Ak,i xi
i=1
g : Rk → R
Show that
∇x g(u) = AT ∇u g(u)
∇x : n × 1
A1,1 . . . Ak,1
AT = ... .. ..
. .
A1,n . . . Ak,n
AT : n × k
∇u : k × 1
g(u) ∈ R
g(u) = g(u1 , u2 , u3 , . . . uk )
As
n
X
A1,i xi
i=1
n
X
A2,i xi
i=1
X n
u=
A3,i xi
i=1
..
n .
X
Ak,i xi
i=1
n
X
uj = Aj,i xi
i=1
Hence
uj (x)
That is uj : Rn → R
∇x g(u)
∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂x1
∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂x
2
∂
∇x g(u) =
∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x))
..
.
∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂xn
Consider
AT ∇u g(u)
∂
g(u1 , u2 , u3 , . . . uk )
∂u1
∂
g(u1 , u2 , u3 , . . . u )
k
∂u2
∂
∇u g(u) =
∂u3 g(u1 , u2 , u3 , . . . uk )
..
.
∂
g(u1 , u2 , u3 , . . . uk )
∂uk
∂
g(u1 , u2 , u3 , . . . uk )
∂u1
A1,1 . . . Ak,1 ∂
A1,2 . . . Ak,2 g(u1 , u2 , u3 , . . . uk )
∂u2
AT ∇u g(u) = A1,3 . . . Ak,3 · ∂ g(u1 , u2 , u3 , . . . uk )
.. ∂u3
. .
.
.
A1,n . . . Ak,n ∂
g(u1 , u2 , u3 , . . . uk )
∂uk
k
X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk )
∂ui
i=1
k
∂
X
Ai,2 · g(u1 , u2 , u3 , . . . uk )
i=1 ∂ui
k
AT ∇u g(u) = X ∂
Ai,3 ·
g(u1 , u2 , u3 , . . . uk )
i=1 ∂ui
.
..
k
∂
X
Ai,n · g(u1 , u2 , u3 , . . . uk )
i=1
∂u i
Hence, to show:
∇x g(u) = AT ∇u g(u)
k
X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk )
∂ ∂u
i
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
i=1
∂x1 k
∂
X
∂
A · g(u , u , u , . . . u )
g(u (x), u (x), u (x), . . . u (x)) i,2 1 2 3 k
∂x2
1 2 3 k
i=1 ∂ui
∂ k
= ∂
X
∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) Ai,3 ·
g(u1 , u2 , u3 , . . . uk )
..
i=1 ∂ui
.
..
∂ .
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
k
∂xn ∂
X
Ai,n · g(u1 , u2 , u3 , . . . uk )
i=1
∂u i
Consider
∂
ui (x)
∂xj
Where,
n
X
A1,i xi
i=1
n
X
A 2,i x i
i=1
X n
u=
A 3,i x i
i=1
..
n .
X
Ak,i xi
i=1
n
X
ui = Ai,l xl
l=1
n
∂ X
Ai,l xl = Ai,j
∂xj l=1
As
(
∂ Ai,l When l = j
Ai,l xl =
∂xj 0 When l ̸= j
Therefore,
∂
ui (x) = Ai,j
∂xj
Hence
k k
X ∂ ∂ X ∂
ui (x) · g(u1 , u2 , u3 , . . . uk ) = Ai,j · g(u1 , u2 , u3 , . . . uk )
i=1
∂x j ∂u i i=1
∂u i
k
∂ X ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = Ai,j · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂ui
This holds for j ∈ [1, n]
Hence,
k
X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk )
∂ ∂ui
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
i=1
∂x1 k
∂
X
∂
Ai,2 · g(u1 , u2 , u3 , . . . uk )
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂ui
∂x2
i=1
∂ k
= ∂
X
∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x))
Ai,3 ·
g(u1 , u2 , u3 , . . . uk )
..
i=1 ∂ui
.
∂
.
..
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
k
∂xn ∂
X
Ai,n · g(u1 , u2 , u3 , . . . uk )
i=1
∂u i
∇x g(u) = AT ∇u g(u)
4.3 c
(AT A) is invertible ⇔ A and AT are invertible (moreover A is invertible ⇔ AT is invertible). This
restriction forces k = n as A and AT must be invertible.
Ax : Rn
y : Rn
Set u = Ax
f (x) = ∥u − y∥2
∇x f (x) = ∇x ∥u − y∥2
Set
g(u) = ∥u − y∥2
∇x f (x) = ∇x g(u)
Where g : Rn → R,
By 4.b,
∇x g(u) = AT ∇u g(u)
AT ∇u g(u) = AT ∇u ∥u − y∥2
By 4.a,
∇u ∥u − y∥2 = 2 (u − y)
Therefore,
AT ∇u g(u) = AT 2 (u − y)
As,
∇x g(u) = AT ∇u g(u)
∇x g(u) = AT 2 (u − y)
Substituting back u = Ax and ∇x f (x) = ∇x g(u),
∇x f (x) = AT 2 (Ax − y)
Critical points occur when ∇x f (x) = 0. Hence, critical points occur when:
AT 2 (Ax − y) = 0
As 2 ∈ R
AT (Ax − y) = 0
AT Ax − AT y = 0
AT Ax = AT y
As AT and A are invertible,
A−T AT Ax = A−T AT y
Ax = y
x = A−1 y
4.4 d
To show
Aw − y = Aw − Ax̂ + Ax̂ − y
n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )((Ax̂)i − yi )
i=1 i=1 i=1 i=1
n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )(yi − yi )
i=1 i=1 i=1 i=1
n
X n
X n
X
2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi )2
i=1 i=1 i=1
n
As ∀w ∈ R
n
X
((Aw)i − (Ax̂)i )2 ≥ 0
i=1
n
X n
X
2
((Aw)i − yi ) ≥ ((Ax̂)i − yi )2
i=1 i=1
5 Question 5.
5.1 a
Given that ∃g(x, y) such that:
∂
g(x, y) = f (x, y)
∂y
And the mixed second partial derivatives are continuous.
∂2 ∂2
g(x, y) = g(x, y)
∂y∂x ∂x∂y
∂
As ∂y
g(x, y) = f (x, y)
∂2 ∂
g(x, y) = f (x, y)
∂y∂x ∂x
Integrate both sides over [a, b] with respect to y
Z b Z b
∂2 ∂
g(x, y)dy = f (x, y)dy
a ∂y∂x a ∂x
Z b Z b
∂ ∂ ∂
g(x, y) dy = f (x, y)dy
a ∂y ∂x a ∂x
By the FTC,
Z b
∂ ∂ ∂ ∂
g(x, y) dy = g(x, b) − g(x, a)
a ∂y ∂x ∂x ∂x
Hence,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Now consider the integral:
Z b
f (x, y)dy
a
∂
As f (x, y) = ∂y
g(x, y)
Z b Z b
∂
f (x, y)dy = g(x, y)dy
a a ∂y
By the FTC,
Z b
∂
g(x, y)dy = g(x, b) − g(x, a)
a ∂y
Hence,
Z b
f (x, y)dy = g(x, b) − g(x, a)
a
∂
Take ∂x
Z b
∂ ∂
f (x, y)dy = (g(x, b) − g(x, a))
∂x a ∂x
By linearity of the derivative,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
∂x a ∂x ∂x
Rb
As a
f (x, y)dy will return a function of purely x, the partial can be switched out:
Z b
d ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
dx a ∂x ∂x
It was earlier derived that:
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Therefore,
Z b Z b
d ∂
f (x, y)dy = f (x, y)dy
dx a a ∂x
5.2 b
Z π/2
2 2 2 1+β
ln(cos x + β sin x) dx = π ln
0 2
Consider
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
d ∂
f (β) = ln(cos2 x + β 2 sin2 x)dx
dβ 0 ∂β
π/2
2β sin2 (x)
Z
d
f (β) = dx
dβ 0 sin2 (x) β 2 + cos2 (x)
π/2
sin2 (x)
Z
d
f (β) = 2β dx
dβ 0 sin2 (x)β 2 + cos2 (x)
As
tan (x)
sin (x) =
sec (x)
1
cos (x) =
sec (x)
This can be rewritten as,
Z π/2 tan2 (x)
d sec2 (x)
f (β) = 2β 2
tan (x) 2
dx
dβ 0 2 β + 1
2
sec (x) sec (x)
1
d
Z π/2
sec2 (x)
tan2 (x)
f (β) = 2β 1 dx
dβ 0 sec2 (x)
(β 2 tan2 (x) + 1)
Z π/2
d sec2 (x) tan2 (x)
f (β) = 2β dx
dβ 0 sec2 (x) (β 2 tan2 (x) + 1)
As
u = tan (x)
Hence,
du = sec2 (x) dx
du
= dx
sec2 (x)
Therefore,
u2
Z
= du
(u2 + 1) (β 2 u2 + 1)
Z
1 1
= − du
(β 2 − 1) (u2 + 1) (β 2 − 1) (β 2 u2 + 1)
Z Z
1 1 1 1
du − du
β2 − 1 u2 + 1 β2 − 1 β 2 u2 + 1
Z Z
1 1 1 1
du − 2 du
2
β −1 2
u +1 (β − 1) β 2 u + β12
2
Z
1
2
du = arctan(u)
u +1
Z
1
du = β arctan(βu)
u + β12
2
1 1
= arctan(u) − 2 β arctan(βu)
β2 −1 (β − 1) β 2
arctan(u) arctan(βu)
= −
β2 − 1 (β 2 − 1) β
As u = tan(x)
lim tan(x) = ∞
x→ π2
For some β ∈ R
lim β tan(x) = ∞
x→ π2
As,
π
lim arctan(x) =
x→∞ 2
Hence,
π
lim arctan(β tan(x)) =
x→ π2 2
π/2
sec2 (x) tan2 (x) βπ − π 2 arctan(0)
Z
2β 2 2 2
dx = 2 +
0 (tan (x) + 1) (β tan (x) + 1) β −1 β2 − 1
π/2
sec2 (x) tan2 (x) βπ − π 2·0
Z
2β 2 2 2
dx = 2 + 2
0 (tan (x) + 1) (β tan (x) + 1) β −1 β −1
Z π/2
sec2 (x) tan2 (x) βπ − π
2β 2 2 2
dx = 2
0 (tan (x) + 1) (β tan (x) + 1) β −1
Z π/2
sec2 (x) tan2 (x) π (β − 1)
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) β2 − 1
Using difference of squares,
π/2
sec2 (x) tan2 (x) π (β − 1)
Z
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β − 1)(β + 1)
π/2
sec2 (x) tan2 (x)
Z
π
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β + 1)
Hence,
d π
f (β) =
dβ (β + 1)
Z
π
f (β) = dβ
(β + 1)
f (β) = π ln(β + 1) + C
Consider f (1),
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
f (1) = ln(cos2 x + sin2 x)dx
0
By the pythagorean identity
Z π/2
f (1) = ln(1)dx
0
Z π/2
f (1) = 0dx
0
f (1) = 0
Hence,
f (1) = π ln(1 + 1) + C
0 = π ln(2) + C
C = −π ln(2)
Therefore,
f (β) = π ln(β + 1) + C