0% found this document useful (0 votes)

14 views

Assignment 2

The document discusses solving a multi-variable optimization problem using Lagrange multipliers. It derives the system of equations for finding extrema along a constraint and shows the solution agrees with using fmincon in MATLAB. It also proves that differentiability of a function implies its rate of change can be expressed as the gradient plus a higher order term.

Uploaded by

Abhinav Pradeep

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Assignment 2

Uploaded by

Abhinav Pradeep

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Assignment 2

Abhinav Pradeep
Tutorial group 12
September 3, 2023

1 Question 1.
Note, all code was written in one file:

z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);

[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

hold on
plot3(-3.779310,-3.283186,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(-2.805118,3.131312,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.584428,-1.848126,0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)
plot3(3.000000, 2.000000, 0,’ro’,’MarkerSize’,10,’MarkerFaceColor’,’r’)

Provided answers below are snippets of this file.

1
1.1 a
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;
Min1 = fminsearch(z,[1,1]);
disp("Minimum at: (x,y)");
disp(Min1);
Min2 = fminsearch(z,[-1,1]);
disp("Minimum at: (x,y)");
disp(Min2);
Min3 = fminsearch(z,[-1,-1]);
disp("Minimum at: (x,y)");
disp(Min3);
Min4 = fminsearch(z,[2,-1]);
disp("Minimum at: (x,y)");
disp(Min4);

1.2 b
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

1.3 c
[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
hSurf = surf(X,Y,Z);
xlabel(’X’)
ylabel(’Y’)
zlabel(’Z’)

Minimum at: (x,y)

3.0000 2.0000

Minimum at: (x,y)

-2.8051 3.1313

Minimum at: (x,y)

-3.7793 -3.2832

Minimum at: (x,y)

3.5844 -1.8481

2 Question 2.
z(x, y) = (x2 + y − 11)2 + (x + y 2 − 7)2

x+y =0
Let g(x, y) = x + y

Solutions of the below system of equations are the extrema along restriction x + y = 0:

∇z = λ∇g

x+y =0
Find ∇z:

δ 2 2 2 2
δ 2 2 2 2

∇z = (x + y − 11) + (x + y − 7) , (x + y − 11) + (x + y − 7)
δx δy

δ 2 δ δ δ
∇z = (x + y − 11)2 + (x + y 2 − 7)2 , (x2 + y − 11)2 + (x + y 2 − 7)2
δx δx δy δy

∇z = 2(x2 + y − 11) · 2x + 2(x + y 2 − 7) · 1, 2(x2 + y − 11) · 1 + 2(x + y 2 − 7) · 2y

∇z = 4x(x2 + y − 11) + 2(x + y 2 − 7), 2(x2 + y − 11) + 4y(x + y 2 − 7)

Find ∇g:

δ δ
∇g = (x + y) , (x + y)
δx δy

∇g = (1, 1)
Therefore, the system of equations becomes

4x(x2 + y − 11) + 2(x + y 2 − 7) 1
2 2 =λ·
2(x + y − 11) + 4y(x + y − 7) 1

x+y =0

4x(x2 + y − 11) + 2(x + y 2 − 7) λ
2 2 =
2(x + y − 11) + 4y(x + y − 7) λ

x+y =0
Hence, the system of equations turns to:

4x(x2 + y − 11) + 2(x + y 2 − 7) = 2(x2 + y − 11) + 4y(x + y 2 − 7)

x+y =0
Plotting these relations on Desmos,
2.1 a, b, c done in one file:
z = @(x) (x(1)^2+x(2)-11).^2+(x(1)+x(2)^2-7).^2;

[X,Y] = meshgrid(-5:0.1:5,-5:0.1:5);
Z = arrayfun(@(x,y) z([x,y]), X, Y);
contour(X,Y,Z, ’LevelStep’, 5)

hold on
fimplicit(@(x,y) x+y, [-5 5 -5 5])

A = [1 1];
b = 0;
x0 = [1,1];
Min = fmincon(z,x0,A,b);
disp(Min);

plot(Min(1), Min(2), ’ro’, ’MarkerSize’, 10, ’MarkerFaceColor’, ’r’)

Console output:

Local minimum found that satisfies the constraints.

Optimization completed because the objective function is non-decreasing in

feasible directions, to within the value of the optimality tolerance,
and constraints are satisfied to within the value of the constraint tolerance.

<stopping criteria details>

2.8548 -2.8548

This does agree with my understanding of Lagrange multipliers as the minimum calculated by Matlab
is a solution of the Lagrange multiplier system of equations derived earlier. This is show below:
It can be seen that the system of equations does yield a solution at x = 2.8548 and y = −2.8548. From
an intuition standpoint, it can be seen that the minimum is at a point where the constraint curve is
tangent to the contour line. Contour lines are level curves. The constraint curve is a level curve. For
any point P on a level curve f = 0, where f : Rn → R, it can be shown that ∇f (P ) ⊥ (f = 0)@P
(note @P is shorthand for : at point P ). Below is a proof of this:

Consider r(t) : R → Rn which satisfies.

f (r(t)) = 0
That is, level curve defined by f = 0 is parameterised by r(t)
d
f (r(t)) = 0
dt
d
∇f (r(t)) · r(t) = 0
dt
d
The above equation says that for any point P on r(t), as r(t)@P will yield a vector tangent to
dt
r(t)@P , the dot product of ∇f (P ) and r(t)@P will be 0. Hence, ∇f (P ) ⊥ r(t)@P

By the previous fact, when the two level curves in R2 are tangent to one another, their gradients are
parallel to one another. That is, for level curve z(x, y) = A and level constraint curve g(x, y) = 0
(where g(x, y) = x + y), if they are tangent at some point P , then at P :

∇z(P ) = λ∇g(P )
This is precisely the Lagrange multiplier condition. Hence this does agree with my conceptual under-
standing of Lagrange multipliers.
3 Question 3.
3.1 a
f : Rn → R is differentiable. That is:

f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Where k ∈ Rn

Prove:
∃ E(k) = (E1 (k), E2 (k), . . . , En (k))
Where Ei : Rn → R such that

f (a + k) − f (a) = (∇f (a) + E(k)) · k

and lim Ei (k) = 0 ∀i
k→0

f is differentiable ⇔ there exists a multivariate function s : Rn → R such that

and lim s(k) = 0

k→0
n
Consider s : R → R such that:

f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥

Such s can be defined as f is differentiable and therefore ∇f (a) exists. Moreover, differentiability
implies continuity and therefore, ∀a and ∀k f (a) and f (a + k) exist.

As all outputs are ∈ R the below algebra is valid:

f (a + k) − f (a) ∇f (a) · k
= + s(k)
∥k∥ ∥k∥
f (a + k) − f (a) ∇f (a) · k
− = s(k)
∥k∥ ∥k∥
f (a + k) − f (a) − ∇f (a) · k
= s(k)
∥k∥
By taking the lim
k→0

f (a + k) − f (a) − ∇f (a) · k
lim = lim s(k)
k→0 ∥k∥ k→0

The definition of differentiability requires that:

f (a + k) − f (a) − ∇f (a) · k
lim =0
k→0 ∥k∥
Hence lim s(k) = 0.
k→0

Therefore f is differentiable ⇔ ∃s : Rn → Rn such that:

f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥
And:

lim s(k) = 0
k→0

Consider the initial expression:

f (a + k) − f (a) = ∇f (a) · k + s(k)∥k∥

Define E : Rn → Rn

E(x) = (E1 (x), E2 (x), E3 (x), . . . En (x)) , Ei : Rn → R

Such that:

E(k) · k = s(k)∥k∥
The existence of s(k) ∀k guarantees the existence of E(k) · k ∀k, which then guarantees the existence
of E(k) ∀k? Hence,

f (a + k) − f (a) = ∇f (a) · k + E(k) · k

Consider,

E(k) · k = s(k)∥k∥
To satisfy this condition, set:

s(k)∥k∥
Ei (k) =
nki

s(k)∥k∥ s(k)∥k∥ s(k)∥k∥ s(k)∥k∥
E(k) = , , , ...
nk1 nk2 nk3 nkn
This would ensure that
n
X s(k)∥k∥
E(k) · k = · ki
i=1
nki
n
X s(k)∥k∥
E(k) · k =
i=1
n
n
X 1
E(k) · k = s(k)∥k∥
i=1
n

E(k) · k = s(k)∥k∥
Now consider that ∀ i ∈ [1, n], as ki is just one component of |k|,

∥ki ∥ ≤ ∥k∥
This implies that
−ki ≤ ∥k∥ ≤ ki

∥k∥
−1 ≤ ≤1
ki
−1 ∥k∥ 1
≤ ≤
n nki n
−s(k) s(k)∥k∥ s(k)
≤ ≤
n nki n
s(k)∥k∥
As Ei (k) = nki

−s(k) s(k)
≤ Ei (k) ≤
n n
Taking lim
k→0

−s(k) s(k)
lim ≤ lim Ei (k) ≤ lim
k→0 n k→0 k→0 n
As lim s(k) = 0,
k→0

0 ≤ lim Ei (k) ≤ 0
k→0

Hence by the squeeze theorem,

lim Ei (k) = 0
k→0

This applies ∀i ∈ [1, n]

3.2 b
Prove that:

lim Ei (x(t0 + h) − x(t0 )) = 0

h→0

As differentiability implies continuity,

lim x(t0 + h) − x(t0 ) = 0

h→0

Where (x(t0 + h) − x(t0 )) ∈ Rn (as x(t) ∈ Rn ) and 0 ∈ Rn is the n-dimensional zero vector.

Consider:

lim Ei (k) = 0
k→0

Is equivalent to saying that:

∀ϵ1 > 0 ∃δ1 > 0 such that if

0 < ∥k∥ < δ1
Then,

|Ei (k)| < ϵ1

Consider:

lim x(t0 + h) − x(t0 ) = 0

h→0

Is equivalent to saying that:

∀ϵ2 > 0 ∃δ2 > 0 such that if

0 < |h| < δ2

Then,

∥x(t0 + h) − x(t0 )∥ < ϵ2

Fix δ1 . Set δ2 such that if

0 < |h| < δ2

∥x(t0 + h) − x(t0 )∥ < δ1

As (x(t0 + h) − x(t0 )) ∈ Rn it is a valid input to Ei , ∥x(t0 + h) − x(t0 )∥ < δ1 implies that:

|Ei (x(t0 + h) − x(t0 ))| < ϵ1

As this applies for any δ1 > 0 and corresponding ϵ1 > 0, the above can be rewritten as:

∀ϵ > 0 ∃δ > 0 such that if

0 < |h| < δ

Then,

|Ei (x(t0 + h) − x(t0 ))| < ϵ

Hence,

lim Ei (x(t0 + h) − x(t0 )) = 0

h→0
4 Question 4.
4.1 a
∇x ||x − y||2 = 2(x − y)

∂ ∂ ∂ ∂
∇x = , , , ...
∂x1 ∂x2 ∂x3 ∂xn
n
X
||x − y||2 = (xi − yi )2
i=1
2 n
Therefore, ||x − y|| is R → R:

n n n n
!
∂ X ∂ X ∂ X ∂ X
∇x ||x − y||2 = (xi − yi )2 , (xi − yi )2 , (xi − yi )2 , ... (xi − yi )2
∂x1 i=1 ∂x2 i=1 ∂x3 i=1 ∂xn i=1

For j ∈ [1, n]
n
∂ X
(xi − yi )2 = 2(xj − yj )
∂xj i=1
As,
(
∂ 2(xi − yi )When i = j
(xi − yi )2 =
∂xj 0 When i ̸= j
Hence,

∇x ||x − y||2 = (2(x1 − y1 ), 2(x2 − y2 ), 2(x3 − y3 ), ...2(xn − yn ))

Which is equivalently

∇x ||x − y||2 = 2(x − y)

4.2 b
A:k×n

x:n×1

u = Ax
Hence, the dimensions of u:

u:k×1
n
 
X
 A1,i xi 
 i=1 
 n 
X 

 A 2,i x i


 i=1 
X n 
u=
 
 A 3,i x i


 i=1 
 .. 

 n . 

X 
Ak,i xi
 
i=1

g : Rk → R
Show that

∇x g(u) = AT ∇u g(u)

∇x : n × 1
 
A1,1 . . . Ak,1
AT =  ... .. .. 

. . 
A1,n . . . Ak,n

AT : n × k

∇u : k × 1

g(u) ∈ R

g(u) = g(u1 , u2 , u3 , . . . uk )
As
n
 
X
 A1,i xi 
 i=1 
 n 
 X 

 A2,i xi 

 i=1 
X n 
u= 

 A3,i xi 

 i=1 
 .. 
 n .
 

X 
Ak,i xi
 
i=1
n
X
uj = Aj,i xi
i=1

Hence

uj (x)
That is uj : Rn → R

g(u) = g(u1 (x), u2 (x), u3 (x), . . . uk (x))

Consider

∇x g(u)

∂
 
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 ∂x1 
 ∂ 
 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
∂x
 
 2 
 ∂
∇x g(u) = 

 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 

 .. 
.
 
 
 ∂ 
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂xn
Consider

AT ∇u g(u)

∂
 
g(u1 , u2 , u3 , . . . uk )
 ∂u1 
 ∂ 
 g(u1 , u2 , u3 , . . . u )
k 

 ∂u2


 ∂
∇u g(u) = 

 ∂u3 g(u1 , u2 , u3 , . . . uk ) 

 .. 
.
 
 
 ∂ 
g(u1 , u2 , u3 , . . . uk )
∂uk
∂
 
g(u1 , u2 , u3 , . . . uk )
   ∂u1 
A1,1 . . . Ak,1  ∂  
 A1,2 . . . Ak,2   g(u1 , u2 , u3 , . . . uk )
  ∂u2

 
AT ∇u g(u) =  A1,3 . . . Ak,3  ·  ∂ g(u1 , u2 , u3 , . . . uk )
   
 ..   ∂u3 
 .   . 
 .
.

A1,n . . . Ak,n   ∂


g(u1 , u2 , u3 , . . . uk )
∂uk
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ui

 
 i=1 
k
∂
X 
Ai,2 · g(u1 , u2 , u3 , . . . uk ) 
 


 i=1 ∂ui 

k
AT ∇u g(u) =  X ∂
 
Ai,3 ·

 g(u1 , u2 , u3 , . . . uk ) 

 i=1 ∂ui 


 .
.. 

 
k
∂
X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

Hence, to show:

∇x g(u) = AT ∇u g(u)
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ ∂u
  
i
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 
 i=1 
 ∂x1 k
∂
 X 
 ∂
A · g(u , u , u , . . . u )
  
 g(u (x), u (x), u (x), . . . u (x))   i,2 1 2 3 k 
 ∂x2
 1 2 3 k  
  i=1 ∂ui 

 ∂ k
= ∂
 X 
 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x))   Ai,3 ·
   
g(u1 , u2 , u3 , . . . uk ) 
 ..  
  i=1 ∂ui 
.
 
   .. 
 ∂   . 
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
k

∂xn ∂
X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

That is show that for j ∈ [1, n]

k
∂ X ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = Ai,j · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂ui
Consider
∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
∂xj
By the chain rule,
k
∂ X ∂ ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = ui (x) · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂x j ∂u i

Consider
∂
ui (x)
∂xj
Where,
n
 
X
 A1,i xi 
 i=1 
 n 
X 

 A 2,i x i


 i=1 
X n 
u=
 
 A 3,i x i


 i=1 
 .. 

 n . 

X 
Ak,i xi
 
i=1
n
X
ui = Ai,l xl
l=1
n
∂ X
Ai,l xl = Ai,j
∂xj l=1
As
(
∂ Ai,l When l = j
Ai,l xl =
∂xj 0 When l ̸= j
Therefore,
∂
ui (x) = Ai,j
∂xj
Hence
k k
X ∂ ∂ X ∂
ui (x) · g(u1 , u2 , u3 , . . . uk ) = Ai,j · g(u1 , u2 , u3 , . . . uk )
i=1
∂x j ∂u i i=1
∂u i

k
∂ X ∂
g(u1 (x), u2 (x), u3 (x), . . . uk (x)) = Ai,j · g(u1 , u2 , u3 , . . . uk )
∂xj i=1
∂ui
This holds for j ∈ [1, n]

Hence,
 k

X ∂
Ai,1 · g(u1 , u2 , u3 , . . . uk ) 
∂ ∂ui
  
g(u1 (x), u2 (x), u3 (x), . . . uk (x))
 
 i=1 
 ∂x1 k
∂
 X 
 ∂
Ai,2 · g(u1 , u2 , u3 , . . . uk ) 
  
 g(u1 (x), u2 (x), u3 (x), . . . uk (x))  
∂ui
 ∂x2
   
  i=1 
 ∂ k
= ∂
 X 

 ∂x3 g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
 

Ai,3 ·

g(u1 , u2 , u3 , . . . uk ) 
 ..  
  i=1 ∂ui 
.
 

 ∂
 
  .
.. 

g(u1 (x), u2 (x), u3 (x), . . . uk (x)) 
k

∂xn ∂
X 
Ai,n · g(u1 , u2 , u3 , . . . uk )
 
i=1
∂u i

∇x g(u) = AT ∇u g(u)

4.3 c
(AT A) is invertible ⇔ A and AT are invertible (moreover A is invertible ⇔ AT is invertible). This
restriction forces k = n as A and AT must be invertible.

f (x) = ∥Ax − y∥2

Ax : Rn

y : Rn
Set u = Ax

f (x) = ∥u − y∥2

∇x f (x) = ∇x ∥u − y∥2
Set

g(u) = ∥u − y∥2

∇x f (x) = ∇x g(u)
Where g : Rn → R,
By 4.b,

∇x g(u) = AT ∇u g(u)

AT ∇u g(u) = AT ∇u ∥u − y∥2
By 4.a,
∇u ∥u − y∥2 = 2 (u − y)
Therefore,

AT ∇u g(u) = AT 2 (u − y)
As,

∇x g(u) = AT ∇u g(u)

∇x g(u) = AT 2 (u − y)
Substituting back u = Ax and ∇x f (x) = ∇x g(u),

∇x f (x) = AT 2 (Ax − y)
Critical points occur when ∇x f (x) = 0. Hence, critical points occur when:

AT 2 (Ax − y) = 0
As 2 ∈ R

AT (Ax − y) = 0

AT Ax − AT y = 0

AT Ax = AT y
As AT and A are invertible,

A−T AT Ax = A−T AT y

Ax = y

x = A−1 y

4.4 d
To show

∥Aw − y∥2 ≥ ∥Ax̂ − y∥2

Aw − y = Aw − Ax̂ + Ax̂ − y

∥Aw − y∥2 = ∥Aw − Ax̂ + Ax̂ − y∥2

n
X n
X
((Aw)i − yi )2 = ((Aw)i − (Ax̂)i + (Ax̂)i − yi )2
i=1 i=1
n
X n
X
2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i + (Ax̂)i − yi )2
i=1 i=1
n
X n
X
((Aw)i − yi )2 = (((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ))2
i=1 i=1

n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )((Ax̂)i − yi )
i=1 i=1 i=1 i=1

As x̂ = A−1 y ⇒ Ax̂ = AA−1 y ⇒ Ax̂ = y ⇒ (Ax̂)i = yi

n
X n
X n
X n
X
2 2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi ) + 2((Aw)i − (Ax̂)i )(yi − yi )
i=1 i=1 i=1 i=1

n
X n
X n
X
2 2
((Aw)i − yi ) = ((Aw)i − (Ax̂)i ) + ((Ax̂)i − yi )2
i=1 i=1 i=1
n
As ∀w ∈ R
n
X
((Aw)i − (Ax̂)i )2 ≥ 0
i=1
n
X n
X
2
((Aw)i − yi ) ≥ ((Ax̂)i − yi )2
i=1 i=1

∥Aw − y∥2 ≥ ∥Ax̂ − y∥2

5 Question 5.
5.1 a
Given that ∃g(x, y) such that:
∂
g(x, y) = f (x, y)
∂y
And the mixed second partial derivatives are continuous.

By this continuity, conditions for Clairaut’s theorem are satisfied. Hence,

∂2 ∂2
g(x, y) = g(x, y)
∂y∂x ∂x∂y
∂
As ∂y
g(x, y) = f (x, y)
∂2 ∂
g(x, y) = f (x, y)
∂y∂x ∂x
Integrate both sides over [a, b] with respect to y
Z b Z b
∂2 ∂
g(x, y)dy = f (x, y)dy
a ∂y∂x a ∂x
Z b Z b
∂ ∂ ∂
g(x, y) dy = f (x, y)dy
a ∂y ∂x a ∂x

By the FTC,
Z b
∂ ∂ ∂ ∂
g(x, y) dy = g(x, b) − g(x, a)
a ∂y ∂x ∂x ∂x
Hence,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Now consider the integral:
Z b
f (x, y)dy
a
∂
As f (x, y) = ∂y
g(x, y)
Z b Z b
∂
f (x, y)dy = g(x, y)dy
a a ∂y
By the FTC,
Z b
∂
g(x, y)dy = g(x, b) − g(x, a)
a ∂y
Hence,
Z b
f (x, y)dy = g(x, b) − g(x, a)
a
∂
Take ∂x
Z b
∂ ∂
f (x, y)dy = (g(x, b) − g(x, a))
∂x a ∂x
By linearity of the derivative,
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
∂x a ∂x ∂x
Rb
As a
f (x, y)dy will return a function of purely x, the partial can be switched out:
Z b
d ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
dx a ∂x ∂x
It was earlier derived that:
Z b
∂ ∂ ∂
f (x, y)dy = g(x, b) − g(x, a)
a ∂x ∂x ∂x
Therefore,
Z b Z b
d ∂
f (x, y)dy = f (x, y)dy
dx a a ∂x

5.2 b
Z π/2
2 2 2 1+β
ln(cos x + β sin x) dx = π ln
0 2
Consider
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
d ∂
f (β) = ln(cos2 x + β 2 sin2 x)dx
dβ 0 ∂β
π/2
2β sin2 (x)
Z
d
f (β) = dx
dβ 0 sin2 (x) β 2 + cos2 (x)
π/2
sin2 (x)
Z
d
f (β) = 2β dx
dβ 0 sin2 (x)β 2 + cos2 (x)
As

tan (x)
sin (x) =
sec (x)
1
cos (x) =
sec (x)
This can be rewritten as,
Z π/2 tan2 (x)
d sec2 (x)
f (β) = 2β 2
tan (x) 2
dx
dβ 0 2 β + 1
2
sec (x) sec (x)
1
d
Z π/2
sec2 (x)
tan2 (x)
f (β) = 2β 1 dx
dβ 0 sec2 (x)
(β 2 tan2 (x) + 1)
Z π/2
d sec2 (x) tan2 (x)
f (β) = 2β dx
dβ 0 sec2 (x) (β 2 tan2 (x) + 1)
As

sec2 (x) = tan2 (x) + 1

This can be rewritten as,
π/2
sec2 (x) tan2 (x)
Z
d
f (β) = 2β dx
dβ 0 (tan2 (x) + 1) (β 2 tan2 (x) + 1)
Now consider

sec2 (x) tan2 (x)

Z
dx
(tan2 (x) + 1) (β 2 tan2 (x) + 1)
Set

u = tan (x)
Hence,

du = sec2 (x) dx

du
= dx
sec2 (x)
Therefore,

u2
Z
= du
(u2 + 1) (β 2 u2 + 1)
Z
1 1
= − du
(β 2 − 1) (u2 + 1) (β 2 − 1) (β 2 u2 + 1)
Z Z
1 1 1 1
du − du
β2 − 1 u2 + 1 β2 − 1 β 2 u2 + 1
Z Z
1 1 1 1
du − 2 du
2
β −1 2
u +1 (β − 1) β 2 u + β12
2

Z
1
2
du = arctan(u)
u +1
Z
1
du = β arctan(βu)
u + β12
2

1 1
= arctan(u) − 2 β arctan(βu)
β2 −1 (β − 1) β 2
arctan(u) arctan(βu)
= −
β2 − 1 (β 2 − 1) β
As u = tan(x)

arctan(tan(x)) arctan(β tan(x))

= −
β2 − 1 (β 2 − 1) β
x arctan(β tan(x))
= −
β2 −1 (β 2 − 1) β
Hence,
sec2 (x) tan2 (x)
Z
x arctan(β tan(x))
2 2 2
dx = 2 −
(tan (x) + 1) (β tan (x) + 1) β −1 (β 2 − 1) β
π/2
sec2 (x) tan2 (x)
Z
2βx 2β arctan(β tan(x))
2β 2 2 2
dx = 2 −
0 (tan (x) + 1) (β tan (x) + 1) β −1 (β 2 − 1) β
π/2
sec2 (x) tan2 (x)
Z
2βx 2 arctan(β tan(x))
2β 2 2
dx = −
0 (tan (x) + 1) (β 2 tan (x) + 1) β2 − 1 β2 − 1
π/2
sec2 (x) tan2 (x) 2βx − 2 arctan(β tan(x))
Z
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) β2 − 1
π/2
sec2 (x) tan2 (x) βπ − 2 arctan(β tan( π2 )) −2 arctan(β tan(0))
Z
2β dx = −
0 (tan2 (x) + 1) (β 2 tan2 (x) + 1) β2 − 1 β2 − 1
π/2
sec2 (x) tan2 (x) βπ − 2 arctan(β tan( π2 )) 2 arctan(β tan(0))
Z
2β dx = +
0 (tan2 (x) + 1) (β 2 tan2 (x) + 1) β2 − 1 β2 − 1

lim tan(x) = ∞
x→ π2

For some β ∈ R

lim β tan(x) = ∞
x→ π2

As,
π
lim arctan(x) =
x→∞ 2
Hence,
π
lim arctan(β tan(x)) =
x→ π2 2
π/2
sec2 (x) tan2 (x) βπ − π 2 arctan(0)
Z
2β 2 2 2
dx = 2 +
0 (tan (x) + 1) (β tan (x) + 1) β −1 β2 − 1
π/2
sec2 (x) tan2 (x) βπ − π 2·0
Z
2β 2 2 2
dx = 2 + 2
0 (tan (x) + 1) (β tan (x) + 1) β −1 β −1
Z π/2
sec2 (x) tan2 (x) βπ − π
2β 2 2 2
dx = 2
0 (tan (x) + 1) (β tan (x) + 1) β −1
Z π/2
sec2 (x) tan2 (x) π (β − 1)
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) β2 − 1
Using difference of squares,
π/2
sec2 (x) tan2 (x) π (β − 1)
Z
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β − 1)(β + 1)
π/2
sec2 (x) tan2 (x)
Z
π
2β 2 2 2
dx =
0 (tan (x) + 1) (β tan (x) + 1) (β + 1)
Hence,
d π
f (β) =
dβ (β + 1)
Z
π
f (β) = dβ
(β + 1)

f (β) = π ln(|β + 1|) + C

As β > 0,

f (β) = π ln(β + 1) + C
Consider f (1),
Z π/2
f (β) = ln(cos2 x + β 2 sin2 x)dx
0
Z π/2
f (1) = ln(cos2 x + sin2 x)dx
0
By the pythagorean identity
Z π/2
f (1) = ln(1)dx
0
Z π/2
f (1) = 0dx
0

f (1) = 0
Hence,

f (1) = π ln(1 + 1) + C

0 = π ln(2) + C

C = −π ln(2)
Therefore,

f (β) = π ln(β + 1) + C