Practice BFGS Algorithm
Practice BFGS Algorithm
Q=
[ 41 13] , b=[12]
This is a convex function with a unique minimum at x ¿=Q−1 b .
Step 1: Initialization
We choose an initial guess:
x 0=
[ 21]
The initial inverse Hessian approximation is the identity matrix:
H 0=I =
[ 10 01]
Now, your turn! Compute the initial gradient g0=∇ f ( x 0 ) . What do you get?
g0 =
[ 41 13] [ 21]−[ 12]
Perform the matrix-vector multiplication:
[ ][ ][]
4 ( 2 )+1 ( 1 )
1 ( 2 ) +3 ( 1 )
=
8+1
2+3
=
9
5
Subtracting b :
g0 =
[ 95]−[ 12]=[ 83]
Thus, the initial gradient is:
g0 =
[]
8
3
Now, compute the search direction using the initial inverse Hessian
approximation:
p0=−H 0 g 0
p0=−I g0=−
[ 10 01][ 83]
Multiplying by I (which keeps the vector unchanged):
p0=−
[ 83]=[−8
−3 ]
p0 =
[−8
−3 ]
Now, perform a line search to determine the optimal step size α 0. We seek
α 0 >0 such that it minimizes:
ϕ ( α )=f ( x 0 + α p 0 )
1 T T
Substituting f ( x )= x Qx−b x , derive the optimal α 0.
2
(RESPONSE):
(I ASKED IF THAT WAS ALL AND IT SAID,)
No, there’s much more! Let’s go step by step through the line search process
in detail.
We aim to minimize:
ϕ ( α )=f ( x 0 + α p 0 )
Step 1: Express x k+1 in Terms of α
x ( α ) =x0 + α p0=
[ 21]+α [−8
−3 ] [ 1−3 α ]
=
2−8 α
Expanding x ( α )T Qx ( α ):
T
x ( α ) Qx ( α )=( 2−8 α , 1−3 α )
[ 41 13 ][ 2−8
1−3 α ]
α
Qx ( α )=
[ 4 ( 2−8 α ) +1 ( 1−3 α )
1 ( 2−8 α )+ 3 (1−3 α ) ]
¿
[8−32
2−8 α +3−9 α ] [5−17 α ]
α +1−3 α
=
9−35 α
Now computing x ( α )T ( Qx ( α ) ):
[
( 2−8 α , 1−3 α ) ⋅ 9−35 α
5−17 α ]
¿ ( 2−8 α ) ( 9−35 α ) + ( 1−3 α ) ( 5−17 α )
Expanding:
2 2
¿ 18−70 α−72 α +280 α +5−17 α −15 α + 51 α
2
¿ 23−174 α + 331 α
Thus:
1 1 23 331 2
x ( α ) Qx ( α )= ( 23−174 α +331 α ) = −87 α +
T 2
α
2 2 2 2
Second Term: Linear Term Expansion
b x ( α )=[ 1 2 ]
T
[ 2−8
1−3 α ]
α
Your Turn
Now update the iterate:
x 1=x 0 +α 0 p0
(RESPONSE):
We update the iterate:
x 1=x 0 +α 0 p0
x 1=
[21]+ 0.2205[−8
−3 ]
¿
[2−0.2205
1−0.2205 ×3 ]
×8
¿
[1−0.6615
2−1.764
]
¿
[0.3385
0.236
]
Thus, the updated iterate is:
x 1=
[0.3385
0.236
]
Next Step: Compute Differences for the BFGS Update
Now, compute:
s0 =x1−x 0 , y 0=∇ f ( x 1 )−∇ f ( x 0 )
Since we already know x 0 and x 1, compute s0 first. Then, find ∇ f ( x 1 ) using the
gradient formula:
∇ f ( x )=Qx−b
Step 1: Compute s0
s0 =x1−x 0
s0 =
[ 0.3385
0.236
]−[ 21]
¿
[ ]
0.236−2
0.3385−1
¿
[−0.6615
−1.764
]
Step 2: Compute y 0=∇ f ( x 1 ) −∇ f ( x 0 )
We use the gradient formula:
∇ f ( x )=Qx−b
∇ f ( x 1 )=Q x 1−b
¿
[ 41 13][ 0.3385
0.236
]−[12]
First, perform the matrix-vector multiplication:
Q x1 =
[ 4 ( 0.236 ) +1 ( 0.3385 )
1 ( 0.236 )+3 ( 0.3385 ) ]
¿
[0.944
0.236+1.0155 ]
+0.3385
¿
[1.2825
1.2515 ]
Now subtract b :
∇ f ( x 1 )=
[1.2825
1.2515 ]−
[ 2]
1
¿
[1.2825−1
1.2515−2 ]
¿
[−0.7485
0.2825
]
Now, compute y 0:
y 0=∇ f ( x 1 ) −∇ f ( x 0 )
¿
[−0.7485
0.2825
]−[ 83]
¿
[0.2825−8
−0.7485−3 ]
¿
[−7.7175
−3.7485 ]
Final Results
[−0.6615
s0 =
−1.764
]
y=
[−7.7175
−3.7485 ]
0
(RESPONSE):
First, compute y T0 s0 :
T
y 0 s0 =(−7.7175 ,−3.7485 ) ⋅ (−1.764 ,−0.6615 )
1. ρ0 s 0 y T0 and ρ0 y 0 s T0 :
T
ρ0 s 0 y 0 =0.0621
[−0.6615
−1.764
] [−7.7175 −3.7485 ]
¿ 0.0621
[ 13.616 6.609
5.103 2.480 ] =
[ 0.846
0.317
0.411
0.154 ]
Since ρ0 s 0 y T0 = ρ0 y 0 sT0 , we subtract twice this matrix from I :
I −2 ×
[0.846
0.317
0.411
0.154
=
][
1−1.692 −0.822
=
−0.692 −0.822
−0.634 1−0.308 −0.634 0.692 ][ ]
2. ρ0 s 0 s T0 :
T
[−0.6615
ρ0 s 0 s 0 =0.0621
−1.764
] [−1.764 −0.6615 ]
¿ 0.0621
[ 1.167
3.112 1.167
0.438 ] =
[ 0.193
0.073
0.073
0.027 ]
Adding to the previous result:
H 1=
[−0.692
−0.634 0.692 ] [ 0.073
−0.822 0.193
+
0.073
0.027 ]
¿
[−0.499
−0.561 0.719 ]
−0.749
Final Result:
The updated inverse Hessian approximation:
H 1=
[−0.499
−0.561
−0.749
0.719 ]
Final Thoughts
This completes the first iteration of the BFGS algorithm. In subsequent
iterations, we would:
Check for convergence.