A Nonmonotone Semismooth Inexact NM
A Nonmonotone Semismooth Inexact NM
A Nonmonotone Semismooth Inexact NM
Abstract
In this work we propose a variant of the inexact Newton method
for the solution of semismooth nonlinear systems of equation. We
introduce a nonmonotone scheme, which couples the inexact features
with the nonmonotone strategies. For the nonmonotone scheme, we
present the convergence theorems. Finally, we show how we can apply
these strategies in the variational inequality context and we present
some numerical examples.
Keywords: Semismooth Systems, Inexact Newton Methods, Non-
monotone Convergence, Variational Inequality Problems, Nonlinear
Programming Problems.
1 Introduction
An efficient method for the solution of the nonlinear system of equations
F (x) = 0, (1)
1
2
to the outer residual, given by the quantity kF (xk )k, while (3) implies that
the ratio of the norms of the vector F computed in two successive iterates
is less than (1 − β(1 − ηk )), a quantity less than one.
It is worth to stressing that the two conditions are related each other by
means of the forcing term ηk .
There are many advantages in the algorithms with inexact features, from
the theoretical and practical point of view. Indeed, global convergence the-
orems can be proved under some standard assumptions. Furthermore, the
condition (2) tells us that an adaptive tolerance can be introduced in the
solution of the iteration function equation (4), saving unnecessary compu-
tations when we are far from the solution.
A further relaxation on the requirements can be obtained by allowing non-
monotone choices. The nonmonotone strategies (see for example [10]) are
well known in literature for their effectiveness in the choice of the steplength
in many line–search algorithms.
In [1] the nonmonotone convergence has been proved in the smooth case
for an inexact Newton line–search algorithm and the numerical experience
shows that the nonmonotone strategies can be useful in this kind of al-
gorithm to avoid the stagnation of the iterates in a neighborhood of some
“critical” points. In this paper we modify the general inexact Newton frame-
work (2) and (3) in a nonmonotone way, by substituting (2) and (3) with
3
Our approach is to generalize the global method for smooth equations pre-
sented in [1], providing convergence theorems under analogous assumptions.
In the following section we recall some basic definitions and some results
for the semismooth functions; in section 3 we describe the general scheme
of our nonmonotone semismooth inexact method and we prove the con-
vergence theorems; in the section 4 we apply the method to a particular
semismooth system arising from variational inequalities and nonlinear pro-
gramming problems and, in section 5, we report the numerical results.
The next results play an important role in establishing the global conver-
gence of the semismooth Newton methods.
kH −1 k ≤ K
xk ∈ N δ (x∗ )
2
and ½ ¾
²
x`(k) ∈ S² ≡ y : kF (y)k < .
K(1 + η)
7
² δ
kxk+1 − x∗ k ≤ 2KkF (xk+1 )k < 2K <
K(1 + η) 2
Algorithm 3.1
Set x0 ∈ Rn , β ∈ (0, 1),0 < θmin < θmax < 1, ηmax ∈ (0, 1), k = 0.
For k = 0, 1, 2, ...
Choose Hk ∈ ∂B F (xk ) and determine η̄k ∈ [0, ηmax ] and s̄k that
satisfy
kHk s̄k + F (xk )k ≤ η̄k kF (x`(k) )k.
Set αk = 1.
While kF (xk + αk s̄k )k > (1 − αk β(1 − η̄k ))kF (x`(k) )k
Choose θ ∈ [θmin , θmax ]
8
Set αk = θαk .
Set xk+1 = xk + αk s̄k
is satisfied. Condition (13) is more general than the Armijo condition em-
ployed for example in [7], since it does not require the differentiability of the
merit function Ψ(x) = 1/2kF (x)k2 .
The final inexact Newton step is given by sk = αk s̄k , and it satisfies condi-
tions (8) and (9) with forcing term ηk = 1 − αk (1 − η̄k ).
We will simply assume that at each iterate k it is possible to compute the
vector s̄k which is an inexact Newton step at the level η̄k (see for example
the assumption A1 in [12] for a sufficient condition). The next lemma shows
that, under the previous assumption, the sequence generated by Algorithm
3.1 satisfies conditions (8) and (9).
Lemma 3.1 Let β ∈ (0, 1); suppose that there exist η̄ ∈ [0, 1), s̄ satisfying
(1 − β)(1 − η̄)
ε= kF (x`(k) )k, (16)
ks̄k
9
Theorem 3.2 Suppose that {xk } is the sequence generated by the non-
monotone semismooth Algorithm 3.1, with 2β < 1 − ηmax . Assume that the
following conditions hold:
A1 There exists a limit point x∗ of the sequence {xk }, such that F is
BD-regular in x∗ ;
A2 At each iterate k it is possible to find a forcing term η̄k and a vector
s̄k such that the inexact residual condition (8) is satisfied;
A3 For every sequence {xk } converging to x∗ , every convergent sequence
{sk } and every sequence {λk } of positive scalars converging to zero,
Ψ(xk + λk sk ) − Ψ(x`(k) )
lim sup ≤ lim F (xk )T Hk sk ,
k→+∞ λk k→+∞
10
where Ψ(x) = 1/2kF (x)k2 , whenever the limit in the left-hand side
exists;
A4 For every sequence {xkj } such that αkj converges to zero, then ks̄kj k
is bounded.
Then, F (x∗ ) = 0 and the sequence {xk } converges to x∗ .
Proof. The assumption A1 implies that the norm of the vector ks̄k k is
bounded in a neighborhood of the point x∗ . Indeed, from Proposition 2.1,
there exists a positive number δ and a constant K such that kHk−1 k ≤ K
for any Hk ∈ ∂B F (xk ), for any xk ∈ Nδ (x∗ ).
Thus, the following conditions hold:
Furthermore, the condition A2 ensures that the Algorithm 3.1 does not
break down, thus it generates an infinite sequence.
Now we consider separately the two following cases:
a) There exists a set of indices K such that {xk }k∈K converges to x∗ and
lim inf αk = 0;
k→+∞,k∈K
where I is a set of indices such that the limit on the right hand side exists.
Since αk is the final value after the backtracking loop, we must have
αk ³ αk ´
kF (xk + s̄k )k > 1 − β(1 − η̄k ) kF (x`(k) )k (20)
θ θ
11
which yields
αk ³ αk ´
lim kF (xk + s̄k )k ≥ lim 1− β(1 − η̄k ) kF (x`(k) )k.
k→+∞,k∈K θ k→+∞,k∈K θ
(21)
If we choose K as the set of indices with the property a), exploiting the
continuity of F , recalling that η̄k is bounded, that ks̄k k is bounded and sub-
sequencing to ensure the existence of the limit of αk , we obtain kF (x∗ )k ≥ L.
On the other hand, from (19) we have also that L ≥ kF (x∗ )k, thus it follows
that
L = kF (x∗ )k. (22)
Furthermore, by squaring both sides of (20), we obtain the following in-
equalities
αk ³ αk ´2
kF (xk + s̄k )k2 > 1− β(1 − η̄k ) kF (x`(k) )k2
θ ³ θ ´
αk
≥ 1 − 2 β(1 − η̄k ) kF (x`(k) )k2 .
θ
This yields
αk αk
kF (xk + s̄k )k2 − kF (x`(k) )k2 > −2 β(1 − η̄k )kF (x`(k) )k2 . (23)
θ θ
αk
Dividing both sides by θ , passing to the limit and exploiting the assumption
A4, we obtain
αk 2
T kF (xk + θ s̄k )k − kF (x`(k) )k2
lim F (xk ) Hk sk ≥ lim αk
k→+∞,k∈K k→+∞,k∈K θ
≥ lim −2β(1 − η̄k )kF (x`(k) )k2 . (24)
k→+∞,k∈K
Since (22) holds and taking into account that η̄k ≥ 0, we have
lim F (xk )T Hk s̄k ≤ lim −kF (xk )k2 + ηmax kF (x`(k) )k2 .
k→+∞ k→+∞
L · lim α`(k)−1 ≤ 0
k→∞
that implies
L=0
or
lim α`(k)−1 = 0. (30)
k→∞
ˆ
Suppose that L 6= 0, so that (30) holds. Defining `(k) = `(k + N + 1) so
ˆ
that `(k) > k, we show by induction that for any j ≥ 1 we have
lim α`(k)−j
ˆ =0 (31)
k→∞
13
and
lim kF (x`(k)−j
ˆ )k = L. (32)
k→∞
lim kx`(k)
ˆ − x`(k)−1
ˆ k = 0. (33)
k→∞
lim kF (x`(k)−1
ˆ )k = L. (34)
k→∞
Assume now that (31) and (32) hold for a given j. We have
lim α`(k)−(j+1)
ˆ =0
k→∞
and so
lim kx`(k)−j
ˆ − x`(k)−(j+1)
ˆ k = 0,
k→∞
lim kF (x`(k)−(j+1)
ˆ )k = L.
k→∞
Thus, we conclude that (31) and (32) hold for any j ≥ 1. Now, for any k,
we can write
ˆ
`(k)−k−1
X
kxk+1 − x`(k)
ˆ k≤ α`(k)−j
ˆ ks̄`(k)−j
ˆ k
j=1
ˆ − k − 1 ≤ N , we have
so that, since we have `(k)
Furthermore, we have
kx`(k)
ˆ − x∗ k ≤ kx`(k)
ˆ − xk+1 k + kxk+1 − x∗ k (36)
Since x∗ is a limit point of {xk } and (35) holds, (36) implies that x∗ is
a limit point for the sequence {x`(k)
ˆ }. From (33) we conclude that x∗ is a
14
lim kF (xk )k = 0.
k→∞
where C is a nonempty closed convex subset of Rn , < ·, · > the usual inner
product in Rn and V : Rn → Rn is a continuous function.
When V is the gradient mapping of the real-valued function f : Rn → R, the
problem VIP(C,V) becomes the stationary point problem of the following
optimization problem
min f (x)
(38)
s. t. x ∈ C
L(x, λ, µ, κl , κu ) = 0
h(x) = 0
µT g(x) = 0 g(x) ≥ 0 µ ≥ 0 (40)
κTl (Πl x − l) = 0 Πl x − l ≥ 0 κl ≥ 0
κTu (u − Πu x) = 0 u − Πu x ≥ 0 κu ≥ 0
The system (41) can be solved by the semismooth inexact Newton method
[7], given by
wk+1 = wk + αk ∆wk ,
where w0 is a convenient starting point, αk is a damping parameter and
∆wk is the solution of the following linear system
Hk ∆w = −Φ(wk ) + rk (42)
where Hk ∈ ∂B Φ(wk ) and rk is the residual vector and it satisfies the con-
dition
krk k ≤ ηk kΦ(wk )k.
As shown in [18], permuting the rows of the matrix Hk of the right hand
side and changing the sign of the fourth row, the system (42) can be written
as follows:
Rκl 0 0 Rl Πl 0
0 Rκu 0 Ru Πu 0
0 0 Rµ Rg (∇g(x)) T 0
ΠT −ΠTu ∇g(x) −∇V (x) + ∇2 g(x)µ + ∇2 h(x)λ ∇h(x)
l
0 0 0 (∇h(x))T 0
4κl ϕl (κl , Πl x − l)
4κu ϕu (κu , u − Πu x)
4µ = − ϕI (µ, g(x)) + Pr
4x α
4λ h(x)
where Pr is the permuting residual vector and −α = V (x) − ∇h(x)λ −
∇g(x)µ − (Πl )T κl + (Πu )T κu ;
Rg = diag(rg1 , . . . , rgm )
à !
g
diag q i −1 if (gi (x), µi ) 6= 0
(rg )i = µ2 + g2
i i
−1 if (gi (x), µi ) = 0
17
Rµ = diag(rµ1 , . . . , rµm )
à !
µ
diag q i −1 if (gi (x), µi ) 6= 0
(rµ )i = µ2 + g2
i i
−1 if (gi (x), µi ) = 0
Rl = diag(rl1 , . . . , rlnl )
à !
((Π x)i − l i )
diag q l
−1 if ((Πl x)i − li , (κl )i ) 6= 0
(rl )i = (κ )2 + ((Π x) − l )2
l i l i i
−1 if ((Πl x)i − li , (κl )i ) = 0
Ru = diag(ru1 , . . . , runu )
à !
−diag q (ui − (Πu x)i )
−1 if (ui − (Πu x)i , (κu )i ) 6= 0
(ru )i = (κ )2 + (u − (Π x) )2
u i i u i
−1 if (ui − (Πu x)i , (κu )i ) = 0
Algorithm 4.1
Step 4 (Linesearch)
Compute the minimum integer h, such that, if αk = θh the following
condition holds
5 Numerical results
In this section we report some numerical experiments, by coding Algorithm
4.1 in FORTRAN 90 using double precision on HP zx6000 workstation with
Itanium2 processor with 1.3 GHz and 2 Gb of RAM, running HP-UX oper-
ating system.
19
In particular the input parameters are: β = 10−4 , θ = 0.5, tol = 10−8 and we
declare a failure of the algorithm when the tolerance tol is not reached after
500 iterations or when, in order to satisfy the backtracking condition (44),
more than 30 reductions of the damping parameter have been performed.
The forcing term ηk has been adaptively chosen as
µ ¶
1 −8
ηk = max , 10 .
1+k
The solution of the linear system (43) is computed by the LSQR method
(Paige and Saunders in [15]) with a suitable preconditioner proposed in [18].
*http://scicomp.ewha.ac.kr/netlib/ampl/models/nlmodels/
21
NLP Problem N =1 N =3 N =5 N =7
osborne1 ext. 121 16 16 16
inn. 121 16 16 16
back 443 0 0 0
rosenbr ext. 180 14 9 9
inn. 180 14 9 9
back 1121 4 2 2
hs6 ext. 8 7 7 7
inn. 8 7 7 7
back 14 11 11 11
mitt105 ext. 11 9 8 8
inn. 19 13 9 9
back 13 4 0 0
mitt305 ext. 31 24 24 20
inn. 71 47 46 37
back 195 105 102 68
mitt405 ext. 32 26 19 19
inn. 77 54 39 39
back 227 144 81 81
− the algorithm does not converge
References
[1] S. Bonettini (2005). A Nonmonotone Inexact Newton Method, Optim.
Meth. Software, 20, Vol. 4-5.
[2] I. Bongartz, A.R. Conn, N. Gould and Ph. L. Toint (1995). CUTE:
Constrained and Unconstrained Testing Environnment, ACM Transac-
tions on Mathematical Software, 21, 123–160.
[11] L. Lukšan and J. Vlček (1999). Sparse and partially separable test prob-
lems for unconstrained and equality constrained optimization, Technical
report 767, Institute of Computer Science, Academy of Science of the
Czech Republic.