COMBINATION ADAPTIVE TRUST REGION METHOD BY NON-MONOTONE STRATEGY FOR UNCONSTRAINED NONLINEAR PROGRAMMING

Masoud Ahookhosh

Asia-Pacific Journal of Operational Research Vol. 28, No. 5 (2011) 585–600 c World Scientific Publishing Co. & Operational Research Society of Singapore DOI: 10.1142/S0217595911003454 COMBINATION ADAPTIVE TRUST REGION METHOD BY NON-MONOTONE STRATEGY FOR UNCONSTRAINED NONLINEAR PROGRAMMING KEYVAN AMINI∗ and MASOUD AHOOKHOSH† Department of Sciences, Razi University Kermanshah, Iran ∗kamini@razi.ac.ir †ahoo.math@gmail.com In this paper, we present a new trust region method for unconstrained nonlinear programming in which we blend adaptive trust region algorithm by non-monotone strategy to propose a new non-monotone trust region algorithm with automatically adjusted radius. Both non-monotone strategy and adaptive technique can help us introduce a new algorithm that reduces the number of iterations and function evaluations. The new algorithm preserves the global convergence and has local superlinear and quadratic convergence under suitable conditions. Numerical experiments exhibit that the new trust region algorithm is very eﬃcient and robust for unconstrained optimization problems. Keywords: Unconstrained optimization; trust region method; non-monotone technique; global convergence; superlinear convergence; quadratic convergence; second necessary condition. 1. Introduction Consider the following unconstrained optimization problem min f (x) x∈Rn (1.1) where f : Rn → R is a twice continuously diﬀerentiable function. There are many methods that were proposed to solve (1.1), but most of these methods are iterative methods. Trust region family are one of the most well-known methods for solving nonlinear programming of which concept has matured over than 30 years. In this family of methods, one considers a model mk of the objective function which is assumed to be adequate in neighborhood around the current iterate xk , namely trust region. This neighborhood is often represented by a ball in some norms of which radius δk is updated from iteration k to k + 1 by considering how well mk predicts the objective function in iterate xk+1 . Trust region methods try to find the area around the current iterate xk in which the quadratic model must have an agreement with objective function. In 585 586 K. Amini & M. Ahookhosh the standard trust region method, this agreement is measured by the following ratio: rk = f (xk ) − f (xk + dk ) mk (0) − mk (dk ) (1.2) the numerator of (1.2) is called the actual reduction and the denominator of (1.2) is called the predicted reduction; and dk is a trial step determined by solving the following sub-problem: 1 min mk (d) = gkT d + dT Bk d, 2 d∈Rn d ≤ δk (1.3) where · is the Euclidean norm, gk = ∇f (xk ), Bk is a symmetric approximation of Hk = ∇2 f (xk ), and δk is trust region radius in kth iteration. If rk is close to 1, it can be concluded that there is a good agreement between the model and the objective function over this step, so this step is successful, otherwise the step is unsuccessful. There are two prominent factors in trust region method, initial trust region radius and ratio that measure the agreement of function and quadratic model. It is well known that the standard trust region method is very sensitive on initial radius, but in standard trust region method, δk is independent of gk and Bk . So we do not know if the radius δk is suitable for the whole iterations of algorithm. This situation possibly increases the number of solving subproblems in the inner iteration and so decreases the eﬃciency of the algorithm. A favorite goal for avoiding of this drawback is using the information of problem in current iterate to reduce the number of ineﬀective iterations. To overcome this drawback, Sartenear (1997) proposed a strategy to determine the initial trust region radius. Motivated by a neural network problem, Zhang et al. (2002) proposed another approach to determine the trust region radius. They set δk = ρp gk · B̂ −1 where ρ ∈ (0, 1), p is a non-negative integer, and B̂ = Bk + iI is a positive definite matrix for some i ∈ N. Zhang’s approach was an eﬃcient approach to solve unconstrained optimization problems. Shi and Guo (2008), motivated by Zhang’s strategy, introduced a new adaptive radius for trust region that is described in the next section. They proved that the new adaptive trust region algorithm has strong properties like global, superlinear and quadratic convergence. Their numerical experiments exhibited that the new trust region algorithm with automatically adjusted radius is very eﬃcient. As we state, another important factor in trust region is the ratio that compares the agreement between function and quadratic model. It is well known that the pioneer in proposal of non-monotone strategies is Chamberlain et al. (1982), who proposed the watchdog technique for constrained optimization. Based on this idea, Grippo et al. (1986, 1989) introduced a non-monotone line search technique for Combination Adaptive Trust Region Method by Non-Monotone Strategy 587 Newton method. They relaxed Armijo rule such that stepsize θk is satisfied in the following condition: f (xk + θk dk ) ≤ fl(k) + βθk ∇f (xk )T dk (1.4) in which, β ∈ (0, 1) and fl(k) = max {fk−j }, 0≤j≤n(k) k = 0, 1, 2, · · · (1.5) with 0 ≤ n(k) ≤ min{n(k − 1) + 1, N }, in which N ≥ 0 is an integer constant. Motivated by this idea, many authors worked on combination trust region methods with non-monotone technique (see Deng et al., 1993; Toint, 1996, 1997 and Zhang et al., 2003). The basic idea of non-monotone trust region techniques is changing of the ratio (1.2). The most famous non-monotone ratio is as follows: r̃k = fl(k) − f (xk + dk ) mk (0) − mk (dk ) (1.6) in which, in comparison with (1.2), actual reduction is changed. In 2005, Fu and Sun proposed a new non-monotone ratio: r̂k = fl(k) − f (xk + dk ) fl(k) − fk − mk (dk ) (1.7) Indeed, in new definition, in comparison with (1.2), both numerator and dominator are changed. The actual reduction and the predicted reduction is defined as follows: aredk = fl(k) − f (xk + dk ) predk = fl(k) − fk − mk (dk ) (1.8) Numerical experiments exhibited that the non-monotone trust region algorithm was more eﬃcient than the general monotone versions, especially in presence of the narrow curved valley. Pay attention to these attempts to reform trust region algorithm, Zhang et al. (2003) decided to combine a non-monotone strategy (1.6) with adaptive trust region method. Zhang’s non-monotone adaptive trust region algorithm is very eﬃcient. Thus, Fu et al. (2005) proposed another non-monotone adaptive trust region algorithm in which the non-monotone technique (1.7) was combined with Zhang’s adaptive trust region radius. On the other hand, from Shi and Guo (2008), we know that the Shi’s adaptive trust region algorithm is more eﬃcient than the Zhang’s adaptive trust region algorithm; so we decided to combine Shi’s adaptive radius with nonmonotone technique (1.7) to propose a new non-monotone trust region algorithm with automatically adjusted radius. The rest of this paper organized as follows. In Sec. 2, we describe our new nonmonotone adaptive trust region algorithm and give some properties about it. In Sec. 3, we prove the global convergence of the new algorithm. In Sec. 4, the local superlinear and quadratic convergence and second necessary condition are proved. Numerical results in Sec. 5 indicate that the new algorithm is very eﬃcient and robustness. Finally, some conclusions and remarks are delivered in Sec. 6. 588 K. Amini & M. Ahookhosh 2. New Algorithm In this section, we will describe a new non-monotone trust region method with adaptive radius and give some properties about this algorithm. In the new algorithm, we combine the non-monotone strategy (1.7) with Shi’s adaptive trust region radius. Firstly, we describe Shi’s adaptive radius. Pay attention to Shi’s algorithm, we set µ ∈ (0, 1), ρ ∈ (0, 1), and suppose that qk satisfies the following relation − gkT qk ≥τ gk ·qk (2.1) with τ ∈ (0, 1]. Now we define sk = − gkT qk (2.2) qkT B̂k qk in which B̂k = Bk + iI, and i is the smallest non-negative integer such that qkT B̂k qk > 0 (2.3) With these definitions, they proposed a trust region radius as δk = αk qk in which αk is the largest α in {sk , ρsk , ρ2 sk , · · · } such that the trial step dk satisfies r̂k ≥ µ (2.4) p So there is a positive integer p such that δk = ρ sk qk . In the new algorithm, we use this radius only with this diﬀerence that qk is chosen satisfying in the following inequality qkT B̂k qk > λ0 qkT qk (2.5) Remark 2.1. Choosing qk satisfy in (2.5) is possible because λmin (B̂k ) ≤ qkT B̂k qk ≤ λmax (B̂k ) qkT qk So, it is enough that i is chosen such that λmin (B̂k ) > 0. Now, we can introduce new non-monotone trust region algorithm as follows: Algorithm 2.1: New non-monotone adaptive trust region algorithm. (1) Choose 0 < ρ < 1, 0 < µ < 1, ǫ > 0, x0 ∈ Rn , k = 0, a symmetric matrix B0 ∈ Rn×n , and N ≥ 0. (2) Compute gk . If gk ≤ ǫ, stop. (3) Choose qk to satisfy (2.1) and set α = sk that sk is defined in (2.2). (4) Solve (1.3) to determine dk and set x̄k+1 = xk + dk . (5) Calculate n(k) and fl(k) , and compute r̂k by (1.7). If r̂k < µ, then let α = ρα and go to Step 4. (6) Let xk+1 = x̄k+1 , p = 0, update Bk+1 , k = k + 1 and go to Step 2. Combination Adaptive Trust Region Method by Non-Monotone Strategy 589 Note that similar to Chamberlain et al. (1982), we solve the quadratic subproblem (1.3) inaccurately by Newton method, such that the following relation holds: gk fl(k) − fk − mk (dk ) ≥ βgk min αk , Bk (2.6) and it is well-known if sequence {xk } is generated by the trust region algorithm, then |f (xk ) − f (xk + dk ) + mk (dk )| ≤ O(dk 2 ) Throughout this paper, we consider the following assumptions in order to analyze the new algorithm: Assumption 2.1. (1) The objective function f (x) has a lower bound on Rn and g(x) = ∇f (x) is uniformly continuous on a open convex set Ω that contains the level set L(x0 ) = {x ∈ Rn | f (x) ≤ f (x0 )}, where x0 ∈ Rn is given. (2) Bk is uniformly bounded, i.e., there exists a constant M such that Bk ≤ M , for all k. Remark 2.2. If in definition of algorithm set N = 0, the new algorithm reduces to Shi’s adaptive trust region algorithm. Also in definition of fl(k) , we set n(k) = min{N, k}. Remark 2.3. If f (x) is a twice continuously diﬀerentiable function and the level set L(x0 ) is bounded, then Assumption 2.1(1) implies that ∇2 f (x) is uniformly continuous and bounded on the open bounded convex set Ω that contains L(x0 ). Hence, there exists a constant M1 > 0 such that ∇2 f (x) ≤ M1 and by using Mean Value Theorem, we have g(x) − g(y) ≤ M1 x − y, ∀ x, y ∈ Ω Now we can give some properties about the new algorithm: Lemma 2.1. If dk is a optimal solution of subproblem (1.3) with respect to αk ≤ sk , then for all k, we have 1 fl(k) − fk − mk (dk ) ≥ −mk (αk qk ) ≥ − αk gkT qk 2 where dk is an optimal solution of subproblem with respect to αk ≤ sk . Proof. It is clear that d = αk qk is a feasible solution to (1.3), and fl(k) − fk ≥ 0. So, we have fl(k) − fk − mk (dk ) ≥ −mk (dk ) ≥ −mk (αk qk ) 1 = − αk gkT qk + α2k qkT Bk qk 2 590 K. Amini & M. Ahookhosh 1 2 T T ≥ − αk gk qk + αk qk B̂k qk 2 1 ≥ − αk gkT qk + αk sk qkT B̂k qk 2 1 = − αk gkT qk 2 So the proof is complete. Lemma 2.2. Suppose that Assumption 2.1 holds. Then Steps 5 and 6 of the new algorithm are well-defined, i.e., the new algorithm doesn’t cycle infinitely in the inner cycle. Proof. First, we prove that when p is suﬃciently large, (2.4) holds. Let dik be the solution of subproblem (1.3) corresponding to p = i at xk . It follows from (2.1), Remark 2.1 and Lemma 2.1 that fk − f (xk + dik ) + mk (dik ) fl(k) − f (xk + dik ) i − 1 = f fl(k) − fk − mk (dik ) l(k) − fk − mk (dk ) ≤ O(dik 2 ) ≤ − 21 αk(i) gkT qk = O(δk(i) ) →0 1 2 τ gk 2 O(δk(i) ) 1 2 τ δk(i) gk where δk(i) → 0 as i → ∞. This implies that (2.4) holds, so Steps 5 and 6 of the new algorithm are well defined, and inner cycle does not infinity. Lemma 2.3. Suppose that Assumption 2.1 holds, then sequence {fl(k) } is not increasing monotonically, so sequence {fl(k) } is convergent. Proof. If iteration k + 1 be a successful iteration, then we have fl(k) − f (xk + dk ) ≥ µ(fl(k) − fk − mk (dˆk )) ≥ 0, ∀k Therefore, we will have fl(k) ≥ fk+1 , ∀k (2.7) The rest of this proof is similar to Lemma 3.5 in Zhang et al. (2003). 3. Convergence Analysis Trust region algorithms are one of the most prominent methods for solving optimization problems that have strong convergence properties (see Conn et al., 2000; Combination Adaptive Trust Region Method by Non-Monotone Strategy 591 Fletcher, 1987; Moré, 1983; Nocedal and Wright, 2006; Powell, 1975, 1984; and Schultz et al., 1985). In this section, we prove the global convergence of the new trust region algorithm. Lemma 3.1. Suppose that sequence {xk } is generated by the new algorithm, then there exists a positive scalar c̄ such that dk ≤ c̄gk (3.1) Proof. From the definition δk , we have dk ≤ ρp sk qk . By (2.5) we have 0 < λ0 qk 2 ≤ qkT B̂k qk (3.2) By (3.2) and the Cauchy inequality, we have dk ≤ ρp sk qk = −ρp where c̄ = ρp λ0 , gkT qk qk T qk B̂k qk ≤ −ρp ρp gkT qk ≤ gk = c̄gk λ0 qk λ0 so the proof is completed. Lemma 3.2. Suppose that sequence {xk } is generated by the new algorithm, then we have lim dk = 0 (3.3) k→∞ Proof. Pay attention to Assumption 2.1(2) and Lemma 3.1, all conditions of Lemma 4.6 in Fu and Sun (2005) is confirmed, so like this theorem we can prove the lemma. Theorem 3.3. Suppose that Assumption 2.1 holds and ǫ = 0, then the new algorithm either stops at stationary point of (1.1) or generates an infinite sequence {xk } such that lim − k→∞ gkT qk =0 qk (3.4) Proof. Suppose that the new algorithm generates sequence {xk } and lim − k→∞ gkT qk =0 qk Therefore, there exist a ǫ0 > 0 and an infinite subset K ⊆ {0, 1, 2, · · · }, such that − gkT qk ≥ ǫ0 , qk ∀k ∈ K (3.5) From assumption 2.1(2), we know ∃ M0 > 0 s.t. B̂k ≤ M0 , ∀k (3.6) 592 K. Amini & M. Ahookhosh thus, qkT B̂k qk ≤ M0 qk 2 , ∀k (3.7) Now let K1 = {k ∈ K | αk = sk } and K2 = {k ∈ K | αk < sk }. By this definition, we have that K = K1 ∪ K2 is an infinite subset of {0, 1, 2, · · · }. In this part of proof, we prove that neither K1 nor K2 can be an infinite set. Thus we obtain a contradiction to (3.5). We assume that K1 be an infinite subset of K, by using Lemma 2.1 and (3.7) we have 1 fl(k) − f (xk+1 ) ≥ µ(fl(k) − fk − mk (dˆk )) ≥ − µαk gkT qk 2 T 2 1 1 g qk ≥ − µsk gkT qk = µ Tk 2 2 qk B̂qk µ ≥ 2M0 gkT qk qk 2 ≥ µ 2 ǫ , 2M0 0 k ∈ K1 So we have fl(k) − f (xk+1 ) ≥ µ 2 ǫ , 2M0 0 k ∈ K1 (3.8) Now we set Rk = k + N + 2, so by definition of fl(k) and recurrence relation we can prove l(Rk )−(k+1) xk+1 = xl(Rk ) − dl(Rk )−j j=1 Because l(Rk ) − (k + 1) ≤ N + 1 and f (x) is a uniform continuous function, by Lemma 3.2, we have   l(Rk )−(k+1) (3.9) dl(Rk )−j  = lim fl(Rk ) lim f (xk+1 ) = lim f xl(Rk ) − k→∞ k→∞ j=1 k→∞ This inequality together with Lemma 2.3 and (3.9) on left hand side of (3.8), as k → ∞, implies µ 2 0≥ ǫ (3.10) 2M0 0 This contradiction shows that K1 can not be an infinite subset of K. Now, let k ∈ K2 , Lemma 2.1 implies that 1 fl(k) − f (xk+1 ) ≥ µ(fl(k) − fk − mk (dˆk )) ≥ − µαk gkT qk 2 1 g T qk 1 ≥ µαk qk ǫ0 = − µαk qk k 2 qk 2 Combination Adaptive Trust Region Method by Non-Monotone Strategy 593 Using of the same argument for deriving (3.10) on preceding inequality, we deduce lim δk = lim αk qk = 0, k→∞ k ∈ K2 . k→∞ (3.11) Now, we suppose that d˜k is an optimal solution of the following subproblem 1 minn gkT d + dT Bk d, d∈R 2 d ≤ δ̃k , δ̃k = δk /ρ then with definition of algorithm the following inequality holds: fl(k) − f (xk + d˜k ) < µ, fl(k) − fk − mk (d˜k ) k ∈ K2 (3.12) Hence, (3.11) implies that lim δ̃k = 0, k→∞ k ∈ K2 (3.13) So by Lemma 2.1, (3.5) and (3.13) we have f − f (x + d˜ ) + m (d˜ ) f l(k) − f (xk + d˜k ) k k k k k − 1 = fl(k) − fk − mk (d˜k ) fl(k) − fk − mk (d˜k ) ≤ = O(d˜k 2 ) fl(k) − fk − mk (d˜k ) ≤ O(δ̃k2 ) − 12 α˜k gkT qk O(δ̃k ) O(δ̃k2 ) ≤ 1 → 0 as k → ∞ 1 ˜ T − 2 δk gk qk /qk 2 ǫ0 where k ∈ K2 . Therefore, fl(k) − f (xk + d˜k ) ≥µ fl(k) − fk − mk (d˜k ) (3.14) But for suﬃciently large k ∈ K2 , (3.14) contradicts (3.12). Therefore, there exists no infinite subset of K such that (3.5) holds, so the proof is complete. Theorem 3.4. Suppose that conditions of Theorem 3.3 holds and qk satisfies (2.1), then the new algorithm either stops finitely or generates an infinite sequence {xk } such that lim gk = 0 k→∞ Proof. If the algorithm stops finitely, then theorem is proved. Otherwise, Theorem 3.3 says that the algorithm generates an infinite sequence {xk } such that satisfies (3.4) and since qk satisfies (2.1) we have 0 ≤ τ gk ≤ − gkT qk g T qk gk = − k → 0, gk · qk qk k→∞ Therefore we have limk→∞ gk = 0, so the proof is competed. (3.15) 594 K. Amini & M. Ahookhosh 4. Convergence Rate Analysis In this section, we first prove that the new algorithm have both superlinear and quadratic convergence rate at suitable conditions, and in continuation a second necessary condition are investigated. In this section, we need qk = −Bk−1 gk satisfy in (2.1). To take this purpose, we need to make additional assumption as follows: Assumption 4.1. Bk is a uniformly bounded condition number matrix. Remark 4.1. Nocedal and Wright (2006) showed that if Bk be a positive definite matrix with uniformly bounded condition number then qk = −Bk−1 gk satisfies in (2.1). Theorem 4.1. Suppose that Assumptions 2.1 and 4.1 hold, qk satisfies (2.1) and the new algorithm generates an infinite sequence {xk } such that xk → x∗ , as k → ∞, H(x) = ∇2 f (x) is continuous in a neighborhood N (x∗ , ǫ) of x∗ , and H(x) and Bk are uniformly positive definite matrices such that [Bk − H(x∗ )]qk =0 k→∞ qk lim (4.1) where qk = −Bk−1 gk . Then sequence {xk } converges to x∗ superlinearly. Proof. For suﬃciently large k, we have that B̂k = Bk , and it is obvious that sk = 1 and δk = ρp qk , so dˆk = qk is a feasible solution of the subproblem for p = 0. By (4.1), we have that [Bk − H(x∗ )]dˆk gk + H(x∗ )dˆk = lim =0 k→∞ k→∞ dˆk dˆk lim Hence, we have that [Bk − H(x∗ )]dˆk = o(dˆk ) and [Hk − H(x∗ )]dˆk = o(dˆk ), so we can write [Bk − Hk ]dˆk = o(dˆk ) and dˆk = −H(x∗ )−1 gk + o(dˆk ) Thus, dˆk ≤ H(x∗ )−1 · gk + o(dˆk ) Therefore, we have gk 1 o(dˆk ) + ≥ ∗ −1 H(x ) dˆk dˆk (4.2) Combination Adaptive Trust Region Method by Non-Monotone Strategy 595 Theorem 3.4 implies gk → 0, when k → ∞. Where right hand side of preceding inequality is strictly positive, so dˆk → 0, when k → ∞. By Lemma 2.1 and −gkT qk = qkT Bk qk , we have q T Bk qk 1 (fl(k) − fk − mk (dˆk )) ≥ − αk gkT qk ≥ k 2 2 (4.3) using (4.1), (4.2), and Taylor expansion we have |fk − f (xk + dˆk ) + mk (dˆk )| ≤ o(dˆk 2 ) So from (4.3) and the fact that qk = dˆk = −B −1 gk , we obtain f − f (x + dˆ ) + m (dˆ ) f k l(k) − f (xk + dˆk ) k k k k − 1 = fl(k) − fk − mk (dˆk ) fl(k) − fk − mk (dˆk ) ≤ ≤ o(dˆk 2 ) fl(k) − fk − mk (dˆk ) ≤ o(dˆk 2 )) TB q qk k k 2 o(dˆk 2 )) o(dˆk 2 )) → 0 as k → ∞ = qk 2 dˆk 2 which implies that (2.4) holds, so xk+1 = xk + dˆk , for suﬃciently large k, and the new trust region algorithm reduces to standard quasi-Newton algorithm. We know that in presence of (4.1) quasi-Newton methods converges superlinearly, so the new method is convergent superlinearly. Theorem 4.2. Suppose that Assumptions 2.1 and 4.1 hold, qk satisfies (2.1), the new algorithm generates an infinite sequence {xk } such that xk → x∗ as k → ∞, H(x) is Lipschitz continuous and uniformly positive definite on a neighborhood N (x∗ , ǫ) of x∗ , Bk = H(xk ) and qk = −Bk−1 gk . Then sequence {xk } converges to x∗ quadratically. Proof. Bk = H(xk ) implies that all condition of Theorem 4.1 holds, so similar to Theorem 4.1 we can prove that qk = dˆk → 0 as k → ∞. From Lemma 2.1 we have 1 q T Hk qk fl(k) − fk − mk (dˆk ) ≥ − αk gkT qk ≥ k 2 2 Now, similar to Theorem 4.1 we can prove that (2.4) holds. So the new algorithm reduce to standard Newton algorithm for suﬃciently large k. Hence the new method converges quadratically. Theorem 4.3. Suppose that Assumption 2.1 holds, Bk = Hk , and infinite sequence {xk } is generated by the new algorithm. If sequence {xk } converges to x∗ , then H(x∗ ) is a semi-positive definite matrix, i.e., x∗ satisfies the second order necessary condition. 596 K. Amini & M. Ahookhosh Proof. Assume that λ1k and λ∗ are the smallest eigenvalues of Hk and H(x∗ ), respectively. Let zk be a normalized eigenvector of Hk corresponding to eigenvalue λ1k and zkT gk ≤ 0, then Hk zk = λ1k zk . Suppose that H(x∗ ) is not a semi-positive definite matrix, then λ∗ < 0 and thus λ1k < 0 for suﬃciently large k. Because αk qk · zk = δk , it follows that αqk zk is a feasible solution to (1.3). Therefore 1 1 −mk (αk qk zk ) ≥ − α2k qk 2 zkT Bk zk = − α2k qk 2 zkT Hk zk 2 2 1 = − α2k qk 2 λ1k 2 By (4.4) and r̂k ≥ µ, we have (4.4) fl(k) − fk+1 ≥ −µmk (αk qk zk ) ≥ 1 2 µα qk 2 λ1k 2 k Now by using same argument for deriving (3.11) and by λ1k → λ∗ , when k → ∞, we have lim δk = lim αk qk = 0 k→∞ k→∞ (4.5) Similar to procedure for deriving (3.12) and (3.14), we can give a contradiction. So λ∗ ≥ 0 and thus H(x∗ ) is a semi-positive definite matrix. 5. Numerical Experiments In this section, we present computational results to illustrate the performance of the new trust region method in comparison with other versions of trust region methods. All test problems are selected from Moré, Grabow and Hilstrom (1981). List of test problems are presented in Table 1. Programs are performed in double precision MATLAB 7.4 on a 3.0 GHz Intel Pentium IV WinXP PC with 1G RAM. In entire algorithms, we update Bk by the BFGS formula, and the stoping criterion is gk ≤ 10−8 . We choose µ = 0.1, N = 2n, where n is dimension of test problems. The quadratic subproblems are solved inaccurately by Nearly exact solution algorithm.(algorithm 7.3.5 in Toint(1997)) Table 1. List of test functions. Problem name Problem name Helical valley function Biggs EXP6 function Gaussian function Powell badly scaled function Box three-dimensional function Variably dimensioned function Watson function Penalty function I Penalty function II Brown badly scaled function Brown and Dennis function Gulf research and development function Trigonometric function Extended Rosenbrock function Extended Powell singular function Beale function Wood function Chebyquad function Combination Adaptive Trust Region Method by Non-Monotone Strategy 597 For proper comparison in this section, we decided to compare the new algorithm by Zhang’s non-monotone adaptive trust region algorithm (2003), NMATR-Z, and Fu et al.’s non-monotone adaptive trust region algorithm (2005), NMATR-S. We decided to use a performance profile that proposed by Dolan and Moré (2002). we let S is a set of all algorithms. Now, if fp,s is the performance index, any of established fourth measure, of algorithm s on problem p, then the performance ratio is defined by rp,s = fp,s min{fp,s : s ∈ S} (5.1) If algorithm s is converged for problem p, and rp,s = rfail otherwise, in which rfail must be strictly larger than any performance ratio (5.1). For any factor τ , the overall performance of algorithm s is given by ρs (τ ) = 1 φs (τ ) np (5.2) where np is the number of test problems, and φs (τ ) is the number of problems that for which rp,s ≤ τ . In fact ρs (τ ) is the probability of algorithm s ∈ S that a performance ratio rp,s is within a factor τ ∈ Rn of the best possible ratio. The function ρs (τ ) is the distribution function for performance ratio. In particular, ρs (1) gives the probability that algorithm s wins over all other algorithms, and limτ →rfail ρs (τ ) gives the probability of that algorithm s solve a problem. So this performance profile can be a measure of robustness of algorithms. In performance profiles we can use one index, but we have three measures ni , nf and ng . Pay attention to this subject we can use performance profile for all of this indexes or combine these measures. On the other hand we know that calculation of gradient evaluation is more expensive than function evaluation, so we decided to use the following index: nc = nf + 5 ng In Fig. 1, we can see all test problems solved by any solver. For nearly 50% of problems new algorithm has better result; and for nearly τ ≥ 1.5 all problem solved by the new algorithm. Hence, the new algorithm is more robust because its performance profile grows faster than the other profiles. It means that in the cases that the new method is not better than the other algorithms, the performance index is close to minimum of performance index. One of the most important factor in benchmarking of the iterative methods is cpu time of algorithms. In Table 1, we present the summation of cpu time for any algorithms, and from this idea we can see that new algorithm has the best result, but for a better comparison we use the performance profile and illustrate the results in Fig. 2. In Fig. 2, we observe that in nearly 60% of cases the new method is faster than the other considered non-monotone adaptive trust region algorithm. In the cases 598 K. Amini & M. Ahookhosh 1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6 0.55 0.5 1 1.5 2 2.5 3 3.5 4 Fig. 1. Performance profile for nf + 5 ng . 1 0.9 0.8 0.7 0.6 0.5 NMATR–S NMATR–Z NMATR–B 0.4 1 1.5 2 2.5 3 3.5 4 4.5 Fig. 2. Performance profile for cputime. 5 5.5 Combination Adaptive Trust Region Method by Non-Monotone Strategy 599 that new algorithm has not the best time, it is very near to the minimum time, of these cases. We can conclude that all of considered algorithms are eﬃcient and robustness. However, between these methods, the new non-monotone adaptive trust region algorithm is the best in terms of eﬃciency and robustness. 6. Conclusion In this paper, we combine the non-monotone strategy with Shi et al.’s (2008) adaptive trust region method to propose a new non-monotone trust region method with adaptive radius. In this method, the trust region radius can be adjusted automatically according to the current iterative information and computed by simple formula. By choosing diﬀerent qk , we have a diﬀerent trust region methods. Natural choice is qk = −gk , and another choice is qk = −Bk−1 gk . Theoretical analysis exhibits that the new method has global convergence and if qk = −Bk−1 gk , we can show the superlinear and quadratic convergence rate of this method. Finally, we have provided the numerical results that indicate the new method is robust and eﬃcient for solving unconstrained optimization problems. References Chamberlain, RM, MJD Powell, C Lemarechal and HC Pedersen (1982). The watchdog technique for forcing convergence in algorithm for constrained optimization. Mathematical Programming Study, 16, 1–17. Conn, AR, NIM Gould and PhL Toint (2000). Trust Region Methods. Society for Industrial and Applied Mathematics, SIAM, Philadelphia. Deng, NY, Y Xiao and FJ Zhou (1993). Non-monotonic trust region algorithm. Journal of Optimization Theory and Applications, 76, 259–285. Dolan, E and JJ Moré (2002). Benchmarking optimization software with performance profiles. Mathematical Programming, 91, 201–213. Fletcher, R (1987). Practical Method of Optimization. New York: John Wiley and Sons. Fu, JH and WY Sun (2005). Nonmonotone adaptive trust-region method for unconstrained optimization problems, Appl. Math. Comput. 163, 489–504. Grippo, L, F Lamparillo and S Lucidi (1986). A non-monotone line search technique for Newton’s method. SIAM Journal of Numerical Analysis, 23, 707–716. Grippo, L, F Lamparillo and S Lucidi (1989). A truncated Newton method with nonmonotone line search for unconstrained optimization. Journal of Optimization Theory and Applications, 60, 401–419. Moré, JJ (1983). Recent developments in algorithms and software for trust region methods. In Bachem, A, M Grotchel and B Korte (eds.), Mathematical Programming: The State of the Art, pp. 258–287. Berlin: Springer. Moré, JJ, BS Grabow and KE Hillstrom (1981). Testing unconstrained optimization software. ACM Transactions on Mathematical Software, 7, 17–41. Nocedal, J and SJ Wright (2006). Numerical Optimization. New York: Springer. Powell, MJD (1975). Convergence properties of a class minimization algorithms. In Mangasarian, OL, RR Meyer and SM Robinson (Eds.), Nonlinear Programming, Vol. 2, pp. 1–27. New York: Academic Press. 600 K. Amini & M. Ahookhosh Powell, MJD (1984). On the global convergence of trust region algorithms for unconstrained optimization. Mathematical Programming, 29, 297–303. Sartenaer, A (1997). Automatic determination of an initial trust region in nonlinear programming. SIAM Journal of Scientific Computing, 18(6), 1788–1803. Schultz, GA, RB Schnabel and RH Byrd (1985). A family of trust-region-based algorithms for unconstrained minimization with strong global convergence. SIAM Journal of Numerical Analysis, 22, 47–67. Shi, ZJ and JH Guo (2008). A new trust region methods for unconstrained optimization. Journal of Computational and Applied Mathematics, 213, 509–520. Toint, PhL (1996). An assessment of nonmonotone line search technique for unconstrained optimization. SIAM Journal of Scientific Computing, 17, 725–739. Toint, PhL (1997). Non-monotone trust-region algorithm for nonlinear optimization subject to convex constraints. Mathematical Programming, 77, 69–94. Zhang, XS, JL Zhang and LZ Liao (2002). An adaptive trust region method and its convergence. Science China, 45, 620–631. Zhang, XS, JL Zhang and LZ Liao (2003). An Nonmonotone adaptive trust region method and its convergence. An International Computer and Mathematics with Applications, 45, 1469–1477. Keyvan Amini is now an Associate Professor at the Department of Mathematics, Razi University (Kermanshah, Iran). He received BS degree in Razi University in 1996, and MS and PhD degrees in Sharif University of Technology (Tehran, Iran). His research is focused on Nonlinear Programming techniques, especially Trust Region Methods. He is also working on Interior point methods for linear programming. He has published papers in Computers and Mathematics with Applications, ANZIAIM JOURNAL, Bulletin of Australian Mathematical Society, Journal of Computational and Applied Mathematics, Applied Mathematics and Computation, Acta Mathematica Sinica, Southeast Asian Bulletin of Mathematics. Masoud AhooKhosh received his BS, MS degrees in Razi University, Kermanshah, Iran. His research focused on Nonlinear Programming techniques, especially Trust Region Methods. He has published paper’s in Computer Mathematics with Applications, Numerical Algorithms and Applied Mathematical Modeling.

RELATED PAPERS

RELATED TOPICS

Log In

COMBINATION ADAPTIVE TRUST REGION METHOD BY NON-MONOTONE STRATEGY FOR UNCONSTRAINED NONLINEAR PROGRAMMING

COMBINATION ADAPTIVE TRUST REGION METHOD BY NON-MONOTONE STRATEGY FOR UNCONSTRAINED NONLINEAR PROGRAMMING

Related Papers

RELATED PAPERS

RELATED TOPICS