Non-Linear Programming - A Basic Introduction
Non-Linear Programming - A Basic Introduction
Non-Linear Programming
ii
The aim of this new book series is to publish the rese arch studies and articles
that bring up the latest development and research applied to mathematics and its
applications in the manufacturing and management sciences areas. Mathematical
tool and techniques are the strength of engineering sciences. They form the common
foundation of all novel disciplines as engineering evolves and develops. The series
will include a comprehensive range of applied mathematics and its application in
engineering areas such as optimization techniques, mathematical modelling and
simulation, stochastic processes and systems engineering, safety- critical system
performance, system safety, system security, high assurance software architecture
and design, mathematical modelling in environmental safety sciences, finite element
methods, differential equations, reliability engineering, etc.
Linear Transformation
Examples and Solutions
Nita H. Shah and Urmila B. Chaudhari
Non-Linear Programming
A Basic Introduction
Nita H. Shah and Poonam Prakash Mishra
Non-Linear Programming
A Basic Introduction
MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks
does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of
MATLAB® software or related products does not constitute endorsement or sponsorship by The
MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.
First edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
© 2021 Nita H. Shah and Poonam Prakash Mishra
CRC Press is an imprint of Taylor & Francis Group, LLC
The right of Nita H. Shah and Poonam Prakash Mishra to be identified as authors of this work has been
asserted by them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or
contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used
only for identification and explanation without intent to infringe.
Library of Congress Cataloging‑in‑Publication Data
Names: Shah, Nita H., author. | Mishra, Poonam Prakash, author.
Title: Non-linear programming: a basic introduction /
Nita H. Shah and Poonam Prakash Mishra. Description: First edition. |
Boca Raton, FL: CRC Press, an imprint of Taylor & Francis Group, LLC, 2021. |
Series: Mathematical engineering, manufacturing, and management sciences |
Includes bibliographical references and index.
Identifiers: LCCN 2020040934 (print) | LCCN 2020040935 (ebook) |
ISBN 9780367613280 (hardback) | ISBN 9781003105213 (ebook)
Subjects: LCSH: Nonlinear programming.
Classification: LCC T57.8 .S53 2021 (print) |
LCC T57.8 (ebook) | DDC 519.7/6–dc23
LC record available at https://lccn.loc.gov/2020040934
LC ebook record available at https://lccn.loc.gov/2020040935
ISBN: 978-0-367-61328-0 (hbk)
ISBN: 978-1-003-10521-3 (ebk)
v
Contents
Preface���������������������������������������������������������������������������������������������������������������������� vii
Acknowledgement���������������������������������������������������������������������������������������������������� ix
Author/Editor Biographies���������������������������������������������������������������������������������������� xi
v
vi
vi Contents
Index������������������������������������������������������������������������������������������������������������������������ 69
vii
Preface
Optimization is just an act of utilizing the given resources in best possible way. We
use this concept knowingly or unknowingly in all aspects of life. Therefore, the term
“optimization” has its wide range of applications in almost all the fields such as basic
sciences, engineering and technology, business management, medical science and
defence, etc. An optimization algorithm is a procedure which is executed iteratively
by comparing various solutions till an optimum or a satisfactory solution is found.
With the advent of computers, optimization has become a part of computer-aided
activity. There are two types of algorithms widely used today as deterministic and
stochastic algorithms.
In order to understand and explore this concept we need to see mathemat-
ical formulation behind it. Mathematically, an optimization problem consists of a
function, better known as objective function that needs to be optimized (maximized/
minimized). There are some decision variables (design variables) on which value of
objective function depends. Problem can be with or without constraints. If objective
function as well as all the constraints are linear then that particular problem come
under the category of linear programming problem (LPP) otherwise non-linear pro-
gramming problem (NLP). If objective function or any of the constraint happens to
be non-linear that problem is called NLP. Focus of this book is on NLP only. It is
obvious that solving non-linear problems is more difficult compared to LPP. Choice
of methods for solving an NLP depends on many parameters such as number of deci-
sion variables, concavity of the function, presence of constraint, equality or inequality
constraints, and lastly overall complexity of the objective function in terms of con-
tinuity, smoothness, and differentiability. This book proposes a well synchronized and
auto-guided approach for beginners to understand and solve different types of NLPs.
This book has presented algorithms with their basic idea and appropriate illustrations
for better understanding of the readers. Language and approach is simple to cater
needs of undergraduate, postgraduate, and even research scholar in formulation and
solution of their research problem. We have also mentioned MATLAB® syntax to
use inbuilt functions of the MATLAB for solving different NLPs.
In this book we will discuss only non-linear programming (NLP). There are
many conventional methods available in the literature for the optimization but still
unable to approach all kinds of problem. So, researchers are continuously involved
in developing new methods better known as optimization algorithms. Chapters of the
book are as follows:
Chapter 1 discusses NLP for unimodal functions with single variable without
constraints. Here we have discussed both conventional gradient-based methods and
search algorithms for unimodal functions. These approaches act as fundamental for
multivariable problems. We have also compared various approaches available to
obtain solution and illustrate them briefly. Chapter 2 takes reader to the next level
of multivariable NLP problems but without constraints. Here also we will demon-
strate different approaches to solve the set of problems. This chapter also includes
limitations of different methods which reader need to take care while applying them.
vii
viii
viii Preface
Chapter 3 allows reader to understand the most complex problems with non-
linearity and multivariability in the presence of constraints. Here different conven-
tional methods for equality and inequality constraints are explained. As this is most
complex form, most of the time real-world problems cannot be addressed with con-
ventional approaches. Therefore, some of the widely accepted modern approaches of
stochastic search algorithms are mentioned in this chapter.
Chapter 4 makes reader understand the applicability of the above discussed
methods in the different areas of pure sciences, engineering and technology, man-
agement, finance, etc. This part includes formulation of real-world problems into
mathematical form that can be solved by any of the appropriate methods that allow
readers to use these concepts in their research work widely. It is possible to compute
program for all the algorithms mentioned in Chapters 1, 2, and 3 using C language
or MATLAB. But, MATLAB also comes with built-in functions that can be simply
called using appropriate syntax to approach most of the NLP. These inbuilt functions
are discussed with their syntax for the convenience of readers.
ix
Acknowledgement
First and foremost, I would like to thank Almighty for giving me all the strength and
knowledge to undertake and complete the writing of this book successfully. I would
like to thank Prof. Nita H. Shah from bottom of my heart for being my mentor and
guide, since my Ph.D. tenure. She has always been a light house for my career and
research activities. She has played the role of a friend, philosopher, and guide in
my life. During the writing of this book she has helped me as a mentor as well as a
co-author.
I am very grateful towards management of PDPU, SoT –Director –Prof.
S. Khanna and Academic Director –Prof. T. P. Singh for giving me full support and
mental space to write this book successfully. I would also like to extend my gratitude
towards my departmental colleagues for supporting me consistently.
I sincerely express my gratitude to my parents Mrs. Kanchan L. Pandey and late
G. N. Pandey for their generous blessings. At last, I would like to extend my spe-
cial thanks to my true strength, my husband Mr. Prakash Mishra and my beloved
daughter Aarushi Mishra, for giving me all love and affection to cherish my goals
in life.
ix
x
newgenprepdf
xi
Author/Editor Biographies
Prof. Nita H. Shah received her Ph.D. in Statistics from Gujarat University in 1994.
From February 1990 till now Prof. Nita is HOD of Department of Mathematics in
Gujarat University, India. She is postdoctoral visiting research fellow of University of
New Brunswick, Canada. Prof. Nita’s research interests include inventory modeling in
supply chain, robotic modeling, mathematical modeling of infectious diseases, image
processing, dynamical systems and its applications, etc. Prof. Nita has published
13 monographs, 5 textbooks, and 475+ peer-reviewed research papers. Four edited
books are prepared for IGI-global and Springer with coeditor as Dr. Mandeep Mittal.
Her papers are published in high impact Elsevier, Inderscience, and Taylor and
Francis journals. She is author of 14 books. By the Google scholar, the total number
of citations is over 3070 and the maximum number of citation for a single paper is
over 174. The H-index is 24 up to March 2020 and i-10 index is 74. She has guided
28 Ph.D. students and 15 M.Phil. students till now. Seven students are pursuing
research for their Ph. D. degree. She has travelled to USA, Singapore, Canada, South
Africa, Malaysia, and Indonesia for giving talks. She is Vice-President of Operational
Research Society of India. She is council member of Indian Mathematical Society.
Dr. Poonam Prakash Mishra has completed her Ph.D in the year 2010 in math-
ematics. She also holds master’s degree in business administration with specializa-
tion in operations management. Her core research area is modelling and formulation
of inventory and supply chain management. She is also interested in the mathemat-
ical modelling of real-world problems with stochastic optimization. She has applied
concepts of modelling and optimization in various fields such as for crude oil explor-
ation, for sea ice route optimization problems, and impact of wind power forecasts on
revenue insufficiency issue of electricity markets other than supply chain problems.
She has successfully guided 03 students to earn their Ph.D. degree. She has more than
40 journal publications and 8 book chapters in various reputed international journals.
She has successfully completed a funded project form SAC –ISRO and working on
the other proposals.
Presently, she is working on Remote Sensing Investigation of Parameters
that Affect Glacial Lake Outburst Flood (GLOF). She is working as a faculty of
Mathematics at School of Technology –“Pandit Deendyal Petroleum University” at
present.
xi
xii
1
1 One-Dimensional
Optimization Problem
1.1 INTRODUCTION
In this section, we discuss the different methods available to optimize (min-
imize/maximize) the given function with only one variable. Methods used for
one- dimensional optimization are highly useful for multivariable optimization.
Firstly, we see the methods that can be used for unimodal functions. Unimodal
functions are those functions that has only one peak or valley (in the given
domain). Mathematically, function f ( x ) is unimodal if (i ) x1 < x2 < x* implies that
f ( x2 ) < f ( x1 ) and (ii ) x2 > x1 > x* implies that f ( x1 ) < f ( x2 ) , where x* is the min-
imum point. Figure 1.1 is a mind map that can help to explore available methods for
these set of problems.
Necessary condition: For a point x0 to be the local extrema (local maximum and
minimum) of a function y = f ( x ) defined in the interval a ≤ x ≤ b is that the first
derivative of f ( x ) exists as a finite number at x = x0 and f ′( x0 ) = 0
Algorithm:
1
2
FIGURE 1.1
Tree diagram of available methods for one-variable optimization.
3. Compute f ′′ ( x0 ) .
4. Declare x0 as local minima if f ′′( x0 ) > 0 or declare x0 as local maxima if
f ′′ ( x0 ) < 0
Example:
If the total revenue (R) and total cost (C) function of a firm are given by R = 30 x − x 2 .
and C = 20 + 4 x, where x is the output. What is the maximum profit?
Solution
Let profit function be P, P = R – C
P( x ) = (30 x − x 2 ) − (20 + 4 x ) = − x 2 + 26 x − 20
P ′( x ) = −2 x + 26 = 0 ⇒ x = 13 , using necessary condition,
P ′′( x ) = −2 < 0, using sufficient condition P( x ) gives its maximum value at x = 13.
Hence, maximum profit is Rs. 149.
Algorithm:
Example:
Find the minimum of f = x(x − 2.5) by initial point as 1.01 and step size as 0.1 using
unrestricted search algorithm.
Solution:
Let us calculate functional value at x = 2 and further taking step length as 0.2, as per
the algorithm.
This process shows that minimum lies between 1.21 and 1.31. Either of the point can
also be chosen as minima. This algorithm can be repeated assuming smaller step size
in the interval (1.21, 1.31).
Algorithm:
1. Divide the given interval ( xF , xL ) into (n + 1) equal intervals assuming “n”
points within the given interval.
2. Find functional value simultaneously on all the (n + 2) points.
( ) ( )
3. Find a point x j such that f ( x j −1 ) > f x j < f x j +1 .
4. Declare x j as minima.
Example:
Find the minimum of f = x(x − 2.5) in the interval (1, 1.4) using exhaustive search
technique.
i 1 2 3 4 5 6 7 8 9
xi 1 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
f(xi) −1.5000 − 1.5225 −1.5400 −1.5525 −1.5600 −1.5625 −15600 −1.5525 −1.5400
Since x5 = x7 , minimum lies between these values. Middle of these values, that is,
x6 = 1.25 can be considered as appropriate approximation.
Algorithm:
Example:
Find the minimum for f(x) = x(x − 2.5) in the interval (1, 1.4) using dichotomous
search.
Here, f(x) = x(x − 2.5), let us find x1 and x2 using the above-mentioned formula by
taking δ = 0.001
Iteration-1
L0 δ 2.4
x1 = − = − 0.0005 = 1.1995
2 2 2
L δ 2.4
x2 = 0 + = + 0.0005 = 1.2005
2 2 2
L1 δ 2.5995
x3 = − = − 0.0005 = 1.29925
2 2 2
L δ 2.3395
x4 = 1 + = + 0.0005 = 1.30025
2 2 2
Now, f3 = −1.5600 , f4 = −1.5599. Since f3 < f4, interval (x4, xL) = (1.30025, 1.4) can
be discarded. So, next interval of uncertainty is (x1, xL) = (1.1995, 1.30025)
Let us find x5 and x6.
Iteration-3
L2 δ 2.49975
x5 = − = − 0.0005 = 1.249375
2 2 2
L δ 2.49975
x6 = 2 + = + 0.0005 = 1.259375
2 2 2
Now, f5 = − 1.562499 , f6 = −1.562412 . Since f5 < f6, interval (x6, x4) = (1.259375,
1.30025) can be discarded. So, next interval of uncertainty is (x6, x4) = (1.1995,
1.259375).
6
So, minimum lies between the interval (1.1995, 1.259375) at the end of three
iterations. The midpoint 1.22943 can be considered as the required optimal value.
Interval of uncertainty reduces in this way after every iteration
1 11 δ 11 δ δ
L1 =
2
( L0 + δ ) , L2 = ( L0 + δ ) + , L3 = ( L0 + δ ) + +
22 2 22 2 2
Ln 1
Ln = + δ 1 − n /2
2 n/2 2
Algorithm:
Example:
Find the minimum for f(x) = x(x − 5.3) in the interval [1, 4] using Fibonacci search
algorithm.
Solution
Here function is f(x) = x(x − 5.3), L0 = [1, 4]. Let us assume n = 6
Fn − 2
Let us calculate x1 and x2 using x1 = a + L, * x1 = b + L* where L* = L0
Fn
7
x1 = a + L * = 1 + 1.153846 = 2.153846 5
where L* = (3) = 1.153846
x2 = b − L* = 4 − 1.153846 = 2.846154 13
←→
|| || | |
a = 1 x1 = 2.153846 x2 = 2.846154 b=4
f1 = −67759738, f2 = −6.9840236. Since f1 > f2, we will discard (a, x1) and next interval
of uncertainty is L1 = [x1, b] = [2.153846, 4].
Let us calculate x3 using x3 = x1 + (b − x2) = 2.153846 + (4 − 2 .846154) = 3.307692.
← →
| | || ||
x1 = 2.153846 x2 = 2.846154 x3 = 3.307692 b=4
Here, f2 = −6.9840236, f3 = −6.5899412. Since, f2 < f3, we will discard (x3, b) and next
interval of uncertainty is L2 = [x1, x3].
Let us calculate x4 using x4 = x1 + (x3 − x2) = 2.153846 + (3.307692 − 2.846154) =
2.615384
← →
| | || ||
x1 = 2.153846 x4 = 2.615384 x2 = 2.846154 x3 = 3.307692
Here, f4 = −7.0213017. Since f2 > f4, we will discard (x2, x3) and new interval of uncer-
tainty is L3 = [x1, x2].
Let us calculate x5 using x5 = x1 + (x2 − x4) = 2.153846 + (2.846154 − 2.615384) =
2.384616
← →
|| || | |
x1 = 2.153846 x5 = 2.384616 x4 = 2.615384 x2 = 2.846154
Here, f5 = −6.9520713. Since f5 > f4, we will discard (x1, x5) and new interval of uncer-
tainty is L3 = [x5, x2].
Let us calculate x6 using x6 = x1 + (x4 − x5) = 2.153846 + (2.615384 − 2.384616)
= 2.384614
← →
|| || | |
x5 = 2.384616 x6 = 2.384614 x4 = 2.615384 x2 = 2.846154
Here, f6 = −6.9520703. Since f6 > f4 we will discard (x5, x6) and new interval of uncer-
tainty is [x6, x2] = 0.23077 = L6
Here, L6 / L0 = 0.23077/3 = 0.0769233 (Reduction ratio for n = 6).
8
b a 1
1+ = = γ ⇒ 1+ = γ ⇒ γ 2 − γ −1 = 0
a b γ
Positive root of the equation is γ = 1.61803. This is the golden ratio and we will use
this in the golden section search method.
Algorithm:
Example:
Find the minimum for f(x) = x(x − 5.3) in the interval [1, 4] using golden section
search algorithm.
Solution:
1
Initial level of tolerance is L0 = [1, 4]. Let us find L*2 = L0 = 0.382(3) = 1.146
γ2
N o w , x1 = xF + L*2 = 1+ 1.146 = 2.146 and x2 = xL − L*2 = 4 − 1.146 = 2.854 .
f1 = –6.7684, f2 = –6.9808. Since f1 > f2 we can discard [ xF , x 1 ] = [1, 2.146] and next
interval of uncertainty will be [ x1 , x L ] = [2.146, 4].
←→
|| || | |
xF = 1 x1 = 2.146 x2 = 2.854 b=4
9
TABLE 1.1
Comparison of various search techniques
2 L0
Exhaustive search Ln = Ln = (0.4)L0 Ln = (0.18182)L0
n +1
L0 1 Ln = (0.25)L0 Ln = (0.03125)L0
Dichotomous Ln = + δ 1 − n / 2 ,
search 2n / 2 2 + 0.0075 + 0.0096875
δ = 0.01
L0 Ln = (0.2)L0 Ln = (0.11245)L0
Fibonacci search Ln =
Fn
Golden section Ln = (0.618)n −1 L0 Ln = (0.236)L0 Ln = (0.01315)L0
search
Here, f3 = −6.610336 , f2 < f3. Hence, we will discard [ x3 , x L ] = [3.292, 4] and new
interval of uncertainty be [2.146, 2.854]. Process can be continued to attain better
approximation.
Table 1.1 shows a tabular comparison of various approaches of search techniques in
order to understand the efficiency of different methods. Efficiency of search techniques
depends on Ln / L0 . Here L0 represent original interval of uncertainty whereas Ln
represent reduce interval of uncertainty after n iterations. Comparison clearly shows
that Fibonacci and Golden section searches reduce this ratio faster compared to
exhaustive search and dichotomous search techniques. As soon as we move to higher
iterations, Fibonacci proves to be a better search technique than Golden section.
Algorithm:
1. Initialize with xF , x1 , xL such that xF < x1 < xL where x1 can be midpoint of the
given interval [ xF , xL ].
2. Approximate given function f ( x ) with quadratic polynomial
p( x ) = a0 + a1 x + a2 x 2 using following set of equations.
f ( xF ) = a0 + a1 ( xF ) + a2 ( xF )2
f ( x1 ) = a0 + a1 ( x1 ) + a2 ( x1 )2
f ( xL ) = a0 + a1 ( xL ) + a2 ( xL )2
f ( xF ) x1 xL ( xL − x1 ) + f ( x1 ) xL xF ( xF − xL ) + f ( xL ) x1 xF ( x1 − xF )
a0 =
( xF − x1 )( x1 − xL )( xL − xF )
f ( xF )( x12 − xL 2 ) + f ( x1 )( xL 2 − xF 2 ) + f ( xL )( xF 2 − x12 )
a1 =
( xF − x1 )( x1 − xL )( xL − xF )
− f ( xF )( x1 − xL ) + f ( x1 )( xL − xF ) + f ( xL )( xF − x1 )
a2 =
( xF − x1 )( x1 − xL )( xL − xF )
− a1
Obtain optimal x using x* =
2 a2
f ( x* ) − p( x* )
3. Check <ε
f ( x* )
4. There will be four cases on the basis of value x* and x1 and their functional
values. Obtain the next interval of uncertainty as per the cases:
( )
CASE 1 ⇒ x* < x1 , f x* < f ( x1 ), then new interval be [x F ,x1 ].
Declare new x F = xF , x* = x1 , xL = x1
( )
CASE 2 ⇒ x* < x1 , f x* > f ( x1 ), then new interval be [x* ,x L ].
Declare new x F = x* , x1 = x1 , xL = x1
( )
CASE 3 ⇒ x* > x1 , f x* < f ( x1 ), then new interval be [x1 ,x L ].
Declare new x F = x1 , x1 = x* , xL = xL
( )
CASE 4 ⇒ x* > x1 , f x* > f ( x1 ), then new interval be [x F , x* ].
Declare new x F = x F , x1 = x1 , xL = x*
5. Continue to refine the interval of uncertainty till desired approximation is
achieved.
Example:
1 0.9
Find the minimum value of the function f ( x ) = 0.5 − x tan −1 − using quad-
x 1 + x2
ratic interpolation method in the interval of [0.45, 0.65].
Solution
Here xF = 0.45, x1 = 0.55, xL = 0.65 . Let ε = 0.001
11
Iteration-1:
Solve for p( x ) = a0 + a1 x + a2 x 2 using following set of equations
− a1 0.77669
x* = = = 0.245938
2 a2 0.6333
f ( x* ) − p( x* ) −0.675677 + 0.698499
= = 0.03377 > ε
*
f (x ) −0.675677
Iteration 2:
− a1 0.989916
x* = = = 0.405530
2 a2 2(0.819324)
f ( x* ) − p( x* ) −0.75366275 + 0.748452
= = 0.00691 > ε. Since f1 < f *. New
*
f (x ) −0.75366275
interval will be [ x* , xL ] = [0.405530, 0.65]
Iteration 3:
− a1 1.31452
x* = = = 0.603084
2 a2 2(1.08983)
f ( x* ) − p( x* ) −0.78 + 0.780758
= = 0.0009717 < ε
f ( x* ) −0.78
We have achieved the desired accuracy and can terminate the process. Optimal
(minima) value for given function is 0.603084
1.4.1 Newton Method
Newton’s method, also known as the Newton–Raphson method, is a method of
finding roots. Using this concept we can find root of f ′( x ) and this point would be
local minima of f ( x ). In case initial approximation is not close to x*, it may diverge.
Let quadratic approximation of the function f ( x ) at x = x* using Taylor’s be
( ) ( ) ( )
f ( x ) = f x* + ( x − x* ) f ′ x* + ( x − x* )2 f ′′ x* + ....
( ) ( )
f ′( x ) = f ′ x* + ( x − x* ) f ′′ x* = 0
Algorithm:
Example:
Find minimum of the function f ( x ) = x 4 − x 3 + 5 using Newton method.
Take x0 = 1
Solution:
We have f ( x ) = x 4 − x 3 + 5, then f ′( x ) = 4 x 3 − 3 x 2 and f ′′( x ) = 12 x 2 − 6 x .
f ′ ( x0 ) 1
First iteration: x1 = x0 − = 1 − = 0.833
f ′′( x0 ) 6
f ′( x1 ) 0.23037
Second iteration: x1 = x1 − = 0.833 − = 0.7637
f ′′( x1 ) 3.32866
Minima for the given function is 0.7637
1.4.2 Secant Method
This method is again used to approximate root of f ′( x ) that eventually happens to
be minima of f ( x ). Here, we use secant to approximate the roots instead of tangent.
Let us have two points ‘a’ and ‘b’ on the function f ′( x ) such that f ′(a ). f ′(b) < 0 .
Then, we can evaluate the next approximation using secant (a, f ′(a )) and (b, f ′(b))
b−a
as xk = b − f ′ (b ) .
f ′ (b ) − f ′ ( a )
For next iteration, if f ′(a ). f ′( xk ) < 0, (a, f ′(a )) and ( xk , f ′( xk )) will be new
secant.
Otherwise, f ′(b). f ′( xk ) < 0 allows (b, f ′(b)) and ( xk , f ′( xk )) to form new secant.
Continue the process till the desired accuracy is achieved.
Algorithm:
Step 0: Set [ a, b] (initial approximation) such that f ′(a ). f ′(b) < 0 , ε > 0, k = 0,
1, 2 … .
b−a
Step 1: Get xk = b − f ′ (b )
f ′ (b ) − f ′ ( a )
Step 2: If f ′(a ). f ′( xk ) < 0 then (a, f ′(a )) , ( xk , f ′( xk )) is new secant, otherwise
if f ′(b). f ′( xk ) < 0 then ( (b, f ′(b)) , ( xk , f ′( xk )) is new secant.
Step 3: If f ′( xk ) < ε declare xk as optimal point or repeat Step 1.
TRY YOURSELF
Q1. Find the minimum of the function f ( x ) = x 5 − 5 x 3 − x + 25 using the following
methods:
(a) Unrestricted search technique using initial interval of uncertainty as
(0, 3)
(b) Dichotomous search technique using initial interval of uncertainty as
(0, 3). Take δ = 0.001
14
6x
Q4. Find the maximum of the function f ( x ) = 2
using the following
x − 3x + 5
methods taking initial guess as x0 = 2.5 . Comment which method converges
faster to the optimum.
(a) Quadratic interpolation method
(b) Cubic interpolation method
(c) Newton method
(d) Secant method
Answer: 2.236
15
2 Unconstrained
Multivariable
Optimization
2.1 INTRODUCTION
In this section, we will discuss several methods with their algorithm to solve uncon-
strained multivariable problems for optimization. We will also demonstrate working
of these algorithms with suitable examples. There are two approaches to find optimal
solution for multivariable problems. If function is smooth and can be derived gradient-
based methods are followed, which are also known as indirect methods. However,
if this is not the case then optimal point of the given function can be obtained by
different search algorithms popularly known as direct search methods. Tree dia-
gram in Figure 2.1 illustrates the different methods that are available for solving
multivariable optimization problems.
Minimize f ( x1 , x2 ...... xn )
x1 , x2 ...... xn
15
16
INDIRECT METHODS
DIRECT SEARCH METHODS
(Gradient-based methods)
• Random Search Method
• Grid Search Method • Using Hessian Matrix
• Univariate Search Method • Steepest Descent Method
• Paern Search Algorithm • Newton’s Method
i) Hooke-Jeeves Method • Quasi Method
ii) Powell’s Method
• Simplex Method
FIGURE 2.1
Tree diagram of available methods for multivariable optimization problems.
example, a two-design variable space with four partitions have 24 = 16 nodes but a
design space with let say five design variables will have 54 = 625 nodes. It is obvious
that computation cost is too high in case of problems with high number of decision
variables and this method too is not an efficient search method to find the optimal
solution.
Algorithm:
Step 5: Set the value of k = k + 1 and go to step 1. Continue the process till desired
accuracy is achieved.
Example:
Find Min f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2 using univariate method.
Solution:
0
Let us set x0 = , ε = 0.01.
0
1
Step 1: let d0 =
0
0 1 0.01
Step 2: f0 = 0 , f + = f ( x0 + εd0 ) = f + 0.01 = f = 0.0102
0 0 0
18
0 1 −0.01
f − = f ( x0 − εd0 ) = f − 0.01 = f = −0.0098
0 0 0
−0.25 0 −0.25
Step 5: f ( x1 + ∆x1d1 ) = f − ∆x1 1 = f ∆x
0 1
= ( ∆x1 )2 − (1.5)∆x1 − 0.375.
df
= 0 ⇒ ∆x1* = 0.75
d ∆x1
−0.25 0 −0.25
Thus x2 = x1 + ∆x1* d1 = + 0.75 1 = 0.75
0
f2 = −0.6875
Algorithm:
Example:
( )
2
Find the minimum of f ( x, y) = x 2 + y − 11 + ( x + y 2 − 7)2. Let the initial approxi-
mation be X (0 ) = ( x ( 0 ) , y( 0 ) ) = (0, 0)
Solution:
1st Iteration
Let X (0 ) = ( x ( 0 ) , y( 0 ) ) = (0, 0), ∆1 = 0.5
Let us go for first exploratory move in x-direction
Since, f (0.5, 0) = 157.8, New ( x (0 ) , y(0 ) ) = (0.5, 0) is least we will use this to move
in y-direction.
20
( )
X (pk +1) = X ( k ) + X ( k ) − X ( k −1) = 2 X ( k ) − X ( k −1)
X (p2 ) = 2(0.5, 0.5) − (0, 0) = (1,1)
Now, X E(2 ) = (1.5,1.5). Let us compute pattern move using last two moves
( ) ( )
T
positive definite matrix Q if d (1) Q d (2 ) = 0 .
Let Q be an (n x n) square positive definite symmetric matrix. A set of n linearly
independent search directions d (i ) is called conjugate with respect to matrix if Q is
(d ) Q (d ) = 0,
T
(i ) ( j)
∀i ≠ j, i = 1, 2, 3...n, j = 1, 2, 3....n
1
q( x ) = a + bT x + xT Qx , a is scalar, b is vector, Q is 2 × 2 mattrix
2
Algorithm:
Example:
Find the minimum of f ( x1 , x2 ) = 2 x13 + 4 x1 x23 − 10 x1 x2 + x22 , x (0 ) = [5, 2]T
22
Solution
1 0 5
Step 0: Set d (1) = , d (2 ) = , x (0 ) =
0 1 2
( ) (
Step 1: f x (0 ) = 314, we need to find the minimum of f x (0 ) + λ d (2 ) )
5 0 5
f ( x + λ d ) = f + λ = f
(0) (2)
= 20 λ + 121λ
3 2
+ 229λ + 384
2 1 2 + λ
df d2 f
= 60 λ 2 + 242 λ + 229 = 0, λ = −1.5 = 120 λ + 242 = 62 > 0
dλ dλ2 d = −1.5
5 0 5
x (1) = x (0 ) + (λ* )d (2 ) = − 1.5 =
2 1 1.5
( )
Step 2: f x (1) = 244.75, we need to find the minimum of f x (1) + λ d (1) ( )
5 1 5 + λ
f ( x + λ d ) = f + λ = f
(1) (1)
= 2λ
3
+ 30 λ 2 + 148.5λ − 5.25
1.5 0 2
df d2 f
= 6 λ 2 + 60 λ + 148.5 = 0, λ = −4.5, = 12 λ + 60 = 6 > 0
dλ dλ2 d = −4.5
5 1 0.5
( )
x (2 ) = x (1) + (λ* )d (2 ) = − 4.5 = , f x (2 ) = 1.75.
1.5 0 1.5
Algorithm:
Step 0: Set (n + 1) points to define initial Simplex. Set α, γ > 1, β ∈ (0,1) also
ε > 0 (for termination).
Step 1: Get the worst and best points from these (n + 1) points. Let xw and xb be
the worst and best points.
Step 2: Calculate the centroid xc from the remaining (n) points using
n +1
1
xc = ∑ x .
n i =1 k
i≠w
23
Step 3: Replace the worst point with new point as xnew, using xc through the pro-
cess of reflection by xr = (1 + α ) xw − α xb.
Example:
Minimize f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x22.
4 5 4
For initial simplex consider X1 = , X 2 = , X3 = and α = 1.0, b = 0.5
4 4 5
γ = 2.0, set ε = 0.2
Solution:
Iteration 1:
Step 1:
Since there are two unknowns, initially we need three simplex as
X1 , X 2 and X3 f ( X1 ) = 80, f ( X 2 ) = 107, f ( X3 ) = 96.
5 4
Then, X 2 = is best whereas X1 = is worst point
4 4
5 4
Therefore, X b = , X w =
4 4
1 4
Step 2: The centroid X c is X c = ( X + X3 ) = 5 , where f ( Xc ) = 87.25
2 1
Step 3: The reflection point is
4 5 3
Xr = 2 X c − X w = 2 − = , where f ( Xr ) = 71
4.5 4 5
Step 4: As f ( Xr ) < f ( X b ), we will go with expansion as
3 4 2
X e = 2 Xr − X c = 2 − = , where f ( X e ) = 56.75
5 4.5 5.5
Step 5: f ( X e ) < f ( X b ),we replace X w by X e
Step 6: Calculate Q for convergence which is 19.06, so we will continue with next
iteration.
4 2 4
Iteration 2: We haveX1 = , X 2 = , X3 = ,
4
5
. 5 5
24
T
∂f ( x ) ∂f ( x ) ∂f ( x )
∇f ( x ) = , .....,
∂x1 ∂x2 ∂x n
T
∂2 f ( x ) ∂2 f ( x) ∂2 f ( x)
.....
∂x1 ∂x1∂x2 ∂x1∂xn
2
∂ f ( x) ∂2 f ( x) ∂2 f ( x)
.....
H ( x ) = ∂x2 ∂x1 ∂ x2 2 ∂ x2 ∂ x n
: : :
2 2
∂ f ( x) ∂ f ( x) ∂2 f ( x )
∂xn ∂x1 ∂ x n ∂ x2 ∂xn 2
Example:
Determine the maximum of the function f ( x1 , x2 ) = x1 + 2 x2 + x1 x2 − x12 − x2 2
Solution
The necessary condition for local optimum value is that gradient
∂f ∂f 2 x1 x2 + 5e x2 0
∇f ( x ) = , = 2 =
∂x1 ∂x2 x1 + 5 x1e 2 0
x
4 5
Stationary point is x0 = ,
3 3
The sufficient condition using Hessian matrix is
∂2 f ∂2 f
∂x1
2 ∂x1∂x2 −2 1
H ( x) = =
2
∂ f ∂ 2 f 1 −2
∂x2 ∂x1 ∂x2 2
4 5
Since, H(x) is negative definite the stationary point x0 = , is local maximum
3 3
of the function f ( x ).
Algorithm:
Step 0: Set x (0 ) , k = 0
Step 1: d ( K ) = −∇f ( x ( K ) ). If d ( K ) = 0, then stop.
26
Example:
Solution:
0 1 + 4 x1 + 2 x2
For f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2 , x0 = and ∇f =
0 −1 + 2 x1 + 2 x2
First iteration:
0 1 −1
We have,x0 = , ∇f X = ⇒ d (0 ) = ,
0 0
−1 1
0 −1 −α
min α f ( x (0 ) + α d (0 ) ) = min α f + α (0 ) = min α f = α 2 − 2α,
0 1 α
df
= 0 ⇒ α = 1. Now generate x (1) by x ( K +1) = x ( K ) + α d ( K ) taking k = 0
dα
0 −1 −1 −1
x (1) = x (0 ) + α d (0 ) = + 1 = . Since, ∇f X (1) = ≠ 0. We will go for
0 1 1 −1
next iteration.
−1 −1 1
Second Iteration: x1 = and ∇f X (1) = ⇒ d (1) =
−1 −1 1
−1 1 −0.8
Optimal direction is α = 0.2 , x (2 ) = x (1) + α d (1) = + 0.2 =
1 1 1.2
0.2
Since ∇f X ( 2 ) = ≠ 0. We will continue the process till desired accuracy is
−0.2
achieved.
2.3.3 Newton’s Method
This method approximates the given function by second-order Taylor’s approxima-
tion. Further, that approximation is optimized using necessary and sufficient condi-
tion of calculus for optimal value.
1 T
f ( x ) = f ( x0 ) + ∇f ( x0 )h + h H ( x )h, x = x0 + h
2!
27
1 T
It can be represented in standard form as q( x ) = x Hx + bT x + c
2
Necessary condition: ∇q( x ) = 0 ⇒ Hx + b = 0 ⇒ x = − H −1b, where b = ∇f ( x )
Sufficient condition: ∇2 q( x ) = 0 ⇒ H = 0 ⇒ f ( x ) is Minimum at x = x0 if H is
positive definite.
Algorithm:
Step 0: Set x0 ∈ R n , k = 0, ε (very small quantity)
Step 1: Find xK +1 = xK + H K −1∇f ( xk )
Step 2: If ∇f ( xk ) < ε, terminate the process otherwise go to step 1
Example:
Determine the minimum of the given function f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2
0
using Newton’s method with initial guess x0 =
0
Solution:
For the given f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2
∂f ∂2 f ∂f ∂2 f
= 1 + 4 x1 + 2 x2 , = 4, = −1 + 2 x1 + 2 x2 , =2
∂x1 ∂x1 2 ∂ x2 ∂ x2 2
0
First iteration: x0 =
0
1 1
−
4 2 −1 1 2 −2 2 2 −1
H0 = , H 0 = 4 −2 4 = 1 and ∇f0 =
2 2 − 1 −1
2
1 1
2 −
0 2 1 0 1 −1
x1 = x0 + H 0 −1∇f ( x0 ) = − = − =
0 − 1 −1 0 −3 / 2 3 / 2
2 1
0
Now, ∇f X = .
1
0
−1
We can terminate the process here since gradient is 0. Thus minima is
3 / 2
2.3.4 Quasi Method
This method is a good alternative of Newton’s method. It computes search direc-
tion only by first derivatives whereas Newton’s method uses Hessian matrix for the
same purpose. This method basically uses approximation of Hessian matrix and thus
28
Algorithm:
Step 1: Compute Newton direction as d k = H k g k , where H k and g k are approxi-
mate inverse Hessian matrix and gradient, respectively.
Step 2: Compute new approximation as x k +1 = x k + α k d k
Step 3: Compute g k +1 = ∇f ( x k +1 )
Step 4: Update approximate inverse Hessian matrix using
H k +1 = update( H k , x k +1 ± x k , g k +1 − g k )
1 1
⇒ u(vT p k ) = q k − H k p k ⇒ u = (q k − H k p k ) ⇒ v,
vT p k vT p k
1
where v = q k − H k p k , H k +1 = H k − vvT
vT p k
( ) 1
−1
Sherman–Morrison Formula: A + uvT = A−1 + A−1uvT A−1
1 − vT A−1u
To update the Hessian matrix Min H k +1 − H k and H k +1 > 0
w
W ≈ H ⇐ broyden-fletcher-goldfarb-shanno
W ≈ H −1 ⇐ davidon fletcher powell method
These two approaches are widely used to update the Hessian matrix approximation.
TRY YOURSELF
Q1. Find the minimum for f ( x1 , x2 ) = 3 x12 + x2 2 − 10 using the following
methods with [0, 0] initial approximation [0, 0] using Univariate Method and
Hooke–Jeeves Method
Answer: [6, 2]
Q2. Minimize the objective function f ( x, y) = x − y − 2 x 2 + 2 xy + y 2 using 5
iterations of (a) Newton’s method (b) Steepest Descent with starting value
x0 = [0, 0]T . Plot the values of iterates for each method on the same graph.
Which method seems to be more efficient.
Answer: [–1, 1.5]
29
3 Constrained
Multivariable
Optimization
3.1 INTRODUCTION
We have already discussed methods for optimization of single variable and
multivariable without constraint in Chapter 1 and Chapter 2, respectively. Actually,
most of the real world problems that are required to be optimized have constraints.
Now, it is high time to discuss optimization of multivariable functions with constraints.
General structure of multivariable functions with constraints is given below:
hi ( x1 , x2 ,..., xn ) ≤ = ≥ bi ; i = 1, 2,..., m
This set of problems can be further divided into problem with equality constraints
and unequality constraints. There are some conventional methods available for both
set of problems with and without constraints but still all the problems cannot be
solved using these methods due to complexity of the problem. Some of the time,
conventional methods stuck to the local optimum instead of global optimum. So, we
have another set of methods known as stochastic search techniques. These methods
are search algorithms inspired by natural phenomenon like evolution, natural selec-
tion, animal behaviour, or natural laws. Under this section, we will discuss methods
like genetic algorithm, particle swarm optimization, simulated annealing, and Tabu
search. For the sake of convenience, first we will discuss the conventional methods
then followed by stochastic search techniques.
hi ( x1 , x2 ,..., xn ) = bi ; i = 1, 2,..., m
31
32
Z = f ( x)
and gi ( x ) = hi ( x ) − bi ; bi is constant
Here it is assumed that m < n to get the solution.
There are various methods for solving the above defined problem. But in this
section, we shall discuss only two methods: (i) Direct Substitution Method and
(ii) Lagrange Multiplier Method.
Example:
Find the optimum solution of the following constrained multivariable problem:
x1 + 5 x2 − 3 x3 = 6
and x1 , x2 , x3 ≥ 0
Solution
Since the given problem has three variables and one equality constraint, any one of
the variables can be removed from Z with the help of the equality constraint. Let
us choose variable x3 to be eliminated from Z. Then, from the equality constraint,
we have:
x3 =
( x1 + 5 x2 − 6)
3
1
Z or f ( x ) = x12 + ( x2 + 1)2 + ( x1 + 5 x2 − 9)2
9
∂f ∂f
∇f ( x ) = , =0
∂x1 ∂x2
33
∂Z 2
That is, = 2 x1 + ( x1 + 5 x2 − 9) = 0
∂x1 9
∂Z 10
= 2( x2 + 1) + ( x1 + 5 x2 − 9) = 0
∂ x2 9
Optimize Z = f ( x )
or
gi ( x ) = hi ( x ) − bi = 0, i = 1, 2,..., m and m ≤ n ; x ∈ E n
The necessary conditions for a function to have a local optimum at the given points
can be extended to the case of a general problem with n variables and m equality
constraints.
Multiply each constraint with an unknown variable λ i (i = 1, 2,..., m) and subtract
each from the objective function f ( x ) to be optimized. The new objective function
now becomes:
m
L ( x, λ ) = f ( x ) − ∑ λ g ( x) ; x = ( x , x ,..., x )
i =1
i i 1 2 n
T
conditions for the given constrained optimum of f ( x ), provided the matrix of partial
derivatives ∂gi / ∂x j has rank m at the point of optimum.
The necessary conditions for an optimum (max or min) of L ( x, λ ) or f ( x ) are the
m + n equations to be solved for m + n unknown ( x1 , x2 ,..., xn ; λ1 , λ 2 ,..., λ m ).
m
∂L ∂f ∂gi
=
∂x j ∂x j
− ∑λ
i =1
i
∂x j
= 0; j = 1, 2,..., n
∂L
= − gi ; i = 1, 2,..., m
∂λ i
m
L ( x, λ ) = f ( x ) − ∑ λ g ( x)
i =1
i i
∂L ∂L
= 0 and = 0; for all i and j
∂x j ∂λ i
m
∇L ( x , λ ) = ∇f ( x ) − ∑ λ g ( x) = 0
i =1
i i
and gi ( x ) = 0, i = 1, 2,..., m
Then the sufficient condition for an extreme point x to be a local minimum (or
local maximum) of f ( x ) subject to the constraints gi ( x ) = 0, (i = 1, 2,..., m) is that the
determinant of the matrix (also called Bordered Hessian matrix)
Q H
D= T
H 0 (m+n)× (m+n)
35
∂ 2 L ( x, λ ) ∂g ( x )
Q= ; H= i
∂xi ∂x j n × n ∂x j m × n
1. If starting with principal minor of order (m + 1) , the extreme point gives the
maximum value of the objective function when signs of last (n - m) principal
minors alternate, starting with the (–1)m+n sign.
2. If starting with principal minor of order (2 m + 1) the extreme point gives the
minimum value of the objective function when all signs of last (n - m) prin-
cipal minors are the same and are of (–1)m type.
Example:
Solve the following problem by using the method of Lagrangian multipliers.
and x1 , x2 ≥ 0
Solution:
The Lagrangian function is
L ( x, λ ) = x12 + x22 + x32 − λ1 ( x1 + x2 + 3 x3 − 2) − λ 2 (5x1 + 2 x2 + x3 − 5)
To see that this solution corresponds to the minimum of Z , apply the sufficient
condition with the help of a matrix:
2 0 0 5 1
0 2 0 2
1
D = 0 0 2 1 3
1 1 3 0 0
5 2 1 0 0
Optimize Z = g( x )
g( x ) = h( x ) − b = 0; x = ( x1 , x2 ,..., xn )T ≥ 0
Multiply each constraint by Lagrange multiplier λ and subtract it from the objective
function. The new unconstrained objective function (Lagrange function) becomes:
L ( x, λ ) = f ( x ) − λ g ( x )
∂L ∂f ∂g
= −λ = 0; j = 1, 2,..., n
∂x j ∂x j ∂x j
∂L
= − g( x ) = 0
∂λ
( ∂f / ∂x j )
λ= ; j = 1, 2,..., n
( ∂g / ∂x j )
0 ∂g ∂g ∂g
∂x1 ∂ x2 ∂x n
∂g ∂2 f ∂2 g ∂2 f ∂2 g ∂2 f ∂2 g
−λ −λ −λ
∂x1 ∂x12 ∂x12 ∂x1∂x2 ∂x1∂x2 ∂x1∂xn ∂x1∂xn
∆ n + 1 = ∂g ∂2 f ∂2 g ∂2 f ∂2 g ∂2 f ∂2 g
−λ −λ −λ
∂ x2 ∂x2 ∂x1 ∂x2 ∂x1 ∂ x2 2 ∂ x2 2 ∂ x2 ∂ x n ∂ x2 ∂ x n
∂g 2
∂ f ∂ g 2 2
∂ f ∂ g 2 2
∂ f ∂2 g
−λ −λ −λ
∂x n ∂xn ∂x1 ∂xn ∂x1 ∂ x n ∂ x2 ∂ x n ∂ x2 ∂x n 2 ∂x n 2
37
If the sign of minors ∆3, ∆4, ∆5 are alternately positive and negative, then the extreme
point is a local maximum. But if sign of all minors ∆3, ∆4, ∆5 are negative, then the
extreme point is local minimum.
Example:
Use the method of Lagrangian multipliers to solve the following NLP problem. Does
the solution maximize or minimize the objective function?
and
x1 , x2 , x3 ≥ 0
Solution
Lagrangian function can be formulated as:
Putting the values of x1 , x2 and x3 in the last equation ∂L / ∂λ = 0 and solving for
λ, we get λ = 30. Substituting the value of λ in the other three equations, we get an
extreme point: ( x1 , x2 , x3 ) = (5,11, 4).
To prove the sufficient condition of whether the extreme point solution gives
maximum or minimum value of the objective function we evaluate (n −1) principal
minors as follows:
∂g ∂g
0
∂x1 ∂ x2
0 1 1
∂g ∂2 f ∂2 g ∂2 f ∂2 g
∆3 = −λ −λ = 1 4 0 = −6
∂x1 ∂x12 ∂x12 ∂x1∂x2 ∂x1∂x2
1 0 2
∂g ∂2 f ∂2 g ∂2 f ∂2 g
−λ −λ
∂ x2 ∂x2 ∂x1 ∂x2 ∂x1 ∂ x2 2 ∂ x2 2
38
0 1 1 1
1 4 0 0
∆4 = = 48
1 0 2 0
1 0 0 6
Since, the sign of ∆3 and ∆4 are alternative, therefore extreme points: ( x1 , x2 , x3 ) = (5,11, 4)
is local maximum. At this point the value of the objective function is Z = 281.
Optimize Z = f ( x )
Optimize Z = g( x )
gi ( x ) + si 2 = 0, i = 1, 2,..., m
λ i gi ( x ) = 0,
gi ( x ) ≤ 0,
λ i ≥ 0, i = 1, 2,..., m
Maximize Z = f ( x )
gi ( x ) ≤ 0, i = 1, 2,..., m
Example:
Maximize Z = 12 x1 + 21x2 + 21x1 x2 − 2 x12 − 2 x22 subject to the constraints
(i ) x2 ≤ 8, (ii) x1 + x2 ≤ 10,
and
x1 , x2 ≥ 0
Solution
Here f ( x1 , x2 ) = 12 x1 + 21x2 + 21x1 x2 − 2 x12 − 2 x22
g1 ( x1 , x2 ) = x2 − 8 ≤ 0
g2 ( x1 , x2 ) = x1 + x2 − 10 ≤ 0
2
∂f ∂gi
(i )
∂x j
− ∑λ
i =1
i
∂x j
= 0, j = 1, 2 (ii ) λ i gi ( x ) = 0, i = 1.2
12 + 2 x2 − 4 x1 − λ 2 = 0 λ 1 ( x 2 − 8) = 0
21 + 2 x1 − 4 x2 − λ1 − λ 2 = 0 λ 2 ( x1 + x2 − 10) = 0
40
(iii ) gi ( x ) ≤ 0 (iv) λ i ≥ 0. i = 1, 2
x2 − 8 ≤ 0
x1 + x2 − 10 ≤ 0
12 + 2 x2 − 4 x1 = 0 and 21 + 2 x1 − 4 x2 = 0
x2 − 8 = 0 or x2 = 8
x1 + x2 − 10 = 0 or x1 = 2
Substituting these values in condition (i), we get λ1 = −27 and λ 2 = 20. However, this
solution violates condition (iv) and therefore may be discarded.
Case 3: λ1 ≠ 0, λ 2 = 0, then from conditions (i) and (ii) we have:
x1 + x2 = 10
2 x2 − 4 x1 = −12
2 x1 − 4 x2 = −12 + λ1
2 x2 − 4 x1 = −12 + λ 2
2 x1 − 4 x2 = −21 + λ 2
x1 + x2 = 10
Initialization
Selection
Reproduction
• A mating pool is created from the appropriate individuals for reproduction pro-
cess. Members of mating pool crossbreed to generate new population. This
approach is used to generate a next generation population of solutions from
those selected through genetic operators: Crossover (also called recombin-
ation), and/or mutation.
• For each new solution to be generated, a pair of “parent” solutions is selected
for breeding to get “child” solutions.
• By generating a “child” solution from either crossover or mutation, a new
solution is created which typically shares many of the characteristics of its
“parents”. Now, new parents are selected for each child, and this process con-
tinues till a feasible solution set of appropriate size is generated.
3.3.1.1 Crossover
Parent
Parent 1 1 1 0 0 1 0 1 0 0 1
Parent 2 0 1 0 1 0 1 0 1 0 1
Child 1 1 1 0 0 1 1 0 1 0 1
Child 2 0 1 0 1 0 0 1 0 0 1
43
Mutation
Before mutation 1 1 0 0 1 1 0 1 0 1
After mutation 1 1 0 0 0 1 0 1 0 1
Before mutation 1 1 0 0 1 1 0 1 0 1
After mutation 0 0 1 1 0 0 1 0 1 0
Termination
Limitations
• The representation of the problem may be difficult. You need to identify which
variables are suitable to be treated as genes and which variables must be let
outside the GA process.
• The determination of the convenient parameters (population size, mutation
rate) may be time consuming
• As in any optimization process, if you don’t take enough precautions, the algo-
rithm may converge in a local minimum (or maximum).
44
( ) ( )
vi( k +1) = wvi( k ) + c1r1 pi( k ) − xi( k ) + c2 r2 pg( k ) − xi( k ) and xi( k +1) = xi( k ) + vi( k +1)
Where w is the inertia weight; k (= 1, 2...m − gen) indicates the iterations (generations).
The constant c1 (> 0) and c2 (> 0) are cognitive learning and social learning rates,
respectively, which are the acceleration constants responsible for varying the particle
velocity towards pi( k ) and pg( k )respectively.
Updated velocity of ith particle is calculated by considering three components:
(i) previous velocity of the particle, (ii) the distance between the particles best
previous and current positions, and (iii) the distance between swarms best experi-
ence (the position of the best particle in the swarm) and the current position of the
particle.
The velocity is also limited by the range − vmax , vmax where vmax is called the
maximum velocity of the particle. The choice of a too small value for vmax can cause
very small updating of velocities and positions of particles at each iteration. Hence,
the algorithm may take a long time to converge and faces the problem of getting
stuck to local minima. To overcome this issue, Clerc (1999), Clerc and Kennedy
(2002) proposed an improved velocity update rule employing a constriction factor of
χ. According to them, the updated velocity is given by
( ) ( )
vi( k +1) = χ vi( k ) + c1r1 pi( k ) − xi( k ) + c2 r2 pg( k ) − xi( k )
2
χ , where φ = c1 + c2 , φ > 4,
2 − φ − φ2 − 4φ
45
Algorithm:
Global maxima
Local maxima
FIGURE 3.1
Graph of an arbitrary function with local and global maxima.
46
FIGURE 3.2
Plateau.
FIGURE 3.3
Ridges.
Algorithm
Step 3: If x (t +1) − x (t ) < ε and T is small enough then terminate the process, else
if (t mod n) = 0, then accordingly go to step 1.
Algorithm
β
{
max [ τ(h, u)] • [ η(h, u)] if {q ≤ q0 }
s w∉Mk
S otherwise
Here S is random variable which favours shorter edges with higher level of phero-
mone trail through a probability distribution mentioned below.
48
Note that [ τ(h, u)] is amount of pheromone trail on edge (h, u) whereas η(h, u)
is heuristic function on edge (h, u) .
Step 2: Pheromone amount on the edges are updated locally as well as globally.
Step 3: Global updating rewards edged belongs to shortest tours. Once ants have
completed their routes, ant that has travelled shortest path deposits additional
pheromone on each visited edge.
τ(h, s ) ← (1 − α ) • τ(h, s ) + α • τ o.
Advantages:
Disadvantages:
1. For small set of nodes, many other search techniques are available with less
computational cost.
2. Although convergence is certain but time is uncertain.
3. Some of the time computational cost is high.
coordination between the forbidding and freeing strategy to select the trial solutions
which is better known as short-term strategy. To identify neighbouring or adjacent
solutions a “neighbourhood” is constructed to move to another solutions from the
current solution. Choosing a particular solution from the neighbourhood depends
on search history and on frequency of solution called attributes that have already
produced past solutions. As mentioned earlier this algorithm has a flexible memory
and therefore it records forbidden moves for future known as tabu moves. There is
provision of exceptions too in aspiration criterion. When a tabu move gives a better
result compared to all the solutions received so far, then it can be overridden.
Stopping Criterion
Algorithm
Advantages:
Disadvantages:
TRY YOURSELF
g1 = − x12 + x2 − 4 ≤ 0
Subject to using KT conditions.
g2 = −( x1 − 2)2 + x2 − 3 ≤ 0
Q2. Write code to solve constrained problem using Genetic Algorithm in C
program.
Q3. Compare advantages and disadvantages of Hill Climbing and Simulated
Annealing. Do they have any relation with each other?
Q4. Ant colony algorithm is most efficient with which kind of problems. Explain
with suitable illustration.
Q5. Find the minimum of f = x5 − 5x3 − 20x + 5 in the range (0, 3) using the ant
colony optimization method. Show detailed calculations for two iterations
with four ants.
51
4 Applications of
Non-Linear
Programming
Solution:
The relevant variables of this problem are:
If the company has unlimited resources, the only constraints are s1,s2 ≥ 0.
Unconstrained Optimization. We first solve the unconstrained optimization
problem. If P has a maximum in the first quadrant this yields the optimal solution. The
condition for an extreme point of P leads to a linear system of equations for (s1 , s2 ),
∂P
= 144 − 0.02 s1 − 0.007s2 = 0
∂s1
∂P
= 174 − 0.007s1 − 0.02 s2 = 0
∂s2
The solution of these equations is s1* = 4735, s2* = 7043 with profit value P* =
P(s1* , s2* ) = 553,641. Since s1* , s2* are positive, the inequality constraints are satisfied.
To determine the type of the extreme point, we inspect the Hessian matrix,
−0.02 −0.007
HP(s1* , s2* ) =
−0.007 −0.02
A sufficient condition for a maximum is that ( HP )11 < 0 and det( HP ) > 0. Both of
these conditions are satisfied and so our solution point is indeed a maximum, in fact
a global maximum.
53
The first two constrains are satisfied by (s1* , s2* ), however s1* + s2* = 11, 278. The
global maximum point of P is now no longer in the feasible region, thus the optimal
solution must be on the boundary. We therefore solve the constrained optimization
problem
subject to
We can either substitute s1 or s2 from the constraint equation into P and solve
an unconstrained one-variable optimization problem, or use Lagrangian multipliers.
Choosing the second approach, the equation ∇P = λ∇c becomes
which reduces to a single equation for s1 , s2 . Together with the constraint equation we
then have again a system of two linear equations,
−0.013s1 + 0.013s2 = 30
s1 + s2 = 10, 000.
The solution is s1* = 3846, s2* = 6154 with profit value P* = 532, 308.
Solution:
Here, our objective is to minimize the total cost of manufacturing “CAN”. Cost of
fabrication depends on total surface area of “CAN” (cylinder).
∴ Total surface area of cylinder = 2 πrh + 2 πr 2 = 2 πr (h + r ) from Figure 4.1
FIGURE 4.1
CAN with radius r and height h.
It is mentioned in the problem that the capacity of “CAN” must be 200 ml (cc)
πr 2 h = 200
3.5r − h ≤ 0
∑ ( )
3
min F Pg (i)
g =1
50 ≤ P1 ≤ 200 (v)
FIGURE 4.2
Simple three bus network.
56
10 ≤ P2 ≤ 150 (vi)
60 ≤ P3 ≤ 300 (vii)
The goal of this optimization problem is to reduce the total generation cost to a
minimum subject to the constraints. There are nine decision variables. Quadratic
expression related to variables, P1, P2, and P3 represent the total generation cost. The
other variables, voltage terms,V1, V2, and V3 and angles θ12 , θ23, and θ31 have zero
coefficients. For simplicity only three equality constraints (ii), (iii), and (iv) are load
balance constraints at each bus. The non-linear terms represent the power flowing
through individual transmission line. The last three set of inequality constraints are
generation limit constraints. This entire formulation is of the non-linear convex type
but it can also be approximated as linear programming model type.
Qh = fc p (tout − t m ) , (4.2)
• Each heat exchanger also has a capital cost that is based on its area Ai , i ∈{c, h, m},
for heat exchange. Here we consider a simple countercurrent, shell, and tube
heat exchanger with an overall heat transfer coefficient, Ui , i ∈{c, h, m}. The
resulting area equations are given by
i
Qi = Ui Ai ∆ Tlm , i ∈{c, h, m} . (4.4)
57
fcp, tin
Tw
FCp, Tin Qm FCp, Tm FCp, Tout
Qc
fcp, tm
Qh
Ts
fcp, tout
FIGURE 4.3
Example of simple heat exchanger network.
i
• The log-mean temperature difference ∆ Tlm is given by
∆ Tai − ∆ Tbi
i
∆ Tlm = , i ∈{c, h, m} , (4.5)
(
ln ∆ Tai / ∆ Tbi )
and, ∆ Tac = Tm − Tw , ∆ Tbc = Tout − Tw ,
Our objective is to minimize the total cost of the system, i.e., the energy cost as well
as the capital cost of the heat exchangers. This leads to the following NLP:
∑ i ∈{c,h, m}
(cQ + c A ) (4.6)
i i i
β
I
where the cost coefficients ci and ci reflect the energy and amortized capital prices,
the exponent β ∈ (0,1] reflects the economy of scale of the equipment, and a small
constant ∆ > 0 is selected to prevent the log-mean temperature difference from
becoming undefined. This example has one degree of freedom. For instance, if the
heat duty Qm is specified, then the hot and cold stream temperatures and all of the
remaining quantities can be calculated.
58
FIGURE 4.4
Distillation column example.
The column is specified to recover most of then-butane (the light key) in the top
product and most of the isopentane (the heavy key) in the bottom product. We assume
a total condenser and partial reboiler, and that the liquid and vapour phases are in
equilibrium. A tray-by-tray distillation column model is constructed as follows using
the MESH (Mass–Equilibrium–Summation–Heat) equations:
B + V0 − L1 = 0,(4.9)
Li + Vi − Li +1 − Vi −1 = 0, i [1, N ] , i ∉ S , (4.10)
59
Li + Vi − Li +1 − Vi −1 − F = 0, i S , (4.11)
LN +1 + D − VN = 0. (4.12)
( LN +1 + D ) xN +1, j − VN yN , j = 0, j C, (4.16)
x N +1, j − yN , j = 0, j C , (4.17)
Enthalpy Balances
BH B + V0 HV , 0 − L1 H L ,1 − QR = 0 (4.18)
Li H L , i − Vi HV , i − Li +1 H L , i +1 − Vi −1 HV ,i −1 = 0 , i [1, N ] , i ∉ S , (4.19)
Li H L , i − Vi HV , i − Li +1 H L ,i +1 − Vi −1 HV ,i −1 − FH F = 0, i S , (4.20)
VN HV , N − ( LN +1 − D ) H L , D − QC = 0. (4.21)
∑ ∑
m m
yi, j − x = 0, i = 0,…, N + 1 (4.22)
j =1 j =1 i , j
H L ,i = ϕ L ( xi , Ti ) , HV ,i = ϕV ( yi , Ti ) , i = 1,…, N (4.24)
H B = ϕ L ( x0 , T0 ) , H F = ϕ L ( xF , TF ) , H N +1 = ϕ L ( x N +1 , TN +1 ) , (4.25)
60
where
The feed is a saturated liquid with component mole fractions specified in the order
given above. The column is operated at a constant pressure, and we neglect pressure
drop across the column. This problem has 2 degrees of freedom. For instance, if the
flow rates for V0 and LN +1 are specified, all of the other quantities can be calculated
from equations (4.9)–(4.23). The objective is to minimize the reboiler heat duty which
accounts for a major portion of operating costs, and we specify that the mole fraction
of the light key must be 100 times smaller in the bottom than in the top product. The
optimization problem is therefore given by
Min QR
s.t. (4.9)-(4.23)
xbottom,lk ≤ 0.01xtop,lk ,
Li , Vi , Ti ≥ 0, i = 1,…, N + 1,
D, QR , QC ≥ 0,
per tray, and through phase equilibrium relations, which also increase the non-
linearity and number of equations on each tray.
Example:
Find minimum of f ( x ) = x 3 − 2 x − 5 in the interval of (0, 2)
Step 1: Write function file in script and save it appropriately (name of your
function is “minimf” in the example, it can be changed as per user wish):
Step 2: Call fminbnd by appropriate syntax to calculate both minima and min-
imum value.
Calling syntax and output is shown below.
1. fminunc –It is suitable for continuous functions that have first-and second-
order derivatives. It uses quasi-Newton algorithm.
• Write a function m.file for given function.
• Define x0 (initial approximation) and then call fminunc in a script file using
syntax: [x, fval] = fminunc(@myfun, x0)
62
• OUTPUT
2. fminsearch –It can handle even discontinuous functions. It does not need
derivative information.
• Write a m.file for function
• Define x0 (initial guess)and then call fminsearch in a script file.
• Call the fminsearch by following syntax:
x = fminsearch(@myfun, x0)
• Min f ( x ) = 100( x2 − x12 )2 + (1 − x1 )2 ; X 0 = [ −1.2, 1]
• INPUT
OUTPUT
output
X = 0.2578 0.2578
resnorm = 124.3622
63
• Function file:
• Calling script:
• Output
output
X = 0.2578 0.2578
resnorm = 124.3622
c( x ) ≤ 0
ceq( x ) − 0
min f ( x )such that A⋅ x ≤ b
x
Aeq ⋅ x = beq
lb ≤ x ≤ ub,
Min f ( x ) = − x1 x2 x3 ; X 0 = [10;10;10]
• Example:
s.t.0 ≤ x1 + 2 x2 + 2 x3 ≤ 72
• Function file:
• Calling file:
• Output:
output x =
24.0000
12.0000
12..0000
fval = -3.4560e+03
well. Beginners can use them instead of writing complete programs by themselves.
Here, an illustration is demonstrated to minimize a function with constraint through
GA. MATLAB has optimtool box that can be explored for other inbuilt functions.
− exp[( − x / 20)2 ], x ≤ 20
Example: f ( x ) =
− exp( −1) + ( x − 20)( x − 22), x>20
• Write function file
Here solver used is GA, fitness function is objective function with name “@goodfun”.
Number of variable is one. Since there is no constraint and no predefined bounds on
66
the decision variable, we can run the solver. It shows that it took 51 iterations with
value of variable as 0.006 and functional value at minima is –0.99. Now let us discuss
one example with constraints.
Example:
Min f ( x ) = 100(x12 − x 2 )2 + (1 − x 2 )2
subject : x1 x2 + x1 − x2 + 1.5 ≤ 0, 10 − x1 x2 ≤ 0, 0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 13
Function file:
Constraint file:
Solver is GA, fitness function is objective function, number of variables are 2, bounds
are mentioned. Please note that equality constraints can be mentioned directly in
67
interactive window in array form whereas for non-linear constraint we need to write
function file and call it appropriately as shown above. Minima is [0.812, 12.312] and
minimum is 13706.1085.
TRY YOURSELF
Q1. Optimize the following function using fsearch:
Q2. Optimize the following functions using “fmincon” and “ga” –built function
of MATLAB. Initial approximation can be [1, 1]T
Subject to : x1 + 2 x2 ≤ 5
4 x1 + 3 x2 ≤ 10
6 x1 + x2 ≤ 7, x1 , x2 ≥ 0
(ii ) Min f = 4 x1 + 2 x2 + 3 x3 + 4 x4
Subject to : x1 + x3 + x4 ≤ 24
3 x1 + x2 + 2 x3 + 4 x4 ≤ 48
2 x1 + 2 x2 + 3 x3 + 2 x4 ≤ 36, x1 , x2 , x3 , x4 ≥ 0
68
BIBLIOGRAPHY
1. Rao, S. S., (First published 2009). Engineering and Optimization: Theory and Practice
(4th Ed.), New Jersey, U.S.A.: John Wiley & Sons, Inc.
2. Sharma, J. K., (2009). Operations Research: Theory and Practices (4th Ed.), New Delhi,
INDIA: Macmilan.
3. Pant, K. K., Sinha, S., & Bajpai, S., (2015). Advances in Petroleum Engineering II –
Petrochemical, Studium Press. LLC, U.S.A: Studium Press LLC, U.S.A.
4. Lorenz T. Biegler, (2010). Nonlinear Programming: Concepts, Algorithms and
Applications to Chemical Processes, Society for Industrial and Applied Mathematics,
U. S.
69
Index
analytical approach 1 Lagrange Multipliers method 33
ant colony optimization algorithm 47
Newton method 12
basics of formulation 51
particle swarm optimization 44
examples of NLP formulation 51 problems with equality constraints
exhaustive search technique 3 31
69
70