Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
297 views

Non-Linear Programming - A Basic Introduction

Uploaded by

Zé Felipe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
297 views

Non-Linear Programming - A Basic Introduction

Uploaded by

Zé Felipe
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 83

i

Non-​Linear Programming
ii

Mathematical Engineering, Manufacturing, and


Management Sciences
Series Editor: Mangey Ram, Professor, Assistant Dean (International Affairs),
Department of Mathematics, Graphic Era University, Dehradun, India

The aim of this new book series is to publish the rese arch studies and articles
that bring up the latest development and research applied to mathematics and its
applications in the manufacturing and management sciences areas. Mathematical
tool and techniques are the strength of engineering sciences. They form the common
foundation of all novel disciplines as engineering evolves and develops. The series
will include a comprehensive range of applied mathematics and its application in
engineering areas such as optimization techniques, mathematical modelling and
simulation, stochastic processes and systems engineering, safety-​ critical system
performance, system safety, system security, high assurance software architecture
and design, mathematical modelling in environmental safety sciences, finite element
methods, differential equations, reliability engineering, etc.

Circular Economy for the Management of Operations


Edited by Anil Kumar, Jose Arturo Garza-Reyes, and Syed Abdul Rehman Khan

Partial Differential Equations: An Introduction


Nita H. Shah and Mrudul Y. Jani

Linear Transformation
Examples and Solutions
Nita H. Shah and Urmila B. Chaudhari

Matrix and Determinant


Fundamentals and Applications
Nita H. Shah and Foram A. Thakkar

Non-​Linear Programming
A Basic Introduction
Nita H. Shah and Poonam Prakash Mishra

For more information about this series, please visit: www.routledge.com/​


Mathematical-​Engineering-​Manufacturing-​and-​Management-​Sciences/​book-​series/​
CRCMEMMS
iii

Non-​Linear Programming
A Basic Introduction

Nita H. Shah and Poonam Prakash Mishra


iv

MATLAB® is a trademark of The MathWorks, Inc. and is used with permission. The MathWorks
does not warrant the accuracy of the text or exercises in this book. This book’s use or discussion of
MATLAB® software or related products does not constitute endorsement or sponsorship by The
MathWorks of a particular pedagogical approach or particular use of the MATLAB® software.
First edition published 2021
by CRC Press
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-​2742
and by CRC Press
2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN
© 2021 Nita H. Shah and Poonam Prakash Mishra
CRC Press is an imprint of Taylor & Francis Group, LLC
The right of Nita H. Shah and Poonam Prakash Mishra to be identified as authors of this work has been
asserted by them in accordance with sections 77 and 78 of the Copyright, Designs and Patents Act 1988.
Reasonable efforts have been made to publish reliable data and information, but the author and publisher
cannot assume responsibility for the validity of all materials or the consequences of their use. The
authors and publishers have attempted to trace the copyright holders of all material reproduced in
this publication and apologize to copyright holders if permission to publish in this form has not been
obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.
Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced,
transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or
hereafter invented, including photocopying, microfilming, and recording, or in any information storage
or retrieval system, without written permission from the publishers.
For permission to photocopy or use material electronically from this work, access www.copyright.com or
contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-​750-​8400. For works that are not available on CCC please contact mpkbookspermissions@tandf.co.uk
Trademark notice: Product or corporate names may be trademarks or registered trademarks and are used
only for identification and explanation without intent to infringe.
Library of Congress Cataloging‑in‑Publication Data
Names: Shah, Nita H., author. | Mishra, Poonam Prakash, author.
Title: Non-linear programming: a basic introduction /
Nita H. Shah and Poonam Prakash Mishra. Description: First edition. |
Boca Raton, FL: CRC Press, an imprint of Taylor & Francis Group, LLC, 2021. |
Series: Mathematical engineering, manufacturing, and management sciences |
Includes bibliographical references and index.
Identifiers: LCCN 2020040934 (print) | LCCN 2020040935 (ebook) |
ISBN 9780367613280 (hardback) | ISBN 9781003105213 (ebook)
Subjects: LCSH: Nonlinear programming.
Classification: LCC T57.8 .S53 2021 (print) |
LCC T57.8 (ebook) | DDC 519.7/6–dc23
LC record available at https://lccn.loc.gov/2020040934
LC ebook record available at https://lccn.loc.gov/2020040935
ISBN: 978-​0-​367-​61328-​0  (hbk)
ISBN: 978-​1-​003-​10521-​3  (ebk)
v

Contents
Preface���������������������������������������������������������������������������������������������������������������������� vii
Acknowledgement���������������������������������������������������������������������������������������������������� ix
Author/​Editor Biographies���������������������������������������������������������������������������������������� xi

Chapter 1 One-​Dimensional Optimization Problem��������������������������������������������� 1


1.1  Introduction����������������������������������������������������������������������������������� 1
1.2  Analytical Approach���������������������������������������������������������������������� 1
1.3  Search Techniques������������������������������������������������������������������������� 2
1.3.1  Unrestricted Search Technique������������������������������������������� 3
1.3.2  Exhaustive Search Technique��������������������������������������������� 4
1.3.3  Dichotomous Search Technique����������������������������������������� 4
1.3.4  Fibonacci Search Method��������������������������������������������������� 6
1.3.5  Golden Section Search Method������������������������������������������ 8
1.3.6  Interpolation Method (Without Using Derivative)������������� 9
1.3.6.1  Quadratic Interpolation����������������������������������������� 9
1.3.6.2  Cubic Interpolation��������������������������������������������� 12
1.4  Gradient-​Based Approach����������������������������������������������������������� 12
1.4.1  Newton Method���������������������������������������������������������������� 12
1.4.2  Secant Method������������������������������������������������������������������ 13
Try Yourself���������������������������������������������������������������������������������������� 13

Chapter 2 Unconstrained Multivariable Optimization���������������������������������������� 15


2.1  Introduction��������������������������������������������������������������������������������� 15
2.2  Direct Search Methods���������������������������������������������������������������� 15
2.2.1  Random Search Method��������������������������������������������������� 16
2.2.2  Grid Search Method��������������������������������������������������������� 16
2.2.3  Univariate Search Method������������������������������������������������ 17
2.2.4  Pattern Search Algorithm������������������������������������������������� 18
2.2.4.1  Hooke–​Jeeves Method���������������������������������������� 19
2.2.4.2  Powell’s Method������������������������������������������������� 20
2.2.5  Simplex Algorithm����������������������������������������������������������� 22
2.3  Gradient-​Based Methods������������������������������������������������������������� 24
2.3.1  Using Hessian Matrix������������������������������������������������������� 24
2.3.2  Steepest Descent Method������������������������������������������������� 25
2.3.3  Newton’s Method������������������������������������������������������������� 26
2.3.4  Quasi Method������������������������������������������������������������������� 27
Try Yourself���������������������������������������������������������������������������������������� 28

v
vi

vi Contents

Chapter 3 Constrained Multivariable Optimization�������������������������������������������� 31


3.1  Introduction��������������������������������������������������������������������������������� 31
3.2  Conventional Methods for Constrained Multivariate
Optimization�������������������������������������������������������������������������������� 31
3.2.1  Problems with Equality Constraints��������������������������������� 31
3.2.1.1  Direct Substitution Method�������������������������������� 32
3.2.1.2  Lagrange Multipliers Method����������������������������� 33
3.2.2  Problems with Inequality Constraints������������������������������ 38
3.2.2.1  Kuhn–​Tucker Necessary Conditions������������������ 38
3.2.2.2  Kuhn–​Tucker Sufficient Conditions������������������� 39
3.3  Stochastic Search Techniques����������������������������������������������������� 41
3.3.1  Genetic Algorithm������������������������������������������������������������ 41
3.3.1.1 Crossover������������������������������������������������������������� 42
3.3.2  Particle Swarm Optimization������������������������������������������� 44
3.3.3  Hill Climbing Algorithm�������������������������������������������������� 45
3.3.4  Simulated Annealing�������������������������������������������������������� 45
3.3.5  Ant Colony Optimization Algorithm�������������������������������� 47
3.3.6  Tabu Search Algorithm����������������������������������������������������� 48
Try Yourself���������������������������������������������������������������������������������������� 50

Chapter 4 Applications of Non-​Linear Programming���������������������������������������� 51


4.1  Basics of formulation������������������������������������������������������������������ 51
4.2  Examples of NLP formulation���������������������������������������������������� 51
4.3  Solving NLP through MATLAB Inbuilt Functions��������������������� 61
4.4  Choice of Method������������������������������������������������������������������������ 67
Try Yourself���������������������������������������������������������������������������������������� 67

Index������������������������������������������������������������������������������������������������������������������������ 69
vii

Preface
Optimization is just an act of utilizing the given resources in best possible way. We
use this concept knowingly or unknowingly in all aspects of life. Therefore, the term
“optimization” has its wide range of applications in almost all the fields such as basic
sciences, engineering and technology, business management, medical science and
defence, etc. An optimization algorithm is a procedure which is executed iteratively
by comparing various solutions till an optimum or a satisfactory solution is found.
With the advent of computers, optimization has become a part of computer-​aided
activity. There are two types of algorithms widely used today as deterministic and
stochastic algorithms.
In order to understand and explore this concept we need to see mathemat-
ical formulation behind it. Mathematically, an optimization problem consists of a
function, better known as objective function that needs to be optimized (maximized/​
minimized). There are some decision variables (design variables) on which value of
objective function depends. Problem can be with or without constraints. If objective
function as well as all the constraints are linear then that particular problem come
under the category of linear programming problem (LPP) otherwise non-​linear pro-
gramming problem (NLP). If objective function or any of the constraint happens to
be non-​linear that problem is called NLP. Focus of this book is on NLP only. It is
obvious that solving non-​linear problems is more difficult compared to LPP. Choice
of methods for solving an NLP depends on many parameters such as number of deci-
sion variables, concavity of the function, presence of constraint, equality or inequality
constraints, and lastly overall complexity of the objective function in terms of con-
tinuity, smoothness, and differentiability. This book proposes a well synchronized and
auto-​guided approach for beginners to understand and solve different types of NLPs.
This book has presented algorithms with their basic idea and appropriate illustrations
for better understanding of the readers. Language and approach is simple to cater
needs of undergraduate, postgraduate, and even research scholar in formulation and
solution of their research problem. We have also mentioned MATLAB® syntax to
use inbuilt functions of the MATLAB for solving different NLPs.
In this book we will discuss only non-​linear programming (NLP). There are
many conventional methods available in the literature for the optimization but still
unable to approach all kinds of problem. So, researchers are continuously involved
in developing new methods better known as optimization algorithms. Chapters of the
book are as follows:
Chapter  1 discusses NLP for unimodal functions with single variable without
constraints. Here we have discussed both conventional gradient-​based methods and
search algorithms for unimodal functions. These approaches act as fundamental for
multivariable problems. We have also compared various approaches available to
obtain solution and illustrate them briefly. Chapter 2 takes reader to the next level
of multivariable NLP problems but without constraints. Here also we will demon-
strate different approaches to solve the set of problems. This chapter also includes
limitations of different methods which reader need to take care while applying them.

vii
viii

viii Preface

Chapter  3 allows reader to understand the most complex problems with non-​
linearity and multivariability in the presence of constraints. Here different conven-
tional methods for equality and inequality constraints are explained. As this is most
complex form, most of the time real-​world problems cannot be addressed with con-
ventional approaches. Therefore, some of the widely accepted modern approaches of
stochastic search algorithms are mentioned in this chapter.
Chapter  4 makes reader understand the applicability of the above discussed
methods in the different areas of pure sciences, engineering and technology, man-
agement, finance, etc. This part includes formulation of real-​world problems into
mathematical form that can be solved by any of the appropriate methods that allow
readers to use these concepts in their research work widely. It is possible to compute
program for all the algorithms mentioned in Chapters 1, 2, and 3 using C language
or MATLAB. But, MATLAB also comes with built-​in functions that can be simply
called using appropriate syntax to approach most of the NLP. These inbuilt functions
are discussed with their syntax for the convenience of readers.
ix

Acknowledgement
First and foremost, I would like to thank Almighty for giving me all the strength and
knowledge to undertake and complete the writing of this book successfully. I would
like to thank Prof. Nita H. Shah from bottom of my heart for being my mentor and
guide, since my Ph.D. tenure. She has always been a light house for my career and
research activities. She has played the role of a friend, philosopher, and guide in
my life. During the writing of this book she has helped me as a mentor as well as a
co-​author.
I am very grateful towards management of PDPU, SoT  –​Director  –​Prof.
S. Khanna and Academic Director –​Prof. T. P. Singh for giving me full support and
mental space to write this book successfully. I would also like to extend my gratitude
towards my departmental colleagues for supporting me consistently.
I sincerely express my gratitude to my parents Mrs. Kanchan L. Pandey and late
G. N. Pandey for their generous blessings. At last, I would like to extend my spe-
cial thanks to my true strength, my husband Mr. Prakash Mishra and my beloved
daughter Aarushi Mishra, for giving me all love and affection to cherish my goals
in life.

ix
x
newgenprepdf

xi

Author/​Editor Biographies
Prof. Nita H. Shah received her Ph.D. in Statistics from Gujarat University in 1994.
From February 1990 till now Prof. Nita is HOD of Department of Mathematics in
Gujarat University, India. She is postdoctoral visiting research fellow of University of
New Brunswick, Canada. Prof. Nita’s research interests include inventory modeling in
supply chain, robotic modeling, mathematical modeling of infectious diseases, image
processing, dynamical systems and its applications, etc. Prof. Nita has published
13 monographs, 5 textbooks, and 475+ peer-​reviewed research papers. Four edited
books are prepared for IGI-​global and Springer with coeditor as Dr. Mandeep Mittal.
Her papers are published in high impact Elsevier, Inderscience, and Taylor and
Francis journals. She is author of 14 books. By the Google scholar, the total number
of citations is over 3070 and the maximum number of citation for a single paper is
over 174. The H-​index is 24 up to March 2020 and i-​10 index is 74. She has guided
28 Ph.D.  students and 15 M.Phil. students till now. Seven students are pursuing
research for their Ph. D. degree. She has travelled to USA, Singapore, Canada, South
Africa, Malaysia, and Indonesia for giving talks. She is Vice-​President of Operational
Research Society of India. She is council member of Indian Mathematical Society.
Dr. Poonam Prakash Mishra has completed her Ph.D in the year 2010 in math-
ematics. She also holds master’s degree in business administration with specializa-
tion in operations management. Her core research area is modelling and formulation
of inventory and supply chain management. She is also interested in the mathemat-
ical modelling of real-​world problems with stochastic optimization. She has applied
concepts of modelling and optimization in various fields such as for crude oil explor-
ation, for sea ice route optimization problems, and impact of wind power forecasts on
revenue insufficiency issue of electricity markets other than supply chain problems.
She has successfully guided 03 students to earn their Ph.D. degree. She has more than
40 journal publications and 8 book chapters in various reputed international journals.
She has successfully completed a funded project form SAC –​ISRO and working on
the other proposals.
Presently, she is working on Remote Sensing Investigation of Parameters
that Affect Glacial Lake Outburst Flood (GLOF). She is working as a faculty of
Mathematics at School of Technology –​“Pandit Deendyal Petroleum University” at
present.

xi
xii
1

1 One-​Dimensional
Optimization Problem

1.1  INTRODUCTION
In this section, we discuss the different methods available to optimize (min-
imize/​maximize) the given function with only one variable. Methods used for
one-​ dimensional optimization are highly useful for multivariable optimization.
Firstly, we see the methods that can be used for unimodal functions. Unimodal
functions are those functions that has only one peak or valley (in the given
domain). Mathematically, function f ( x ) is unimodal if (i ) x1 < x2 < x* implies that
f ( x2 ) < f ( x1 ) and (ii ) x2 > x1 > x* implies that f ( x1 ) < f ( x2 ) , where x* is the min-
imum point. Figure 1.1 is a mind map that can help to explore available methods for
these set of problems.

1.2  ANALYTICAL APPROACH


Analytic or conventional approach for extreme values can be achieved by following
necessary and sufficient conditions, but this approach can work only for well-​defined
continuous and differentiable functions in the given domain.

Necessary condition: For a point x0 to be the local extrema (local maximum and
minimum) of a function y = f ( x ) defined in the interval a ≤ x ≤ b is that the first
derivative of f ( x ) exists as a finite number at x = x0  and f ′( x0 ) = 0

Sufficient condition: If at an extreme point x = x0 of f ( x ), the first (n –​ 1) derivatives


of it becomes zero, then:

(i)  Local maximum of f ( x ) occurs at x = x0 if f ( ) ( x0 ) < 0, for n even


n
( n)
(ii)  Local minimum of f ( x ) occurs at x = x0 if f ( x0 ) > 0, for n even
(iii)  Point of inflection occurs at x = x0 if, f ( ) ( x0 ) ≠ 0, for n odd.
n

Algorithm:

1.  Compute first derivative.


2.  Solve the equation for x, f ′( x ) = 0, say x = x0 .

1
2

2 One-Dimensional Optimization Problem

OPTIMIZATION (MIN/MAX) FOR ONE-DIMENSIONAL UNIMODAL FUNCTIONS

ANALYTICAL APPROACH SEARCH ALGORITHMS


(CALCULUS)

WITHOUT USING DERIVATIVE USING DERIVATIVE

• Unrestricted search • Newton method


• Exhausve search • Secant method
• Dichotomous search
• Fibonacci search
• Golden secon search
• Interpolaon-based approach
o Quadrac interpolaon
o Cubic interpolaon

FIGURE 1.1 
Tree diagram of available methods for one-​variable optimization.

3. Compute f ′′ ( x0 ) .
4. Declare x0 as local minima if f ′′( x0 ) > 0 or declare x0 as local maxima if 
f ′′ ( x0 ) < 0

Example:
If the total revenue (R) and total cost (C) function of a firm are given by R = 30 x − x 2 .
and C = 20 + 4 x, where x is the output. What is the maximum profit?

Solution
Let profit function be P, P = R –​  C
P( x ) = (30 x − x 2 ) − (20 + 4 x ) = − x 2 + 26 x − 20
P ′( x ) = −2 x + 26 = 0 ⇒ x = 13 , using necessary condition,
P ′′( x ) = −2 < 0, using sufficient condition P( x ) gives its maximum value at x = 13.
Hence, maximum profit is Rs. 149.

1.3  SEARCH TECHNIQUES


We shall discuss five different search techniques that are actually based on the
assumption that function is unimodal, at least in the given range. This initial range
is the interval of uncertainty and need to be made finer and finer with the iterations.
These methods are also known as bracketing methods. These search techniques either
simultaneously or step by step calculate the functional value at different points in
the interval of uncertainty to obtain a finer interval of uncertainty in which minima/​
maxima lies.
3

One-Dimensional Optimization Problem 3

Let f ( x ) be our unimodal objective function and we desire to find minimum of


this function in the range ( xF , xL ). Let x F and x L denote the first and last points of
the interval of uncertainty. Then our initial interval of uncertainty is L0 = xF − xL .
Except unrestricted and exhaustive search techniques in all other search techniques,
we shall try to reduce this initial level of uncertainty from L0 , L1 , L2 ....Ln in nth
iterations. Let us see basic concepts, algorithm, and worked illustrations of each of
the methods.

1.3.1  Unrestricted Search Technique


In this method functional value is calculated at an initial point say,x0 then the following
points can be calculated using fixed step length. This method can also be practiced
with accelerated step size. Here we are presenting algorithm with fixed step length.

Algorithm:

1. Start with an initial guess point, say, x0 in the given interval ( xF , xL )


2. Find f ( x0 ) = f0
3. Assume a step size s, find x1 = x0 + s
4. Find f(x1) = f1
5. If f1 < f0 , and if the problem is for minimization, then unimodality of
f ( x )  indicates that the desired minimum cannot lie for x > x0. Hence,
the search can be continued further along points x2 , x3 , x4 ..... using the
assumption of unimodality while testing each pair of experiments. This pro-
cedure is continued until a point xi = x1 + (i − 1)s shows an increase in the
function value.
6. The search is terminated at xi if fi > fi−1. In that case either xi or xi−1 can be
taken as the optimum point. Even midpoint of (xi, xi−1) can be considered as
optimum point.

Example:
Find the minimum of f = x(x − 2.5) by initial point as 1.01 and step size as 0.1 using
unrestricted search algorithm.

Solution:
Let us calculate functional value at x = 2 and further taking step length as 0.2, as per
the algorithm.

Value Functional value Condition Next iteration required


1 x0 = 1.01 f0 = –​1.5049 -​-​-​-​-​
2 x1 = 1.11 f1 = –​1.5429 f0 > f1 Yes
3 x2 = 1.21 f2 = –​1.5609 f1 > f2 Yes
4 x2 = 1.31 f3 = –​1.5589 f2 < f3 No
4

4 One-Dimensional Optimization Problem

This process shows that minimum lies between 1.21 and 1.31. Either of the point can
also be chosen as minima. This algorithm can be repeated assuming smaller step size
in the interval (1.21, 1.31).

1.3.2  Exhaustive Search Technique


This method evaluates the objective function at a predetermined number of equally
spaced points in the interval ( xF , xL ) . Let need to be evaluated at “n” equally spaced
points in originally known interval of uncertainty of length L0 = xL − xF that divides
L0 into n + 1 intervals and we choose one of the interval as final interval of uncer-
tainty called Ln. Let, say optimum value of function lies in the xi th interval form the
2
n equally spaced intervals then  Ln = x j +1 − x j +1 = L0
n +1

Algorithm:

1. Divide the given interval ( xF , xL ) into (n + 1)  equal intervals assuming “n”
points within the given interval.
2. Find functional value simultaneously on all the (n + 2) points.
( ) ( )
3. Find a point x j such that f ( x j −1 ) > f x j < f x j +1 .
4. Declare x j as minima.

Example:
Find the minimum of f = x(x − 2.5) in the interval (1, 1.4) using exhaustive search
technique.

i 1 2 3 4 5 6 7 8 9
xi 1 1.05 1.10 1.15 1.20 1.25 1.30 1.35 1.40
f(xi) −1.5000 − 1.5225 −1.5400 −1.5525 −1.5600 −1.5625 −15600 −1.5525 −1.5400

Since x5 = x7 , minimum lies between these values. Middle of these values, that is,
x6 = 1.25 can be considered as appropriate approximation.

1.3.3  Dichotomous Search Technique


In this method, initial interval of uncertainty L0 is reduced in successive iteration by
finding functional value at just two points. On the basis of functional value, interval
of uncertainty is reduced.

Algorithm:

1. Let initial interval of uncertainty be L0 = ( xF , xL ). Assume very small value of


delta δ.
L δ L δ
2. Find two points x1 and x2 such that x1 = 0 − , x2 = 0 + .
2 2 2 2
5

One-Dimensional Optimization Problem 5

3. If f ( x2 ) < f ( x1 ), then we can discard ( x1 , xL ). And our new interval of uncer-


tainty is ( xF , x1 ) .
4. Continue Step 2 for the interval ( xF , x1 ) .
5. Let the new points be x3 and x4 .
6. If f ( x3 ) < f ( x4 ) then the new interval of uncertainty be ( xF , x4 ).
7. Continue the process to reduce the interval of uncertainty to the desirable level.

Example:
Find the minimum for f(x) = x(x − 2.5) in the interval (1, 1.4) using dichotomous
search.
Here, f(x) = x(x − 2.5), let us find x1 and x2 using the above-​mentioned formula by
taking δ = 0.001
Iteration-​1

L0 δ 2.4
x1 = − = − 0.0005 = 1.1995
2 2 2
L δ 2.4
x2 = 0 + = + 0.0005 = 1.2005
2 2 2

Now, f1 = −1.5599 and f2 = −1.5600. Since f1 > f2 , interval ( xF , x1 ) can be discarded.


So, next interval of uncertainty is ( x1 , xL ) = (1.1995, 1.4 ).
Let us find x3 and x4 using the same method.
Iteration-​2

L1 δ 2.5995
x3 = − = − 0.0005 = 1.29925
2 2 2
L δ 2.3395
x4 = 1 + = + 0.0005 = 1.30025
2 2 2

Now, f3 = −1.5600 , f4 = −1.5599. Since f3 < f4, interval (x4, xL) = (1.30025, 1.4) can
be discarded. So, next interval of uncertainty is (x1, xL) = (1.1995, 1.30025)
Let us find x5 and x6.
Iteration-​3

L2 δ 2.49975
x5 = − = − 0.0005 = 1.249375
2 2 2
L δ 2.49975
x6 = 2 + = + 0.0005 = 1.259375
2 2 2

Now, f5 = − 1.562499 , f6 = −1.562412 . Since f5 < f6, interval (x6, x4) = (1.259375,
1.30025) can be discarded. So, next interval of uncertainty is (x6, x4) = (1.1995,
1.259375).
6

6 One-Dimensional Optimization Problem

So, minimum lies between the interval (1.1995, 1.259375) at the end of three
iterations. The midpoint 1.22943 can be considered as the required optimal value.
Interval of uncertainty reduces in this way after every iteration

1 11  δ 11  δ δ
L1 =
2
( L0 + δ ) , L2 =  ( L0 + δ ) + , L3 =   ( L0 + δ ) +  + 
22  2 22  2 2

Ln  1 
Ln = + δ 1 − n /2 
2 n/2  2 

1.3.4  Fibonacci Search Method


This is also an iterative method that uses the sequence of Fibonacci numbers, {Fn},
for placing the experiments. These numbers are defined as F0 = F1 = 1, Fn = Fn−1 = Fn−2,
n = 2, 3, 4, …
which gives 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 … ..
Other than function with initial interval of uncertainty, this algorithm needs value
of n to execute the algorithm.

Algorithm:

1. Let initial interval of uncertainty is Lo = [ a, b ], and number of experiments


be n.
2. Calculate x1 and x2 between [a, b] such that using x1and x2 is L* units apart from
F
a and b respectively, x1 = a + L*, x1 = b + L* where L* = n − 2 L0.
Fn
3. Calculate f1 and f2. If f1 < f2 , then we shall discard ( x2 , b ) and form Lo = [ a, b ]
and new interval of uncertainty will be L2 = [a, x2]. If f1 > f2 then (a, x1) will get
discarded and new interval of uncertainty will be [x1, b].
4. Calculate x3 using x3 = a + (x2 − x1) considering L2 = [a, x2] as present interval
of uncertainty, such that distance between x3 and a is same as x2 and x1.
5. Calculate f1 and f3, if f1 > f3, then discard (a, x1) and next interval of uncertainty
will be [x1, x2].
6. Calculate x4 using x4 = x1 + (x2 − x3).
7. Repeat Step 5 till xn.

Example:
Find the minimum for f(x) = x(x − 5.3) in the interval [1, 4] using Fibonacci search
algorithm.

Solution
Here function is f(x) = x(x − 5.3), L0 = [1, 4]. Let us assume n = 6
Fn − 2
Let us calculate x1 and x2 using  x1 = a + L, * x1 = b + L* where L* = L0
Fn
7

One-Dimensional Optimization Problem 7

x1 = a + L * = 1 + 1.153846 = 2.153846  5
 where L* = (3) = 1.153846
x2 = b − L* = 4 − 1.153846 = 2.846154  13

←→
|| || | |
a = 1 x1 = 2.153846 x2 = 2.846154 b=4

f1 = −67759738, f2 = −6.9840236. Since f1 > f2, we will discard (a, x1) and next interval
of uncertainty is L1 = [x1, b] = [2.153846, 4].
Let us calculate x3 using x3 = x1 + (b − x2) = 2.153846 + (4 − 2 .846154) = 3.307692.

←   →
| | || ||
x1 = 2.153846 x2 = 2.846154 x3 = 3.307692 b=4

Here, f2 = −6.9840236, f3 = −6.5899412. Since, f2 < f3, we will discard (x3, b) and next
interval of uncertainty is L2 = [x1, x3].
Let us calculate x4 using x4 = x1 + (x3 − x2) = 2.153846 + (3.307692 − 2.846154) =
2.615384

←  →
| | || ||
x1 = 2.153846 x4 = 2.615384 x2 = 2.846154 x3 = 3.307692

Here, f4 = −7.0213017. Since f2 > f4, we will discard (x2, x3) and new interval of uncer-
tainty is L3 = [x1, x2].
Let us calculate x5 using x5 = x1 + (x2 − x4) = 2.153846 + (2.846154 − 2.615384) =
2.384616

←  →
|| || | |
x1 = 2.153846 x5 = 2.384616 x4 = 2.615384 x2 = 2.846154

Here, f5 = −6.9520713. Since f5 > f4, we will discard (x1, x5) and new interval of uncer-
tainty is L3 = [x5, x2].
Let us calculate x6 using x6 = x1 + (x4 − x5) = 2.153846 + (2.615384 − 2.384616)
= 2.384614

←   →
|| || | |
x5 = 2.384616 x6 = 2.384614 x4 = 2.615384 x2 = 2.846154

Here, f6 = −6.9520703. Since f6 > f4 we will discard (x5, x6) and new interval of uncer-
tainty is [x6, x2] = 0.23077 = L6
Here, L6 /​ L0 = 0.23077/​3 = 0.0769233 (Reduction ratio for n = 6).
8

8 One-Dimensional Optimization Problem

1.3.5  Golden Section Search Method


This method is again an iterative method that allows deletion of some part of an
interval of uncertainty in every iteration on the basis of unimodality and functional
value of the function. But, it does not need a predefined “n” number of iterations as
in the case of Fibonacci. It allows to terminate the process as per the tolerance and
considers very large “n”. Therefore, simply applying condition “n tends to infinity”
on Fibonacci can as well lead to this ratio. Golden ratio is being used since ancient
times by Greek architects and engineers in designing process. They believed that any
a+b a
construction with side a and b that satisfies the ratio = = γ carries happiness
a b
and prosperity.

b a 1
1+ = = γ ⇒ 1+ = γ ⇒ γ 2 − γ −1 = 0
a b γ

Positive root of the equation is γ = 1.61803. This is the golden ratio and we will use
this in the golden section search method.

Algorithm:

1. Given interval of uncertainty is L0 = [ xF , xL ].


1
2. Find L*2 = 2 L0 and using this obtain x1 = xF + L*2 and x2 = xL − L*2.
γ
3. Find f1 and f2 . If f1 > f2 then discard [ xF , x1 ]and declare [ x1 , xL ] as new
interval of uncertainty.
4. Next experiment is x3, which needs to be obtained using the equation
x3 = x1 + ( xL − x1 ).
5. Continue this process till desired tolerance is obtained.

Example:
Find the minimum for f(x) =  x(x − 5.3) in the interval [1,  4] using golden section
search algorithm.

Solution:
1
Initial level of tolerance is L0 = [1, 4]. Let us find L*2 = L0 = 0.382(3) = 1.146
γ2
N o w ,     x1 = xF + L*2 = 1+ 1.146 = 2.146 and x2 = xL − L*2 = 4 − 1.146 = 2.854 .
f1 = –6.7684, f2 = –6.9808. Since f1 > f2 we can discard [ xF , x 1 ] = [1, 2.146] and next
interval of uncertainty will be [ x1 , x L ] = [2.146, 4].

←→
|| || | |
xF = 1 x1 = 2.146 x2 = 2.854 b=4
9

One-Dimensional Optimization Problem 9

TABLE 1.1
Comparison of various search techniques

Method Formula n = 4 n = 10

2 L0
Exhaustive search Ln = Ln = (0.4)L0 Ln = (0.18182)L0
n +1
L0  1  Ln = (0.25)L0 Ln = (0.03125)L0
Dichotomous Ln = + δ 1 − n / 2  ,
search 2n / 2  2  + 0.0075 + 0.0096875
δ = 0.01
L0 Ln = (0.2)L0 Ln = (0.11245)L0
Fibonacci search Ln =
Fn
Golden section Ln = (0.618)n −1 L0 Ln = (0.236)L0 Ln = (0.01315)L0
search

Now next experiment is x3 = x1 + ( xL − x2 ) = 2.146 + (4 − 2.854) = 3.292.


←→
| | || ||
x1 = 2.146 x2 = 2.854 x3 = 3.292 xL = 4

Here, f3 = −6.610336 , f2 < f3. Hence, we will discard [ x3 , x L ] = [3.292, 4] and new
interval of uncertainty be [2.146, 2.854]. Process can be continued to attain better
approximation.
Table 1.1 shows a tabular comparison of various approaches of search techniques in
order to understand the efficiency of different methods. Efficiency of search techniques
depends on Ln / L0 . Here L0 represent original interval of uncertainty whereas Ln
represent reduce interval of uncertainty after n iterations. Comparison clearly shows
that Fibonacci and Golden section searches reduce this ratio faster compared to
exhaustive search and dichotomous search techniques. As soon as we move to higher
iterations, Fibonacci proves to be a better search technique than Golden section.

1.3.6  Interpolation Method (Without Using Derivative)


In this section we will see how interpolation can be used to find minima of given
unimodal function without actually finding its derivative. It comprises of two
methods: (1) quadratic interpolation and (2) cubic interpolation methods.

1.3.6.1  Quadratic Interpolation


For a given unimodal function f ( x ), x ∈[ xF , xL ] whose minima need to be obtained
is approximated with a quadratic polynomial p( x ). In this approach p( x ) gets
optimized in the interval [ xF , xL ] instead of f ( x ), x ∈[ xF , xL ]. Optimum value x* can
f ( x* ) − p( x* )
be accepted if it satisfies < ε where ε is a very small predefined value.
f ( x* )
10

10 One-Dimensional Optimization Problem

Algorithm:

1. Initialize with xF , x1 , xL such that xF < x1 < xL where x1 can be midpoint of the
given interval [ xF , xL ].
2. Approximate given function f ( x ) with quadratic polynomial
p( x ) = a0 + a1 x + a2 x 2 using following set of equations.

f ( xF ) = a0 + a1 ( xF ) + a2 ( xF )2
f ( x1 ) = a0 + a1 ( x1 ) + a2 ( x1 )2
f ( xL ) = a0 + a1 ( xL ) + a2 ( xL )2
f ( xF ) x1 xL ( xL − x1 ) + f ( x1 ) xL xF ( xF − xL ) + f ( xL ) x1 xF ( x1 − xF )
a0 =
( xF − x1 )( x1 − xL )( xL − xF )
f ( xF )( x12 − xL 2 ) + f ( x1 )( xL 2 − xF 2 ) + f ( xL )( xF 2 − x12 )
a1 =
( xF − x1 )( x1 − xL )( xL − xF )
− f ( xF )( x1 − xL ) + f ( x1 )( xL − xF ) + f ( xL )( xF − x1 )
a2 =
( xF − x1 )( x1 − xL )( xL − xF )
− a1
Obtain optimal x using x* =
2 a2
f ( x* ) − p( x* )
3. Check  <ε
f ( x* )
4. There will be four cases on the basis of value x* and x1 and their functional
values. Obtain the next interval of uncertainty as per the cases:
( )
CASE 1 ⇒ x* < x1 , f x* < f ( x1 ), then new interval be [x F ,x1 ].
Declare new x F = xF , x* = x1 , xL = x1
( )
CASE 2 ⇒ x* < x1 , f x* > f ( x1 ), then new interval be [x* ,x L ].
Declare new x F = x* , x1 = x1 , xL = x1
( )
CASE 3 ⇒ x* > x1 , f x* < f ( x1 ), then new interval be [x1 ,x L ].
Declare new x F = x1 , x1 = x* , xL = xL
( )
CASE 4 ⇒ x* > x1 , f x* > f ( x1 ), then new interval be [x F , x* ].
Declare new x F = x F , x1 = x1 , xL = x*
5. Continue to refine the interval of uncertainty till desired approximation is
achieved.

Example:
 1 0.9
Find the minimum value of the function f ( x ) = 0.5 − x tan −1   − using quad-
 x  1 + x2
ratic interpolation method in the interval of [0.45, 0.65].

Solution
Here xF = 0.45, x1 = 0.55, xL = 0.65 . Let ε = 0.001
11

One-Dimensional Optimization Problem 11

Iteration-​1:
Solve for p( x ) = a0 + a1 x + a2 x 2 using following set of equations

f (0.45) = a0 + a1 (0.45) + a2 (0.45)2


f (0.55) = a0 + a1 (0.55) + a2 (0.55)2
f (0.65) = a0 + a1 (0.65) + a2 (0.65)2

Value of a0 = −0.548247, a1 = −0.76669, a2 = −0.6333

− a1 0.77669
x* = = = 0.245938
2 a2 0.6333

f ( x* ) − p( x* ) −0.675677 + 0.698499
= = 0.03377 > ε
*
f (x ) −0.675677

Now, f1 = −0.778353, f * = −0.675677. Since  f1 < f *.


New interval will be [ x* , xL ] = [0.2459, 0.65]

Iteration 2:

xF = 0.2459, x1 = 0.55, xL = 0.65

Solving for p( x ) = a0 + a1 x + a2 x 2 using following set of equations

f (0.2459) = a0 + a1 (0.2459) + a2 (0.2459)2


f (0.55) = a0 + a1 (0.55) + a2 (0.55)2
f (0.65) = a0 + a1 (0.65) + a2 (0.65)2

− a1 0.989916
x* = = = 0.405530
2 a2 2(0.819324)

f ( x* ) − p( x* ) −0.75366275 + 0.748452
= = 0.00691 > ε. Since f1 < f *. New
*
f (x ) −0.75366275
interval will be [ x* , xL ] = [0.405530, 0.65]

Iteration 3:

xF = 0.40553, x1 = 0.55, xL = 0.65


12

12 One-Dimensional Optimization Problem

− a1 1.31452
x* = = = 0.603084
2 a2 2(1.08983)

f ( x* ) − p( x* ) −0.78 + 0.780758
= = 0.0009717 < ε
f ( x* ) −0.78

We have achieved the desired accuracy and can terminate the process. Optimal
(minima) value for given function is 0.603084

1.3.6.2  Cubic Interpolation


Cubic interpolation approach is analogous to quadratic interpolation. In quadratic
interpolation, we approximate the given unimodal function f ( x ), x ∈[ xF , xL ] whose
minima need to be determined by a cubic polynomial p( x ). Similarly, in cubic
interpolation approach p( x ) = a0 + a1 x + a2 x 2 + a3 x 3 gets optimized in the interval
[ xF , xL ] instead of f ( x ), x ∈[ xF , xL ]. Optimum value x* can be accepted if it satisfies
f ( x* ) − p( x* )
< ε where ε is a very small predefined value.
f ( x* )

1.4  GRADIENT-​BASED APPROACH


Calculus suggests that necessary conditions for a given function f ( x ) to have min-
imum at x* is f ′( x* ) = 0. In this section we shall find the root of equation f ′( x* ) = 0
using Newton method and Secant method.

1.4.1  Newton Method
Newton’s method, also known as the Newton–​Raphson method, is a method of
finding roots. Using this concept we can find root of f ′( x ) and this point would be
local minima of f ( x ). In case initial approximation is not close to x*, it may diverge.
Let quadratic approximation of the function f ( x ) at x = x* using Taylor’s be

( ) ( ) ( )
f ( x ) = f x* + ( x − x* ) f ′ x* + ( x − x* )2 f ′′ x* + ....

Deriving once and equating to zero gives

( ) ( )
f ′( x ) = f ′ x* + ( x − x* ) f ′′ x* = 0

Algorithm:

Step 0: Set x0, (initial approximation), ε > 0, k = 0, 1, 2 … .


f ′( xk )
Step 1: Get  xk +1 = xk −
f ′′( xk )
Step 2: If f ′( xk ) < ε declare xk as optimal point or repeat Step 1.
13

One-Dimensional Optimization Problem 13

Example:
Find minimum of the function f ( x ) = x 4 − x 3 + 5 using Newton method.
Take x0 = 1

Solution:
We have f ( x ) = x 4 − x 3 + 5, then f ′( x ) = 4 x 3 − 3 x 2 and  f ′′( x ) = 12 x 2 − 6 x .
f ′ ( x0 ) 1
First iteration:  x1 = x0 − = 1 − = 0.833
f ′′( x0 ) 6
f ′( x1 ) 0.23037
Second iteration:  x1 = x1 − = 0.833 − = 0.7637
f ′′( x1 ) 3.32866
Minima for the given function is 0.7637

1.4.2  Secant Method
This method is again used to approximate root of f ′( x ) that eventually happens to
be minima of f ( x ). Here, we use secant to approximate the roots instead of tangent.
Let us have two points ‘a’ and ‘b’ on the function f ′( x ) such that  f ′(a ). f ′(b) < 0 .
Then, we can evaluate the next approximation using secant (a, f ′(a )) and  (b, f ′(b))
 b−a 
as  xk = b −   f ′ (b ) .
 f ′ (b ) − f ′ ( a ) 
For next iteration, if f ′(a ). f ′( xk ) < 0, (a, f ′(a )) and ( xk , f ′( xk )) will be new
secant.
Otherwise, f ′(b). f ′( xk ) < 0 allows (b, f ′(b)) and ( xk , f ′( xk )) to form new secant.
Continue the process till the desired accuracy is achieved.

Algorithm:

Step 0: Set [ a, b] (initial approximation) such that f ′(a ). f ′(b) < 0 , ε > 0, k  =  0,
1, 2 … .
 b−a 
Step 1: Get  xk = b −   f ′ (b )
 f ′ (b ) − f ′ ( a ) 
Step 2: If f ′(a ). f ′( xk ) < 0 then (a, f ′(a )) , ( xk , f ′( xk )) is new secant, otherwise
if f ′(b). f ′( xk ) < 0 then ( (b, f ′(b)) , ( xk , f ′( xk )) is new secant.
Step 3: If f ′( xk ) < ε declare xk as optimal point or repeat Step 1.

TRY YOURSELF
Q1. Find the minimum of the function f ( x ) = x 5 − 5 x 3 − x + 25 using the following
methods:
(a) Unrestricted search technique using initial interval of uncertainty as
(0, 3)
(b) Dichotomous search technique using initial interval of uncertainty as
(0, 3). Take δ = 0.001
14

14 One-Dimensional Optimization Problem

(c) Golden section search technique using initial interval of uncertainty


as (0, 3)
(d) Fibonacci search algorithm using initial interval of uncertainty as (0, 3)
Answer: 1.751
x/2
Q2. Find the minimum of the function f ( x ) = using the following
log ( x / 3)
methods taking initial guess as x0 = 8.5.
(a) Quadratic interpolation method
(b) Cubic interpolation method
(c) Newton method
(d) Secant method
Answer: 8.155
Q3. Find the number of experiments to be conducted in the following methods to
1L 
obtain a value of  n  = 0.01:
2L  0
(a) Exhaustive search
(b) Dichotomous search with δ = 0.001
(c) Fibonacci method
(d) Golden section method
Answer (a) n ≥ 99, (b) n ≥ 14, (c) n ≥ 9, n ≥ 10

6x
Q4. Find the maximum of the function f ( x ) = 2
using the following
x − 3x + 5
methods taking initial guess as x0 = 2.5 . Comment which method converges
faster to the optimum.
(a) Quadratic interpolation method
(b) Cubic interpolation method
(c) Newton method
(d) Secant method
Answer: 2.236
15

2 Unconstrained
Multivariable
Optimization

2.1  INTRODUCTION
In this section, we will discuss several methods with their algorithm to solve uncon-
strained multivariable problems for optimization. We will also demonstrate working
of these algorithms with suitable examples. There are two approaches to find optimal
solution for multivariable problems. If function is smooth and can be derived gradient-​
based methods are followed, which are also known as indirect methods. However,
if this is not the case then optimal point of the given function can be obtained by
different search algorithms popularly known as direct search methods. Tree dia-
gram in Figure  2.1 illustrates the different methods that are available for solving
multivariable optimization problems.

2.2  DIRECT SEARCH METHODS


These methods are applicable to functions that are not differentiable and may not be
continuous. Different search techniques using various logics approximate the extreme
points. Eventhough there are many techniques available under direct search methods,
we will discuss simplex method, Hooke–​Jeeves method, and Powell’s method in
detail. We will also understand key points behind random search, grid search, and
univariate search as these are basic and less commonly used methods.

Minimize f ( x1 , x2 ...... xn )
x1 , x2 ...... xn

In general, search algorithms have the following structure:

1. x K +1 = x K + ∆x K d K , where d K is direction and x K is increment .


2. We need to start with an initial guess and must specify termination criterion.

15
16

16 Unconstrained Multivariable Optimization

OPTIMIZATION (MIN/MAX) FOR MULTIVARIABLE PROBLEM

INDIRECT METHODS
DIRECT SEARCH METHODS
(Gradient-based methods)
• Random Search Method
• Grid Search Method • Using Hessian Matrix
• Univariate Search Method • Steepest Descent Method
• Paern Search Algorithm • Newton’s Method
i) Hooke-Jeeves Method • Quasi Method
ii) Powell’s Method
• Simplex Method

FIGURE 2.1 
Tree diagram of available methods for multivariable optimization problems.

2.2.1  Random Search Method


As name suggests here functional value is calculated for the bounded decision
variables. Let xi , i ∈[1, n] are decision variables with lower and upper bounds as lK
and uK respectively for the objective function Minimize f ( x1 , x2 ...... xn ).
x1 , x2 ...... xn
Approach of random search uses random numbers 0 < {r11 , r12 , r13 ......r1n } < 1, to
 x1   l1 + r11 (u1 − l1 ) 
 x  l + r (u − l )
   
obtain first approximation by X1 =  2  =  2 12 2 2 
:  : 
 xn  ln + r1n (un − ln )
Similarly, new set of random numbers can be generated 0 < {r21 , r22 , r23 ......r2 n } < 1.
 l1 + r21 (u1 − l1 ) 
 l + r (u − l ) 
 
And X 2 can be obtained by X 2 =  2 22 2 2 
 : 
ln + r2 n (un − ln )
Now, obtain f ( X1 ), f ( X 2 ).... f ( X n )..... Further get (f(XK) = Min{f(X1), f(X2)....
f(Xn)....}).
This method is also suitable for functions with discontinuous and non-​differentiable
points and can find both local and global extremes. Still this method is not efficient
as it needs too many functional values to reach to conclusion but it can be fused with
other methods to obtain the global optimum.

2.2.2  Grid Search Method


This method divides the given space into grids and further value of objective function
is calculated at each node to have an idea of extremes. An objective function with
two variables will have a design grid in two-​dimension and are easy to interpret. For
17

Unconstrained Multivariable Optimization 17

example, a two-​design variable space with four partitions have 24 = 16 nodes but a
design space with let say five design variables will have 54 = 625 nodes. It is obvious
that computation cost is too high in case of problems with high number of decision
variables and this method too is not an efficient search method to find the optimal
solution.

2.2.3  Univariate Search Method


In this method, only one variable vary at one point of time keeping other variables
constant. Value of one variable can be optimized by any of the methods discussed in
Chapter 1.

Algorithm:

Step 0: Set x0 , k = 0,1, 2...


Step 1: Find the search direction dk   as
(1, 0, 0......0) for i = 1, n + 1, 2n + 1......
(0,1, 0......0) for i = 2, n + 2, 2 n + 2......

dkT = (0, 0,1......0) for i = 2, n + 3, 2 n + 3......
................................

(0, 0, 0......0) for i = 1, 2 n + 3n......

Step 2: Find  fk = f ( xk ), f + = f ( xk + εdk ), f − = f ( xk − εdk )


 f + < fk , then d k is correct direction,
If  for a minimization problem

 f < fk , then -d k is correct direction
*
Step 3: Find optimal step length ∆xk * such that  f ( xk ± ∆xk dk ) = min( xk ± ∆xk dk ).
Step 4: Set xk +1 = xk ± ∆xk dk  and  f ( xk +1 ) = fk +1
* ∆xk

Step 5: Set the value of k = k + 1 and go to step 1. Continue the process till desired
accuracy is achieved.

Example:
Find Min f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2 using univariate method.

Solution:
0 
Let us set x0 =   , ε = 0.01.
0 
1 
Step 1: let d0 =  
0 
 0  1  0.01
Step 2:  f0 = 0 ,  f + = f ( x0 + εd0 ) = f    + 0.01   = f   = 0.0102
 0  0   0 
18

18 Unconstrained Multivariable Optimization

 0  1   −0.01
f − = f ( x0 − εd0 ) = f    − 0.01   = f   = −0.0098
 0  0   0 

Since, f − < f0, −d0 will be direction for minimization.


 0 1   −∆x0  2
Step 3:  f ( x0 + ∆x0 d0 ) = f    − ∆x0   = f   = 2( ∆x0 ) − ∆x0 .
 0  0
    0 
df
= 0 ⇒ ∆x0* = 0.25
d ∆x0
0  1   −0.25
Thus x1 = x0 − ∆x0* d0 =   − 0.25   =  
0
  0   0 
Here,  f1 = −0.125
0 
Step 4: Choose the search direction d1 =  
1 
  −0.25 0  −0.25
f + = f ( x1 + εd1 ) = f    + 0.01   = f   = −0.1399
 0  1  0.01 
  −0.25 0  −0.25
f − = f ( x1 − εd1 ) = f    − 0.01   = f   = −0.1099.
 0  1  −0.01
Since f + < f1 , d1 is correct choice for the direction.

  −0.25 0   −0.25
Step 5: f ( x1 + ∆x1d1 ) = f    − ∆x1 1  = f  ∆x 
 0     1 
= ( ∆x1 )2 − (1.5)∆x1 − 0.375.

df
= 0 ⇒ ∆x1* = 0.75
d ∆x1

 −0.25 0   −0.25
Thus x2 = x1 + ∆x1* d1 =   + 0.75 1  =  0.75 
 0     
f2 = −0.6875

2.2.4  Pattern Search Algorithm


There are certain methods that use pattern directions and are called Pattern Search
Algorithm. In this section we shall be discussing Hooke–​Jeeves method and Powell’s
method. This algorithm helps to search the minimum along the pattern direction Si
defined by Si = Xi − Xi −1, where, Xi  = point at the end of n univariate steps.

Xi−1= beginning of n univariate steps


19

Unconstrained Multivariable Optimization 19

2.2.4.1  Hooke–​Jeeves  Method


It is a sequential technique in which each step consists of two kinds of moves:

(a) Exploratory move –​It is performed in the neighbourhood of current point using


univariate technique to explore local behaviour of objective function. Aim of
exploratory move is to obtain best possible point around the current point.
(b) Pattern move  –​It is performed along a pattern direction. Pattern direction
can be achieved by considering the current best points as well as the previous
( )
points using the formula x (pk +1) = x ( k ) + x ( k ) − x ( k −1) , but if new point is not
the improved point then we need to re-​explore the exploratory move with
small step length.

Algorithm:

Step 0: Define starting point x (0 ); increment ∆ i = 1, 2,......n ; step reduction factor


α > 1; ε > 0 (termination factor). Set k=0.
Step 1: Perform exploratory search with x ( k ) as base point. Let x be output of
exploratory move. If exploratory move is a success then x ( k ) = x and go
to step 2.
Step 2: If ∆ i < ε, stop; current solution ≈ x* . Else set ∆ i = ∆ i / α and go to step 1.
( )
Step 3: Set k = k +1 and perform pattern move: x (pk +1) = x ( k ) + x ( k ) − x ( k −1) .
Step 4: Perform exploratory search with x (pk +1)as base point. Say, output = x ( k +1).
Step 5: If f ( x ( k +1) ) < f ( x ( k ) ), go to step 3 or else go to step 2.

Example:
( )
2
Find the minimum of f ( x, y) = x 2 + y − 11 + ( x + y 2 − 7)2. Let the initial approxi-
mation be X (0 ) = ( x ( 0 ) , y( 0 ) ) = (0, 0)

Solution:

1st Iteration
Let X (0 ) = ( x ( 0 ) , y( 0 ) ) = (0, 0), ∆1 = 0.5
Let us go for first exploratory move in x-​direction

(0, 0) ⇒ (0 + 0.5, 0) = (0.5, 0)


(0, 0) ⇒ (0, 0)
(0, 0) ⇒ (0 − 0.5, 0) = ( −0.5, 0)

Functional value at possible moves is

f (0.5, 0) = 157.8, f (0, 0) = 170, f ( −0.5, 0) = 171.81,

Since, f (0.5, 0) = 157.8, New ( x (0 ) , y(0 ) ) = (0.5, 0) is least we will use this to move
in y-​direction.
20

20 Unconstrained Multivariable Optimization

(0.5, 0) ⇒ (0.5, 0 + 0.5) = (0.5, 0.5)


(0.5, 0) ⇒ (0.5, 0)
(0.5, 0) ⇒ (0.5, 0 − 0.5) = (0.5, −0.5)

Functional value at possible moves is

f (0.5, 0.5) = 144.12, f (0.5, 0) = 157.81, f (0.5, −0.5) = 165.62,

Since, f (0.5, 0.5) = 144.12, Finally, X E(1) = (0.5, 0.5)


Now we will apply pattern move by

( )
X (pk +1) = X ( k ) + X ( k ) − X ( k −1) = 2 X ( k ) − X ( k −1)
X (p2 ) = 2(0.5, 0.5) − (0, 0) = (1,1)

2nd Iteration: let X (2 ) = (1,1). We shall move on with x-​direction

(1,1) ⇒ (1 + 0.5,1) = (1.5,1)


(1,1) ⇒ (1,1) = (1,1)
(1,1) ⇒ (1 − 0.5,1) = (0.5,1)

f (1.5,1) = 80.3125, f (1,1) = 106, f (0.5,1) = 125.3

We will choose f (1.5,1) = 80.3125 to explore in y-​direction.

(1.5,1) ⇒ (1.5,1 + 0.5) = (1.5,1.5)


(1.5,1) ⇒ (1.5,1) = (1.5,1)
(1.5,1) ⇒ (1.5,1 − 0.5) = (1.5, 0.5)

f (1.5,1.5) = 63.12, f (1.5,1) = 80.31, f (1.5, 0.5) = 95.67

Now, X E(2 ) = (1.5,1.5). Let us compute pattern move using last two moves

X (p3) = 2(1.5,1.5) − (0.5, 0.5) = (2.5, 2.5)

Continuing the process we will get X (p4 ) = (4.5, 2.5) ,


We have  f ( X (1) ) = 144.12, f ( X (2 ) ) = 144.12, f ( X (3) ) = 0, f ( X ( 4 ) ) = 50 .

Since, f ( X (3) ) < f ( X ( 4 ) ), we can terminate it and start with new ∆ = = (0.25, 0.25).
2
2.2.4.2  Powell’s Method
This method is also known as Powell’s Conjugate Direction Method and is one of
the most widely used direct search method. It is a unique extension of basic pattern
21

Unconstrained Multivariable Optimization 21

search method. Basically, this method minimizes a quadratic function in a finite


number of steps with the fact that any non-​linear function can be approximated with
a quadratic function near its minimum.
Powell’s method generates n linearly independent search directions from the pre-
vious best point, and then unidirectional search is performed along each of these
search directions. In general, non-​ linear unconstrained multivariable function
generates more than one go for n unidirectional searches; whereas, Powell’s method
finds minimum of a quadratic function by one go at a time of n unidirectional searches
along each search direction. This works on conjugate directions instead of arbitrary
search directions. Two directions d (1) and d (2 ) are called conjugate with respect to

( ) ( )
T
positive definite matrix Q if  d (1) Q d (2 ) = 0 .
Let Q be an (n x n) square positive definite symmetric matrix. A set of n linearly
independent search directions d (i ) is called conjugate with respect to matrix if Q is

(d ) Q (d ) = 0,
T
(i ) ( j)
∀i ≠ j, i = 1, 2, 3...n, j = 1, 2, 3....n

Parallel Subspace Property


Consider a quadratic function with two variables:

 1
q( x ) = a + bT x + xT Qx , a is scalar, b is vector, Q is 2 × 2 mattrix
 2

Let y(1) is solution to the problem: minimize q( x (1) + λ d )


And y(2 )is the solution to the problem: minimize q( x (2 ) + λ d )
Then the direction ( y(2 ) − y(1) )is conjugate to d. It equivalently means that
( y − y(1) )Qd = 0
(2)

Algorithm:

Step 0: Set x (0 )–​initial approximation; set d (i ) as a set of n linearly independent


directions.
Step 1: Perform unidirectional search along d (1) and take it to d ( n ) .
Step 2: Use extended parallel subspace property to form new conjugate direc-
tion, d.
Step 3: If d is smaller or search directions are linearly dependent, then terminate
the process.
Else replace d ( j ) = d ( j −1) ∀j = n, n − 1, n − 2.... 
d
d (1) = and then go to step 1.
d

Example:
Find the minimum of  f ( x1 , x2 ) = 2 x13 + 4 x1 x23 − 10 x1 x2 + x22 , x (0 ) = [5, 2]T
22

22 Unconstrained Multivariable Optimization

Solution
1  0   5
Step 0: Set d (1) =   , d (2 ) =   , x (0 ) =  
0  1  2 
( ) (
Step 1:  f x (0 ) = 314, we need to find the minimum of  f x (0 ) + λ d (2 ) )
  5 0    5 
f ( x + λ d ) = f    + λ   = f  
(0) (2)
 = 20 λ + 121λ
3 2
+ 229λ + 384
 2  1   2 + λ 

df d2 f
= 60 λ 2 + 242 λ + 229 = 0, λ = −1.5 = 120 λ + 242 = 62 > 0
dλ dλ2 d = −1.5

 5 0   5 
x (1) = x (0 ) + (λ* )d (2 ) =   − 1.5   =  
2  1  1.5

( )
Step 2:  f x (1) = 244.75, we need to find the minimum of  f x (1) + λ d (1) ( )
 5  1  5 + λ 
f ( x + λ d ) = f    + λ   = f  
(1) (1)
 = 2λ
3
+ 30 λ 2 + 148.5λ − 5.25
 1.5 0   2 

df d2 f
= 6 λ 2 + 60 λ + 148.5 = 0, λ = −4.5, = 12 λ + 60 = 6 > 0
dλ dλ2 d = −4.5

5 1  0.5
( )
x (2 ) = x (1) + (λ* )d (2 ) =   − 4.5   =   , f x (2 ) = 1.75.
1.5 0  1.5 

2.2.5  Simplex Algorithm


This is a popular search technique that uses (n + 1) points for any problem with n
decision variables. On the basis of functional value of these (n + 1) points, a worst
point is identified. Obviously, worst point is omitted and using rest of the points cen-
troid is calculated. Further, new points are calculated on the basis of reflection, con-
traction, and expansion. Precise algorithm is mentioned below:

Algorithm:

Step 0: Set (n + 1)  points to define initial Simplex. Set α, γ > 1, β ∈ (0,1) also
ε > 0 (for termination).
Step 1: Get the worst and best points from these (n + 1) points. Let xw and xb be
the worst and best points.
Step 2: Calculate the centroid xc from the remaining (n) points using
n +1
1
xc = ∑ x .
n i =1 k
i≠w
23

Unconstrained Multivariable Optimization 23

Step 3: Replace the worst point with new point as xnew, using xc through the pro-
cess of reflection by xr = (1 + α ) xw − α xb.

(i) If f ( xr ) < f ( xb ), then xnew = (1 + γ ) xc − γ xw


(ii) If f ( xr ) ≥ f ( xw ) , then xnew = (1 − β) xc − β xw
(iii) If f ( xg ) < f ( xr ) < f ( xn ), then xnew = (1 + β) xc − β xn

Declare xnewas improved point


1/ 2
 N +1  f ( x ) − f ( x ) 2 
 i b  
Step 4: If  ∑
 i =1 N +1 
≤ ε, then stop or else go to step 2
 

Example:
Minimize  f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x22.
4  5 4
For initial simplex consider X1 =   , X 2 =   , X3 =   and α = 1.0, b = 0.5
4 4  5
γ = 2.0, set ε = 0.2

Solution:
Iteration 1:

Step 1: 
Since there are two unknowns, initially we need three simplex as
X1 , X 2 and X3 f ( X1 ) = 80, f ( X 2 ) = 107, f ( X3 ) = 96.
 5 4
Then, X 2 =   is best whereas X1 =   is worst point
4 4
 5 4
Therefore, X b =   , X w =  
4 4
1 4
Step 2: The centroid X c is X c = ( X + X3 ) =  5 , where f ( Xc ) = 87.25
2 1  
Step 3: The reflection point is
 4   5   3
Xr = 2 X c − X w = 2   −   =   , where f ( Xr ) = 71
 4.5  4  5
Step 4: As f ( Xr ) < f ( X b ), we will go with expansion as
 3  4   2 
X e = 2 Xr − X c = 2   −   =   , where f ( X e ) = 56.75
5  4.5 5.5
Step 5:  f ( X e ) < f ( X b ),we replace X w by X e
Step 6: Calculate Q for convergence which is 19.06, so we will continue with next
iteration.
4 2 4
Iteration 2: We haveX1 =   , X 2 =   , X3 =  ,
4
  5
 . 5  5
24

24 Unconstrained Multivariable Optimization

Step 1: f ( X1 ) = 80, f ( X 2 ) = 56.75, f ( X3 ) = 96


2 4
X b =   , X w =   are best and worst points respectively.
5.5  5
1  3 
Step 2: The centroid X c is X c = ( X1 + X 2 ) =   , where f ( X c ) = 67.31
2  4.75
Step 3: The reflection point is
 3  4  2 
Xr = 2 X c − X w = 2   −   =   , where f ( Xr ) = 43.75
 4.75  5   4.5
Step 4: As f ( Xr ) < f ( X b ), we will go with expansion as
2   3   1 
X e = 2 Xr − X c = 2   −  =  , where f ( X e ) = 23.3125
 4.5  4.75  4.25

Step 5:  f ( X e ) < f ( X b ), we replace X w by X e, we obtain the new vertices as


4 2  1 
X1 =   , X 2 =   , X3 =  
4 5.5  4.25
Step 6: For convergence, we compute Q = 26.1 > ε, we will go to next iteration.

2.3  GRADIENT-​BASED METHODS


2.3.1  Using Hessian Matrix
Let the objective function be Min z = f ( x ) = f ( x1 , x2 ..... xn ).
Gradient vector of f ( x ) is denoted by

T
 ∂f ( x ) ∂f ( x ) ∂f ( x ) 
∇f ( x ) =  , ....., 
 ∂x1 ∂x2 ∂x n 

T
 ∂2 f ( x ) ∂2 f ( x) ∂2 f ( x) 
 ..... 
 ∂x1 ∂x1∂x2 ∂x1∂xn 
 2 
 ∂ f ( x) ∂2 f ( x) ∂2 f ( x) 
.....
H ( x ) =  ∂x2 ∂x1 ∂ x2 2 ∂ x2 ∂ x n 
 
 : : : 
 2 2 
 ∂ f ( x) ∂ f ( x) ∂2 f ( x ) 
 ∂xn ∂x1 ∂ x n ∂ x2 ∂xn 2 

Necessary condition: For a continuous function f ( x ) to have extreme point at x = x0


∂ f ( x0 ) ∂ f ( x0 ) ∂ f ( x0 )
is that the gradient ∇f ( x0 ) = 0. That is  = = =0
∂x1 ∂ x2 ∂x n
25

Unconstrained Multivariable Optimization 25

Sufficient Condition: A stationary point x = x0 is extreme point if Hessian matrix


H ( x0 ) is

• Positive definite when x = x0 is a minimum point and


• Negative definite when x = x0 is a maximum point.

Example:
Determine the maximum of the function  f ( x1 , x2 ) = x1 + 2 x2 + x1 x2 − x12 − x2 2

Solution
The necessary condition for local optimum value is that gradient

 ∂f ∂f  2 x1 x2 + 5e x2  0 
∇f ( x ) =  , = 2 = 
 ∂x1 ∂x2   x1 + 5 x1e 2  0 
x

 4 5
Stationary point is x0 =  , 
3 3
The sufficient condition using Hessian matrix is

 ∂2 f ∂2 f 
 
 ∂x1
2 ∂x1∂x2   −2 1 
H ( x) =  = 
2
 ∂ f ∂ 2 f   1 −2 
 ∂x2 ∂x1 ∂x2 2 

 4 5
Since, H(x) is negative definite the stationary point x0 =  ,  is local maximum
3 3
of the function  f ( x ).

2.3.2  Steepest Descent Method


This is a gradient-​based method that uses gradient information to locate local minima
of the function. In principle, functional value of the function always increases faster
along the gradient direction. Hence, the functional value decreases faster in the nega-
tive direction of gradient. On the basis of this we can move to steepest ascent in the
direction along the gradient if we need to achieve maxima and similarly to steepest
descent in the negative direction of the gradient in case we need to achieve minima.
Here, we have demonstrated algorithm as well as example for a minima which is
steepest descent method. In general, we start with an initial guess and with a step
length of α that move in the direction of −∇f .

Algorithm:

Step 0: Set x (0 ) , k = 0
Step 1: d ( K ) = −∇f ( x ( K ) ). If d ( K ) = 0, then stop.
26

26 Unconstrained Multivariable Optimization

Step 2: Solve min α f ( x ( K ) + α ( K ) d ( K ) ) for step size α ( K ) chosen by line search


method.
Step 3: Setx ( K +1) = x ( K ) + α ( K ) d ( K ) , K = K + 1. Go to step 1.

Note: d ( K ) = −∇f ( x ( K ) )is descent direction, it follows  f ( x ( K +1) ) < f ( x ( K ) ).

Example:

Determine the minimum of the given function f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2


0 
using steepest descent method with initial guess x0 =  .
0 

Solution:
0   1 + 4 x1 + 2 x2 
For f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2 , x0 =   and ∇f =  
0   −1 + 2 x1 + 2 x2 
First iteration:
0  1  −1
We have,x0 =  , ∇f X =   ⇒ d (0 ) =  ,
0  0
 −1 1
 0   −1   −α 
min α f ( x (0 ) + α d (0 ) ) = min α f    + α (0 )   = min α f    = α 2 − 2α,
 0   1    α 
df
= 0 ⇒ α = 1. Now generate x (1) by x ( K +1) = x ( K ) + α d ( K ) taking k = 0

0   −1  −1  −1
x (1) = x (0 ) + α d (0 ) =   + 1   =  . Since, ∇f X (1) =   ≠ 0. We will go for
0   1   1   −1
next iteration.
 −1  −1 1
Second Iteration: x1 =   and ∇f X (1) =   ⇒ d (1) =  
 −1  −1 1
 −1 1  −0.8
Optimal direction is α = 0.2 , x (2 ) = x (1) + α d (1) =   + 0.2   =  
1 1  1.2 
 0.2 
Since ∇f X ( 2 ) =   ≠ 0. We will continue the process till desired accuracy is
 −0.2 
achieved.

2.3.3  Newton’s Method
This method approximates the given function by second-​order Taylor’s approxima-
tion. Further, that approximation is optimized using necessary and sufficient condi-
tion of calculus for optimal value.

1 T
f ( x ) = f ( x0 ) + ∇f ( x0 )h + h H ( x )h, x = x0 + h
2!
27

Unconstrained Multivariable Optimization 27

1 T
It can be represented in standard form as q( x ) = x Hx + bT x + c
2
Necessary condition: ∇q( x ) = 0 ⇒ Hx + b = 0 ⇒ x = − H −1b, where b = ∇f ( x )
Sufficient condition:  ∇2 q( x ) = 0 ⇒ H = 0 ⇒ f ( x ) is Minimum at x = x0 if H is
positive definite.

Algorithm:
Step 0: Set x0 ∈ R n , k = 0, ε (very small quantity)
Step 1: Find xK +1 = xK + H K −1∇f ( xk )
Step 2: If ∇f ( xk ) < ε, terminate the process otherwise go to step 1

Example:
Determine the minimum of the given function f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2
0 
using Newton’s method with initial guess x0 =  
0 

Solution:
For the given  f ( x1 , x2 ) = x1 − x2 + 2 x12 + 2 x1 x2 + x2 2
∂f ∂2 f ∂f ∂2 f
= 1 + 4 x1 + 2 x2 , = 4, = −1 + 2 x1 + 2 x2 , =2
∂x1 ∂x1 2 ∂ x2 ∂ x2 2
0 
First iteration: x0 =  
0 

 1 1
− 
4 2 −1 1  2 −2   2 2  −1
H0 =   , H 0 = 4  −2 4  =  1  and ∇f0 =  
2 2   − 1   −1
 2 
 1 1
 2 − 
 0  2  1  0   1   −1 
x1 = x0 + H 0 −1∇f ( x0 ) =   −    =  − =
0  − 1 −1 0 −3 / 2  3 / 2 
 2 1      

 
0
Now, ∇f X =   .
1
0 
 −1 
We can terminate the process here since gradient is 0. Thus minima is  
3 / 2 

2.3.4  Quasi Method
This method is a good alternative of Newton’s method. It computes search direc-
tion only by first derivatives whereas Newton’s method uses Hessian matrix for the
same purpose. This method basically uses approximation of Hessian matrix and thus
28

28 Unconstrained Multivariable Optimization

possess less computation cost compared to Newton’s method. This approximation is


initialized as a positive definite matrix and is updated on the basis of previous points
and gradients.

Algorithm:
Step 1: Compute Newton direction as d k = H k g k , where H k and g k are approxi-
mate inverse Hessian matrix and gradient, respectively.
Step 2: Compute new approximation as x k +1 = x k + α k d k
Step 3: Compute g k +1 = ∇f ( x k +1 )
Step 4: Update approximate inverse Hessian matrix using
H k +1 = update( H k , x k +1 ± x k , g k +1 − g k )

Calculation of approximated Hessian matrix


Let the difference of two successive x ( s )  as p( k ) = x ( k +1) − x ( k )
Let difference of two successive gradients as q ( k ) = g ( k +1) − g ( k )
As per the gradient and Hessian matrix relation Hdx = dg ⇒ Hp k = q k is known
as secant equation.
Then, rank one update is H k +1 = H k + uvT, where u and v are collinear and
symmetric.

1 1
⇒ u(vT p k ) = q k − H k p k ⇒ u = (q k − H k p k ) ⇒ v,
vT p k vT p k

1
where v = q k − H k p k , H k +1 = H k − vvT
vT p k

( ) 1
−1
Sherman–​Morrison Formula:  A + uvT = A−1 + A−1uvT A−1
1 − vT A−1u
To update the Hessian matrix Min H k +1 − H k and H k +1 > 0
w
W ≈ H ⇐ broyden-fletcher-goldfarb-shanno
W ≈ H −1 ⇐ davidon fletcher powell method
These two approaches are widely used to update the Hessian matrix approximation.

TRY YOURSELF
Q1. Find the minimum for f ( x1 , x2 ) = 3 x12 + x2 2 − 10 using the following
methods with [0, 0] initial approximation [0, 0] using Univariate Method and
Hooke–​Jeeves  Method
Answer: [6, 2]
Q2. Minimize the objective function f ( x, y) = x − y − 2 x 2 + 2 xy + y 2 using 5
iterations of (a)  Newton’s method (b)  Steepest Descent with starting value
x0 = [0, 0]T . Plot the values of iterates for each method on the same graph.
Which method seems to be more efficient.
Answer: [–​1,  1.5]
29

Unconstrained Multivariable Optimization 29

Q3. Minimize f ( x, y) = x 4 − 2 x 2 y + x 2 + y 2 + 2 x + 1 by the Simplex method.


Perform two steps of reflection, expansion, and/​or contraction.
Answer: [1, 1]
Q4. Minimize f ( x, y) = 4 x 2 − 3 y 2 − 5 xy − 8 x starting from point [0,  0] using
Powell’s method. Perform four iterations.
Answer: [2.0869, 1.7390]
30
31

3 Constrained
Multivariable
Optimization

3.1  INTRODUCTION
We have already discussed methods for optimization of single variable and
multivariable without constraint in Chapter 1 and Chapter 2, respectively. Actually,
most of the real world problems that are required to be optimized have constraints.
Now, it is high time to discuss optimization of multivariable functions with constraints.
General structure of multivariable functions with constraints is given below:

Optimize (max or min) Z = f ( x1 , x2 ,..., xn )

subject to the constraints

hi ( x1 , x2 ,..., xn ) ≤ = ≥ bi ; i = 1, 2,..., m

This set of problems can be further divided into problem with equality constraints
and unequality constraints. There are some conventional methods available for both
set of problems with and without constraints but still all the problems cannot be
solved using these methods due to complexity of the problem. Some of the time,
conventional methods stuck to the local optimum instead of global optimum. So, we
have another set of methods known as stochastic search techniques. These methods
are search algorithms inspired by natural phenomenon like evolution, natural selec-
tion, animal behaviour, or natural laws. Under this section, we will discuss methods
like genetic algorithm, particle swarm optimization, simulated annealing, and Tabu
search. For the sake of convenience, first we will discuss the conventional methods
then followed by stochastic search techniques.

3.2  CONVENTIONAL METHODS FOR CONSTRAINED


MULTIVARIATE OPTIMIZATION
3.2.1  Problems with Equality Constraints
Optimize (max or min)Z = f ( x1 , x2 ,..., xn ) subject to the constraints

hi ( x1 , x2 ,..., xn ) = bi ; i = 1, 2,..., m

31
32

32 Constrained Multivariable Optimization

In matrix notation the above problem can also be written as:

Z = f ( x)

subject to the constraints

gi ( x ) = 0, i = 1, 2,..., m   where x = ( x1 , x2 ,..., xn ),

and gi ( x ) = hi ( x ) − bi ; bi is constant
Here it is assumed that m < n to get the solution.
There are various methods for solving the above defined problem. But in this
section, we shall discuss only two methods:  (i) Direct Substitution Method and
(ii) Lagrange Multiplier Method.

3.2.1.1  Direct Substitution Method

Example:
Find the optimum solution of the following constrained multivariable problem:

Minimize Z = x12 + ( x2 + 1)2 + ( x3 - 1)2

subject to the constraint

x1 + 5 x2 − 3 x3 = 6

and x1 , x2 , x3 ≥ 0

Solution
Since the given problem has three variables and one equality constraint, any one of
the variables can be removed from Z with the help of the equality constraint. Let
us choose variable x3 to be eliminated from Z. Then, from the equality constraint,
we have:

x3 =
( x1 + 5 x2 − 6)
3

Substituting the value of x3 in the objective function, we get:

1
Z or f ( x ) = x12 + ( x2 + 1)2 + ( x1 + 5 x2 − 9)2
9

The necessary condition for minimum of Z is that the gradient

 ∂f ∂f 
∇f ( x ) =  , =0
 ∂x1 ∂x2 
33

Constrained Multivariable Optimization 33

∂Z 2
That is,  = 2 x1 + ( x1 + 5 x2 − 9) = 0
∂x1 9

∂Z 10
= 2( x2 + 1) + ( x1 + 5 x2 − 9) = 0
∂ x2 9

On solving these equations, we get x1 = 2/​5 and  x2  = 1.


To find whether the solution, so obtained, is minimum or not, we must apply the
sufficiency condition forming a Hessian matrix. The Hessian matrix for the given
 ∂2 Z ∂2 Z 
 2 ∂x1∂x2   20 / 9 10 / 9
∂x1
objective function is H ( x1 , x2 ) =  = 
 ∂2 Z ∂ 2 Z   10 / 9 20 / 9
 
 ∂x ∂x ∂x 2 
2 1 2
Since the matrix is symmetric and principal diagonal elements are positive,
H ( x1 , x2 )is positive definite and the objective function is convex. Hence, the optimum
solution of the given problem is x1 = 2 / 5, x2 = 1, x3 = −1 / 5, andMin Z = 28/5.

3.2.1.2  Lagrange Multipliers Method

Optimize Z = f ( x )

subject to the constraint


hi ( x ) = bi

or

gi ( x ) = hi ( x ) − bi = 0, i = 1, 2,..., m and m ≤ n ; x ∈ E n

The necessary conditions for a function to have a local optimum at the given points
can be extended to the case of a general problem with n variables and m equality
constraints.
Multiply each constraint with an unknown variable λ i (i = 1, 2,..., m) and subtract
each from the objective function f ( x ) to be optimized. The new objective function
now becomes:

m
L ( x, λ ) = f ( x ) − ∑ λ g ( x) ; x = ( x , x ,..., x )
i =1
i i 1 2 n
T

where m < n. The function L ( x, λ ) is called the Lagrange function.


The necessary conditions for an unconstrained optimum of L ( x, λ ), i.e. the first
derivatives, with respect to x and λ of L ( x, λ ) must be zero, are also necessary
34

34 Constrained Multivariable Optimization

conditions for the given constrained optimum of f ( x ), provided the matrix of partial
derivatives ∂gi / ∂x j has rank m at the point of optimum.
The necessary conditions for an optimum (max or min) of L ( x, λ ) or f ( x ) are the
m + n equations to be solved for m + n unknown ( x1 , x2 ,..., xn ; λ1 , λ 2 ,..., λ m ).

m
∂L ∂f ∂gi
=
∂x j ∂x j
− ∑λ
i =1
i
∂x j
= 0; j = 1, 2,..., n

∂L
= − gi ; i = 1, 2,..., m
∂λ i

These m + n necessary conditions also become sufficient conditions for a maximum


(or minimum) of the objective function f ( x ), in case it is concave (or convex) and the
constraints are equalities, respectively.

Sufficient conditions for a general problem


Let the Lagrangian for a general non-​linear programming (NLP) problem, involving
n variables and m (< n) constraints, be

m
L ( x, λ ) = f ( x ) − ∑ λ g ( x)
i =1
i i

Further, the necessary conditions

∂L ∂L
= 0 and = 0; for all i and j
∂x j ∂λ i

For an extreme point to be local optimum of f ( x ) is also true for optimum of L ( x, λ ).


Let there exists points x and λ that satisfy the equation

m
∇L ( x , λ ) = ∇f ( x ) − ∑ λ g ( x) = 0
i =1
i i

and gi ( x ) = 0, i = 1, 2,..., m
Then the sufficient condition for an extreme point x to be a local minimum (or
local maximum) of f ( x ) subject to the constraints gi ( x ) = 0, (i = 1, 2,..., m) is that the
determinant of the matrix (also called Bordered Hessian matrix)

 Q H
D= T 
H 0  (m+n)× (m+n)
35

Constrained Multivariable Optimization 35

is positive (or negative), where

 ∂ 2 L ( x, λ )   ∂g ( x ) 
Q=   ; H= i 
 ∂xi ∂x j  n × n  ∂x j  m × n

Conditions for maxima and minima:


The sufficient condition for the maxima and minima is determined by the signs of the
last (n - m) principal minors of matrix D. That is,

1. If starting with principal minor of order (m + 1) , the extreme point gives the
maximum value of the objective function when signs of last (n - m) principal
minors alternate, starting with the (–​1)m+n sign.
2. If starting with principal minor of order (2 m + 1) the extreme point gives the
minimum value of the objective function when all signs of last (n - m) prin-
cipal minors are the same and are of (–​1)m type.

Example:
Solve the following problem by using the method of Lagrangian multipliers.

Minimize Z = x12 + x22 + x32


subject to the constraints
(i) x1 + x2 + 3 x3 = 2, (ii) 5x1 + 2 x2 + x3 = 5

and x1 , x2 ≥ 0

Solution:
The Lagrangian function is
L ( x, λ ) = x12 + x22 + x32 − λ1 ( x1 + x2 + 3 x3 − 2) − λ 2 (5x1 + 2 x2 + x3 − 5)

The necessary conditions for the minimum of Z give us the following:


The solution of these simultaneous equations gives:
x = ( x1 , x2 , x3 ) = (37/46,16/46,13/46); λ = (λ1 , λ 2 ) = (2 / 23, 7 / 23) and Z = 193/250

To see that this solution corresponds to the minimum of Z , apply the sufficient
condition with the help of a matrix:
2 0 0 5 1
0 2 0 2 
1

D = 0 0 2 1 3
 
1 1 3 0 0
 5 2 1 0 0 

Since m = 2, n = 3, so n − m = 1 and 2 m + 1 = 5, only one of minor of D of order 5


needs to be evaluated and it must have a positive sign; (–​1)m = (–​1)2 = 1. Since D =
460 > 0, the extreme point, ( x1 , x2 , x3 ) = corresponds to the minimum of Z .
36

36 Constrained Multivariable Optimization

Necessary and sufficient conditions when concavity (convexity) of objective function


is not known, with single equality constraint:
Let us consider the non-​linear programming problem that involves n decision
variables and a single constraint.

Optimize Z = g( x )

subject to the constraint

g( x ) = h( x ) − b = 0; x = ( x1 , x2 ,..., xn )T ≥ 0

Multiply each constraint by Lagrange multiplier λ and subtract it from the objective
function. The new unconstrained objective function (Lagrange function) becomes:

L ( x, λ ) = f ( x ) − λ g ( x )

The necessary conditions for an extreme point to be optimum (max or min)


point are:

∂L ∂f ∂g
= −λ = 0; j = 1, 2,..., n
∂x j ∂x j ∂x j
∂L
= − g( x ) = 0
∂λ

From the first condition we obtain the value of λ as:

( ∂f / ∂x j )
λ= ; j = 1, 2,..., n
( ∂g / ∂x j )

The sufficient conditions for determining whether the optimal solution, so


obtained, is either maximum or minimum, need computation of the value of (n-​1)
principal minors, of the determinant, for each extreme point as follows:

0 ∂g ∂g  ∂g
∂x1 ∂ x2 ∂x n
∂g ∂2 f ∂2 g ∂2 f ∂2 g  ∂2 f ∂2 g
−λ −λ −λ
∂x1 ∂x12 ∂x12 ∂x1∂x2 ∂x1∂x2 ∂x1∂xn ∂x1∂xn
∆ n + 1 = ∂g ∂2 f ∂2 g ∂2 f ∂2 g  ∂2 f ∂2 g
−λ −λ −λ
∂ x2 ∂x2 ∂x1 ∂x2 ∂x1 ∂ x2 2 ∂ x2 2 ∂ x2 ∂ x n ∂ x2 ∂ x n
   
∂g 2
∂ f ∂ g 2 2
∂ f ∂ g 2  2
∂ f ∂2 g
−λ −λ −λ
∂x n ∂xn ∂x1 ∂xn ∂x1 ∂ x n ∂ x2 ∂ x n ∂ x2 ∂x n 2 ∂x n 2
37

Constrained Multivariable Optimization 37

If the sign of minors ∆3, ∆4, ∆5 are alternately positive and negative, then the extreme
point is a local maximum. But if sign of all minors ∆3, ∆4, ∆5 are negative, then the
extreme point is local minimum.

Example:
Use the method of Lagrangian multipliers to solve the following NLP problem. Does
the solution maximize or minimize the objective function?

Optimize Z = 2 x12 + x22 + 3 x32 + 10 x1 + 8 x2 + 6 x3 − 100

subject to the constraint


g( x ) = x1 + x2 + x3 = 20

and
x1 , x2 , x3 ≥ 0

Solution
Lagrangian function can be formulated as:

L ( x, λ ) = 2 x12 + x22 + 3 x32 + 10 x1 + 8 x2 + 6 x3 − 100 − λ( x1 + x2 + x3 − 20)

The necessary conditions for maximum or minimum are:


∂L ∂L
= 4 x1 + 10 − λ = 0; = 2 x2 + 8 − λ = 0
∂x1 ∂ x2
∂L ∂L
= 6 x3 + 6 − λ = 0; = −( x1 + x2 + x3 − 20) = 0
∂x3 ∂λ

Putting the values of x1 , x2 and x3 in the last equation ∂L / ∂λ = 0 and solving for
λ, we get λ = 30. Substituting the value of λ in the other three equations, we get an
extreme point: ( x1 , x2 , x3 ) = (5,11, 4).
To prove the sufficient condition of whether the extreme point solution gives
maximum or minimum value of the objective function we evaluate (n −1) principal
minors as follows:

∂g ∂g
0
∂x1 ∂ x2
0 1 1
∂g ∂2 f ∂2 g ∂2 f ∂2 g
∆3 = −λ −λ = 1 4 0 = −6
∂x1 ∂x12 ∂x12 ∂x1∂x2 ∂x1∂x2
1 0 2
∂g ∂2 f ∂2 g ∂2 f ∂2 g
−λ −λ
∂ x2 ∂x2 ∂x1 ∂x2 ∂x1 ∂ x2 2 ∂ x2 2
38

38 Constrained Multivariable Optimization

0 1 1 1
1 4 0 0
∆4 = = 48
1 0 2 0
1 0 0 6

Since, the sign of ∆3 and ∆4 are alternative, therefore extreme points: ( x1 , x2 , x3 ) = (5,11, 4)
is local maximum. At this point the value of the objective function is Z = 281.

3.2.2  Problems with Inequality Constraints


3.2.2.1  Kuhn–​Tucker Necessary Conditions

Optimize Z = f ( x )

subject to the constraints

gi ( x ) = hi ( x ) − bi ≤ 0 i = 1, 2,..., m where x = ( x1 , x2 ,..., xn )T

Add non-​negative slack variables si (i = 1, 2,,..., m) in each of the constraints to con-


vert them to equality constraints. The problem can then be restated as:

Optimize Z = g( x )

subject to the constraints

gi ( x ) + si 2 = 0, i = 1, 2,..., m

The si 2 has only been added to ensure non-​negative (feasibility requirement) of si


and to avoid adding si ≥ 0 as an additional side constraint. The new problem is the
constrained multivariable optimization problem with equality constraints with n + m
variables. Thus, it can be solved using Lagrangian multiplier method. For this, let us
form the Lagrangian function as:
m
L ( x , s, λ ) = f ( x ) − ∑ λ [ g ( x) + s
i =1
i i i
2
]

where λ = (λ1 , λ 2 ,..., λ n )T is the vector of Lagrange multiplier.


The Kuhn–​Tucker necessary conditions (when active constraints are known) to be
satisfied at local optimum (max or min) point can be stated as follows:
m
∂f ∂gi
∂x j
− ∑λ
i =1
i
∂x j
= 0, j = 1, 2,..., n
39

Constrained Multivariable Optimization 39

λ i gi ( x ) = 0,

gi ( x ) ≤ 0,

λ i ≥ 0, i = 1, 2,..., m

3.2.2.2  Kuhn–​Tucker Sufficient Conditions


The Kuhn–​Tucker necessary conditions for the problem

Maximize Z = f ( x )

subject to the constraints

gi ( x ) ≤ 0, i = 1, 2,..., m

are also sufficient conditions if f ( x ) is concave and gi ( x ) are convex functions of x.

Example:
Maximize Z = 12 x1 + 21x2 + 21x1 x2 − 2 x12 − 2 x22 subject to the constraints

(i ) x2 ≤ 8, (ii) x1 + x2 ≤ 10,

and

x1 , x2 ≥ 0

Solution
Here  f ( x1 , x2 ) = 12 x1 + 21x2 + 21x1 x2 − 2 x12 − 2 x22

g1 ( x1 , x2 ) = x2 − 8 ≤ 0
g2 ( x1 , x2 ) = x1 + x2 − 10 ≤ 0

The Lagrangian function can be formulated as:

L ( x, s, λ ) = f ( x ) − λ1[ g1 ( x ) + s12 ] − λ 2 [ g2 ( x ) + s22 ]

The Kuhn–​Tucker necessary condition can be stated as:

2
∂f ∂gi
(i )
∂x j
− ∑λ
i =1
i
∂x j
= 0, j = 1, 2 (ii ) λ i gi ( x ) = 0, i = 1.2

12 + 2 x2 − 4 x1 − λ 2 = 0 λ 1 ( x 2 − 8) = 0
21 + 2 x1 − 4 x2 − λ1 − λ 2 = 0 λ 2 ( x1 + x2 − 10) = 0
40

40 Constrained Multivariable Optimization

(iii ) gi ( x ) ≤ 0 (iv) λ i ≥ 0. i = 1, 2
x2 − 8 ≤ 0
x1 + x2 − 10 ≤ 0

There may arise four cases:


Case 1: If λ1 = 0, λ 2 = 0, then from condition (i) we have:

12 + 2 x2 − 4 x1 = 0 and 21 + 2 x1 − 4 x2 = 0

Solving these equations, we get x1 = 15 / 2, x2 = 9. However, this solution violates


condition (iii) and so it should be discarded.
Case 2: λ1 ≠ 0, λ 2 ≠ 0, then from condition (ii) we have:

x2 − 8 = 0 or x2 = 8

x1 + x2 − 10 = 0 or x1 = 2

Substituting these values in condition (i), we get λ1 = −27 and λ 2 = 20. However, this
solution violates condition (iv) and therefore may be discarded.
Case 3: λ1 ≠ 0, λ 2 = 0, then from conditions (i) and (ii) we have:

x1 + x2 = 10
2 x2 − 4 x1 = −12
2 x1 − 4 x2 = −12 + λ1

Solving these equations, we get x1 = 17 / 4, x2 = 23 / 4, and λ1 = −16. However, this


solution violates condition (iv) and therefore may be discarded.
Case 4: λ1 = 0, λ 2 ≠ 0, then from conditions (i) and (ii) we have:

2 x2 − 4 x1 = −12 + λ 2
2 x1 − 4 x2 = −21 + λ 2
x1 + x2 = 10

Solving these equations, we get


x1 = 17 / 4, x2 = 23 / 4, and λ 2 = 13 / 4. This solution does not violate any of the
Kuhn–​Tucker conditions and therefore must be accepted. Hence, the optimum solu-
tion of the given problem is x1 = 17 / 4, x2 = 23 / 4, λ1 = 0, and λ 2 = 13 / 4. and Max
Z = 1734/​16.
41

Constrained Multivariable Optimization 41

3.3  STOCHASTIC SEARCH TECHNIQUES


Many search algorithms are available and are continuously being developed on the
basis of different natural phenomenon. We shall discuss some of them to make our
readers understand the basic idea or inspiration behind them. These methods can be
used for problems with both equal and unequal constraints.

3.3.1  Genetic Algorithm


Genetic algorithm (GA) is a search technique inspired by Charles Darwin’s theory
of “Survival of fittest”. It is based on the natural selection process that allows only
good chromosomes to go in the next generation to improve the chance of sur-
vival. John Holland introduced genetic algorithms in 1960 based on the concept of
Darwin’s theory of evolution, and his student David E. Goldberg further extended
GA in 1989.
GA is a search technique used in computing to find true or approximate solutions
for optimization of any non-​linear problem. GA may explore the solution space in
many directions and from many points using parallel computation process. Complex
environments with non-​lineal behaviour are good candidates to be worked with GA’s.
The fitness function may be discontinuous and even changing over time. Genetic
algorithms are a particular class of evolutionary algorithms that use techniques
inspired by evolutionary biology such as inheritance, mutation, selection, and cross-
over (also called recombination).

Fundamentals of Genetic algorithm


Initially, a population is created randomly with a group of individuals. These individ-
uals are being evaluated on the basis of fitness function (objective function). A fitness
function is defined by programmer over the genetic representation and measures the
quality of the represented solution. The programmer provides score to individuals
on the basis of their performance. Two individuals are selected on the basis of their
fitness. Higher fitness score increases the chances of selection. Further, these best
individuals reproduce offspring that are further muted on a random basis. This pro-
cess continues until a feasible solution is attained. Let us discuss all the stages of GA
step-​by-​step.

Initialization

• Initially many solutions are randomly generated to form an initial popula-


tion. The population size depends on the nature of the problem, but typically
contains several hundreds or thousands of possible solutions.
• Traditionally, the population is generated randomly, covering the entire range
of possible solutions called search space.
• Sometimes, the solutions may be “seeded” i.e., selected from areas where there
is possibility of getting optimal solutions.
42

42 Constrained Multivariable Optimization

Selection

• During each successive generation, a proportion of the present population is


selected on the basis of fitness function to produce a new generation.
• Individual solutions are selected through a fitness-​based process, where best
fit solutions (as measured by a fitness function) are typically more likely to be
selected. Some selection methods rate the fitness of each solution and prefer to
select the best possible solutions. Other methods choose a random sample of
the population, which can be more tedious.
• Due to stochastic approach of GA, even the less fit solutions are included in
very small amount to diversify the population and preventing premature con-
vergence to poor solutions. Roulette wheel selection and tournament selection
are the widely used selection methods.

Reproduction

• A mating pool is created from the appropriate individuals for reproduction pro-
cess. Members of mating pool crossbreed to generate new population. This
approach is used to generate a next generation population of solutions from
those selected through genetic operators:  Crossover (also called recombin-
ation), and/​or mutation.
• For each new solution to be generated, a pair of “parent” solutions is selected
for breeding to get “child” solutions.
• By generating a “child” solution from either crossover or mutation, a new
solution is created which typically shares many of the characteristics of its
“parents”. Now, new parents are selected for each child, and this process con-
tinues till a feasible solution set of appropriate size is generated.

3.3.1.1 Crossover

Parent

Parent 1 1 1 0 0 1 0 1 0 0 1
Parent 2 0 1 0 1 0 1 0 1 0 1

Two possible offsprings

Child 1 1 1 0 0 1 1 0 1 0 1
Child 2 0 1 0 1 0 0 1 0 0 1
43

Constrained Multivariable Optimization 43

Mutation

• After selection and crossover, we have a new population full of individuals


in which some are simply copied and others are being crossover from parent
chromosomes.
• To ensure that the individuals are not all exactly the same, we perform mutation.
• For this we examine the alleles of all the individuals, and if that allele is
selected for mutation it can be changed by a small proportion or replaced with
new value. The probability of a mutation of a bit is 1 /​L, where L is the length
of the binary vector.
• Mutation is fairly simple. You just change the selected alleles based on what
you feel is necessary and move on. Mutation ensures genetic diversity within
the population.

Before mutation 1 1 0 0 1 1 0 1 0 1
After mutation 1 1 0 0 0 1 0 1 0 1

It can also be flipped as follows

Before mutation 1 1 0 0 1 1 0 1 0 1
After mutation 0 0 1 1 0 0 1 0 1 0

Termination

• Above-​discussed process gets repeated until a termination condition has been


attained.
• Common terminating conditions are as follows:
• Manual inspection
• A solution is found that satisfies minimum criteria
• Fixed number of generations reached
• Allocated budget (computation time/​money) reached
• The highest ranking solution’s fitness is reaching or has reached a plateau such
that successive iterations no longer produce better results
• Any combinations of the above

Limitations

• The representation of the problem may be difficult. You need to identify which
variables are suitable to be treated as genes and which variables must be let
outside the GA process.
• The determination of the convenient parameters (population size, mutation
rate) may be time consuming
• As in any optimization process, if you don’t take enough precautions, the algo-
rithm may converge in a local minimum (or maximum).
44

44 Constrained Multivariable Optimization

3.3.2  Particle Swarm Optimization


Particle swarm optimization (PSO) is a population-​based heuristic global search
algorithm based on the social interaction and individual experience. It was proposed
by Eberhart and Kennedy in 1995. It has been widely used in finding the solu-
tion of optimization problems. This algorithm is inspired by social behaviour of
bird flocking or fish schooling. In PSO, the potential solutions, called particles, fly
through the search space of the problem by following the current optimum particles.
PSO is initialized with a population of random particles positions (solutions) and
then searches for optimum in generation to generation. In every iteration, each par-
ticle is updated with two best positions (solutions). The first one is the best position
(solution) that has been reached so far by the particle and this best position is said to
be personal best position and called pi( k ) . The other one is the current position (solu-
tion), obtained so far by any particle in the population. This best value is a global
best and called pg( k ) .
In each generation, the velocity and position of ith (i = 1, 2, … .p_​size) particle are
updated by the following rules:

( ) ( )
vi( k +1) = wvi( k ) + c1r1 pi( k ) − xi( k ) + c2 r2 pg( k ) − xi( k ) and xi( k +1) = xi( k ) + vi( k +1)

Where w is the inertia weight; k (= 1, 2...m − gen) indicates the iterations (generations).
The constant c1 (> 0) and c2 (> 0) are cognitive learning and social learning rates,
respectively, which are the acceleration constants responsible for varying the particle
velocity towards pi( k ) and pg( k )respectively.
Updated velocity of ith particle is calculated by considering three components:
(i) previous velocity of the particle, (ii) the distance between the particles best
previous and current positions, and (iii) the distance between swarms best experi-
ence (the position of the best particle in the swarm) and the current position of the
particle.
The velocity is also limited by the range  − vmax , vmax  where vmax is called the
maximum velocity of the particle. The choice of a too small value for vmax can cause
very small updating of velocities and positions of particles at each iteration. Hence,
the algorithm may take a long time to converge and faces the problem of getting
stuck to local minima. To overcome this issue, Clerc (1999), Clerc and Kennedy
(2002) proposed an improved velocity update rule employing a constriction factor of
χ. According to them, the updated velocity is given by

 ( ) ( )
vi( k +1) = χ  vi( k ) + c1r1 pi( k ) − xi( k ) + c2 r2 pg( k ) − xi( k ) 

Here, the constriction factor χ is expressed as

2
χ , where φ = c1 + c2 , φ > 4,
2 − φ − φ2 − 4φ
45

Constrained Multivariable Optimization 45

Algorithm:

Step 0: Initialize the PSO parameters: bounds of the decision variables; population


with random positions and velocities.
Step 1: Evaluate the fitness of all particles.
Step 2: Keep track of the locations where each individual has its highest fitness
so far.
Step 3: Keep track of the position with the global best fitness.
Step 4: Update the velocity of each particle.
Step 5: Update the position of each particle.
Step 6: If the stopping criterion is satisfied, go to Step 7, otherwise go to Step 2.
Step 7: Print the position and fitness of global best particle.
Step 8: End

3.3.3  Hill Climbing Algorithm


As the name suggests, it is mimicking the process of climbing hill from random point
and continue the process if peak not achieved. It is a very basic algorithm that can be
used to obtain maximum of a given function by just initializing with a random point
and then moving to next point. If functional value of new point is better than the pre-
vious it is accepted and the process is continued. But if at some point functional value
is smaller at the new point then terminate the process and declare the last point as
optimum. But as shown in Figure 3.1, user can get stuck with local maxima if initial
point is at the left of the local maxima. Other than this in case of plateau (Figure 3.2)
and ridges (Figure 3.3) too this method fails to give global optimum.

3.3.4  Simulated Annealing


Simulated means mimicking and annealing means process of providing heat to metals
and then allow them to settle down. So basically this particular algorithm is inspired
by annealing process of metallurgy. It is a probabilistic technique for approximating

Global maxima

Local maxima

FIGURE 3.1 
Graph of an arbitrary function with local and global maxima.
46

46 Constrained Multivariable Optimization

FIGURE 3.2 
Plateau.

FIGURE 3.3 
Ridges.

the global optimum of a given function. It is the extension or a modification of hill


climbing where it does not stuck with local optimum as it has provision of downward
hill movement with upward. It is a meta-​heuristic approach as approximate global
optimum in a large search space. For problems where finding an approximate global
optimum is more important than finding a precise local optimum in a fixed amount of
time, simulated annealing may be preferable to alternatives such as gradient descent.
This method can be used for scheduling travelling salesman problem, design of three-​
dimensional structures of protein molecule in the biotechnology, and printed circuit
boards for planning paths for robots, etc.
This process is analogous to the physical annealing of metals. Energy states of
metal are cost function in simulated annealing. Similarly, temperature are control
parameters and final cooled down crystalline structure is optimal solution. Metal
itself is the optimization problem. Here, global optimum can be achieved if cooling
process is slow and steady. In other words, moves could be random in the beginning
but need to be more precise towards the end to obtain the global optimum.
Working rules for SA are to set initial solution and initial temperature. Then gen-
erate new solutions and update the solution. In case it is not acceptable change the
temperature. We will continue the process till desired optimal is achieved.

Algorithm

Step 0: Set x (0 ) and ε as an initial solution and termination criterion, respectively.


Fix sufficiently high temperature “T” and number of iterations as “n”.
Step1: Calculate a neighbouring point x (t +1) = Nx (t ) randomly.
47

Constrained Multivariable Optimization 47

Step 2: If ∆E = E ( x (t +1) ) − E ( x (t ) ) < 0 for t = t + 1


Where, ∆E is difference in the energy at x (t +1) and x (t ) , which is analogous
to difference in functional value at two consecutive points.
Else create a random number “r” in the range (0, 1).
If r ≤ exp( −∆E / T ), else go to step 1.

Step 3: If x (t +1) − x (t ) < ε and T is small enough then terminate the process, else
if (t mod n) = 0, then accordingly go to step 1.

3.3.5  Ant Colony Optimization Algorithm


This algorithm is proposed in 1992 by Macro Dorigo and initially called Ant systems.
Since then, many variants of the principle have been developed. It is inspired by
the shortest path or trail that ants use to carry their food to home. The main inspir-
ation behind this algorithm is stigmergy comprise of interaction, coordination and
adaptation with nature by modifying the local environment. Ant Colony optimization
(ACO) takes inspiration from the foraging behaviour of some ant species. These ants
deposit pheromone on the ground in order to mark some favourable path that should
be followed by other members of the ant colony. Pheromone is a chemical substance
released in the environment by animals, specially mammal or insects, that affect the
behaviour of other organisms of the same species.
Basically, ants need to find a shortest possible path from source of food to their
home. As discussed above this shortest path is achieved through pheromone trails. In
the beginning each ant moves in random motion and releases pheromone. This phero-
mone gets deposited in all the random paths. But the path with more pheromone has
the highest probability of getting followed by others. Slowly with time they concen-
trate on a unique shortest path.
Well known, travelling salesman problem can be solved by this algorithm. In this
problem a set of cities is given and distance between each of them is also provided.
Aim of the salesman is to find the shortest possible route with the requirement of
visiting each city exactly once. Other than travelling salesman problem it can also be
useful in quadratic assignment problems, network model problem, vehicle routing,
and graph colouring.

Algorithm

Step 0: Explore all possible paths.


Step 1: Ant “k” at node “h” chooses node “s” to move using:

 β
{
 max [ τ(h, u)] • [ η(h, u)] if {q ≤ q0 }
s  w∉Mk

 S otherwise

Here S is random variable which favours shorter edges with higher level of phero-
mone trail through a probability distribution mentioned below.
48

48 Constrained Multivariable Optimization

 [ τ(h, u)] • [ η(h, u)]β


 , if {s ∉ M k }

Pk (h, s ) =  ∑ [ τ(h, u)] • [ η(h, u)]β
 w∉Mk
 0 otherwise

Note that [ τ(h, u)] is amount of pheromone trail on edge (h, u) whereas η(h, u)
is heuristic function on edge (h, u) .
Step 2: Pheromone amount on the edges are updated locally as well as globally.
Step 3: Global updating rewards edged belongs to shortest tours. Once ants have
completed their routes, ant that has travelled shortest path deposits additional
pheromone on each visited edge.

φ(h, s ) ← (1 − α ) • φ(h, s ) + α • ∆φ(h, s )

Step 4: Every time a path is chosen, pheromone gets updated.

τ(h, s ) ← (1 − α ) • τ(h, s ) + α • τ o.

Advantages:

1. Perform better than other global optimization techniques such as genetic


algorithm, neural network, and simulated annealing for a particular class of
problems.
2. Many modified versions are available and can be used in dynamic applications.
3. Convergence is sure and certain.

Disadvantages:

1. For small set of nodes, many other search techniques are available with less
computational cost.
2. Although convergence is certain but time is uncertain.
3. Some of the time computational cost is high.

3.3.6  Tabu Search Algorithm


It is a meta-​heuristic approach that guides local heuristic search procedure to explore
the solution space for global optimum by using a Tabu list. It is dynamic in nature and
uses flexible memory to restrict the solution choice to some subset neighbourhood
of current solution by strategic restrictions. At the same time it explores search space
with appropriate aspiration level. The main features of Tabu search is control and
have a check on what enters in the Tabu list called as Forbidding Feature. It controls
exits from Tabu list known as Freeing Feature and most important one is a proper
49

Constrained Multivariable Optimization 49

coordination between the forbidding and freeing strategy to select the trial solutions
which is better known as short-​term strategy. To identify neighbouring or adjacent
solutions a “neighbourhood” is constructed to move to another solutions from the
current solution. Choosing a particular solution from the neighbourhood depends
on search history and on frequency of solution called attributes that have already
produced past solutions. As mentioned earlier this algorithm has a flexible memory
and therefore it records forbidden moves for future known as tabu moves. There is
provision of exceptions too in aspiration criterion. When a tabu move gives a better
result compared to all the solutions received so far, then it can be overridden.

Stopping Criterion

1.  There is no feasible solution in the neighbourhood of solution k of the present


iteration.
2.  Maximum number of iterations are achieved already.
3.  Improvement of the solution is not significant or less than a prior defined (very
small) number.

Algorithm

Step 0: Choose an initial solution k. Set k * = k, m = 0


Step 1: Take k = k + 1 and produce V * ∋ V * ⊂ N (k, m) with the condition that
either one of the tabu condition gets violated or one of the aspiration
condition holds.
Step 2: Choose a best possible n in V * and set m = n.
Step 3: If f (k * ) < f (k ) then set k * = k .
Step 4: Update Tabu and aspiration conditions accordingly.
Step 5: If stopping criterion is achieved terminate the process or else go to step 1.

Advantages:

1.  Rejects non-​improving solutions to avoid local minimum instead of global.


2.  Useful in case of both continuous and discrete solution spaces.
3.  Very useful for complex problems of scheduling, vehicle routing, and quadratic
assignment where other approaches either fail or stuck with local optimum.

Disadvantages:

1.  Large number of parameters are required to be determined.


2.  Computational cost is high.
3.  Number of iterations can be large.
4.  Good parameter setting is required to achieve the global optimum.
50

50 Constrained Multivariable Optimization

TRY YOURSELF

Q1. Obtain  the  Min f = ( x1 − 1) + ( x2 − 5)2


2
x

g1 = − x12 + x2 − 4 ≤ 0
Subject to using KT conditions.
g2 = −( x1 − 2)2 + x2 − 3 ≤ 0
Q2. Write code to solve constrained problem using Genetic Algorithm in C
program.
Q3. Compare advantages and disadvantages of Hill Climbing and Simulated
Annealing. Do they have any relation with each other?
Q4. Ant colony algorithm is most efficient with which kind of problems. Explain
with suitable illustration.
Q5. Find the minimum of f = x5 − 5x3 − 20x + 5 in the range (0, 3) using the ant
colony optimization method. Show detailed calculations for two iterations
with four ants.
51

4 Applications of 
Non-​Linear
Programming

4.1  BASICS OF FORMULATION


In the previous three chapters, we discussed various conventional as well as modern
approaches for optimizing a given non-​linear programming problem. In Section 4.3
of this chapter we will also see various inbuilt functions that can be used to solve a
large set of the problem. But actual challenge lies in mathematical formulation of
real world problem. Non-​linear problem exists in management, core sciences, engin-
eering, medical, military, and finance. It could be a problem for cost minimization or
profit maximization, or designing that lead to profit maximization.
Formulation of any real world problem needs sound knowledge of that field as
well as NLP. Both together help in establishing relationship between various design
(decision variables) and thus concluding with correct objective function. Establishing
authentic constraint and bounds is equally important. Here, in the next section we
have demonstrated some illustrations to show formulation of NLP in various fields.

4.2  EXAMPLES OF NLP FORMULATION


Example 1: Profit Maximization –​Production Problem
A manufacturer of coloured televisions is planning the introduction of two new
products: a 19-​inch stereo colour set with a manufacturer’s suggested retail price of
$339 per year, and a 21-​inch stereo colour set with a suggested retail price of $399
per year. The cost of the company is $195 per 19-​inch set and $225 per 21-​inch
set, plus additional fixed costs of $400,000 per year. In the competitive market, the
number of sales will affect the sales price. It is estimated that for each type of set,
the sales price drops by one cent for each additional unit sold. Furthermore, sales
of the 19-​inch set will affect sales of the 21-​inch set and vice-​versa. It is estimated
that the price of 19-​inch set will be reduced by an additional 0.3 cents for each 21-​
inch set sold, and the price of 21-​inch set will decrease by 0.4 cents for each 19-​inch
set sold. The company believes that when the number of units of each type produced
is consistent with these assumptions all units will be sold. How many units of each
type of set should be manufactured such as the profit of company is maximized?
51
52

52 Applications of Non-Linear Programming

Solution:
The relevant variables of this problem are:

s1: Number of units of the 19-​inch set produced per year


s2: Number of units of the 21-​inch set produced per year
p1: Sales price per unit of the 19-​inch set ($)
p2: Sales price per unit of the 21-​inch set ($)
C: Manufacturing costs ($ per year),
R: Revenue from sales ($ per year),
P: Profit from sales ($ per year).

The market estimates results in the following model equations,

p1 = 339 − 0.01s1 − 0.003s2


p2 = 399 − 0.04 s1 − 0.001s2
R = s1 p1 + s2 p2
C = 400, 000 + 195s1 + 225s2
P = R −C

The profit then becomes a non-​linear function of (s1 , s2 ),

P(s1 , s2 ) = −400, 000 + 144 s1 + 174 s2 − 0.01s12 − 0.07s1s2 − 0.01s22

If the company has unlimited resources, the only constraints are s1,s2 ≥ 0.
Unconstrained Optimization. We first solve the unconstrained optimization
problem. If P has a maximum in the first quadrant this yields the optimal solution. The
condition for an extreme point of P leads to a linear system of equations for (s1 , s2 ),

∂P
= 144 − 0.02 s1 − 0.007s2 = 0
∂s1
∂P
= 174 − 0.007s1 − 0.02 s2 = 0
∂s2

The solution of these equations is s1* = 4735, s2* = 7043 with profit value P* =
P(s1* , s2* ) = 553,641. Since s1* , s2* are positive, the inequality constraints are satisfied.
To determine the type of the extreme point, we inspect the Hessian matrix,

 −0.02 −0.007
HP(s1* , s2* ) =  
 −0.007 −0.02 

A sufficient condition for a maximum is that ( HP )11 < 0 and det( HP ) > 0. Both of
these conditions are satisfied and so our solution point is indeed a maximum, in fact
a global maximum.
53

Applications of Non-Linear Programming 53

Constrained Optimization. Now suppose the company has limited resources


which restrict the number of units of each type produced per year to

s1 ≤ 5, 000, s2 ≤ 8, 000, s1 + s2 ≤ 10, 000.

The first two constrains are satisfied by (s1* , s2* ), however s1* + s2* = 11, 278. The
global maximum point of P is now no longer in the feasible region, thus the optimal
solution must be on the boundary. We therefore solve the constrained optimization
problem
subject to

c (s1 , s2 ) = s1 + s2 − 10, 000 = 0.

We can either substitute s1 or s2 from the constraint equation into P and solve
an unconstrained one-​variable optimization problem, or use Lagrangian multipliers.
Choosing the second approach, the equation ∇P = λ∇c becomes

144 − 0.02 s1 − 0.007s2 = λ


174 − 0.007s1 − 0.02 s2 = λ,

which reduces to a single equation for s1 , s2 . Together with the constraint equation we
then have again a system of two linear equations,

−0.013s1 + 0.013s2 = 30
s1 + s2 = 10, 000.

The solution is s1* = 3846, s2* = 6154 with profit value P* = 532, 308.

Example 2: Cost Minimization –​Optimum Designing Problem


We need to design a “CAN” in such a way that its manufacturing cost can be
minimized, assuming manufacturing cost is Rs 9/​cm2. “CAN” should be designed
to hold almost 200 ml (200 cc) of liquid. Assuming r and h as radius and height of
the “CAN”, 3.3 < r < 5 and 4.7 < h < 20.5 (cm) to cater aesthetics and confront of
the user h ≥ 3.5r .

Solution:
Here, our objective is to minimize the total cost of manufacturing “CAN”. Cost of
fabrication depends on total surface area of “CAN” (cylinder).
∴ Total surface area of cylinder = 2 πrh + 2 πr 2 = 2 πr (h + r ) from Figure 4.1

∴ Total cost = 2 πr(h + r) × 9

∴ Min f(h, r) = 18 πr(h + r) [Objective function]


54

54 Applications of Non-Linear Programming

FIGURE 4.1 
CAN with radius r and height h.

It is mentioned in the problem that the capacity of “CAN” must be 200 ml (cc)

∴ Volume of cylinder = πr2h = 200 [Equality constraint]

For aesthetic and confrontness of users h ≥ 3.5r

∴ 3.5r – h ≤ 0 [Inequality constraint]

Other constraints on decision variables are

3.3 < r < 5

4.7 < h < 20.5 [Boundary conditions]

Finally, mathematical formulation is given as

Min f (h, r ) = 18πr (h + r )

πr 2 h = 200

3.5r − h ≤ 0

3.3 < r < 5

4.7 < h < 20.5

where r and h are design (decision) variables

Example 3: Cost Minimization –​Electrical Engineering


Optimal power flow problem is a typical example of optimization from the domain
of electrical engineering. Cost of generating electricity would be different for various
generators depending on the fuel involved, plant efficiency, technology, etc. The
objective of the optimal power flow problem is to minimize the total cost of gener-
ating electricity for different loads subject to the load demand balance constraints,
generation limits, transmission flow limits, and other such system constraints.
55

Applications of Non-Linear Programming 55

To understand this problem, consider the three buses network shown in


Figure  4.1. For an optimal power flow problem, the objective function is to min-
imize the total cost of generation. This is represented in equation (4.1). Individual
cost of production for the given system is as follows: F ( P1 ) = 0.01P12 + 7.5P1 + 450;
F ( P1 ) = 0.02 P22 + 10.5P1 + 150; F ( P1 ) = 0.15P12 + 4.5P1 + 600 . These generators have
a minimum and maximum generation capacity of 50 MW and 200 MW for generator
1 respectively, 10 MW and 150 MW for generator 2, and 60 MW and 300 MW for
generator 3. Line resistance and reactance of all three lines is 0.05 and 0.14 respect-
ively and the load demand is of 400 MW.
Mathematical formulation of this problem is as shown below

∑ ( )
3
min F Pg (i)
g =1

 cosθ12 sinθ12   cosθ13 sinθ13 


s.t. V1V2  +  + V1V3  + + P = 0 (ii)
 0.05 0.14   0.05 0.14  1

 cosθ12 sinθ12   cosθ23 sinθ23 


V2V1  +  + V2V3  + + P2 = 0 (iii)
 0.05 0.14   0.05 0.14 

 cosθ13 sinθ13   cosθ23 sinθ23 


V3V1  + + V3V2  + + P3 = 400 (iv)
 0.05 0.14   0.05 0.14 

50 ≤ P1 ≤ 200 (v)

FIGURE 4.2 
Simple three bus network.
56

56 Applications of Non-Linear Programming

10 ≤ P2 ≤ 150 (vi)

60 ≤ P3 ≤ 300 (vii)

The goal of this optimization problem is to reduce the total generation cost to a
minimum subject to the constraints. There are nine decision variables. Quadratic
expression related to variables, P1, P2, and P3 represent the total generation cost. The
other variables, voltage terms,V1, V2, and V3 and angles θ12 , θ23, and θ31 have zero
coefficients. For simplicity only three equality constraints (ii), (iii), and (iv) are load
balance constraints at each bus. The non-​linear terms represent the power flowing
through individual transmission line. The last three set of inequality constraints are
generation limit constraints. This entire formulation is of the non-​linear convex type
but it can also be approximated as linear programming model type.

Example 4: Design of a Small Heat Exchanger Network –​Chemical


Engineering
Consider the optimization of the small process network shown in Figure 4.1 with two
process streams and three heat exchangers. Using temperatures defined by Tin > Tout
and tout > t in, the “hot” stream with a fixed flow rate F and heat capacity C p needs
to be cooled from Tin to Tout , while the “cold” stream with fixed flow rate f and
heat capacity c p needs to be heated from t in to tout . This is accomplished by two
heat exchangers; the heater uses steam at temperature Ts and has a heat duty Qh ,
while the cooler uses cold water at temperature Tw and has a heat duty Qc . However,
considerable energy can be saved by exchanging heat between the hot and cold
streams through the third heat exchanger with heat duty Qm and hot and cold exit
temperatures, Tm and t m, respectively. The model for this system is given as follows:

• The energy balance for this system is given by

Qc = FC p (Tm − Tout ) , (4.1)

Qh = fc p (tout − t m ) , (4.2)

Qm = fc p (t m − tin ) = FC p (Tin − Tm ) . (4.3)

• Each heat exchanger also has a capital cost that is based on its area Ai , i ∈{c, h, m},
for heat exchange. Here we consider a simple countercurrent, shell, and tube
heat exchanger with an overall heat transfer coefficient, Ui , i ∈{c, h, m}. The
resulting area equations are given by

i
Qi = Ui Ai ∆ Tlm , i ∈{c, h, m} . (4.4)
57

Applications of Non-Linear Programming 57

fcp, tin

Tw
FCp, Tin Qm FCp, Tm FCp, Tout

Qc
fcp, tm

Qh

Ts
fcp, tout

FIGURE 4.3 
Example of simple heat exchanger network.

i
• The log-​mean temperature difference ∆ Tlm is given by

∆ Tai − ∆ Tbi
i
∆ Tlm = , i ∈{c, h, m} , (4.5)
(
ln ∆ Tai / ∆ Tbi )
and, ∆ Tac = Tm − Tw , ∆ Tbc = Tout − Tw ,

∆ Tah = Ts − t m , ∆ Tbh = Ts − tout ,

∆ Tam = Tin − t m , ∆ Tbm = Tm − tin .

Our objective is to minimize the total cost of the system, i.e., the energy cost as well
as the capital cost of the heat exchangers. This leads to the following NLP:


∑ i ∈{c,h, m}
(cQ + c A ) (4.6)
i i i
β
I

s.t (4.1)-​(4.5). (4.7)

Qi ≥ 0, ∆ Tai ≥ ∆ , ∆ Tbi ≥ ∆ (4.8)

where the cost coefficients ci and ci reflect the energy and amortized capital prices,
the exponent β ∈ (0,1] reflects the economy of scale of the equipment, and a small
constant ∆ > 0 is selected to prevent the log-​mean temperature difference from
becoming undefined. This example has one degree of freedom. For instance, if the
heat duty Qm is specified, then the hot and cold stream temperatures and all of the
remaining quantities can be calculated.
58

58 Applications of Non-Linear Programming

FIGURE 4.4 
Distillation column example.

Example 5: Real-​Time Optimization of a Distillation Column –​


Petroleum Engineering
Distillation is the most common means for separation of chemical components and
lies at the heart of petroleum refining process; it has no moving parts and scales easily
and economically to all production levels. However, distillation is highly energy
intensive and can consume 80%–​90% of the total energy in a typical chemical or
petrochemical process. As a result, optimization of distillation columns is essential
for the profitability of these processes. Moreover, because distillation feeds, product
demands, and even ambient conditions change overtime, the real-​time optimization in
response to these changes is also a key contributor to successful operation. Consider
the distillation column shown in Figure 4.2 with N trays. As seen in the figure, liquid
and vapour contact each other and approach equilibrium (i.e., boiling) on each tray.
Moreover, the counter current flow of liquid and vapour provides an enrichment of
the volatile (light) components in the top product and the remaining components in
the bottom product. Two heat exchangers, the top condenser and the bottom reboiler,
act as sources for the condensed liquid vapour and boiled-​up vapour, respectively.
The hydrocarbon feed contains chemical components given by the set

C = { propane, isobutane, n − butane, isopentane, n − pentane} .

The column is specified to recover most of then-​butane (the light key) in the top
product and most of the isopentane (the heavy key) in the bottom product. We assume
a total condenser and partial reboiler, and that the liquid and vapour phases are in
equilibrium. A tray-​by-​tray distillation column model is constructed as follows using
the MESH (Mass–​Equilibrium–​Summation–​Heat) equations:

Total Mass Balances

B + V0 − L1 = 0,(4.9)

Li + Vi − Li +1 − Vi −1 = 0, i  [1, N ] , i ∉ S , (4.10)
59

Applications of Non-Linear Programming 59

Li + Vi − Li +1 − Vi −1 − F = 0, i  S , (4.11)

LN +1 + D − VN = 0. (4.12)

Component Mass Balances

Bx0, j + V0 y0, j − L1 x1, j = 0, j  C , (4.13)

Li xi, j − Vi yi, j − Li +1 xi +1, j − Vi −1 yi −1, j = 0, j  C , i  [1, N ] , i ∉ S , (4.14)

Li xi, j − Vi yi, j − Li +1 xi +1, j − Vi −1 yi −1, j − FxF , j = 0, j  C , i  S , (4.15)

( LN +1 + D ) xN +1, j − VN yN , j = 0, j  C, (4.16)
x N +1, j − yN , j = 0, j  C , (4.17)

Enthalpy Balances

BH B + V0 HV , 0 − L1 H L ,1 − QR = 0 (4.18)

Li H L , i − Vi HV , i − Li +1 H L , i +1 − Vi −1 HV ,i −1 = 0 , i  [1, N ] , i ∉ S , (4.19)

Li H L , i − Vi HV , i − Li +1 H L ,i +1 − Vi −1 HV ,i −1 − FH F = 0, i  S , (4.20)

VN HV , N − ( LN +1 − D ) H L , D − QC = 0. (4.21)

Summation, Enthalpy, and Equilibrium Relations

∑ ∑
m m
yi, j − x = 0, i = 0,…, N + 1 (4.22)
j =1 j =1 i , j

yi, j − Ki, j (Ti , P, xi ) xi, j = 0, j  C , i = 0,…, N + 1 (4.23)

H L ,i = ϕ L ( xi , Ti ) , HV ,i = ϕV ( yi , Ti ) , i = 1,…, N (4.24)

H B = ϕ L ( x0 , T0 ) , H F = ϕ L ( xF , TF ) , H N +1 = ϕ L ( x N +1 , TN +1 ) , (4.25)
60

60 Applications of Non-Linear Programming

where

i tray index numbered starting from reboiler (=1)


j ∈C components in the feed. The most volatile (lightest) is propane
P pressure in the column
S ∈[1, N ] set of feed tray locations in column, numbered from the bottom
F feed flow rate
Li / Vi flow rate of liquid/​vapour leaving tray i
Ti temperature of tray i
HF feed enthalpy
H L ,i / HV ,i enthalpy of liquid/​vapour leaving tray i
xF feed composition
xi, j mole fraction j in liquid leaving tray i
yi, j mole fraction j in vapour leaving tray i
Ki, j non-​linear vapour/​liquid equilibrium constant
ϕV / ϕ L non-​linear vapour/​liquid enthalpy function
D / B distillate/​bottoms flow rate
QR / QC heat load on reboiler/​condenser
lk / hk ∈C light/​heavy key components that determine the separation.

The feed is a saturated liquid with component mole fractions specified in the order
given above. The column is operated at a constant pressure, and we neglect pressure
drop across the column. This problem has 2 degrees of freedom. For instance, if the
flow rates for V0 and LN +1 are specified, all of the other quantities can be calculated
from equations (4.9)–​(4.23). The objective is to minimize the reboiler heat duty which
accounts for a major portion of operating costs, and we specify that the mole fraction
of the light key must be 100 times smaller in the bottom than in the top product. The
optimization problem is therefore given by

Min QR

s.t. (4.9)-​(4.23)

xbottom,lk ≤ 0.01xtop,lk ,

Li , Vi , Ti ≥ 0, i = 1,…, N + 1,

D, QR , QC ≥ 0,

xi, j yi, j ∈[ 0,1] , j ∈C , i = 1,…, N + 1

Distillation optimization is an important and challenging industrial problem and it


also serves as a very useful testbed for non-​linear programming. This challenging
application can be scaled up in three ways; through the number of trays to increase
overall size, through the number of components to increase the size of the equations
61

Applications of Non-Linear Programming 61

per tray, and through phase equilibrium relations, which also increase the non-​
linearity and number of equations on each tray.

4.3  SOLVING NLP THROUGH MATLAB INBUILT FUNCTIONS


CASE 1: To minimize one-​variable function within the given bounds

• Write a m.file for the given function


• Define <mathgraphic id="9780367613280_​EQ_​0892"/​> and <mathgraphic
id="9780367613280_​EQ_​0893"/​> and then call fminbnd
• Use Golden Section search and parabolic interpolation
• [x, fval]= fminbnd(fun, x1,x2 )

Example:
Find minimum of f ( x ) = x 3 − 2 x − 5 in the interval of (0, 2)

Step 1: Write function file in script and save it appropriately (name of your
function is “minimf” in the example, it can be changed as per user wish):

Step 2: Call fminbnd by appropriate syntax to calculate both minima and min-
imum value.
Calling syntax and output is shown below.

CASE 2: To minimize unconstraint multivariable function


Multivariable problem of Min f ( x )kind can be solved using the following inbuilt
x
functions in MATLAB

1. fminunc –​It is suitable for continuous functions that have first-​and second-​
order derivatives. It uses quasi-​Newton algorithm.
•  Write a function m.file for given function.
• Define x0 (initial approximation) and then call fminunc in a script file using
syntax: [x, fval] = fminunc(@myfun, x0)
62

62 Applications of Non-Linear Programming

• Example: Min f ( x ) = 3 x12 + 2 x1 x2 + x2 2 , x0 = [1,1]T


•  Function file:

• OUTPUT

2. fminsearch –​It can handle even discontinuous functions. It does not need
derivative information.
• Write a m.file for function
• Define x0 (initial guess)and then call fminsearch in a script file.
• Call the fminsearch by following syntax:
x = fminsearch(@myfun, x0)
• Min f ( x ) = 100( x2 − x12 )2 + (1 − x1 )2 ; X 0 = [ −1.2, 1]
• INPUT

  OUTPUT

output
X = 0.2578 0.2578
resnorm = 124.3622
63

Applications of Non-Linear Programming 63

3. lsqnonlin –​This minimization inbuilt function is specially programmed for


functions like min f ( x )
x
2
( )
= min f12 ( x ) + f2 2 ( x ) + ...... fn 2 ( x ) for non-​linear
x
least square curve fitting functions
• Write an m.file for given function
• Call lsqnonlin using syntax: x = lsqnonlin(@fun,x0)
10
• Example: Find x that minimizes  ∑ (2 + 2 k − e
k =1
kx1
− e kx2 )2 X 0 = [0.3, 0.4]

• Function file:

• Calling script:

• Output
output
X = 0.2578 0.2578
resnorm = 124.3622

CASE 3: To minimize constraint multivariable function

 c( x ) ≤ 0
 ceq( x ) − 0

min f ( x )such that  A⋅ x ≤ b
x
 Aeq ⋅ x = beq

 lb ≤ x ≤ ub,

• Write an m.file for given function


• Call lsqnonlin using syntax: [x,fval] = fmincon(@myfun,x0,A,b)
64

64 Applications of Non-Linear Programming

Min f ( x ) = − x1 x2 x3 ; X 0 = [10;10;10]
•  Example: 
s.t.0 ≤ x1 + 2 x2 + 2 x3 ≤ 72
•  Function file:

•  Calling file:

•  Output:

output x =
24.0000
12.0000
12..0000
fval = -3.4560e+03

CASE 4: Genetic Algorithm through MATLAB


Many heuristic search algorithms such as genetic algorithm, simulated annealing, and
PSO are inbuilt in MATLAB. There is provision of customization of the algorithms as
65

Applications of Non-Linear Programming 65

well. Beginners can use them instead of writing complete programs by themselves.
Here, an illustration is demonstrated to minimize a function with constraint through
GA. MATLAB has optimtool box that can be explored for other inbuilt functions.

 − exp[( − x / 20)2 ], x ≤ 20
Example:  f ( x ) = 
 − exp( −1) + ( x − 20)( x − 22), x>20

• Write function file

• Call GA through interacting window of OPTIMTOOL BOX

Here solver used is GA, fitness function is objective function with name “@goodfun”.
Number of variable is one. Since there is no constraint and no predefined bounds on
66

66 Applications of Non-Linear Programming

the decision variable, we can run the solver. It shows that it took 51 iterations with
value of variable as 0.006 and functional value at minima is –​0.99. Now let us discuss
one example with constraints.

Example:

Min f ( x ) = 100(x12 − x 2 )2 + (1 − x 2 )2
subject : x1 x2 + x1 − x2 + 1.5 ≤ 0, 10 − x1 x2 ≤ 0, 0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 13

Function file:

Constraint file:

OPTIMTOOL Interactive window:

Solver is GA, fitness function is objective function, number of variables are 2, bounds
are mentioned. Please note that equality constraints can be mentioned directly in
67

Applications of Non-Linear Programming 67

interactive window in array form whereas for non-​linear constraint we need to write
function file and call it appropriately as shown above. Minima is [0.812, 12.312] and
minimum is 13706.1085.

4.4  CHOICE OF METHOD


1. Type of problem  –​linear, non-​linear, differentiable or not, constraint, and
nature of decision variables.
2. Accuracy of desired solution –​local or global optimum is required.
3. Availability of inbuilt programs or need to write customize code pertaining to
the problem.
4. Availability of time.

TRY YOURSELF
Q1. Optimize the following function using fsearch:

Min f ( x ) = 12( x2 − x1 )2 + 5(1 − x1 )3 ; X 0 = [0, 0]

Q2. Optimize the following functions using “fmincon” and “ga” –​built function
of MATLAB. Initial approximation can be [1, 1]T

(i ) Min f = ( x1 − 1)2 + ( x2 − 2)2 − 4

Subject to : x1 + 2 x2 ≤ 5
4 x1 + 3 x2 ≤ 10
6 x1 + x2 ≤ 7, x1 , x2 ≥ 0

(ii ) Min f = 4 x1 + 2 x2 + 3 x3 + 4 x4

Subject to : x1 + x3 + x4 ≤ 24
3 x1 + x2 + 2 x3 + 4 x4 ≤ 48
2 x1 + 2 x2 + 3 x3 + 2 x4 ≤ 36, x1 , x2 , x3 , x4 ≥ 0
68

68 Applications of Non-Linear Programming

BIBLIOGRAPHY
1. Rao, S. S., (First published 2009). Engineering and Optimization: Theory and Practice
(4th Ed.), New Jersey, U.S.A.: John Wiley & Sons, Inc.
2. Sharma, J. K., (2009). Operations Research: Theory and Practices (4th Ed.), New Delhi,
INDIA: Macmilan.
3. Pant, K. K., Sinha, S., & Bajpai, S., (2015). Advances in Petroleum Engineering II –​
Petrochemical, Studium Press. LLC, U.S.A: Studium Press LLC, U.S.A.
4. Lorenz T. Biegler, (2010). Nonlinear Programming:  Concepts, Algorithms and
Applications to Chemical Processes, Society for Industrial and Applied Mathematics,
U. S.
69

Index
analytical approach 1 Lagrange Multipliers method 33
ant colony optimization algorithm 47
Newton method 12
basics of formulation 51
particle swarm optimization 44
examples of NLP formulation 51 problems with equality constraints
exhaustive search technique 3 31

Fibonacci search method 6 quadratic interpolation 9


fundamentals of genetic algorithm 41 quasi method 27

genetic algorithm 41 random search method 16


golden section search method 8
grid search method 25 search techniques 2
secant method 13
Hessian Matrix 24 simulated annealing 45
hill climbing algorithm 45 steepest descent method 25
Hooke–​Jeeves method  19 stochastic search techniques 41

interpolation method (without using derivative) 9 tabu search algorithm 48

Kuhn–​Tucker Necessary conditions 38 univariate search method 17


Kuhn–​Tucker Sufficient conditions 39 unrestricted search technique 3

69
70

You might also like