Balaji2021 Book ThermalSystemDesignAndOptimiza
Balaji2021 Book ThermalSystemDesignAndOptimiza
Balaji2021 Book ThermalSystemDesignAndOptimiza
Balaji
Thermal
System
Design and
Optimization
Second Edition
Thermal System Design and Optimization
C. Balaji
123
C. Balaji
Department of Mechanical Engineering
Indian Institute of Technology Madras
Chennai, Tamil Nadu, India
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
A journey of a thousand miles begins
with a single step
Chinese proverb.
To
My Parents
who did not live to see this book
Preface to the Second Edition
Nearly 8 years have passed since the first edition of the book was published. While
newer algorithms to solve optimization problems continue to be developed, tradi-
tional ones are still holding fort. In this new edition, new sections are added on
integer programming and multi-objective optimization. The linear programming
chapter has been fortified by a detailed presentation of the simplex method. A major
highlight of the revised edition is the inclusion of workable MATLAB codes for
examples of key algorithms discussed in the book. Readers are encouraged to write
their own codes for solving exercise problems, as appropriate. Needless to say,
several new fully worked examples and exercise problems have been added to this
edition.
I would like to thank my Ph.D. scholars Sandeep, Sangamesh, Girish, Suraj,
Rajesh and Jyotheesh, and my M.S. Scholar Karthik for typing out the class notes,
developing the MATLAB codes, and painstakingly proofreading the book. Thanks
are due to my former Research Scholar Dr. Srikanth Rangarajan for working out the
example in respect of the TOPSIS algorithm.
I would be glad to respond to queries and suggestions at balaji@iitm.ac.in.
ix
Preface to the First Edition
This book is an outgrowth of my lectures for the course “Design and Optimization
of Energy Systems” that I have been offering almost continuously from 2001 to the
students of IIT Madras, a great temple of learning. There are a few excellent texts
on this subject and the natural question that arises in any one’s mind is, Why
another book on “Optimization of Thermal Systems”? The answer to this question
lies in the fact that the field is rapidly advancing, newer algorithms supplement and
even supplant traditional ones and the kind of problems that are amenable to
optimization are ever increasing in the area of thermal sciences, all of which make it
imperative to chronicle the progress at regularly decreasing intervals of time. At the
same time, I wanted to write a book that reflects my teaching style, wherein
“clarity” takes precedence over “extent of coverage”. I am a true believer of what is
said in the Upanishads—Tejasvina Vadhi Tamastu meaning “May what we study
be well studied”.
The major goals of the book are (i) to present the basic ideas of optimization in a
way that would appeal to the engineering student or a practicing engineer, who has
an interest or flair for trying to figure out the “best” among several solutions, in the
field of thermal sciences (ii) to present only that much material that can be covered
in a one semester course. By design, I have left out some optimization algorithms
that are presented elsewhere and on the flip side, I have included some, which, in
my opinion, are contemporary and have tremendous scope in thermal sciences.
The significant departure I have made from traditional text books is the inter-
active or conversational style. This, in my opinion, helps one to get across the
material in an interesting way. The book is laced with several fully worked out
examples with insights into the solutions. The chapters are backed up by a good
number of exercise problems which, together with the example problems, should
equip serious students well enough to take on optimization problems of substantial
difficulty.
This book is not a one-stop-shop for learning all optimization techniques.
Neither is it an exhaustive treatise on the theory of optimization nor is it a simple
guide to solve optimization problems. It is an honest attempt to blend the mathe-
matics behind the optimization with its usefulness in thermal sciences and present
xi
xii Preface to the First Edition
the content in a way that the book becomes “unputdownable”. Whether it will
eventually achieve this depends on the readers.
I would like to thank Prof. S. P.Venkateshan, IIT Madras, my beloved teacher,
who initiated me into research and has been a trusted mentor and colleague over the
last two decades.
Thanks are also due to Professors Roddam Narasimha and J.Srinivasan of IISc
Bangalore, Prof. Heinz Herwig, Hamburg Institute of Technology, Germany and
Prof. T. Sundararajan, IIT Madras, for supporting me at various stages of my
professional life.
I cherish long discussions with Prof. Shankar Narasimhan, IIT Madras, on
several topics presented in this book. I also thank Dr. Sridharakumar Narasimhan, a
recent addition to IIT Madras, with whom I had several telephonic discussions on
slippery aspects of optimization.
Financial assistance from the Centre for Continuing Education, IIT Madras, is
gratefully acknowledged.
I would like to thank NPTEL (National Program for Technology Enhanced
Learning) for giving me an opportunity to bring out a video course on Design and
Optimisation of Energy Systems which served as the starting point for this book.
I also acknowledge the support and commitment of Ane Books Pvt. Ltd., for
bringing this book out in a record time.
I would like to thank my wife Bharathi for painstakingly converting my video
lectures to a working document that served as the basis for the first draft of the
book, in an amazingly short time and for proofreading the manuscript more than
once. She has been one of the greatest strengths in my life! My graduate students
Ramanujam, Gnanasekaran, Chandrasekar and Konda Reddy have spent several
sleepless nights, rather weeks, typing out the equations, proof reading the text and
cross checking the example and exercise problems. Thanks are due to the numerous
students who have taken this course over the last ten years and who, by their
incisive and “stunningly surprising” questions, not only continue to humble me but
also keep me on my toes.
Finally, I would like to thank my daughter Jwalika for allowing me to get
committed on yet another assignment, namely, this book writing venture. I must
admit that it has positively led to hijacking of quite a few weekends and holidays.
I would be glad to respond to queries and suggestions at balaji@iitm.ac.in.
xiii
xiv Contents
3 Curve Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.1.1 Uses of Curve Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.2 Exact Fit and Its Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.2.1 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.2.2 Lagrange Interpolating Polynomial . . . . . . . . . . . . . . . . . 73
3.2.3 Newton’s Divided Difference Method . . . . . . . . . . . . . . . 81
3.2.4 Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.3 Best Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.4 Strategies for Best Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.4.1 Least Square Regression (LSR) . . . . . . . . . . . . . . . . . . . . 87
3.4.2 Performance Metrics of LSR . . . . . . . . . . . . . . . . . . . . . 96
3.4.3 Linear Least Squares in Two Variables . . . . . . . . . . . . . . 101
3.4.4 Linear Least Squares with Matrix Algebra . . . . . . . . . . . . 102
3.5 Nonlinear Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
3.5.2 Gauss–Newton Algorithm . . . . . . . . . . . . . . . . . . . . . . . . 106
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4 Optimization—Basic Ideas and Formulation . . . . . . . . . . . . . . . . . . . 129
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.2 General Representation of an Optimization Problem . . . . . . . . . . . 133
4.2.1 Properties of Objective Functions . . . . . . . . . . . . . . . . . . 134
4.2.2 Cardinal Ideas in Optimization . . . . . . . . . . . . . . . . . . . . 135
4.2.3 Flowchart for Solving an Optimization Problem . . . . . . . . 135
4.2.4 Optimization Techniques . . . . . . . . . . . . . . . . . . . . . . . . 137
5 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2 The Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.2.1 Unconstrained Optimization Problems . . . . . . . . . . . . . . . 142
5.2.2 Constrained Optimization Problems . . . . . . . . . . . . . . . . . 146
5.3 Graphical Interpretation of the Lagrange Multiplier Method . . . . . 150
5.4 Mathematical Proof of the Lagrange Multiplier Method . . . . . . . . 153
5.5 Economic Significance of the Lagrange Multipliers . . . . . . . . . . . 154
5.6 Tests for Maxima/Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5.7 Handling Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.8 Why Should U Be Positive? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6 Search Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.1.1 A Smarter Way of Solving Example 6.1 . . . . . . . . . . . . . 180
Contents xv
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359
Random Number Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
About the Author
xvii
Chapter 1
Introduction to Design and System
Design
1.1 Introduction
which will do the same job and so on, the original goal (of designing something that
will just work!) is no longer acceptable, as this approach, more often than not, leads
to designs that are conservative and costly.
In the past, improved systems were designed only after a break even was achieved.
Break even usually represents the time required to recover the original investment
in a project. Optimization was considered as a costly add-on. Many people did not
want to invest the time and effort required for optimizing a system.
Now, from what we have discussed thus far, it is clear that the key point is there
are several ways of accomplishing the same task. This is one of the cornerstones
in our quest for an optimum. Optimization is afforded by the “choice” or “space”
the design variables provide us. One has, more often than not, a choice of various
designs. It is different from the evaluation of an integral in calculus, where we think
that everybody should get the same answer. Design is certainly not like that!
There are many ways of accomplishing the same task, and we will have to pick
and choose the one that best suits us. Again, some are better than the others. The
word “better” has to be necessarily within parenthesis, because what is better or not
has to be decided by the user or the analyst, based on what the “objective function”
is and accordingly, he/she goes ahead and optimizes the design.
Suppose we want to join sheets of paper, there are several ways of doing this.
We can use a stapler or we can use a bell clip or simply bind the sheets. So, even a
seemingly innocuous task like joining sheets of paper opens up several possibilities
of accomplishing the same task.
So, first, the choice is available to us in engineering design, just as it is in life, in
general. Secondly, the design of complex systems requires large calculations, often
repetitively, for various combinations of the design variables. Let us take the problem
of temperature control of the ubiquitous desktop. The goal here is to arrange all the
components of the desktop so that we get optimal cooling. We have a choice here, no
doubt, but it is not like we have infinite choices because there are certain positions
that are fixed. We do not have much leeway in the location of certain components
or in the way they have to be positioned close to each other. These are known more
formally as “constraints” in an optimization problem. But we still have a choice of
where we want to have slits, where we would like to place the fans, and so on. There
are usually two fans in a desktop. One fan that sits on top of the heat sink, which
in turn sits right on top of the processor. The other fan is very close to the panel
on the back side to facilitate the movement of hot air out of the cabinet. The fan,
which is dedicated to the processor will turn on only when the temperature reaches
a certain level. When we initially start working, this fan usually will not turn on.
When several programs are concurrently running and the outside is also hot, this fan
will turn on. We can now consider this as a Computational Fluid Dynamics (CFD)
problem to determine the maximum temperature for a particular arrangement of the
components. If there are 10 components, there are a million ways of arranging these.
Each time we need to create a solid model on the computer, get the converged CFD
solution, and look at the maximum temperatures in the system. How many times do
4 1 Introduction to Design and System Design
Now we look at analysis and design using flow charts. Let us look at Fig. 1.1.
From the figure, it is clear that first, we have to state the need. This is followed by
a techno-economic feasibility study. A project needs to be both technically feasible
and economically viable. If it is not feasible, we drop the project. 20–30 years back,
desalination was technically feasible, but economically unviable for a developing
country like ours. But now there is a desalination plant in Chennai. Once the techno-
economic feasibility is established, detailed engineering is done and this is followed
by manufacturing. Distribution and consumption are again critical activities that
contribute largely to the success of a product.
The last few steps involved in the design are the incorporation of user feedback
and modifications to the existing design. Research and Development (R&D) may
supplement these efforts or sometimes fresh ideas may come directly from R&D.
1.2 Design Analysis Through a Flowchart 5
From the preceding discussion, it is clear that identifying the need is the first step
in any design. Stating the need is not always so straightforward. For accomplishing
a job, the device we have in mind may be simple or extremely complex. Let us
consider the previously cited example of joining sheets of paper. We can come out
with various options, like using a stapler or punching a hole and tying with a tag or
binding the sheets.
But if the problem is complex, like for example, if we want to solve the water
problem of a city like Chennai, some may propose that there is a need to enlarge
one or more existing reservoirs, desilt them, make them deeper, and let them store
more water, whenever there is rain during the north-east monsoon season during
October, November, and December. So the need is to enlarge the reservoirs. But
6 1 Introduction to Design and System Design
unfortunately, the way the need has been stated now is incorrect because that is
the solution! Sometimes, if there are various possible solutions, we incorrectly state
one of the solutions as the need. One can still solve the water problem of Chennai
without enlarging the reservoirs because there are other ways of doing it. We can do
desalination, or charge a very high tax for water consumption which is exponentially
increasing with consumption. We can have a water meter for every apartment. So for
x liters a month, the cost is Rs.y; in the next slab, it will go for 1.25y, 1.5y for the next
x, and so on. That is one way of doing it. Then we can have projects to fetch water
from a neighboring state such as the Krishna water project. We can get water through
pipes from some other reservoir, like we have done for Chennai from the Veeranam1
lake. So, there are various ways of solving the problem. Therefore, the need has to
be stated in such a way that it is very broad. More importantly, the need should not
be stated in such a way that some solutions are already not even considered.
Sometimes, there is a need to come out with new products lest the customers get
tired with the existing range of products of the company. Sometimes, a company is
forced to come out with a new product if its competitor introduces one that also is
successful.
Needs or opportunities arise in the renovation or expansion of facilities to man-
ufacture or distribute a current product. For example, there is a heavy waiting time
for certain cars now. Some companies can be happy that people are waiting for their
cars. But it could be an opportunity for the competitors. If somebody is able to make
something similar, which can be delivered with minimum waiting time, people can
switch loyalties.
A new product may be developed intentionally or accidentally. If we look at how
Velcro was invented, it is interesting. George de Mestral observed how Cockleburs,
an herbaceous annual plant species get attached to jeans and this keen observation
and inspiration led to the invention of the ubiquitous Velcro. The “Post-It” sticky note
or prompt was the outcome of a failed product because it defied all the conventional
properties of an adhesive. Dr. Spencer Silver, an American chemist, discovered an
adhesive which did not stick after some time. Mr. Arthur Fry, an American inventor,
figured out that if it did not stick for a long time it could be used as a bookmark,
because it could be removed without leaving any trace of having used that. So, by
ignoring conventional wisdom, a “failed adhesive” became a very successful office
product.
Now let us take a quick look at some aspects of the flowchart, not usually consid-
ered by engineers, in detail.
Success Criterion
The Cumulative Grade Point Average (CGPA) of a student is one criterion of success
in an academic setting. However, it may not be the sole criterion for success. In
commercial organizations, the usual criterion for success shows a profit. Commercial
enterprise means there should be a return on the investment. One invests Rs. 100 and
the key question is, at the end of the day, how much do we get back? This, as a
fraction of how much we have invested, is called the rate of return. In a big project,
say, a power plant, one cannot start getting returns at the end of the first year. At the
end of 5 years, maybe, we reach the break-even point. After we reach the break-even
point, what is the return on investment? So this is the figure that all the commercial
enterprises look at, primarily.
In public projects, such as building a new terminal in the airport or a new flyover,
the criterion of success is whether the people are happy with that. So sometimes it
can become very subjective and qualitative. However, the projected earnings of a
proposed commercial project exert a lot of influence on whether to go ahead with
the project. However, these concerns are also moderated by social considerations
to a greater degree, for example, whether we want to have a dam over Narmada,
whether we want to have the Tehri-Garhwal project, and whether we want to stop
the Alaknanda river in the fragile Uttarakhand state of North India. There are other
considerations which will temper our drive to keep the return on investment as our
only criterion of success. However, emergency decisions are based on reasons outside
the realm of economics. If a nation wants to declare war on another for legitimate
reasons, then there is no consideration of economics!
Figure 1.2 shows a typical return on investment as a percentage at various stages of
a typical plant. It can be seen that during the preliminary design stage, the uncertainty
band is high. After a detailed design, we see that the uncertainty is decreasing.
Fig. 1.2 Probability of return on investment at various stages for a typical plant
8 1 Introduction to Design and System Design
This shows that our confidence in design increases as the design evolves. When
the uncertainty (σ) is low the return on the investment will be within a tight band,
as according to a normal distribution, the probability of x being with x ± 3 σ is
99%. First, the curves are very diffuse. After construction, it becomes better. After
5 years, we have a good idea, assuming that the market does not change dramatically.
This is basically the probability at various stages of decision-making. Please look up
Stoecker Stoecker (1989) for a detailed discussion on this.
The uncertainty reduces as we move from preliminary design to complete design
to construction. After 1 year, it is lower. After 3 or 5 years, the rate of return is
known exactly. The probability degenerates into a small uncertainty. The probability
eventually looks like a Dirac delta function. But the prediction of future behavior is
not deterministic. We have to factor in various things like the future market conditions,
the inflation associated with this, the fluctuation in interest rates, and so on. Therefore,
the design will change from being deterministic to probabilistic.
Stochastic or probabilistic design is very popular nowadays. For example, the
fluctuations in the stock prices of various companies (P) can be treated as
P = P̄ + P (1.1)
where P̄ is the mean and P is the fluctuation. We can model it along the lines of
turbulent fluid flow. And then we can come out with stochastic partial differential
equations, solve them, and have a predictor model for stock prices. Such activities
are part of the new discipline called “Financial engineering”.
When a flock of birds is trying to catch food, the probability of getting food
increases if all the birds stay closest to the leader. The leader is the one who is
closest to the food. We can write equations for this and work it out. People have
modeled all this and have applied it to solve practical engineering problems. Such a
technique is known as Particle Swarm Optimization or PSO. The foraging behavior
of ants or how ants search for their food has been studied by various optimization
groups. If ants go in search of food, when they return after eating, they leave a trail
of a chemical called pheromone. This pheromone concentration will, of course, be
stronger if more food is available at the place where they went. This concentration
will also exponentially decay with time. So the ants which follow this will look at
the pheromone trail and wherever the pheromone concentration is very weak, they
avoid that path. Even along the path, where there is a pheromone trail, if the signal
is very feeble, the ants know that the last ant that went must have done so 2–3 days
back and most probably by this time, the food would have got exhausted! Birds or
ants are not taking any optimization course or a 101 course on Introduction to finding
food. Yet, they are doing it well all the time!
Market Research Typically, sales volume is inversely proportional to the product
price (see Fig. 1.3a). So one expects that as the price increases, sales will decrease.
This is not always true. Can we think of some goods which will defy this? Gold!
Gold will buck this trend because when the gold price increases, people will get
more scared that it will rise further and more people will go and buy it. That will
1.2 Design Analysis Through a Flowchart 9
cause increased demand, which will cause a short supply and will further increase
its price. So, vanity goods have the opposite trend. The demand for essential goods
like milk, bread, and rice is price insensitive. We call this inelastic demand. The
demand for some goods like tea and coffee displays some elasticity of demand. In
fact, there is a possibility that the consumption of tea will increase when the price
of coffee increases. This is known as cross-elasticity of demand. For example, if we
plot the price of coffee versus the sales of tea, the curve may look like what is shown
in Fig. 1.3b.
In Fig. 1.3a, we have a bunch of lines. They represent different sales and adver-
tising efforts. So, to a limited extent, it is possible to employ aggressive marketing
and make people remember a brand and increase the sales. But after some time, the
return on investment, because of this advertising effort, will decrease. That is called
the law of diminishing marginal returns in micro-economics.
Let P denote the price and Q denote the quantity. The elasticity of demand Ed is
given by
Q/Q P ∂Q
Ed = = (1.2)
P/P Q ∂P
The results of research and development may be an important input to the decision
process. Research efforts may result in an original idea or an improvement of the
10 1 Introduction to Design and System Design
existing idea, whereas, development requires a pilot study. This is the difference
between research and development. It is first research and then development and
then we need to transform it into a product. So if we look at working models or pilot
plants, it is development work. The idea need not always originate from within the
organization. It may also come from a rival organization or competition.
From the flowchart (Fig. 1.1), it is evident that the process of design is essentially
a trial and error or more formally, an iterative procedure. Each pass through the
loop improves the amount and quality of information. What is flowing through the
flowchart is basically information, which gets refined. After we go through sufficient
iterations and we are confident, we stop the process and make a decision on whether
to go ahead with the project or not. The stopping criterion is a prerogative of the
design engineer and depends on several factors.
1.3 Optimization
The flowchart in Fig. 1.1 involved no optimization. However, we have user feedback,
followed by improvements possible in design. These improvements can also serve
as an impetus to the Research and Development design. What we have discussed is a
conventional design. The original design was based on some parameters, but during
the operation of the plant, many parameters will change. One becomes interested in
identifying the set of parameters when the design/plant or machine will work at its
“best”. However, a better strategy would be to integrate the optimization with the
design itself.
The difference between engineering design and analysis is that the analysis problem
is concerned with determining the behavior of an existing system. Let us look at an
example, a steam condenser, a schematic of which is shown in Fig. 1.4. On the shell
side, we have steam entering, which condenses and comes out as hot water. This
will be an isothermal process, as far as the steam in the shell side is concerned. This
condensation is accomplished by cold water circulated in the tubes.
The cooling water will enter at a temperature Tc,in and will go out at a temperature
Tc,out . So its temperature will increase.
Shown in Fig. 1.5 is the temperature—length or the T-x diagram of the heat
exchanger. It does not matter if we have a parallel flow or a counter flow because
1.4 Analysis and Design 11
one of the fluids is changing phase. There are several ways of working this problem
out. One possibility is that we want to condense, say, “m” kg of steam, which is at a
particular temperature and pressure, into “m” kg of water. We have cooling water at
30 ◦ C. Now we want to come up with the design of a steam condenser. So how do we
go about designing this? Is energy balance alone enough? What are the equations?
Let us start with the following energy balance equation:
With just this alone, we cannot design the heat exchanger. That is the limitation of
thermodynamics. Thermodynamics will tell us what the resulting temperatures are,
and how much heat will be transferred eventually between the hot fluid and the cold
fluid. But it will not tell us what the area of the heat exchanger required to accomplish
this task is.
What do we want to find out actually in the previous example? The design solution
will be how many tubes are required in the shell, what should be the size of the
shell, what should be the diameter of the shell, whether we will have a U-tube kind
of situation, i.e., whether it has several passes. Accordingly, the diameter will get
12 1 Introduction to Design and System Design
reduced. So there is an issue of surface area. How much surface area is required for
accomplishing this? What is the equation for this? This is basically Q of the heat
exchanger or the heat duty of the heat exchanger.
Q = U A TLMTD (1.4)
Sometimes, the solution is very tricky, since the LMTD, which is the logarithmic
mean temperature difference, is not known and we will have to perform tedious
iterations to get this temperature in many situations.
What then is a design problem? Let us say we want to design the steam condenser
for the 500MW fast breeder reactor coming at Kalpakkam.2 India will build it up in
the next few years. We know the conditions, the pressure, and the temperature of the
steam. Also, we know the temperature of the cooling water, which is got from the
sea. So how do we go about designing this? Can we take water at 30 ◦ C and send it
out at 80 ◦ C? What is the problem? Thermal pollution! So there is a limit or a cap
on the maximum T that is allowed. What is the value? It is about 5 ◦ C. Otherwise,
it severely disturbs the nearby aquatic life. A constraint is already coming up in our
design. We are allowed a T of only 5 ◦ C.
Of course, the sky is the limit for the flow rate, and we can have an infinite flow
rate. What will be the size of the pump? We can have a beautiful and wonderfully
green design. T is only 5 ◦ C. But what will be the size of the equipment? Heat
transfer equipment works best when there is a temperature difference. So we want
more temperature difference. But we are forced to have a maximum T of 5 ◦ C. So
there is a constraint. There is also a constraint which is coming in the form of losses.
The power required for the pump should not be more than the output of the 500MW
plant! It will and should not happen but if we come up with some simple design
based on the most fantastic fluid mechanics solution, we go home with negative
power output from the plant.
Are there more constraints? Of course, yes. The saltwater will corrode normal
steel. So we have to go for stainless steel or titanium. So what innocuously started as
a harmless simple equation seems to get increasingly tricky. So the design problem is
ṁsteam →221 fixed
T, p →221 fixed
Q →221 known
A →221 unknown.
For the given Q, what the size of the equipment will be, that is the design problem!
For the given U and A, what the Q is, that is the analysis problem. Q is not known
a priori in the analysis problem. Q is known, to begin with, in the design problem.
So for a simple problem like this, we can branch out into design and analysis. But
originally when people tried to design, they would usually follow a nomogram or a
data book, which would list out the thickness and diameter of the shaft for a given load,
and they would have a factor of safety and decide. So, analysis was divorced from
design. But now, we have tools like finite element analysis and therefore, analysis
is an integral part of design. The idea behind this is that alternative designs can
be analyzed and we can find out whether stresses are within limits, without having
2 Kalpakkam is a coastal town 70km south of Chennai, India, where a 500 MW Fast Breeder Nuclear
to build new prototypes, on the computer itself, using simulations. Hence, we can
study the performance of several choices and choose the best. This is essentially the
difference between analysis and design.
The sizes and configurations of the parts are known a priori for the analysis
problem. For the design problem, we need to calculate the sizes and shapes of the
various parts of the system to meet performance requirements. In this system, suppose
we change the U or A or we just change the ṁ, what happens? From the design flow
rate of steam, let us say we deviate, ±10% or ±15%, what happens? So we will have
to do what is called a sensitivity study too!
So the design of a system is essentially a trial and error procedure.
• We will select a bunch of components, assemble them together, and that becomes
a system.
• Then we will use all the equations known to us and see if the system is working.
• If the system is working, it is a feasible or an acceptable design.
• If it is not working, we will turn around and try to change some of these components
or parameters, and
• We will keep on doing it till we have an acceptable or a feasible design.
This is how the design procedure is done. In both these cases, we must have the
capacity or ability to analyze designs to make further decisions. Therefore, analysis
has to be embedded and integrated into the design process. Design and analysis are
two sides of the same coin. So, we should not simply do some design based on some
thumb rules or formulae. Of course, for a simple problem like the determination of the
size of the pump required for an apartment, we do not have to conduct a complicated
finite element analysis. We go to our plumber, who will give us a quick fix solution
and say that “5hp pump will work”. But there, we are not talking about optimum.
We are trying to choose something which works. That is fine for this simple problem
as we are not looking at a costly equipment, wherein optimization really matters.
For a design problem, there are several possible solutions. But all solutions are not
equally desirable. Some solutions are better than the others. Whether it is better or
not, how some solutions are better than the others depends on the objective function,
on the analyst, on the user, and what he/she wants to call desirable. Actually, when
an objective function criterion is defined like cost, size or weight, heat transfer or
pumping power, and so on, invariably only one solution will be an optimum. The goal
of this book is to help us identify the optimum in a given situation. Sometimes, the
optimum may not exist, and that is fine. Sometimes, the optimum may be infeasible.
Sometimes, the constraints themselves may fix the solution and so on. It is sufficient
to say at this stage, just as there exists a difference between design and analysis, there
exists a difference between a workable system and an optimum system.
1.5 Workable System and Optimum System 15
Example 1.1 Select the pump and piping for a system to convey 4 kg/s of water from
point A (in the sump) to point B (in the overhead tank). B is located 10 m higher than
A, and the total length of the pipe from A to B is 300 m.
Solution
The depiction is given in Fig. 1.7. There is a sump and an overhead tank. We are
taking water from the sump to the overhead tank. The height difference is 10m and
the length is 300 m.
f = 0.182Re−0.2
ρ = 1000 kg/m3
μ = 8 × 10−4 Ns/m2
We now need to determine the pump rating and the diameter of the pipe. The
length of the pipe is known. So it is a design problem and is open ended.
Let us assume a pipe diameter d of 1 21 ” or 38 mm
ρπd 2
ṁ = v
4
π
4 = 1000 × × (0.038)2 × v
4
Therefore, v = 3.53 m/s.
The Reynolds number is given by Re = vd /ν
ṁ d 4ṁ
Re = = = 1.7 × 105
ν
ρπd 2 πd μ
4
Hence, the flow is turbulent (flow in a pipe, ReD > 2300, transition to turbulence).
Therefore, the friction factor is
If η = 1, then
ṁ × P
Power =
ηρ
A workable system is one which satisfies our performance requirements. In the case
of a pump and piping example, it should be able to pump the water. Is it enough if it
satisfies the performance requirements?
What about the cost? Right now, we do not know if it is the cheapest or not.
However, the design we come out with should have a reasonable cost. For example,
in this problem that we worked out, had we taken a 6mm or 41 ” diameter pipe, we
could have come out with an answer, where the rating of the pump would be 10 or
25 kW. Immediately, we would have realized that although, technically, this pump
would also do the job, there is something fundamentally incorrect about the value
of the diameter, which was assumed in the first place. We then turn around and try
to correct it. Therefore, even though several designs may work, all these may not be
workable systems, not because they do not satisfy our performance requirements,
but because they do not satisfy our criterion of a reasonable cost. What a reasonable
cost is cannot be taught. For this case, if we have a pump and piping system that
costs between Rs. 5,000 and Rs. 10,000, it is alright. But suppose it were to cost
18 1 Introduction to Design and System Design
Rs. 75,000 or a lakh,3 we know that something is going wrong. It means some of
the fundamental dimensions we have assumed are not correct. This cost consists
of two components: fixed cost and running cost, which includes the cost of power,
maintenance cost, etc. Both of these should be reasonable.
Is that all? If a system satisfies these two criteria, does it become a workable
system? We are almost there but not yet! The system should satisfy all our constraints.
The constraints can be in the form of pressure, temperature, pollution guidelines,
material properties, and so on. For example, asbestos is now banned. Let us say we
are trying to come out with an insulation system made of asbestos. It will satisfy our
performance requirements and the cost factor. But it will fail as it will violate some
regulation which is in place.
So the features of a workable system are that it
1. Satisfies the performance criteria;
2. Satisfies the cost requirements—fixed and running costs included;
3. Satisfies the constraints—pressure, temperature, pollution guidelines, weight,
size, material properties, etc.
Now that we have designed the workable system, and also know its characteristics
and attributes, when we carried out this design of the workable system, and how we
went about doing it? What were the steps involved? The requirement was given. What
did we first choose? First we selected the concept. It may sound silly, but every day,
we can hire somebody, (if that somebody is available and affordable) pay the person
some money, give 2 or 3 buckets, and ask that person to fill up the tank. Technically,
that is also a solution. The person will take buckets full of water, climb the stairs,
and pour it into the overhead tank. We first decided that there will be a centrifugal
pump or a jet pump and that there will be a piping and pump arrangement that will
do this job more efficiently. Then we went about fixing the pertinent parameters. For
example, in this case, we fixed the diameter and performed the calculations. The final
step is to choose the nearest “available solution”. So if we get the pump rating as
3.486 kW, if 3.5 kW is available, we use it else we go for the nearest rating which
may be 4 or 5 kW. Normally in the markets, they still follow the horsepower system.
So a pump will normally be available in multiples of 0.5 horsepower—1 hp, 1.5 hp,
and 2 hp. We round off our calculated pump rating to the nearest horsepower motor
and decide that this is the solution. This basically gives us an overview of how we
go about designing a simple system. If we want to design a complex system, we
may have to do a lot more calculations. We may write a computer program or take
recourse to commercially available software.
3 One lakh of rupees = Rs. 1 × 105 ; 1 USD ≈ Rs. 70 (as of April 2019).
1.6 Optimum Design 19
But it is now that the actual story starts. Each of us can come out with a particular
design. But all of these are not equally desirable. When we say all of these are not
equally desirable, we are actually getting more specific. We are introducing what is
called the objective function. The objective function is mathematically denoted by
“y” and has to be declared upfront. For example, in a problem like what we discussed
a little while ago, the “y” is very straightforward; it is basically the cost.
However, in social projects, “y” is very nebulous and cloudy. For the case of a
flyover, what is the objective function? It is whether the flyover has eased the traffic
problem. It could be at several levels. The users can subjectively say yes or no. One
can do a survey or referendum and find out whether they are happy.
So we do not have to assume that the agreement on the objective function “y”
is trivial and straight forward. Oftentimes, the definition of this objective function
requires a lot of time and effort. For example, if we want to design a heat exchanger,
it is not always that we want to design it at a minimum cost. Because, more often than
not, we may end up with a minimum cost heat exchanger which is highly desirable, no
doubt, but one that results in a lot of pumping power which is certainly not desirable.
So, shall we say that we want to have maximum thermal performance divided by
the pumping power? The ratio of the heat transfer rate to the pumping power or the
pressure drop Q/ P, is it a performance criterion?
So there need not be a consensus on what the objective function is. This is invari-
ably left to the analyst. He/She decides what the objective function is and then goes
about minimizing or maximizing the “y”.
The question now is: What could be the objective function in the case of the
pump and piping problem? How do we optimize the system under question? First
and foremost, we need to agree on what is to be optimized. In this pump and piping
system, what is to be optimized? We have the following costs:
1. Cost of the pump,
2. Cost of the piping,
3. Cost of installation,
4. Maintenance and running costs.
So the objective now is to minimize the lifetime total cost of the system. That is a
fair objective. Now we will have to enumerate the various components that constitute
the lifetime total cost and try to see if we can write down all these costs in terms of
some numbers or equations. We then have to assign values to the various constants
available in these equations and then use an optimization technique of our choice to
determine the optimum operating conditions for the system under question.
The costs involved are
1. Cost of pump,
2. Cost of pipe (and fittings),
3. Lifetime running cost.
20 1 Introduction to Design and System Design
Let us leave out the maintenance cost for the present. Now, is it possible for us to
write all these costs in terms of the fundamental quantities which are involved in this
problem? The fundamental variable in this problem is P or the pressure change. Let
us start with the cost of the pump. The cost is proportional to P and the volumetric
flow rate Q. But the volumetric flow rate is fixed in this problem, and hence does not
enter the problem.
Power = P Q/η
We assume that efficiency is constant over this period and if we assume that the
discharge is constant, the power is proportional to P. Therefore,
Getting c and d again is not that straightforward as the cost of electricity changes
with time. If the pump and piping system needs to work for 15 or 20 years, we have
to factor in the inflation. The cost of electricity is never constant; it changes. So, a
fair approach would be to look at the trend in the last 5 years, assume the average
inflation rate, and then factor in the inflation. Things get complicated if we are trying
to design and optimize a power plant; just as the costs are going up, the revenue
will also keep changing with inflation. On the other hand, for the loan taken, interest
needs to be paid up, which will basically be based on a diminishing balance. At the
end of 5 years, say, a part of the principal would have been paid. A large program is
required to solve this problem. For the example chosen, we will stick to this story.
Nevertheless, we can appreciate that eventually the problem could get really messy.
The cost of the pipe is directly proportional to the mass or assuming that it is
operating under one g, we can say that the cost is proportional to the weight. The
length is pretty much fixed as we know the length and how many bends there are.
1.6 Optimum Design 21
a = πd 2 /4 (1.8)
Cost of pipe:
cost ∝ weight
∝ volume × density
∝ πdtl ρ
∝d
There is an assumption involved in the π dtl. What is this d now? There is a di and
a do , the inner and outer diameters, so this is actually the average diameter. It is
normally called the nominal diameter. But the nominal diameter will usually not be
the arithmetic average of di and do .
We do not know v directly because we have specified only the mass flow rate.
22 1 Introduction to Design and System Design
πd 2
ṁ = ρAv = ρ v
4
4ṁ
v=
πd 2 ρ
2
8fL πd4ṁ2 ρ
P = ρg
2gd
We are not done yet because f is also a function of the Reynolds number. Of course
as a practicing engineer, we may neglect it and get on with it but we will get to the
bottom of it. Assuming that the flow is turbulent,
f = 0.182 Re−0.2
vd −0.2
f = 0.182
ν
4ṁd −0.2
f = 0.182
πd 2 ρν
4ṁd −0.2
f = 0.182
πd 2 μ
C 1 −0.2
P ∝ 5
d d
P ∝ d −4.8
Now we have written the total cost in terms of P. P is the pressure rise taking
place in the system and which is under our control. So for different values of P,
the total cost goes up with P, if we look at term no. 2 on the right-hand side of
Eq. 1.10. Term no. 3 goes down with P. Because there is one term that increases
with P and there is another term which decreases with P, there is hope for us to
optimize and there should be a particular value of P at which the total cost will
be extremized, that is, the first derivative of the total cost will be 0. However, we
do not know if this will be the maximum or minimum cost. Even so, after doing all
this, intuitively we can guess that it will be the minimum cost. However, as a purist
1.6 Optimum Design 23
Reference
2.1 Introduction
System simulation basically mimics an actual system. It can be defined as the cal-
culation of operating variables such as pressures, temperatures, concentrations, and
mass flow rates of fluids in a thermal or energy system, operating in a steady state
or under transient conditions. For example, in a power plant, most of the time we
are interested in its steady-state operations but there are also issues like starting up
and shutting down of the power plant, which is very critical especially for a nuclear
power plant. When an accident occurs in a nuclear power plant, there is an emer-
gency shutdown of the reactor. However, nuclear fission will continue to proceed and
if all systems break down, we will not have pumps to take the heat from the fission
reaction and therefore we will not be able to dissipate all the heat to the ambient.
In this case, natural convection will take over and so the system should be designed
such that even in the case of such an accident, even under natural convection, we
do not have catastrophic consequences. Nuclear safety is thus an important part of
nuclear plants and is critically linked to the transient behavior of the reactor. So, as
thermal engineers or mechanical engineers, though most of the time we are interested
in the steady state, we are also interested in the transient one. A chemical engineer
on the other hand is more interested in transients, and process control is a big deal
in chemical engineering.
Much in the same way as we calculated the diameter of the pipe and the rating
of the pump in the example considered in Chap. 1, we should be able to calculate
the other operating variables too in a problem involving several variables. It could
be possible that we actually have a heat exchanger and we want to work out the
outlet temperatures of the fluids. Now we are talking about an analysis problem.
The design problem on the other hand answers questions like given the flow rate
and other constraints, what will the heat exchanger size be, how many tubes, etc.?
We have already seen the difference between the analysis and design problems. So
simulation is more concerned with the analysis rather than the design. This definition
is basically for a thermal system and for us to do system simulation, we need to
know the (i) performance characteristics of all components and (ii) equations for all
thermodynamic properties of the working substances.
Now, in order to do system simulation, it is not practical to use property tables.
Properties must preferably be in the form of equations. Therefore, regression is
required for this. So one needs to be good at statistics. For example, if the enthalpy
(y) is given as a function of pressure (x1 ) and temperature (x2 ), we should be able to
construct equations such that y is a function of x1 and x2 .
Similarly, if one knows the operating variables of a compressor or a blower, we
should be able to calculate the efficiency in terms of these operating variables. There-
fore, knowledge of regression and the knowledge of how to represent thermophysical
properties, system performance, and so on in the form of equations is imperative.
This information has got to be embedded, otherwise, we cannot do simulation.
The equations for performance characteristics of the components and thermody-
namic properties along with the mass and energy balance will form a set of simultane-
ous equations that relate the operating variables. If we solve the set of simultaneous
equations, we will be able to solve for all the parameters under question. This is
basically system simulation. There may be situations involving a large number of
variables, so we will have to do matrix operations for such large systems.
System simulation is thus a way of mimicking the performance of a real sys-
tem. So instead of performing a real experiment, we try to find out how the output
will change when each of the inputs is changed. We have a computer model or a
mathematical model of the system and carry out numerical experiments upfront,
even before we design. By way of this, we get an idea of how the performance of
the system changes when the operating variables are changed. This is the key goal
of system simulation. Needless to say, the end product of system simulation is a
workable design or workable designs. In view of this, system simulation becomes a
precursor to optimization.
Why are we interested in all these? Because in a power plant or an air conditioning
system for an auditorium or in a heat exchanger, where a lot of costs are involved,
we are not only trying to simulate but also optimize. The costs are going up, and so
simulation and optimization are being taken more seriously lately. Accompanying
this is the fact that we have powerful computers and software programs that can do
all these analyses very quickly. By combining these analyses with an optimization
technique, we can attempt to arrive at an optimum.
may take 2 or 3 h only. The hospital administration wants to study this and get
to its bottom. So what are the various steps involved? For example, in the bypass
surgery, the patient is first wheeled in. Then the anesthetist moves in. Then they
make measurements (of the chest, where the sternum has to be incised). Surgery is
largely engineering nowadays! X-rays and echocardiogram reports are used to see
its size and then mark the place where the cut is to be made. Then the main surgeon
comes and anesthesia is administered; the main surgery begins wherein they hook
the patient onto a heart-lung machine or nowadays they do a beating heart surgery
(the heart is not technically stopped here). Then the incision is closed and the patient
is taken to the ICU where he is monitored.
Each of these events can have a Gaussian distribution. The normal time taken by
the anesthetist is, say, 30 minutes with a σ of 5 or 8 minutes. The normal time taken
for a bypass to be done on a patient with 3 blocks will be, say, 3 h with a σ of 15
minutes. Suppose one wants to do a Monte Carlo simulation, he/she will generate a
random number. If it is between 0 and 50, then the anesthetist will finish his/her job
exactly at (mean + σ). If it is between 50 and 70, he/she will take (mean −σ). If it
between 70 and 90, something else. Like this, we pre-assign numbers. We come out
with a model and using sequential random numbers, and we add up all this and find
out the total time taken. Let us say the total time is 218 minutes for one case. We can
run this Monte Carlo simulation several times and can calculate the average time and
take the variance and that will give us an idea of the average time a bypass surgery
takes in that hospital. That is one information we can have. Like that, if we do for all
surgeries, we can determine the total number of operations that can be performed in
a day and also how they can all be scheduled.
Another example that comes to our mind is from the hospitality industry where
corporate hotels are interested in the average time taken by a guest between his/her
entering the hotel and his/her entering the room. Can we optimize this? How much
time is taken for checkout? These are all related to queueing theory in operations
research. In all these cases, including the hospital case, we cannot exactly say that all
surgeons will complete the surgery in exactly 2.5 h. They may have some unexpected
thing, or they may find something new. This is not deterministic but stochastic because
the variables can change with time.
In a deterministic simulation, all the variables are known a priori with certainty.
We can also have a continuous and discrete simulation. Oftentimes in thermal sys-
tems, we are interested in the continuous operation of power plants, air conditioning
systems, IC engines, and so on. The flow of fluid is assumed to be continuous. We
do not encounter discrete kinds of systems in thermal sciences often. Simulation
of discrete systems is of particular relevance to manufacturing engineering where
we need to look at individual items as they go through various metallurgical and/or
manufacturing processes.
2.4 Information Flow Diagrams 29
The information flow diagram is a pictorial way of representing all the information
which is required for simulating the overall system by looking at the information
pertinent to a particular component.
The information flow diagram tells us what the inputs to this block are, what the
outputs from the block are, and more importantly, the equations relating these quan-
tities. Figure 2.2 shows the information flow diagram for a typical heat exchanger.
The inputs here are the inlet temperatures of the hot and cold fluids, Th,i and Tc,i ,
respectively, and the flow rates of the hot and cold fluids, ṁ h and ṁ c , respectively. In
fact, if three of the four quantities are known, the fourth one gets fixed automatically
by energy balance.
where
Q = (ṁC p dT )hot/cold (2.4)
Shown in Fig. 2.3 is the information flow diagram of an air compressor. The
inputs are the mass flow rate, ṁ, the inlet pressure, p1 , and the output is the outlet
pressure, p2 . The transfer function f is f (ṁ, p1 , p2 ) and gives the performance of
the compressor.
ṁP
Power − =0 (2.5)
η
Now let us look at the information flow diagram for a vapor compression refrig-
eration system. The thermodynamic cycle for the above system on T-s coordinates is
given in Fig. 2.4. The various thermodynamic processes associated with this reversed
Rankine cycle are
1-2 → compression
2-2’-3 → condensation
3-4 → throttling/expansion
4-1 → Evaporation
The key quantity of interest for this system is the coefficient of performance (COP)
given by
h1 − h4
COP = (2.6)
h2 − h1
Fig. 2.5 Information flow diagram for a vapor compression refrigeration system
Solution
We do not worry about the free surface. We first draw a schematic, as shown in
Fig. 2.6. Let us assume that the container is fully closed.
The problem under consideration is an unsteady one. Assuming that the coffee
is well mixed and that there are no temperature gradients within the coffee, the
temperature of the coffee varies with time exponentially as shown in Fig. 2.7.
Let us start with drawing the blocks involved. We can now list all the heat transfer
processes associated with this problem.
Even this apparently simple problem has so many heat transfer processes associated
with it. The corresponding information flow diagram is given in Fig. 2.8. We can
write the pertinent differential equations, and solve them to obtain the instantaneous
temperature of the coffee. However, what we want to do now is to model it even
before we start the simulations. So modeling precedes simulation and simulation
precedes optimization. This is the story we are going to look at throughout this
book—modeling, simulation, followed by optimization. The temperature T of the
coffee is generally not under our control. We cannot overheat the coffee as the water
boils at 100 ◦ C at 1 atmosphere pressure. So probably the temperature of the coffee
may be 60 or 70 ◦ C. The temperature at which we normally drink coffee may be
around 45 or 50 ◦ C. The ambient temperature is 30 or 35◦ C. Both of these are not
under our control. Now that we have drawn the information flow diagram, the possible
steps to reduce heat loss are
1. Low emissivity coating on the outside.
2. For the same volume, we can make a tall container so that the air layer is slender
and tall so that we reduce the natural convection. If we look at natural convection
Fig. 2.8 Representation of various heat transfer processes taking place in the coffee flask
2.4 Information Flow Diagrams 33
from a cavity like the air gap in this problem, we see that there is a heated wall on
the inside, a cooled wall on the outside, and the top and bottom walls are adiabatic
(an assumption). If we go in for a tall flask, it results in a multicellular pattern,
which resists the flow of heat and the convective heat transfer coefficient reduces.
Hence, another option is to redesign the shape.
3. The third option is to use better insulation material.
4. We can try to maintain a vacuum between the layers.
Between the air gap and layer 2, we can also have radiation. As we start probing
deeper and deeper, this problem itself starts getting messier and messier. We have to
just leave it at some stage as is the case with any problem. The perfect mathematical
model may be elusive!
For this problem, the initial temperature of the coffee is known and is an input. The
temperature of the coffee at various instants is what we desire using the modeling. A
question that arises is that the temperature of the coffee varies with time and depends
on all the boxes. So is one right in saying that the information is flowing in one
direction?
The answer to this question is that there is no feedback loop because the inside
of the flask is hotter than the outside. All the heat is flowing from the inside to
the outside. There is no question of something coming back. This is quite different
from the vapor compression system where we have a closed loop. Here, there is
no refrigerant that flows continuously. That is the difference between a sequential
arrangement and an arrangement like this.
By system simulation, we mean solving a set of equations which controls the phe-
nomenon under consideration. If the system consists of more than one component,
then we have to assemble the equations which govern the performance of each of
these components. Now this can be at various levels. We can have just algebraic
equations which give us the performance of the system or these can be governed by
ordinary differential equations or linear partial differential equations or even nonlin-
ear partial differential equations. The latter arise in problems involving fluid flow and
heat transfer (continuity equation, the Navier–Stokes equations, and the equation of
energy). What we have listed above represents escalated levels of difficulty.
The central theme of this topic, in so far as thermal system design is concerned, is
looking at situations where more than one component is involved. Hence, we restrict
our attention to algebraic equations and make the analysis realistic, more meaningful,
and also to add a little spice, we will look at nonlinear algebraic equations. In many of
the components like compressors, turbines, and pumps, the dependent variable can
be related to the independent variable in terms of simple linear/nonlinear algebraic
equations.
We now look at two techniques, namely the successive substitution and Newton–
Raphson method.
34 2 System Simulation
The goal is to make f (x) = 0 and find out where this happens. These are then
the roots of the equation.
In successive substitution, we write an algorithm for this procedure and can either
do the calculations using a pen and paper or write a program and solve it on the
computer. What will the algorithm for this be?
We make a Table, as shown in Table 2.1, where the first column is the serial
number or iteration number, then comes xi , the third column is xi+1 , and the fourth
column is (xi+1 − xi )2 , which is some kind of a norm. Whenever the fourth column
goes on to a reasonably small value, we can stop the iterations and be happy that
we have got a solution to the problem. We take an initial guess of xi and solve this.
Taking an initial value of xi as 1, the first 3 iterations are as shown in the table. As
we continue doing this, let us consider the 8th iteration.
As we can see, after a few iterations the method of successive substitution
works. The problem is that if there is a function that changes very rapidly, then
there is a possibility that the method of successive substitution may miserably fail.
If the gradients are very sharp and/or we start with a wrong guess, we may end up
with a solution that diverges and we will be unable to proceed further. So the method
of successive substitution is not a universal cure or panacea that can be used to solve
all kinds of problems. Even so, simple problems can be solved with the help of
this method. We can also demonstrate this with the help of a spreadsheet (left as an
exercise to the student). A graphical depiction of this example is given in Fig. 2.9.
Solution
The wire is losing heat by both convection and radiation. The surface emissivity and
the local heat transfer coefficient are given. The surface area is not required. Power
density is given. There is no need to draw an information flow diagram for this
36 2 System Simulation
problem. It is a single component system, and enlisting of the various heat transfer
processes in the system is relatively straight forward. First, we will draw a sketch of
the system indicating all the heat transfer processes taking place (see Fig. 2.10). We
then write down the governing equation and start solving. The governing equation is
q = h(Tw − T∞ ) + σ(T 4 − T∞
4
) (2.10)
When we write an equation like this, first we state the inherent assumptions as follows:
1. Steady state,
2. No temperature gradients in the conductor: this is not really correct because it is
generating heat at the rate of 900 W/m2 . Where will the conductor be the hottest?
At the center. But we are not considering this because we are not looking at this
level of detail here!
3. Emissivity of the surroundings is 1. The ambient is considered to be a black body at
T∞ . The most important thing is that the ambient temperature for free convection
is the same as the ambient temperature for surface radiation. Oftentimes, without
realizing, we just take this for granted. It is possible that in a room, we may
have a T∞ for convection and because of reflectors and other stuff, we may
have a different ambient temperature, as far as radiation is concerned. But unless
otherwise stated, the T∞ for radiation is the same as that for convection. Even so,
it is good to remember that it need not always be true.
Two types of information flow diagrams are possible. Using one information flow
diagram causes the solution to diverge. Then we have to turn around and use the
other information flow diagram.
Information flow diagram (a)
The above is a highly nonlinear problem, due to the presence of T 4 term on the
right-hand side. We will not investigate why one algorithm is not working. Even so,
suffice it to say that because of T 4 , one of the two algorithms will not work. The errors
will propagate for one of the two algorithms. We know that it is a physical problem:
a conductor is cooling, it has got emissivity, it has got convection and radiation, and
it has to attain a physical temperature. 900 W/m2 is a reasonable flux; emissivity,
2.5 Techniques for System Simulation 37
h, and ambient temperature are all reasonable values and hence we “should” get a
reasonable value for the temperature. If we, however, get an unrealistic solution, the
problem is surely not with the physics but with the algorithm!
When we try both, we see that the second one fails. So we find that even a one-
component system is not as simple as we thought. Suppose we have a two-component
system and it is nonlinear. The solution will be a lot more formidable.
In the second algorithm, the moment Ti on the right side exceeds 400 K, and is
gone. If we take the slope and investigate it mathematically, we can see this, but
it is not central to our discussion here. So, if we solve a problem using successive
substitution, encounter difficulty and get an answer that is negative or diverging and
is not able to proceed further, then we should not declare that this problem has no
solution. Because from the data given to us, there must be a solution to the given
problem. We should immediately think of the other information flow diagram.
There are two information flow diagrams. So in the first algorithm, we are saying
that
Q conv = Q total − Q radn
364 or 365 K as the initial guess, it may have worked. But, had we started with an
initial guess of 300K or 400K, it may not have worked.
Now we will go to a real thermal system, namely the fan and duct system. Let us
consider a two-component thermal system, like a ducting system for the air condi-
tioning of an auditorium. We want to decide on the fan capacity. We are not talking
about the chiller capacity and how many tons of refrigeration is required. We want
to send in chilled air and have to decide on the fan and ducting system. It is possible
for us to calculate the total length of the duct, how many bends are there, and so
on. If we can calculate the total pressure drop as we did before, using formulae, this
multiplied by the discharge and divided by the fan efficiency will give us an idea of
the power required.
For overcoming a particular head, we will also know what the discharge is from
the manufacturer’s operating characteristics. These are called fan curves. The fan
curve is obtained from the manufacturer’s catalog. But how would this have been
obtained? Basically through experiments. Someone must have first done an experi-
ment with almost zero discharge and then will have slowly increased the discharge
and determined the head and he/she may have got some points. But if we want to
do simulation and we just have these points, it is no good for us. We want a curve.
Hence, we will have to convert all these points into an approximate and best-fitting
curve using principles of regression.
Now if we have to design the fan and ducting system, the fan characteristics of the
manufacturer have to match the load characteristics of the particular design. When
these two intersect at a particular point, this is called the operating point. This is
graphically depicted in Fig. 2.11. The head is usually in meters and the discharge in
m3 /s. Using system simulation, we want to calculate this operating point.
The duct curve can easily be plotted if we know the friction factor and other
associated quantities.
Subsequently, we have to determine the sensitivity of the operating point. It is
possible that with time, our fan curve goes down (it cannot possibly go up!). The
duct curve may also change depending on the factors that affect it like age, friction
factor, etc. So we will have to study the sensitivity of the performance of the system
or equipment with the change in design operating conditions.
Example 2.3 It is required to determine the operating point of a fan and duct
system. The equations for the components are
Solution
An important point here is the initial guess. We can either take P to be 250 Pa or Q to
be 10 m3 /s. So we first draw the information flow diagram for the two components,
the fan and the duct. We draw two rectangular boxes, where information goes in and
comes out. This is a sequential arrangement because the output of the fan becomes
the input to the duct and the output of the duct is the input to the fan. Two types of
information flow diagrams are possible. We can use the first equation to determine
P if we know Q. But there is no hard and fast rule about this.
So we can say
(P − 100)
Q 1.8 = (2.15)
9.5
1
(P − 100) 1.8
Q= (2.16)
9.5
Equation 2.13 can be used to calculate P, given Q or vice versa. By the same token,
the second equation (Eq. 2.14) also need not necessarily be used only to calculate Q;
it can also be used to calculate P.
So if we want to calculate P and Q as shown in the equations, it is one information
flow diagram. If we want to swap the calculations of P and Q in the equations, it is
another information flow diagram. And since the equations are nonlinear (we have
P 2 and Q 1.8 ), one of the two information flow diagrams will not work. At least, this
is our belief reinforced by our bitter experience with the previous example.
Total head = Static head + dynamic head
40 2 System Simulation
f lv 2
Dynamic head = 0.182Re−0.2
2gd
−0.2
vd f lv 2
= 0.182 (2.17)
ν 2gd
The dynamic head becomes v 2 multiplied by v −0.2 which gives us v 1.8 . Then what
does the value 100 indicate in the first equation (Eq. 2.13)? It is the static pressure.
Using regression, we can get that. Equation 2.14 cannot be got from physics, but the
manufacturer does experiments and gives it in the manufacturer’s catalog. It is called
a fan characteristic. This is the fundamental experiment the manufacturer has to do
and give the user.
How did we get this Q 1.8 ? Normally the dimensions are fixed. The diameter of
the pipe or the hydraulic diameter of the rectangular duct is fixed. P is the head.
We now use the fan characteristic to determine Q, knowing P and the duct charac-
teristic to get P from Q. So at the end of the exercise, whatever P we get, we compare
it with the P we assumed and started with. If they are equal, we have the solution.
In general, they will not be equal, so we keep working on it. What goes through this
is the information on P and Q. As we proceed with the iterations, it will stabilize
and whatever is coming out will be the same as what was sent in. That will result in
a situation where (Pi+1 − Pi )2 will be very small. Is this enough for this problem?
What should be the stopping criteria for this problem? No, this is not sufficient. We
need to incorporate the Q factor too. So we write the stopping criterion as
Q i+1 − Q i 2 Pi+1 − Pi 2
=( ) +( ) ; ≤ 10−7 (2.18)
Qi Pi
Table 2.3 Successive substitution method for the fan and duct problem (using the first information
flow diagram) Fig. 2.12
S. No. Pi , Pa Q i , m3 /s Pi+1 , Pa
1 250 17 1657
2 1657 −111.91 Diverges
Table 2.4 Successive substitution method for the fan and duct system (using the second information
flow diagram) Fig. 2.13
Q i+1 − Q i 2 Pi+1 − Pi 2
S. No. Pi , Pa Q i , m3 /s Pi+1 , Pa ( ) +( )
Qi Pi
1 250 4.62 566.1 2.367
2 566.1 8.67 485.8 0.0299
3 485.8 7.81 503.9 2.04 × 10−3
4 503.9 8.01 499.8 9.11 × 10−5
5 499.8 7.97 500.6 4.1 × 10−6
6 500.6 7.98 500.4 2 × 10−7
7 500.4 7.98 500.4 0
Hence, we have to use the other information flow diagram given in Fig. 2.13.
0.555
P − 100
Q= (2.19)
9.5
0.5
20 − Q
P= (2.20)
4.8 × 10−5
We see that the solution is rapidly converging (see Table 2.4). The method of
successive substitution converges rapidly, but when it starts diverging, that too will
be rapid. Success or failure is guaranteed and is immediate.
The stopping criterion cannot be 10−8 when we are rounding it off to two decimal
places. If we are retaining up to the fourth decimal, then it can be of the order of
10−4 . So the solution to the problem is P = 500.4 Pa and Q = 7.98 m3 /s.
Let us now examine the stability of the system at its operating point. For this, we
start with an initial guess which is close to the correct answer, but is to the left or
right of the correct answer, and see whether it is proceeding to the same operating
point. We have to go back to the equations and check if we had overshot. If we start
P as 490 or 510, after 3 iterations, does it come to 500.4 and 7.98? That is one way
of checking it. The other way is to start with Q as 7.5 or 8.5 and see if it approaches
the same value.
42 2 System Simulation
We may think that this is not the most important part of the problem, while getting
the algorithm and iteratively solving it is the most important. However, the stopping
criterion is also equally important because in real life, no one will tell us what the
stopping criterion is. It is for us to figure out and for us to get convinced that we have
reached the level at which the solution will not change. This is the level of accuracy
we are comfortable with. If we know what the true value is, we can compare our
result with the true value. If we get 1% close to the true value, it is fine. But, when
we are doing CFD computations or simulations, the true value is elusive. We do not
know what the true value is. We are looking at the difference between successive
iterations. A word of caution may be necessary here. Convergence can be highly
misleading as we can finally converge into a beautiful but totally incorrect solution.
So if we have reached convergence, it does not necessarily mean that we have
got an accurate solution. Convergence means, with the present numerical scheme,
if we do 100 more iterations, we cannot expect a significant change in the answer.
But whether that answer itself is correct or not, convergence will not tell us! So as
an additional measure, we will have to validate our solution. We need to apply our
numerical technique to some other problem for which either somebody else has done
experiments or for which analytical solutions are possible. Then for that case, after
we run through many iterations, after we incorporate our stopping criterion, if we get
a solution exactly the same as other people have reported or the analytical solution,
then we can say, for this particular problem, when we apply this stopping criterion,
our solution works. Therefore, when we apply it to an actual problem, for which we
do not know what the true solution is, there is no reason why it should misbehave.
Why do some information flow diagrams not work?
Let us try to figure this out from a numerical point of view. If we are trying to get
the roots of the equation g(x) = 0 around x = α, the goal is
g(α) ≈ 0 (2.21)
Then α is a root of the equation. It won’t be 0.0000, but within limits, it is alright.
The requirement now is
|g (α)| < 1 (2.22)
We can do a Taylor series expansion and take an example and show when |g (α)| >
1, and the whole thing will diverge.
Let us try to answer the question of why the information flow diagram does not
work. Consider the same Example 2.3. We substitute the initial condition values and
determine dP/dQ and dQ/dP. In one information flow diagram, one of the derivatives
is very small compared to 1 and will not grow. The other one is trying to create
“some mischief” and after a few iterations, will reach a balance. But if we look
at the other information flow diagram, we get very very large values for both the
derivatives. This justification does give us an approximate understanding of why
2.5 Techniques for System Simulation 43
certain information flow diagrams do not work. As we know, since P and Q are
continuously changing, with every iteration, these will fluctuate. But one of the
derivatives is showing promise. If we work these derivatives out for the third or
fourth iteration, the value will be less than 1 and that is how it eventually converges.
In the previous information flow diagram, if we look at the derivative, it is large and
that is why it is diverging. Here, though one of the two derivatives is more than 1,
the subsequent iterates reduce and die down.
Already, we are not allowing the errors to propagate as we are raising the system
to a power of 0.55. When we raise something to the power of 1.8 or 2, mischief is
guaranteed. We do not know if the other information flow diagram works when one
starts very close to the final solution. This is a good exercise, in the sense that it tells
us how to rewrite the governing equations in such a way that even when we are way
off from the final answer, because we are raising something to the power of 0.55, it
can constrain or bound the errors. We can work it out (readers are encouraged to do
this exercise).
We will see one last example, which concerns the movement of a truck/lorry on
a ghat1 road. There are two characteristics involved in this—the load and engine
characteristics, as shown in Fig. 2.14. These two have to match in order that we
obtain the operating point. For different transmission settings, we will have different
curves and different operating points. When a vehicle is climbing uphill, sometimes
when the vehicle is in the fourth gear or sometimes even in third gear, it may produce
some odd sounds. This means that the engine is telling us that we have to downshift.
We reduce to the appropriate gear so that the load curve and engine curve match
and we stay very close to the operating point. This example will highlight how we
employ the successive substitution for such a two-component system.
Solution
The load characteristics is a straight line while the engine characteristics is a parabola
as it is quadratic in ω. It has two operating points. We can use two possible information
flow diagrams. We can determine T from ω using the equation
or we can write it as
T − 185
ω= (2.26)
0.7
and use it to determine ω from T. The two information flow diagrams should lead
to two possible solutions. Let us rewrite the equations to find T and ω:
−(−11) ± (−11)2 − (4)(0.12)(T − 18)
ω=
2 × 0.12
T = 0.7ω + 185
Let us assume an initial value of Torque as 220 Nm. The ± appearing in the
equation for ω gives rise to two values corresponding to the points A and B in the
figure.
After 5 iterations, we find that the solution converges to a torque of 229.9 Nm
and speed of 64.14 rps (see Table 2.5). Let us do the stability analysis around point
A. We will start with T as 300Nm and if we do two quick iterations using the above
algorithm, we will see that the algorithm hits the same solution for T and ω.
Consider the equation y = f(x). The goal is to determine the values of x at which
y = 0.
Let us draw a tangent at the point on the curve corresponding to xi and extend it
to cut the x-axis and the point at which it intersects the x-axis is taken as xi+1 . The
procedure is graphically elucidated in Fig. 2.15. The algorithm for this method can
be written as
f (xi ) − 0
f (xi ) ≈ (2.27)
(xi − xi+1 )
f (xi )
xi − xi+1 = (2.28)
f (xi )
f (xi )
xi+1 = xi − (2.29)
f (xi )
This is one of the most powerful and potent techniques in solving single-variable
problems and can also be used for multivariable problems. The next step is to go
to xi+1 , extend it to intersect the curve, draw the tangent at that point, and get x
again. Hopefully, it will converge to the correct solution within a few iterations. The
major difference between this method and the successive substitution method is that
we are using the information on the derivatives. Since we are using the information
of the derivatives, the convergence should be very fast. But we cannot say that the
convergence is guaranteed. The advantages are obvious that it converges fast but let
us look at the limitations that are not so obvious.
f (ζ)
f (xi+1 ) = f (xi ) + f (xi )(xi+1 − xi ) + (xi+1 − xi )2 + · · · (2.30)
2
where ζ is between xi and xi+1 . Neglecting terms beyond the linear term, we get
When xi and xi+1 are both close to the true solution, higher order terms do not
matter. Now it becomes a mathematical question of what this ζ is, where we want to
evaluate the second derivative, etc. Let us not worry about that. So we get the same
solution as before. What are we trying to make 0 here? We force f (xi+1 ) = 0.
So we see that both the Taylor series expansion and the graphical method give the
same result. Let us start with an example where the Newton–Raphson method is not
powerful at all.
Example 2.5 Determine the roots of the equation f (x) = x 8 − 1 using the
Newton– Raphson method, with an initial guess of x = 0.5.
Solution
Before drawing the tabular columns, we write down the algorithm.
(xi8 − 1)
xi+1 = xi − (2.35)
8xi7
2.5 Techniques for System Simulation 47
24 1.000 1.000 –
Let us plot the function and see how it looks. A plot of the function is given in
Fig. 2.16. Now we try to figure out why is it misbehaving. We started at 0.5, which
seems a reasonable guess. The problem is that when the slope is very gentle, it takes
forever to converge.
We see from Table 2.6 that the convergence is very slow. So the conventional
wisdom that this method is very fast because it uses the derivative does not hold true
at all times! Now let us work out a problem in which the Newton–Raphson method
really works.
48 2 System Simulation
Example 2.6 Use the Newton–Raphson method to determine the first positive
root of the equation f (x) = x − 2sinx, where x is in radians, with a starting
guess, x0 = 2 radians.
Solution
Let us first write the algorithm.
xi − 2sin(xi )
xi+1 = xi − (2.36)
1 − 2cos(xi )
Figure 2.17 shows the graphical representations of the 2 functions f(x) = x and
f(x) = 2 sinx. The solution is x = 1.895. The curve with diamond symbols shows
the difference between the two functions, which is actually f(x). One can see where
f(x) becomes zero. We are looking at the first positive root. It will also have negative
roots because f(x) will keep going on the other side too.
The method is so fast and there is no comparison with successive substitution. If
our initial guess is in the right region and the function is not changing very slowly,
there is a good chance that we will get convergence within 3–4 iterations (see Table 2.7
for the solution).
f (ζ)
f (xt ) = f (xi ) + f (xi )(xt − xi ) + (xt − xi )2 (2.38)
2!
Equations 2.38–2.37 gives
(xt − xi )2
(xt − xi+1 ) f (xi ) + f (ζ) = 0 (2.39)
2!
Here, xt − xi = E t,i is the error in the true value in ith iteration. By the same
token, xt − xi+1 = E t,i+1 is the error in the true value in the i+1th iteration. Let us
rewrite the above equation in terms of E t,i and E t,i+1 .
2
E t,i
E t,i+1 f (xi ) + f (ζ) = 0 (2.40)
2!
f (ζ) 2
E t,i+1 ≈ − E (2.41)
2 f (xi ) t,i
Anyway we are expanding f(x) very close to the solution. Therefore, we say that
Around the true value, f (xt ) and f (xt ) are both fixed. Therefore
E t,i+1 ∝ E t,i
2
(2.45)
50 2 System Simulation
So the error in the i + 1th step is proportional to the square of the error in
the ith step because the terms f (xt ) and f (xt ) are a constant as we are looking at
values close to the true value. We are saying that xi , xt , xi+1 are all very close. Hence
if E t,i is 0.1, E t,i+1 will be 0.01, then 0.0001 in the next step, and so on. Therefore, the
Newton–Raphson method exhibits quadratic convergence. People who are interested
can perform a similar exercise for successive substitution and can see that it has only
a linear convergence.
Let us revisit Example 2.2 and try solving it using the Newton–Raphson method.
Solution
We are looking at a steady state, energy is input to the conductor, and the conductor
is radiating heat by both natural convection and surface radiation. The ambient for
radiation is the same as the ambient for convection. We had a small discussion about
it earlier where we said that the ambient for convection need not be the same as that
for radiation, though normally both are the same.
Governing equations for the steady-state condition of the conductor, as already
worked out in Example 2.2.
q = h(Tw − T∞ ) + σ(T 4 − T∞
4
) (2.46)
−8
f (Tw ) = −900 + 9(Tw − 300) + (0.6)(5.67 × 10 )(T − 300 ) 4 4
(2.47)
−7
f (Tw ) = 9 + 1.36 × 10 Tw3 (2.48)
Algorithm:
f (Ti )
Ti+1 = Ti − (2.49)
f (Ti )
The solution is presented in Table 2.8. We see that f (Ti ) is extremely stable
here. It took 16 iterations for us to get the solution using the method of successive
substitution, as seen in Table 2.2. But here, it is just 4 steps. So this has to be quadratic
convergence while the successive substitution had linear convergence. Hence, the
Newton–Raphson method is really a powerful one. If the problem is well posed, it
will not give us any trouble.
How do we extend the Newton–Raphson method for multiple unknowns? The prob-
lem of multiple unknowns has great practical relevance, as invariably any thermal
system will have more than two components.
Let us consider a three-variable problem, where the variables are x1 , x2 , and x3 ,
something like pressure P, temperature T, and density ρ. Three independent equations
connecting x1 , x2 , x3 are required to close the problem mathematically.
f 1 (x1 , x2 , x3 ) = 0 (2.50)
f 2 (x1 , x2 , x3 ) = 0 (2.51)
f 3 (x1 , x2 , x3 ) = 0 (2.52)
∂ f1
f 1 (x1,i+1 , x2,i+1 , x3,i+1 ) = f 1 (x1,i , x2,i , x3,i ) + (x1,i+1 − x1,i )
∂x1,i
∂ f1 ∂ f1
+ (x2,i+1 − x2,i ) + (x3,i+1 − x3,i )
∂x2,i ∂x3,i
+ O(h 2 ) (2.53)
We can obtain similar looking equations for f 2 and f 3 . The goal is to make the
left-hand side equal to 0, because we are seeking the roots to the equations f 1 , f 2 ,
and f 3 . Therefore, we force f 1 = 0, f 2 = 0, and f 3 = 0. If we do that, we get
⎡∂ f ∂ f1
⎤
∂ f1 ⎡ ⎤ ⎡ ⎤
− x1,i − f1
1
∂x ∂x2 ∂x3 x1,i+1
⎢ ∂ f12 ∂ f2 ⎥ ⎣
⎣ ∂x1
∂ f2
∂x2 ∂x3 ⎦ x 2,i+1 − x2,i ⎦ = ⎣− f 2 ⎦
∂ f3 ∂ f3 ∂ f3 x3,i+1 − x3,i − f3
∂x1 ∂x2 ∂x3
The algorithm is written as shown here. The left-hand side is equivalent to what
is called a Jacobian matrix, which is a sensitivity matrix. The partial derivatives
with respect to the various variables are referred to as the sensitivity or the Jacobian.
Unfortunately for us, the Jacobian matrix is not fixed because all the elements of the
matrix keep changing with iterations. There are certain derivatives which get fixed.
For example, in the engine problem
T = 0.7ω + 175
and so
f 1 = T − 0.7ω − 175
• Its derivative ∂∂Tf1 will be 1 and it will remain 1 throughout. Similarly, ∂∂ωf1 will
remain −0.7. In a typical problem, the solution will proceed as follows.
• We take initial values for x1 , x2 , x3 .
• We substitute in the respective equations for f 1 , f 2 , f 3 and calculate the values of
f 1 , f 2 , f 3 at the initial starting point. So the forcing vector or the column vector
on the right-hand side is known.
• Once x1 , x2 , x3 are known, the partial derivatives at this point can be evaluated.
Now all the elements of the Jacobian or the sensitivity matrix are known.
• This system of equations can be solved to obtain x1 , x2 , x3 .
• x1 , x2 , x3 can now be updated.
The above algorithm is easily programmable. But the moment we have more than 10
or 12 variables, there are some issues with handling the matrix. Matrix inversion will
not work very efficiently for more than a certain number of variables. However, the
procedure described above can be extended to any number of variables. In our fan
and duct problem or the truck problem, we will have to solve only a 2 × 2 matrix.
Compared to successive substitution, the Newton–Raphson method is extremely
2.5 Techniques for System Simulation 53
fast. The Jacobian matrix also gives us the sensitivity of the system to changes in the
variables as it contains information about the partial derivatives of the functions. Let
us now revisit the truck problem (Example 2.4).
Solution
Let us rewrite the equations in terms of f 1 and f 2 .
For the first iteration, let T = 200 Nm and ω = 60rps. The error is given by
2 2
Ti+1 − Ti ωi+1 − ωi
err or = + (2.60)
Ti ωi
After the first iteration, we see that the error is 936! If we look at this, it looks
“hopeless”. But Newton–Raphson is quadratically convergent and will converge
quickly. Table 2.9 shows the iterations involved and corresponding error after each
iteration.
54 2 System Simulation
Table 2.9 Newton–Raphson method for two variables (the truck problem)
∂ f1 ∂ f1 ∂ f2 ∂ f2
S. No. T ω f1 f2 ∂T ∂ω ∂T ∂ω
1 200 60 −46 −27 1 3.4 1 −0.7
2 230.24 64.63 2.55 −0.001 1 4.5 1 −0.7
3 229.9 64.14 0.0328 0.002 1 4.39 1 −0.7
4 229.89 64.13 3.9e-4 2e-4 1 4.39 1 −0.7
S. No. T ω T ω Error
1 200 60 30.24 4.63 0.0288
2 230.24 64.63 −0.34 −0.49 6.07 ×
10−5
3 229.9 64.14 −0.006 −0.006 8.37 ×
10−9
4 229.89 64.13 −2.3e-4 −3.8e-5 –
The error at the end of the second iteration has reduced to 0.37! We can see the
quadratic convergence if we plot the error as a curve. We will see a nice curve that
falls off rapidly.
If we are solving a transient problem, the kind of error we have considered may
not be sufficient. If we have an extremely small step, we are simply not allowing the
variables to change, and then saying that the stopping criterion is 10−6 or 10−8 is
meaningless. When we have a microsecond or nanosecond as our time step, we just
do not allow the variables to change. For this kind of a time step, we should allow it
to run for a million time steps. We perform additional tests like grid independence
study, mass balance check, and energy balance check to confirm convergence!
MATLAB code for Example 2.8
1 clear ;
2 clc ;
3
18 J=zeros(2 ,2) ;
19 count_max=4; % Maximum no. of iterations needed
20 errTol=10^−20; % Error tolerance
21 count=0;
22 label=0;
23
24 while label==0
25
26 count=count+1;
27 % calculating value of function f1 for T, w
28 f1_value = single (subs(f1 ,{T,w},{T_old,w_old}) ) ;
29 % calculating value of function f2 for T, w
30 f2_value = single (subs(f2 ,{T,w},{T_old,w_old}) ) ;
31
32 forcing_vector (1 ,1)=−f1_value ;
33 forcing_vector (2 ,1)=−f2_value ;
34 % calculating 1st element value of Jacobian for T, w
35 J(1 ,1)=single (subs(f1_T,{T,w},{T_old,w_old}) ) ;
36 % calculating 2nd element value of Jacobian for T, w
37 J(1 ,2)=single (subs(f1_w,{T,w},{T_old,w_old}) ) ;
38 % calculating 3rd element value of Jacobian for T, w
39 J(2 ,1)=single (subs(f2_T,{T,w},{T_old,w_old}) ) ;
40 % calculating 4th element value of Jacobian for T, w
41 J(2 ,2)=single (subs(f2_w,{T,w},{T_old,w_old}) ) ;
42
47 % finding residue
48 err =((T_new−T_old) /T_old)^2+((w_new−w_old) /w_old) ^2;
49
We can write these as resistances in series and parallel and can combine them.
This is one possible way of approaching the above conduction problem.
From the foregoing examples, it is clear that there are several examples where a
system’s performance can be mathematically represented as a system of linear equa-
tions. So we need to know how to solve them. Solving a system of equations having
just 2 variables is very trivial. Let us try and solve one such system.
2x + 3y = 13 (2.61)
3x + 2y = 12 (2.62)
Solution
How many ways are there of solving this? We can use the method of elimination
by substitution, where in one equation we substitute for one variable from the other
equation and solve for it. The second method that we use is matrices, and a third
possible solution could be graphical.
Let us look at matrix inversion, as the other two are elementary.
2x + 3y = 12 (2.64)
4x + 6y = 28 (2.65)
2 3
They are parallel lines and never meet. There is no solution as D = 46 = 0 and
this is a singular system.
2x + 3y = 12 (2.66)
4x + 6y = 24 (2.67)
These
have an infinite number of solutions. This is also a singular system, where
D = 24 36
(c) Consider the following system of equations:
2x + 3y = 12 (2.68)
1.9x + 3y = 11.8 (2.69)
The determinant is very close to zero and the two lines are so close to each
other that we cannot find out exactly
2 3 where the two lines meet. This is called an
ill-conditioned system. So D 1.9 3 = 0.3 is small in this case, but nonzero. If
we have several variables and the system is ill conditioned, the roundoff errors
will eventually propagate and prevent us from obtaining the true solution.
From Fig. 2.20, we see that the two lines are almost indistinguishable and hence
solving this problem graphically is not possible, though we can get the solution
algebraically. Very small changes in the coefficient can lead to large changes in
the result. The roundoff errors become very critical. For large systems, when
2.5 Techniques for System Simulation 59
Now we consider several variables in the system of equations. The general rep-
resentation is
Of course, we have to reiterate that when the number of variables exceeds a certain
number of components, it is very difficult to invert the matrix.
Gauss–Seidel Method
The Gauss–Seidel method is an iterative method while the matrix inversion is a one-
shot procedure. By one shot, we mean we invert the matrix and we automatically
get values for x1 to xn . In the Gauss–Seidel method, we have to start from some
initial values and proceed and stop iterations only when the solution converges. One
may start with 0 or 1. There should be something which is nonzero which will start
driving the system. Then we should see how it converges.
The algorithm goes like this.
Needless to say, this system method will fail when aii = 0. When we are given a
system of equations, it is important for us to rearrange the system of equations such
that aii is the highest.
For example, 2x + 3y + 6z = 14 can be used to solve for z as the expression on
the right will be divided by 6, which ensures that the errors do not grow.
• The first step in the Gauss–Seidel method is to rearrange the equations so that
diagonal dominance is satisfied and the algebraic manipulations required to do
this are permitted.
• We have to start with guess values for all the variables.
• We substitute the guess values for all the variables and get the value of x1 first.
• Now when we go to x2 , for all variables except x1 , we will use the initial guess
value, while the latest determined value will be substituted into x1 .
• When we go to x3 , the guess values will be used for all variables except x1 and x2 ,
for which the updated values will be used. Therefore, there is a dynamic updating
of the variables in the Gauss–Seidel method.
• It is also perfectly OK if we use the guess values in all the equations of the algorithm
and get the values for x1 to xn without using the dynamically updated values in
any of these. This is called the Jacobi iteration and needless to say, it will be slow.
So the key point in Gauss–Seidel is dynamic updating.
2.5 Techniques for System Simulation 61
Stopping criterion:
N
R =2
(xi,k+1 − xi,k )2 (2.78)
i=1
R2 <
Let us say at the end of the fifth iteration, we have the value of x and we also
have the value of x at the end of the fourth iteration. It is possible to assign ω = 1,
such that we care only about the new value of x and not the old value. This may be
because we have much faith in the new iterate! We can go ahead with this! Such a
scheme is called a fully relaxed numerical scheme.
Suppose ω = 0.5, then we are giving 50% weight to the new iterate and 50%
weight to the old iterate. So we are essentially under relaxing. Why would one want
to do that? Sometimes, there are a lot of oscillations of the variables and we want to
dampen these. The system may be going haywire. Therefore, we deliberately slow
down the system so that convergence is guaranteed.
It is also possible for us to have ω > 1, such as ω = 2. Yes! In this case, we
are giving too much weightage to the present iterate. It is called over-relaxation
or successive over-relaxation and works well for pure diffusion problems (as for
example, the Laplace or Poisson equation in heat conduction).
62 2 System Simulation
Q 1 C1 + Q 2 C2 = Q 3 C3 (2.81)
Likewise, if we have multiple reactors and the output of one or two reactors is
going to the third, the third is going to the fourth, fifth, and so on, we can come up with
the system of equations and solve for the concentrations in the various reactors. Apart
from getting the concentrations, because we have declared that under steady-state
conditions Q 1 + Q 2 = Q 3 , this C3 which is coming out will be the concentration of
the chemicals within the reactor itself. This is additional information we are getting,
consequent upon the fact that the stirred reactor is in a steady state and is well mixed.
2.5 Techniques for System Simulation 63
− 6x + 4y + z = 4 (2.83)
2x − 6y + 3z = −2 (2.84)
4x + y − 9z = −16 (2.85)
x, y, and z are in appropriate units. Solve the system of equations using the
Gauss–Seidel method to determine x, y, and z. Rearranging the equations is
allowed, if required. Do you expect convergence for this problem? Initial guess,
x = y = z = 1.
Solution
We write down the algorithm first.
(4 − 4y − z)
x= (2.86)
−6
(−2 − 2x − 3z)
y= (2.87)
−6
(−16 − 4x − y)
z= (2.88)
−9
Does it satisfy diagonal dominance? Yes, because diagonal dominance does not
depend on − and +, but only the modulus. So the convergence is guaranteed. We can
use ω = 1, i.e., a fully relaxed scheme. The Gauss–Seidel solution is presented in
Table 2.10.
The final answer is x = 0.98, y = 1.88, and z = 2.42.
We can employ ω = 1.8 and solve the problem with a view to accelerating the
convergence (left as an exercise to the reader).
We see that Gauss–Seidel method is a lot simpler than the Newton–Raphson
method for 2 variables. But if the system of equations is not diagonally dominant, it
can be very painful or can proceed very slowly.
What would the physical representation of the reactors in Example 2.11 look
like?
We now turn around and see how one can get these equations. Let us look at the
steady-state analysis of a series of reactors. Consider 3 reactors all of which are well
stirred and we call these reactors 1, 2, and 3, as shown in Fig. 2.22. The concentration
of the species in each of the reactors is x, y, and z, respectively, in let us say mg/m3 .
These reactors are interconnected by pipes. There is uniform concentration.
In the figure, if we look at the quantity 2x between reactors 1 and 2, 2 denotes
the flow rate in m3 /s while x is the concentration in mg/m3 . So the concentration
multiplied by the flow rate gives us the mass flow rate. Now if we do the steady-state
mass balance for the 3 reactors, will we get the 3 equations stated in Example 2.11?
Let us start with reactor 1. 2x and 4x are going out from it which makes it −6x.
4y is coming inside, hence it becomes +4y. Since z is also coming in, it becomes +z.
Since 4 is again going out, the mass balance equation for reactor 1 can be written as
− 6x + 4y + z − 4 = 0 (2.89)
2.5 Techniques for System Simulation 65
Please check the first equation we had in Example 2.11. Alternatively, we can
state the problem as “The steady-state analysis of a series of reactors is shown here,
whose concentrations are x, y, and z. Set up the governing equations for this problem
and using the Gauss–Seidel method, and determine x, y, and z”. We can also verify
that the other 2 equations are obtained thus.
2x − 6y + 3z = −2 (2.90)
4x + y − 9z = −16 (2.91)
The beauty of this example is that we should satisfy mass balance and the system
must also be diagonally dominant. Now one can understand why the Gauss–Seidel
method will be more useful or in a more straightforward way, where we can expect
a linear system of equations in engineering—when we are doing chemical reactor
engineering, fluidized beds, fractionating columns, and petroleum engineering.
Recall an earlier example where we designed the pump and piping system for an
apartment complex, with several branches. Suppose we are designing the distribution
system for the city itself, there will be places where it will be pumped again and so
on, and we may end up with a situation similar to Example 2.11.
Problems
2.1 An oil cooler is designed to cool 2 kg/s of hot oil from 80 ◦ C, by using 3 kg/s
of water entering at 30 ◦ C (Tc,i ) in a counter flow heat exchanger. The overall
heat transfer coefficient of the exchanger is 2000 W/m2 K, and the surface area
available for heat transfer is 3 m2 .
(a) Write down the energy balance equations for the above exchanger.
(b) Using the method of successive substitution with an initial guess of the hot
water outlet temperature, Th,o = 60 ◦ C, determine the outlet temperatures.
Perform at least 8 iterations.
2.2 The operating point of a centrifugal pump is to be determined. The pump per-
formance curve and the system load characteristics are given as follows.
where P is the static pressure rise in Pa, and Q is the discharge in m3 /s.
• Using the successive substitution method for 2 unknowns, determine the oper-
ating point of the pump.
• Decide on your own stopping criterion and start with an initial guess value of
Q = 0.5 m3 /s.
• At the operating point, if the pump efficiency is known to be 85%, what is the
electrical power input required for the pump?
• For the same electrical power input, if over time, the pump efficiency drops
to 82%, what will be the operating point for the pump?
• Can you quantify the sensitivity of the discharge with respect to the pump
efficiency?
2.3 The demand for an engineering commodity in thousands of units follows a curve:
Q = 1500(0.97) P
where Q is the demand (in thousands of units) and P is the price in Rupees. This
is frequently referred to as the demand curve in micro-economics. The supply
of the quantity in the market (again in thousands of units) varies with price
P = 10 + 1.1368 × 10−4 Q 2
P = 7 + 9.25 × 10−5 Q 2
what will be the new equilibrium price and the new equilibrium quantity?
2.4 Solve the problem of determining the operating point of the centrifugal pump
(problem # 2.2) by using the Newton–Raphson method for multiple unknowns
with the same initial guess and stopping criterion.
2.5 Techniques for System Simulation 67
2.5 Steam at the rate of 0.12 kg/s, at a temperature of Ts bled from a turbine, is used
to heat feed water in a closed feed water heater. Feed water at 80 ◦ C (Ti ) flowing
at the rate of 3 kg/s enters the heater. The U value for the feed water heater is
1500 W/m2 K. The area of the heat exchanger (heater) is 9.2 m2 .
The latent heat of vaporization of steam is given by h g = (2530 − 3.79Ts ) kJ/kg
where Ts is the temperature at which steam condenses in ◦ C (which is also the
same as the temperature at which steam is bled assuming no heat losses along
the way).
(a) Set up the energy balance equations for the steam side and the water side.
(b) Write down the expression for the heat duty of the feed water heater.
(c) Identify the relationship between (a) and (b).
(d) Using information from (a), (b), and (c) and the method of successive sub-
stitution, determine the outlet temperature of the feed water (T0 ) and the
condensing temperature of steam(Ts ).
(e) Start with an initial guess of T0 = 130 ◦ C and perform at least 4 iterations.
2.6 The mass balance of three species (x1 , x2 and x3 all in kg/s) in a series of inter-
connected chemical reactors is given by the following equations:
Using the Gauss–Seidel method, with initial guess values of 1.0 for all the three
variables, determine the values of x1 , x2 , and x3 . Perform at least 7 iterations and
report the sum of the squares of the residues of the three variables at the end of
every iteration.
2.7 Two-dimensional, steady-state conduction in a square slab with constant ther-
mal conductivity, k = 50 W/m K, and a uniform internal heat generation of
qv = 1 × 106 W/m3 is to be numerically simulated. The details along with the
boundary conditions are given in Fig. 2.23. For simplicity as well for demon-
strative purposes, the number of grid points is intentionally kept small for this
problem.
(a) Identify the equation that governs the temperature distribution T(x,y) for the
given problem.
(b) Using the Gauss–Seidel method, estimate T1 , T2 , T3 , and T4 . Start with an
initial guess of 50 ◦ C for all the four temperatures. Do at least 7 iterations.
(c) What is the approximate center temperature? What would have been your
crude guess of the center temperature? Are these two in agreement?
Chapter 3
Curve Fitting
3.1 Introduction
Next, we will move on to an equally interesting topic, namely, curve fitting. Why
study curve fitting at all? For example, if the properties of a function are available
only at a few points, but we want the values of the function at some other points,
we can use interpolation, if the “unknown points” lie within the interval, where
we have information from other points. Sometimes extrapolation is also required.
For example, we want to forecast maybe the demand of a product or the progress
of a cyclone. Whether the forecasting is in economics or weather science, we need
to establish the relationship between the independent variable(s) and the dependent
variables to prove our proposal of extrapolation. Suppose our data is collected over
40 years, we first develop a model using the data of, say, 35 years and then test it
using the data for the 36th, 37th year, and so on. Then we can confidently say we have
built a model considering the data from 0–35 years, and in the time period 36–40
years, it is working reasonably well. So there is no reason why it should misbehave
in the 41st year. This is an extremely short introduction to, say, a “101 course on
Introduction to Forecasting”.
In thermodynamics, oftentimes, thermodynamic properties are measured at only
specific pressures and temperatures. But we want enthalpy and entropy at other values
of pressures and temperatures for our calculations, and so we need a method by which
we can obtain these properties.
This is only one part of the story. Recall the truck problem of Chap. 2. We wanted
to get the operating point for the truck climbing uphill. Here, we started off with
the torque speed characteristic of the engine and the load. These curves have to be
generated from limited and finite data. We want functional forms of the characteristics
because these are more convenient for us to work with. Therefore, the first step is to
look at these points, draw a curve, and get the best function. So when we do curve
fitting, some optimization is already inherent in it because we are only looking at the
“best fit”.
What is the difference between the best and some other curve? For example,
if we have some 500 data points, it will be absolutely meaningless for us to have a
curve that passes through all the 500 points, because we know that error is inherent
in this data. When an error is inherent in the data, we want to pass a line which gives
the minimum deviation from all the points. So we get a new concept here that curve
fitting does not mean that the curve has to pass through all the known points. But
there can also be cases when the curve passes through all the points.
So in the light of the above, when will one want to have an exact fit and when
will one want to have an approximate fit? The answer is when the number of
parameters and also the number of measurements are small and we have absolute
faith and confidence in our measurements, we can go in for an exact fit. But when the
number of parameters and measurements is more and if we are using a polynomial,
then the order of the polynomial keeps increasing. The basic problem with higher
order polynomials is that they tend to get very oscillatory and if we want to work with
higher order information like derivatives and so on, it gets very messy. Therefore, if
we are talking about high accuracy and limited amount of data, it is possible for us
to do an exact fit. But if we are dealing with a large number of data points that are
error prone, it is fine to have the best fit.
In the light of the above, curve fitting can be of two types
(i) best fit and
(ii) exact fit.
A bird’s eye view of the general curve fitting problem is shown in Fig. 3.1.
(a) Exact fit: Example, enthalpy h = f(T,P). When we want to write a program
to simulate a thermodynamic system,1 it is very cumbersome to use tables of
temperature and pressure, and it is easier to use functions. For property data like
the above, or calibration data like thermocouple emf versus temperature, exact
fits are possible and desired.
1 System in this book refers to a collection of components. It is different from the concept of system
(b) Best fit: Any regression in engineering problems like a Nusselt number corre-
lation. For example, consider flow over a heated cylinder. Under a steady state,
we measure the total amount of heat which is dissipated in order to maintain
the cylinder temperature a constant. We increase the velocity for every Reynolds
number, and we determine the heat transfer rate. Using Newton’s law of cooling,
the heat transfer rate is converted to a heat transfer coefficient. From this, we
get the dimensionless heat transfer coefficient called the Nusselt number. The
Nusselt number goes with the correlating variables as follows:
Nu = aReb Pr c (3.1)
It is then possible for us to get a, b, and c from experiments and curve fitting.
The basis for all the above ideas comes from calculations or experiments. Regardless
of what we do, we get values of the dependent variables only at a discrete number
of points. Our idea of doing calculations or experiments is not to just get the values
at discrete points. As a scientist, we know that if the experiment is done carefully,
we will get 5 values of the Nusselt number for 5 values of Reynolds number. We are
not going to stop there. We would rather want to have a predictive relationship like
the one given in Eq. 3.1, which can be used for first predicting Nu at various values
of Re and Pr for which we did not do experiments and further, possibly simulate the
whole system of which the heated cylinder may be a part.
So with the data we have, before we can do system simulation and optimization,
there is an intermediate step involved which requires that all the data be converted
into equation forms, which can be played around with.
72 3 Curve Fitting
y = ao + a1 x + a2 x2 (3.2)
yo = f (xo ) (3.3)
y1 = f (x1 ) (3.4)
y2 = f (x2 ) (3.5)
ln(1) = 0
ln(3) = 1.099
ln(7) = 1.946
Report the error by comparing your answer with the value obtained from your
calculator.
3.2 Exact Fit and Its Types 73
Solution
when x = 1, 0 = ao + a1 + a2 (3.6)
when x = 3, 1.099 = ao + 3a1 + 9a2 (3.7)
when x = 7, 1.946 = ao + 7a1 + 49a2 (3.8)
ao = −0.719
a1 = 0.775
a2 = −0.056
Consider three points x0 , x1 , and x2 , and the corresponding y values are y0 , y1 , and
y2 , respectively. A depiction of this is given in Fig. 3.2.
We can use the polynomial interpolation we just saw. However, the Lagrange
interpolating polynomial is much more potent as it can easily be extended to any
number of points. It is easy for us to take the first derivative, the second derivative, and
so on with Lagrange interpolation. The post-processing work of CFD software, after
we get the velocities and temperatures, generally uses the Lagrange interpolating
polynomial to obtain the gradients of velocity and temperature. Once we get the
gradient of temperature, we can directly get the Nusselt number and then correlate
for the Nusselt number if we have data at a few values of velocity (please recall
Eq. 3.1).
In Fig. 3.2, xo , x1 , and x2 need not be equally spaced. For the above situation, the
Lagrange interpolating polynomial is given by
It is possible for us to get the first, second, and higher order derivatives. The first
derivative will still be a function of x. The second derivative will be a constant for
a second-order polynomial. For a third-order polynomial, we need the function y at
four points and the second derivative will be linear in x.
Fig. 3.3 Problem geometry with boundary conditions for Example 3.2
Table 3.1 Temperature of the heat generating wall at various locations (Example 3.2)
x, m T(x), ◦ C
0 80
0.01 82.1
0.03 84.9
0.05 87.6
3.2 Exact Fit and Its Types 75
The temperature distribution is symmetric about the midplane and the origin
is indicated in Fig. 3.3. Using the second-order Lagrange interpolation formula,
determine the heat transfer coefficient “h” at the surface. If the cross-sectional
area of the wall is 1m2 , determine the volumetric heat generation rate in the
wall for steady-state conditions.
Solution
What is it we are trying to do here? We are trying to get T as a function of x and then
write as q = −k dT/dx. q that comes from the wall is due to conduction. Therefore
qcond = qconv .
The next logical question is instead of doing all this, why cannot we directly insert
thermocouples in the flow? That will affect the flow and probably may even make
a laminar flow turbulent and hence is not a good idea. We do such measurements
with instruments like the hot wire anemometer, but the boundary layer itself is so
small and thin that it is very difficult to do those measurements. Therefore, we would
much rather prefer to do the measurements on the wall where it is easy to insert
thermocouples and then use the conduction–convection coupling and Fourier’s law,
and then we can determine the heat transfer coefficient. This is the so-called “standard
operating procedure” in convection heat transfer!
The problem at hand is typically called an inverse problem. We have a mathe-
matical model for this, which is
d 2T qv
+ =0 (3.12)
dx2 k
We have some measurements of temperature at xo , x1 , and x2 . So if we marry
the mathematical model with these measurements, we are able to get much more
information about the system. What is the information we are getting? We are getting
two more parameters, which are the heat transfer coefficient and the volumetric heat
generation rate. The straight problem is very simple; there is a wall and there is
convection at the boundary. The heat transfer coefficient and the volumetric heat
generation rate are given. What is the surface temperature? Or what is the temperature
1 cm away from the surface? These are easy to answer!
But here, we are making some measurements and inferring some properties of
the system. This is what is known as an inverse problem.
Here, we have 4 readings, but we need to choose only 3 for the Lagrange method.
Which 3 do we choose? We choose something closest to the wall, so we are going
to employ the first 3 readings.
76 3 Curve Fitting
(x − x1 )(x − x2 ) (x − x0 )(x − x2 )
y= T0 + T1
(x0 − x1 )(x0 − x2 ) (x1 − x0 )(x1 − x2 )
(x − x0 )(x − x1 )
+ T2 (3.13)
(x2 − x1 )(x2 − x1 )
dT [2x − (x1 + x2 )] [2x − (x0 + x2 )]
= T0 + T1
dx [(x0 − x1 )(x0 − x2 )] [(x1 − x0 )(x1 − x2 )]
[2x − (x0 + x1 )]
+ T2 (3.14)
[(x2 − x0 )(x2 − x1 )]
h = 210W/m2K
Figure 3.4 shows the qualitative variation of the temperature across the wall while
the arrow indicates the direction of heat flow.
The last part of the problem is the determination of the volumetric heat generation
rate. There are several ways of obtaining this. The heat transfer from the right side
is hAT and that from the left side is also hAT . So that makes a total of 2hAT .
Where does all this heat come from? From the heat generated. Therefore, equating
the two, we have
qv V = 2hAT (3.15)
So, we determine qv from this. The above procedure does not take recourse to the
governing equation. (This is strictly not correct as the governing equation is itself a
differential form of this energy balance with the constitutive relation (Fourier’s law)
embedded in it).
When we are employing 3 temperatures, we can consider any 3 of the 4. But the
pair (0, 80 ◦ C) is very crucial because we are evaluating the gradient at the solid–fluid
interface. So, using the overall energy balance,
qv V = 2(hAT )
AL = 2 × 210
qv A × 50
2 × 210 × 50
qv = = 2.1 × 105 W/m3
0.10
There is a heat-generating wall that has a thickness of 10 cm whose thermal
conductivity is given by 45 W/mK, and the heat generation rate is estimated to be
2.1 × 105 W/m3 . Now it is possible for us to apply our knowledge of heat transfer and
obtain the maximum temperature and see whether it is the same as what is measured
at 0.05 m. The governing equation for the problem under consideration is
d 2T qv
2
+ =0 (3.16)
dx k
The solution to the above equation is
dT qv x
=− +A (3.17)
dx k
qv x2
T =− + Ax + B (3.18)
2k
T = 80 ◦ C at x = 0, which gives B = 80 ◦ C.
At x = +0.05 m, dT/dx = 0, hence A = qv (0.05)
k
.
Substituting for A and B in Eq. 3.18, at x = 0.05m, we get T = 85.8 ◦ C.
This value is quite close to the data given in Table 3.1.
This is a very simple presentation of inverse heat transfer. Inverse heat transfer can
be quite powerful. For example, in the international terminals at an airport during
the Swine flu seasons of 2009 and 2010 and during the COVID-19 in 2020 and
2021, when people were arriving, thermal infrared images were taken to evaluate if
78 3 Curve Fitting
any of the passengers show symptoms of flu. Here, we are basically working out an
inverse problem. When the imager sees that the radiation from the nose and other
parts exceeds a certain value, it is a sign that the passenger could be infected.
Thermal infrared radiation can also be used for cancer detection, say breast tumor,
for example. If we can use an infrared camera and if we obtain the surface temperature
image, if there is a tumor inside the breast, the metabolism will be high. This will
cause the volumetric thermal heat generation to be high compared to the tissue
which is noncancerous. Therefore, this will show up as a signature on the surface
temperature image of the breast. This is called a breast thermogram. Now, from these
temperatures, one solves an inverse problem to determine the size and location of
the tumor.
The Lagrangian interpolating polynomial of order 2 for φ = f (x) (refer to Fig. 3.5)
is given by
d 2 φ 2 2φ1 2φ2
= φ0 − +
dx2 x=x1 2x2 x2 2x2
φ0 − 2φ1 + φ2
= (3.22)
x2
Purely by using the concept of an interpolating polynomial, we are able to get
the second derivative of the variable φ, where this variable could be anything like
temperature, stream function, and potential difference. The above is akin to the
central difference method in finite differences, which is one way of getting a discrete
form for derivatives. The central difference method basically uses the Taylor series
approximation. Consider Fig. 3.6, where φ is the variable of interest. Using the finite
difference method, d 2 φ/dx2 can be written as
dφ
dφ
d 2 φ dx x+ 21
− dx x− 21
= (3.23)
dx2 xi x
φi+1 −φi
d 2 φ x
− φi −φ
x
i−1
=
dx2 xi x
φi+1 − 2φi + φi−1
= (3.24)
x2
This is the way the central difference is worked out in the finite difference method.
Now we can see that the results obtained using the finite difference method are the
same as what we obtained using the Lagrange interpolating polynomial.
When we use Lagrange polynomials of order 3 and 4, it leads to results that are
similar to ones obtained with higher order schemes in the finite difference method.
The right-hand side term of Eq. 3.24 can be written as
φE − 2φP + φW
x2
80 3 Curve Fitting
Let us now look at an example to make these ideas more clear. Consider two-
dimensional steady-state heat conduction in a slab as given in Fig. 3.7. The governing
equation is
∇2T = 0 (3.25)
∂2T ∂2T
or + =0 (3.26)
∂x 2 ∂y2
d 2φ φE − 2φP + φW
= (3.27)
dx2 x2
Similarly
d 2φ φN − 2φW + φS
= (3.28)
dy2 x2
when x = y,
∂2φ ∂2φ
+ 2 =0
∂x2 ∂y
φE − 2φP + φW φN − 2φP + φS
+ =0 (3.29)
x 2 x2
φ E + φ W + φN + φS
which reduces to φP = (3.30)
4
3.2 Exact Fit and Its Types 81
Please note that Eq. 3.30 can also be obtained by using Lagrange interpolating
polynomials for φ(x, y) for one variable at a time and getting the second derivative
at node P.
If we simplify further, we see that the value of the temperature at a particular node
turns out to be the algebraic average of the temperatures at its surrounding nodes.
Looks reasonable, provided, this is also the commonsensical answer we would have
obtained. For the problem under question, if we apply this formula at the center point,
it would be (100 + 0 + 0 + 0)/4 = 25 ◦ C. Regardless of the sophistication of the
mathematical technique we use, the center temperature must be 25 ◦ C. This serves
as basic validation of the approximation we have done to the governing equation.
y = a0 + a1 (x − x0 ) + a2 (x − x0 )(x − x1 )
+a3 (x − x0 )(x − x1 )(x − x2 ) + · · ·
+an (x − x0 )(x − x1 ) · · · (x − xn−1 ) (3.31)
If f(x) varies with x as shown in Fig. 3.8, we would like to join the 4 points by a
smooth curve. The key is to fit lower order polynomials for subsets of the points, and
mix and match all of them to get a “nice” smooth curve. But all of them do not follow
the same equation. Locally for every 3 points, there is a parabola or for every 4 points
there is a cube. But in order to avoid discontinuities at the intermediate points, we
have to match the functions as well as the slopes at these intermediate points. We
82 3 Curve Fitting
will again get a set of simultaneous equations, which when solved simultaneously
will get us the values of the coefficients and this is how spline fitting is done.
Consider a spline approximation with quadratic fit in each subinterval as depicted
in Fig. 3.9. We can divide it into 3 intervals.
The equations for the three intervals are as follows:
ahead. If n is the number of intervals, there are n − 1 intermediate points. The total
number of intervals here is 3, i.e., n = 3. Intermediate points are 2. For every point,
we have 2 equations and hence we have a total of 2(n − 1) equations.
At the end points, we have 2 equations. So the total number of equations we have
is 2n − 2 + 2 = 2n. The number of constants on the other hand is 3n.
There should be continuity of the slope at the intermediate points. Hence, f (x)
should be the same whether we come from the left or right. That means
Like this, we can generate equations at the intermediate points. We have (n − 1) such
equations. So now we have 2n + n − 1 = 3n − 1 equations totally. We are still short
by one!
The second derivative at either of the two end points may be assumed to be 0.
This means that either 2a1 or 2a3 is 0, depending on the end point chosen, which
means that either a1 or a3 is 0. So the last condition is f (x) = 0 at one end.
Normally, we use the cubic spline when there are hundreds of points. When we
do this, our eyes will not notice that this juggling has been done somewhere. There
is no other simpler way to mathematically close it. This is what a graphing software
typically does when we use an approximating polynomial. So much mathematics
goes on in the background! (See Chapra Steven and Canale Raymond 2009, for a
fuller discussion on spline fits.)
This brings us to the end of exact fits.
All of the above methods discussed in Sect. 3.2 are applicable for properly determined
systems where the number of points is equal to the order of the polynomial +1 and so
on. For example, if we have 3 points, we have a second-order polynomial. Oftentimes,
we have overdetermined systems. For example, if we perform the experiments for
determining the heat transfer coefficient and obtain results for 10 Reynolds numbers,
we know that Nu = aReb Pr and when the Prandtl number, Pr, is fixed, we have two
constants. To determine these, 2 data points are sufficient. But we have 10 data
points. So we have to get the best fit. What is our criterion of “best”? Whether we
want to minimize the difference, or the difference in the modular form, or we want
to minimize the maximum deviation from any point, or we want to minimize the
square of the error, or higher order power of the error?
Oftentimes, we only get data points which are discrete. (For example, the per-
formance of equipment like a turbomachine.) However, we are interested in getting
a functional relationship between the independent and the dependent variables, so
that we are able to do system simulation and eventually optimization. If the goal is
calibration, we want to have an exact fit, where the curve should pass through all the
points. We discussed various strategies like Newton’s divided difference polynomial,
84 3 Curve Fitting
Lagrange interpolation polynomial, and so on. These are basically exact fits where
we have very few points, whose measurements are very accurate. However, there are
several cases where the measurements are error prone. There are also far too many
points, and it is very unwise on our part to come up with a ninth-degree or a twelfth-
degree polynomial that will generally show a highly oscillatory behavior. These are
overdetermined systems as already mentioned. What do we exactly mean by this?
These are systems that have far too many data points compared to the number
of equations required to regress the particular form of the equation.
For example, if we know that the relationship between enthalpy and temperature
is linear, we can state it as h = a + bT .
So if we have enthalpy at two values of temperature, we can get both a and b. But
suppose we have 25 or 30 values of temperature, each of which has an enthalpy and
all of which also have an error associated with it. Now, this is an overdetermined
system as two pairs of temperature-enthalpy can be used to determine a and b. Among
these values of a and b, which is the most desirable, we do not know. Therefore, we
have to come up with a strategy of how to handle the overdetermined system. We
do not want the curve to pass through all the points. So what form of the curve do
we want? Polynomial or exponential or power law or any other form? Some of the
possible representations are given below.
The polynomial given here is a general depiction; we may have higher order poly-
nomials also, or we can have the exponential form y = aebt which is typically the
case when we have an initial value problem, for example, concentration decreases
with time or the population is changing with time; or we can have the power law
form, for example, the Nusselt number or skin friction coefficient. Who will tell us
the best way to get the values for a, b, and c if it is a polynomial, or how to get a and
b, if it is exponential or the power law? We have chosen the form. But what is the
strategy to be used, since it is an overdetermined system? So we now have to discuss
the strategies for the best fit.
Let us take a straight line. y = ax + b. We have the data as shown in Table 3.2.
We want (a, b). We have the table containing the values of y corresponding to
different values of x. We require the values of a and b to be determined from this.
How do we get the table of values? We could perform experiments in the laboratory
or calculations on the computer. What could be this x and y? x could be the Reynolds
number while y could be skin friction coefficient, x could be temperature difference
while y could be heat flux. We are looking at a simple one-variable problem.
3.4 Strategies for Best Fit 85
Strategy 1:
So the first strategy is to minimize R, or rather
n
n
Minimize S = Ri = [yi − (axi + b)] (3.39)
i=1 i=1
Strategy 2:
Another strategy would be to minimize the modulus of the difference between the
data and the fit.
n
n
Minimize S = |Ri | = |yi − (axi + b)| (3.40)
i=1 i=1
Here, we minimize the modulus of the difference between ydata and ymodel , where
a large negative error cannot cancel out a large positive error. It now raises a lot of
hope in us. Let us take a hypothetical example to see whether this strategy works.
There are 4 points as seen from Fig. 3.11. Let us join points 1 and 3 by a line and
points 2 and 4 by another line. Any line between these two lines will try to minimize
S. There could be so many lines which will satisfy this. However, the original goal
was to obtain unique values of a and b which will give the best fit. So this too does
not work!
Strategy 3:
Consider a set of data points 1, 2, 3, 4, and 5. Common sense tells us that a line passing
through points 1, 2, 3, and 4 (or very close to them) is a good way of representing the
data. But point 5 gets neglected in this process. In order to accommodate point 5, let
us choose the dashed line as shown in Fig. 3.12 as the “best” fit. On what basis was
this chosen? The dashed line (of Fig. 3.12) minimizes the maximum deviation from
any point. So it satisfies what is known as the minimax criterion. This basically
comes from the decision theory. Minimax is based on maximizing the minimum
gain and minimizing the maximum loss.
As far as we are concerned, we are trying to minimize the maximum distance from
any point. But point 5 is a rank outsider, and there is something fundamentally wrong
with this value. Statistically, this point 5 is called an outlier; it is an “outstanding
point”! Minimax criterion unnecessarily tries to give undue importance to an outlier.
Sometimes often in meetings too, the most vocal person will be heard and given
3.4 Strategies for Best Fit 87
importance though his/her view may not be the most reasonable! For the problem
at hand, common sense suggests that we better remove the point (here, point 5) and
repeat the simulations/experiments. Suddenly, we cannot get a new physics or heat
transfer that goes up and down. There is something wrong with that point. Maybe an
error in taking the reading, or not allowing a steady state to be reached or a fluctuation
in the operating conditions like voltage and so on.
With all these strategies having failed, we go on to the least square regression
(LSR), which is Strategy 4. Here, we try to see if by minimizing the square of the
differences between the data and the fit, we can get a good handle on the problem
(an excellent treatment on regression is available in Chapra and Canale, 2009).
Let y = ax + b. If we have two points, we can get a and b right away. If we have 200
points, then we can have so many combinations of a and b. Now we are trying to fit
one global value for a and b which best fits the data. Whenever we have a procedure
by which we determine a and b, we substitute a value of x in ax + b from the resulting
y of the data and (ydata − yfit ) is called the residue. The residue in general will be
nonzero. We take the square of this and try to minimize the sum of these residues.
N
N
S= R2i = [yi − (axi + bi )]2 (3.41)
i=1 i=1
in which yi refers to ydata , and (axi + bi ) refers to yfit . The square takes care of all
the negative and positive errors and also helps give us the best value of a and b that
fits all data. This is called the L2 norm or the Euclidean norm. In order to obtain
88 3 Curve Fitting
∂S
=0=2 (yi − axi − bi )(−xi ) (3.42)
∂a
∂S
=0=2 (yi − axi − bi )(−1) (3.43)
∂b
On rearranging the above two equations, we have
− xi yi + a xi2 + b xi = 0 (3.44)
− yi + a xi + nb = 0 (3.45)
Now we have the unique values of a and b that best represent the data. What these
mean and whether there is substantial improvement in our understanding by trying
to force y = ax + b need to be discussed.
Let us start with a simple example. Let us say we have fully developed turbulent
flow in a pipe. We send water in at the rate of ṁ. Now, we wrap the pipe with strip
heaters, whose power can be varied. We can measure the temperature at the inlet and
the outlet. The heaters can also be individually controlled so that we can maintain a
constant temperature at the outside. We send cold water into the pipe, the outside of
which is maintained at one particular temperature, or is input with electrical heating
so that it is essentially a heat exchanger that is trying to heat the water. A schematic
of this situation is shown in Fig. 3.13. It is possible for us to change the velocity and
when we do that, in order to maintain the temperature of the outer wall a constant,
3.4 Strategies for Best Fit 89
we can change the power settings of the heater. For every change in velocity, we can
change the heater setting and keep the outer temperature a constant. When we change
the velocity, we change the Reynolds number and the heat transfer rate. Hence, we
are studying how the heat transfer rate changes with fluid velocity. This is essentially
what we do in convective heat transfer. If the pipe were assumed to be at a particular
temperature, then the wall temperature of the pipe is a constant while the water
temperature goes up like what is shown in Fig. 3.14. The temperature difference at
the inlet is Tin while at the other end it is Tout .
Because one is a straight line and the temperature distribution is exponential, we
cannot use the arithmetic mean temperature difference. Therefore, it becomes contin-
gent upon us to use the log mean temperature difference. The log mean temperature
difference is defined as the temperature difference at one end of the heat exchanger
(the pipe in this case) minus the temperature difference at the other end of the heat
exchanger divided by the logarithm of the ratio of these two temperature differences.
Tout − Tin
LMTD = (3.49)
ln(Tout /Tin )
90 3 Curve Fitting
where ṁ is the mass flow rate of the fluid, Cp is the specific heat capacity of the fluid
at constant pressure, U is the overall heat transfer coefficient, and A is the surface
area. The outside temperature is fixed, so we do not have to use the U because we
know what the temperature on the outside is, and so we will replace U by h.
Q = hATLMTD (3.51)
Q Nu.kf aReb Pr c kf
h= = = (3.52)
ATLMTD d d
In the above equations, kf is the thermal conductivity of the fluid, and Nu is the
dimensionless heat transfer coefficient or the Nusselt number (in the above formu-
lation, conduction resistance across the pipe wall is neglected). Let us work this out
numerically.
Table 3.3 Nusselt number for various Reynolds numbers (Example 3.3)
S.No. Re Nu
1 2500 24
2 4000 36
3 7000 55
4 12000 84
5 18000 119
Solution
We have straightaway introduced a complication. A little while ago, we elucidated
the procedure to regress a straight line but the problem says there is a power law
variation. What should we do as the first step? We take natural logarithms on both
sides and reduce the power law form to a linear equation.
3.4 Strategies for Best Fit 91
Nu = CRem (3.53)
ln(Nu) = ln(C) + mln(Re) (3.54)
Y = aX + b (3.55)
Y = ln(Nu) (3.56)
a=m (3.57)
X = ln(Re) (3.58)
b = ln(C) (3.59)
Using Eqs. 3.47 and 3.48, and values from Table 3.4, we get the following.
We now proceed to determine the goodness of the fit. In order to do this, first we
calculate Yfit by substituting the value of the Reynolds number, raising it to the power
92 3 Curve Fitting
of 0.801, multiplying it by 0.0464, and taking its natural logarithm. (This Reynolds
exponent of 0.8 is very typical of turbulent flows.)
It is instructive to look at the last two columns of Table 3.5, where (Y − Ȳ )2 and
− Yfit )2 are tabulated.
(Y
(Y − Ȳ )2 = St and (Y − Yfit )2 = Sr .
There may be a dilemma as to whether we want to check in the ax + b equation
or directly in the equation for the Nusselt number! But it really does not matter to
evaluate the goodness of the correlation, though the correct procedure would be to
compare the natural logarithm of the Nusselt number.
Now we introduce a new term r 2 , which is the coefficient of determination, as
St − Sr 1.65 − 0.05
r2 = =
St 1.65
r 2 = 0.97
r 2 is
√called as coefficient of determination. √
r 2 = r is known as correlation coefficient. r = 0.97 = 0.98.
We are able to determine the constants a and b, and we are able to calculate some
statistical quantities like r which is close to 1. So we believe that the correlation is
good. We did some heat transfer experiments. We varied the Reynolds number and
got the Nusselt number. We believe that these two are related by a power law form
and hence went ahead and did a least square regression and we got some values. Has
all this really helped us or not? We need some statistical measures to answer this
question. Suppose we had no heat transfer knowledge, and just have the Reynolds
number and the Nusselt number, what we will have possibly done as the first step
is to look at the Nusselt number (y) and get its mean. Then we will tell the outside
world that we do not know any functional relationship but the mean value is likely
to be like this. We get the mean value of ln y as 3.9944. But we do not stop with
the mean of the natural logarithm of the Nusselt number. If we say that the mean is
3.4 Strategies for Best Fit 93
like this, then it is expected to have a variance which is like (Y − Ȳ )2 . Therefore, the
total residual which is given by the sum of the square of the difference between the
ln(Nusselt number)-ln(Nusselt number)average will be 1.65, and the average of this
will be 0.33.
However, if we get a little smart and we say that instead of qualifying only by
the mean, if we can have a functional correlation ax + b, we can regress a and b,
and see whether doing this really helps us. For the problem under consideration, we
are saying that with respect to the fit, we are able to reduce the sum of the residuals
St from 1.65 to Sr = 0.05. So, of the 1.65, 1.60 is explained by the fact that the
Nusselt number goes as aRem . Therefore, 97% variance in the data is explained by
this correlation. Therefore, it is a good correlation. In the absence of a correlation,
we will report only the mean and the standard deviation. But apart from this, we go
deeper and find out some physics behind this and propose a power law relationship
and we are able to explain almost all of the variance in the data with very high r 2 .
Sometimes, we may get a correlation coefficient which is negative (with a min-
imum of −1), which may arise in some correlation where, when x increases, y
decreases. Suppose we have a correlation that gives us only 60%, either our exper-
iments are erroneous or there are additional variables that we have not taken into
account. So when we do a regression, it tells us many things. It is just not the experi-
ments or the simulations that will help us understand science! Regression itself can
be a good teacher!
Any correlation that has a high r 2 is not necessarily a good correlation. It is a
good correlation purely from a statistical perspective. We do not know if there is a
causal relationship between x and y. For example, let us calculate the ratio of the
number of rainy days in India in a year to the total number of days (365). Let us look
at the ratio of the number of one-day international cricket matches won by India to
the total number of matches played in a year. Let us say we look at the data for the
last 5 years. The two may have a beautiful correlation. But we are just trying to plot
two irrelevant things. Such a correlation is known as a “spurious correlation”!
So, first we have to really know whether there exists a physical relationship
between variables under question. This can come when we perform experiments
or when we are looking at the nondimensional form of the equations or by using the
Buckingham Pi theorem.
Outside of the above, we can also draw a parity plot. If we plot Ydata and Yfit ,
the center line is a 45◦ line about which the points will be found. The points must be
equally spread on both sides of the 45◦ line, called the parity line. A typical parity
plot is given in Fig. 3.15. If all the points lie on the parity line, it is incredible.
This may also lead to suspicion! Furthermore, any experiment will have a natural
variation. Approximately, 50% of the points should be above the parity line, and 50%
of the points should be below the line. When all the points are bunched together, it
means there is some higher order physics. For example, when the Nusselt number is
increasing and we get more error, the correlation suggests there are some additional
effects like the effect of variation of viscosity that we have not taken into account.
This parity plot is also called a scattergram or scatter plot. On the same parity
plot, we can have red, blue, and green colors to indicate fluids of different Prandtl
94 3 Curve Fitting
numbers. Beyond a certain point, it is just creativity, when it comes to how we present
the results and how we put the plots together!
Suppose we want to get the height of a student in this class, we measure the
height of each student and take the average. We then report the average height and
the variance Sr . If the class is large, the heights are likely to follow a Gaussian or
a normal distribution. Now we can go one step further and attempt a correlation.
Suppose we know the date of birth of each person, the height is possibly directly
proportional to the date of birth, such that y = ax + b. With the date of birth, we get
a much better fit than just taking all the heights alone and getting the mean. So the
goodness of the fit y = ax + b is just a measure of whether with the date of birth, we
are able to predict heights, much better than just reporting the mean of the population.
Let us take 4 points (1, 2, 3, and 4), as shown in Fig. 3.16. We want to draw a straight
line through this, which is the LSR fit for example. Assuming that all the measure-
ments are made using the same instruments, we can assume that the distribution of
errors in the measurements follows a Gaussian with a standard deviation of σ. If
Yfit = ax + b, the probability for getting Y1 is given by
This comes from the normal or Gaussian distribution. Similarly, the probability
of getting y2 for the same a and b is given by
N
L = P(Y1 , Y2 , . . . |(a, b)) = Pi
i=1
N 2
1 − i=1 [Yi −(axi +b)]
L= e 2σ 2 (3.63)
(2πσ 2 )N /2
N
R2
−2ln(L) = Nln(2πσ 2 ) + i=1
(3.64)
σ2
Now we want to determine the best values of a and b that maximize P. This procedure
is known as the Maximum Likelihood Estimation. The standard deviation σ is not a
variable in the problem; it is known (in fact, for the maximum likelihood estimation
procedure to work, we do not even have to know σ. If we know that sigma is a
constant across measurements, that would suffice). If −2ln(L) = P, we can make
the first derivative stationary as follows:
∂P ∂P
=0= (3.65)
∂a ∂b
and get a and b. The resulting equation (3.65) is exactly the same as what we obtained
earlier in the derivation of LSR for a straight line (Eqs. 3.41–3.43).
96 3 Curve Fitting
1. Before the correlation, we have the data Yi and the mean Ȳ and hence St is the
sum of the deviation of each value from the mean St = Ni=1 (Y − Ȳ )2 .
2. Sr is the residue given by Sr = Ni=1 (Yi − Yfit,i )2 .
3. Now the coefficient of determination is basically what is the percentage or how
much reduction is brought about in St by proposing a correlation wherein we fit
Y as a function of x. The square root of this coefficient of determination is the
correlation coefficient.
4. There is one more quantity of interest, the standard error of estimate. It is some-
thing like the Root Mean Square error (RMSE).
N
i=1 (Yi− Yfit,i )2
Std. error = (3.66)
(n − 2)
dT
mCp = −hA(T − T∞ ) (3.67)
dt
θ = T − T∞ (3.68)
dθ
mCp = −hAθ (3.69)
dt
θ = θi e−t/τ (3.70)
1. The heat transfer coefficient is constant and does not depend on the temperature
of the body (which is a questionable assumption). The mass and specific heat
are constant; these are no questionable assumptions. But is the heat transfer
coefficient really a constant? Under natural convection, this is not so. The heat
transfer coefficient is really variable, but we keep it a constant here.
2. Because the body is cooling and giving heat to the surroundings, the temperature
of the surroundings is not increasing. That is why it is a first-order system. But
we can complicate it by having a second-order system. For example, consider
an infant crying for milk. Let us say that the mother is trying to prepare baby
food. She puts in milk powder and hot water, and mixes them in a glass bottle.
If the milk is too hot, the bottle with milk is immersed in a basin of water.
Unfortunately here, T∞ is the temperature of the surrounding water. This starts
increasing as it gets the heat from the milk in the bottle. We have got to model
this as a second-order system.
For the problem under consideration, we assume that T∞ is a constant. The goal
of this exercise is to estimate the time constant if we have temperature versus time
history. Basically, we can generate different
curves for τ1 , τ2 , and so on. There will
be one particular value of τ at which Ni=1 (θi − θi,fit )2 is minimum. That is the best
estimate of the time constant for this system. From τ , if we know m, Cp, and A, it
gives us a very simple method to determine the heat transfer coefficient by means of
an unsteady heat transfer experiment.
S. No. t, s (x) T, ◦ C
1 10 93.3
2 30 82.2
3 60 68.1
4 90 57.9
5 130 49.2
6 180 41.4
7 250 36.3
8 300 32.9
98 3 Curve Fitting
Solution
We first reduce the solution to the standard equation of a straight line.
θ = θi e−t/τ (3.71)
ln(θ) = ln(θi ) − t/τ (3.72)
Y = ax + b
Y = ln(θ)
x=t
a = −1/τ
b = ln(θi )
St − Sr 8.1099 − 0.0359
r2 = =
St 8.1099
r = 0.9978 or 99.78%
This is a terrific correlation. The correlation coefficient is very high and very good.
So it is a very accurate representation of the data. That means that if we are able
to regress the data (i.e., the data can be represented as e−t/τ ), then 99.7% of the
variance of the data can be explained by proposing this curve. The standard error of
the estimate is given by
Sr
= 0.0931
n−2
This standard error of the estimate is acceptable. That means at any time t, if we
are using the correlation and obtaining the value of ln (θ), there is a 99% chance
that this ln (θ) will lie between the mean and ±3 × 0.08 = ±0.24. It gives us the
confidence with which we can predict ln (θ). This is an independent measure of
the validity of the estimate. So when ln (θ) is high, 0.08 is small and it is fine. But
when this ln (θ) is going down, it becomes comparable. Therefore, when we do the
experiment, we must try to use the maximum data when the system is hot. When the
system is approaching the temperature of the surroundings and ln (θ) is very small,
when sufficient time has elapsed and the driving force for heat transfer itself is small,
the estimates are more prone to error. Other effects may also come into play in this
example. In fact, we can sub-divide the data into 3 sets as early, middle, and final
phases and for each of these phases, we may get a different value of τ .
100 3 Curve Fitting
1 clear ;
2 clc ;
3
4 % Time d a t a
5 t =[10;30;60;90;130;180;250;300];
6
7 % Temperature data
8 T=[93.3;82.2;68.1;57.9;49.2;41.4;36.3;32.9];
9
10
11 % Ambient t e m p e r a t u r e
12 T_inf = 30;
13
14 % Calculating theta
15 t h e t a =T−T _ i n f ;
16
17 % Number o f d a t a p o i n t s
18 n= l e n g t h ( t ) ;
19
20 y= l o g ( t h e t a ) ;
21
22 % C a l c u l a t i n g a and b
23 a =( n∗sum ( t . ∗ y )−sum ( t ) ∗sum ( y ) ) / ( n∗sum ( t . ^ 2 ) −(sum ( t ) ) ^ 2 ) ;
24 b =( sum ( y ) ∗sum ( t . ^ 2 )−sum ( t . ∗ y ) ∗sum ( t ) ) / ( n∗sum ( t . ^ 2 ) −(sum ( t ) ) ^ 2 ) ;
25
26 % linear f i t equation
27 y _ f i t =a . ∗ t +b ;
28
29 % time c o n s t a n t
30 t i m e =−1/ a ;
31
32 t h e t a 0 =exp ( b ) ;
33
34 % i n i t i a l temperature
35 T0= t h e t a 0 + T _ i n f ;
36
40 % correlation coefficient
41 r = s q r t ( ( S_t−S_r ) / S _ t ) ;
42 % standard error
43 e r r o r = s q r t ( S_r / ( n−2) ) ;
44
45 % Print
46 p r t = [ ’ a = ’ , num2str ( a ) . . .
47 ’ , b = ’ , num2str ( b ) . . .
48 ’ , C o r r e l a t i o n c o e f f i c i e n t ( r ) = ’ , num2str ( r ) . . .
49 ’ , Standard e r r o r = ’ , num2str ( e r r o r ) ] ;
50 disp ( prt )
3.4 Strategies for Best Fit 101
a = −0.010261, b = 4.2674
Correlation coefficient (r) = 0.99778
Standard error = 0.077388
There are some other forms that are amenable to linear regression. Let us look at
functions of more than one variable Y = f (x1 , x2 ). Now suppose we propose that
Y = C0 + C1 x1 + C2 x2 (3.73)
Consider a straight line fit y = ax + b. To make matters simple, let us take only 4
data points, as shown in Table 3.9.
N
S= [yi − (axi + b)]2 (3.77)
i=1
∂S
=0 (3.78)
∂a
∂S
=0 (3.79)
∂b
n xi yi − xi yi
a= = 3.84
n xi2 − ( xi )2
yi − a xi
b= = 3.19
n
y = 3.84x + 3.19
When we propose [Z T Z][A] = [Z T ][Y ], we have very concisely written the linear
least squares formulation. Now all the tools available for matrix inversion can be used.
There is no need to simultaneously solve two equations like what we did before. We
will see, how by matrix inversion, we can get the same answers. The right side is
called the forcing vector. So with the matrix formulation, we have
T 14 6
Z Z =
6 4
T 72.9
Z Y =
35.8
14 6 a 72.9
=
6 4 b 35.8
a 1 14 6 −1 72.9
=
b 20 6 4 35.8
a 1 4 −6 72.9
=
b 20 −6 14 35.8
a 1 4 × 72.9 − 6 × 35.8 3.84
= =
b 20 −6 × 72.9 + 14 × 35.8 3.19
The values of a and b are the same as those obtained without using matrix algebra.
This is a very smart way of doing it. When there are many equations and unknowns,
we can use the power of matrix algebra to solve the system of equations elegantly.
104 3 Curve Fitting
3.5.1 Introduction
We need to first establish the need for nonlinear regression in thermal systems. If we
disassemble the chassis of our desktop, we will see the processor and an aluminum
sink on top of it. There may be more than one such sink. There are at least two fans,
one near the outlet, the other fan is dedicated to the CPU. When we boot, the second
fan will not turn on. Only when we run several applications, the second fan turns
on when the CPU exceeds a certain temperature. Basically, we have a heat sink like
what is shown in Fig. 3.17.
Suppose the whole heat sink is such that it is made of a highly conducting material
and can be assumed to be at the same temperature, we can treat it as a lumped
capacitance system. So when the system is turned on, we want to determine its
temperature response. There is a processor that is generating heat at the rate of Q
Watts, has a mass m, specific heat Cp and the heat transfer coefficient afforded is h,
and the surface area is A; if we treat the heat sink and the processor to be at the same
temperature, which is a reasonably good assumption to start with, the governing
equation will be
dT
mCp = Q − hA(T − T∞ ) (3.81)
dt
It is to be noted that in Eq. 3.81, m is the combined mass of the CPU and heat
sink, and Cp is the mass averaged specific heat.
Initially, when we start, the processor is constantly generating heat at the rate of Q
Watts. At t = 0, T = T∞ . So when T = T∞ , even though hA is available, the T is
so less that it is not able to compensate the Q, hence (Q − hAT ) is positive which
forces the system to a higher temperature. The system temperature keeps climbing
up. While this is happening, the heat generation from the processor, Q, remains the
same. As the temperature difference T keeps increasing, a time will come when
Q = hAT and the left side is switched off, which means that mCp dT /d τ becomes
0 and the system approaches a steady state.
After the system approaches a steady state, when we are trying to shut it down, Q
= 0. Then mCp dT /d τ = −hAT , so it will follow a cooling curve that is analogous
to what we studied previously. Therefore, if we plot the temperature–time history for
this problem, when it is starting up, operating under a steady state for a long time and
then shutting down, the temperature profile will look like what is shown in Fig. 3.18.
If we are actually a chip designer, before we start to worry about the unsteady state,
we will worry about the steady-state temperature. So for the steady-state temperature,
we do not have to solve all these equations. When equilibrium is established
Q = hA(Ts − T∞ ) (3.82)
Q
Ts = T∞ + (3.83)
hA
Usually, a limit is set by the manufacturer on the maximum permissible tempera-
ture of the chip, say Ts = 80 ◦ C. The ambient T∞ = 30 ◦ C is also fixed. We know Q,
which is the heat the processor is going to dissipate. Therefore we do not have much
choice. If there is a fan, and if we know that for forced convection h = 100W/m2 K,
then what we are left with is to calculate the surface area of the heat sink. This is
one aspect of chip design. But if we need information on how long it will take for
the system to reach this temperature, how long will it take for the system to cool to
a particular temperature, we have to solve the differential equation. The solution to
the differential equation is interesting.
dT
mCp = Q − hA(T − T∞ ) (3.84)
dt
Let T − T∞ = θ (3.85)
θ = Ae−t/τ (3.87)
106 3 Curve Fitting
mCp
where τ = (3.88)
hA
The particular integral will be θ = Q/hA. The particular integral is one that sat-
isfies the differential equation exactly. If we take θ = Q/hA and substitute it in the
differential equation (Eq. 3.84), which is not a function of time, the first term van-
ishes. The complete solution to the problem will be the sum of the complementary
function and the particular integral. The general solution will then be
Q
θ= + Be−t/τ (3.89)
hA
There is a constant B that has to be figured out from the initial conditions. When
we start, at time t = 0, θ = 0. Substituting this in the solution, we get B = −Q/hA.
So the final solution is
Q
θ= [1 − e−t/τ ]
hA
θ = a[1 − e−bt ] (3.90)
The above form cannot be reduced to a linear regression model regardless of the
number and intensity of algebraic manipulations. We have come to the stage where a
and b cannot be determined using linear least squares. In nonlinear least squares, we
first assume initial values for a and b, determine the residual, and then keep changing
a and b, and see whether the residual goes down. In one shot, we cannot get the
solution because it cannot be linearized.
What does this mean? If we happen to obtain a function f (xi ), for which the
parameters a and b are known, if we substitute the values of a and b and for a known
value of xi , we can get the value of Yi . But that value of f (xi ) need not agree with the
Ydata . After all, the f (xi ) is a form that we are trying to fit. So the difference between
Yi and the f (xi ) with the best values of a and b will result in some error called ei .
3.5 Nonlinear Least Squares 107
For simplicity, we consider this as a 2-parameter problem. The function need not be
only in the form given above, it can be of any nonlinear form with two parameters a
and b. So for a general 2-parameter problem, where linear regression will not hold,
we are proposing this.
We have to assume a value for a and b and then expand f (xi ) from the jth iteration
to the j + 1th iteration using the Taylors series. We truncate the series after a certain
number of terms and determine the difference (Ydata − Yfit ) for all the data points.
Then in a least square sense, we try to minimize this difference. We compare the old
values of a and b with the new values of a and b. But that alone will not do. With
the old values of a and b, we find out the values of Yfit . We then take the square of
Ydata − Yfit for each of the i values. We sum it up and get S0 = R2 . Now at iteration
0, for assumed values of a and b, we will get S0 , which is the sum of the residues.
Now going through this procedure, we will get new values of a and b, which are
called anew and bnew . Using these new values, we will get the new value of Yfit . We
find out (Yfit − Ydata )2 which for this iteration is S, and see how “S” is going down
in the iterations. After the sum of residues, S, has reached a sufficiently small value,
we can take a call and say that this is enough.
So basically, the key is that we are still using linear least squares, which means that
we have to approximate the nonlinear form by an appropriate linear form. Expanding
f (xi ) in the vicinity of j using Taylors series, we have
∂f (xi )j ∂f (xi )j
f (xi )j+1 = f (xi )j + a + b + Higher order terms (3.93)
∂a ∂b
When we try to regress, starting with this approach, if we are truncating after the
first-order terms, then we are linearizing the nonlinear form using the Taylors series
and this is the Gauss Newton algorithm or GNA. Now substituting for f (xi ) using
the expression for f (xi )j+1 from the Taylors series expansion in Eqs. 3.91 or 3.92,
we have
∂f (xi )j ∂f (xi )j
Yi = f (xi )j + a + b + ei (3.94)
∂a ∂b
∂f (xi )j ∂f (xi )j
[Yi − f (xi )j ] = a + b + ei (3.95)
∂a ∂b
{D} = [Zj ]{A} + [E] (3.96)
[Zj ] is called the Jacobian or the sensitivity matrix. We cannot evaluate the Jacobian
matrix unless we have the values of a and b, which we are actually seeking. One cannot
108 3 Curve Fitting
go to the next step unless we assume a and b. This was not the case for the linear
least squares. We have to assume values for a and b, work out all these, and get the
new values of a and b in nonlinear least squares.
If D is given like this, can we write the representation using linear least square as
follows? ⎡ ⎤
Y1 − f (x1 )
⎢Y2 − f (x2 )⎥
⎢ ⎥
{D} = ⎢ .. ⎥ (3.98)
⎣ . ⎦
Yn − f (xn )
The D matrix is in fact called the forcing vector. We call Zj column vector signifying
that for the jth value of iteration, it is not constant here. Now using linear least squares,
aj+1 = aj + a (3.100)
bj+1 = bj + b (3.101)
(i) Basically, nonlinear least squares is an iterative and often cumbersome proce-
dure. The best way to implement this in a multivariable problem is to write a
MATLAB script or use a nonlinear regression tool.
(ii) As is the case with any iterative technique, successful solutions are critically
dependent on our initial guess. If the initial guess is way off, the solution may
converge very slowly, oscillate or diverge. So the Gauss–Newton algorithm
(GNA) is not a universal cure; it cannot solve all nonlinear regression problems.
However, it is a strategy which is worth trying. Suppose we go through this
algorithm and get stuck in between, there are some powerful tools that are
available for us to correct.
(iii) In GNA, we do not use the second-order terms. So there is a chance that the
scheme can falter along the way.
3.5 Nonlinear Least Squares 109
S. No. t, s θi ,◦ C
1 10 3.1
2 41 11.8
3 79 21.1
4 139 29.8
5 202 37.4
6 298 42.5
Solution
θ = 40[1 − e−t/200 ]
We get a high value for the residue here (see Table 3.11) which tells us that the
values for a and b are not correct.
1st iteration: ⎡ ⎤
0.0488 380.49
⎢0.1854 1336.02⎥
⎢ ⎥
⎢0.3263 2128.83⎥
Z0 = ⎢
⎢0.5009
⎥
⎢ 2774.85⎥⎥
⎣0.6358 2942.89⎦
0.7746 2686.44
T 1.40 6302.89
Z0 Z0 =
6302.89 3.00 × 107
Table 3.11 Values from the initial guess and first iteration
t, s θi , ◦ C θfit (θi − θfit )2 θfit,new (θ − θfit,new )2
10 3.1 1.95 1.32 3.38 0.0793
41 11.8 7.41 19.24 12.36 0.3087
79 21.1 13.05 64.76 20.80 0.0878
139 29.8 20.04 95.32 30.01 0.0421
202 37.4 25.43 143.25 36.03 1.8688
298 42.5 30.99 132.59 41.08 2.0083
456.4742 4.3951
110 3 Curve Fitting
2nd iteration: ⎡ ⎤
0.0753 380.4918
⎢0.2745 1339.9448⎥
⎢ ⎥
⎢0.4611 1917.7100⎥
Z0 = ⎢
⎢0.6630
⎥
⎢ 2109.8617⎥⎥
⎣0.7942 1872.7426⎦
0.9029 1303.3960
T 2.1792 5346.4147
Z0 Z0 =
5346.4147 1.53 × 107
3nd iteration:
⎡ ⎤
0.0669 454.1666
⎢0.2472 1502.2336⎥
⎢ ⎥
⎢0.4215 2224.6416⎥
Z0 = ⎢
⎢0.6182
⎥
⎢ 2583.1014⎥⎥
⎣0.7532 2426.3106⎦
0.8731 1840.7771
T 1.9551 6371.041256
Z0 Z0 =
6371.0413 2.34 × 107
We now look at the power of the Gauss–Newton algorithm. In the first iteration,
when θ was 3.1, the θfit was 1.95. Immediately after one iteration, the θfit,new is now
3.38, which is very close to the θ. If one sees the second row, θfit,new has increased
112 3 Curve Fitting
from 7.41 to 12.36 in just one iteration where the actual value is 11.8. (θ − θfit )2
has dropped from 456.47 to 4.39 in just after one iteration (refer to Table 3.11). This
is the result after just one iteration of the Gauss–Newton algorithm. There will be a
significant improvement in the residuals if the problem is well conditioned. At the
end of the third iteration from the MATLAB output below, (θ − θfit )2 has become
even smaller and the solution has almost approached the true solution.
MATLAB code for Example 3.5
1 clc ;
2 clear ;
3
4 % I n p u t Data
5 xt = [10 ,41 ,79 ,139 ,202 ,298]; % input X data
6 yt = [3.1 ,11.8 ,21.1 ,29.8 ,37.4 ,42.5]; % input Y data
7 x = xt ’ ;
8 y = yt ’ ;
9
10 % I n t i a l guess
11 a = 40; % i n i t i a l g u e s s f o r a and b
12 b = 0.005;
13
14 % Pre−a l l o c a t i o n
15 z = zeros ( length ( xt ) ,2) ; % Jacobian matrix
16 D = zeros ( length ( xt ) ,1) ; % forcing vector matrix
17 y_f = zeros ( l e n g t h ( xt ) ,1) ; % y f i t matrix
18
22 count = 0;
23 label = 0;
24 w h i l e l a b e l ==0
25
26
27 syms av bv xv ;
28 y _ f f =av ∗(1− exp (−1∗bv∗xv ) ) ; % f u n c t i o n y=a[1− exp(−bx ) ]
29 dyda= d i f f ( y _ f f , av ) ; % d e r i v a t i v e o f y w. r . t a
30 dydb= d i f f ( y _ f f , bv ) ; % d e r i v a t i v e o f y w. r . t b
31
32 y _ f = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
33 z ( : , 1 ) = s i n g l e ( s u b s ( dyda , { av , bv , xv } , { a , b , x } ) ) ;
34 z ( : , 2 ) = s i n g l e ( s u b s ( dydb , { av , bv , xv } , { a , b , x } ) ) ;
35 D=y−y _ f ;
36
37 i f count >0
38
39 z t z =( z ’ ∗ z ) ;
40 z t d =z ’ ∗D;
41 DV= i n v ( z t z ) ∗ z t d ;
3.5 Nonlinear Least Squares 113
42 a=a+DV( 1 ) ;
43 b=b+DV( 2 ) ;
44
45 end
46
47 y _ f n = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
48
49 e r r = sum ( ( y − y _ f n ) . ^ 2 ) ;
50
51 i f e r r < e r r T o l | | c o u n t ==countmax
52 l a b e l =1;
53 end
54
55 % Print
56 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
57 ’ , a = ’ , num2str ( a ) , . . .
58 ’ , b = ’ , num2str ( b ) , . . .
59 ’ , 1/ b = ’ , num2str ( 1 / b ) , . . .
60 ’ , e r r = ’ , num2str ( e r r ) , ’ \ \ ’ ] ;
61 disp ( prt )
62
63 count=count +1;
64
65 end
Example 3.6 Revisit Example 3.5 and solve it using the Levenberg method.
Perform at least 3 iterations.
Solution
Let us solve this problem using the Levenberg method (Table 3.12).
Table 3.12 Values from the initial guess and first iteration
t, s θi , ◦ C θfit (θi − θfit )2 θfit,new (θ − θfit,new )2
10 3.1 1.95 1.32 3.39 0.0845
41 11.8 7.41 19.24 12.36 0.3174
79 21.1 13.05 64.76 20.77 0.1088
139 29.8 20.04 95.32 29.87 0.0044
202 37.4 25.43 143.25 35.77 2.6441
298 42.5 30.99 132.59 40.67 3.3448
456.4742 6.5038
116 3 Curve Fitting
We get a high value for the residue here (see Table 3.13) which tells us that the
values for a and b are not correct.
1st iteration: ⎡ ⎤
0.0488 380.49
⎢0.1854 1336.02⎥
⎢ ⎥
⎢0.3263 2128.83⎥
Z0 = ⎢⎢ ⎥
⎥
⎢0.5009 2774.85⎥
⎣0.6358 2942.89⎦
0.7746 2686.44
T 1.40 6302.89 10
Z0 Z0 + λI = + 0.01 ×
6302.89 3.00 × 107 01
2nd iteration: ⎡ ⎤
0.0753 416.5431
⎢0.2745 1339.9448⎥
⎢ ⎥
⎢0.4611 1917.7100⎥
Z0 = ⎢
⎢0.6630
⎥
⎢ 2109.8617⎥⎥
⎣0.7942 1872.7426⎦
0.9029 1303.3960
2.1792 5346.4147 10
Z0T Z0 + λI = + 0.01 ×
5346.4147 1.53 × 107 01
3nd iteration: ⎡ ⎤
0.0671 452.4807
⎢0.2479 1495.7306⎥
⎢ ⎥
⎢0.4224 2213.3307⎥
Z0 = ⎢
⎢0.6193
⎥
⎢ 2566.8899⎥⎥
⎣0.7542 2408.0510⎦
0.8738 1823.4242
T 1.9603 6335.1352 10
Z0 Z0 + λI = + 0.01 ×
6335.1352 2.31 × 107 01
1 clc ;
2 clear ;
3
4 % I n p u t Data
5 xt = [10 ,41 ,79 ,139 ,202 ,298]; % input X data
6 yt = [3.1 ,11.8 ,21.1 ,29.8 ,37.4 ,42.5]; % input Y data
7 L = 0.01; %H y p e r p e r a m e t e r lambda v a l u e
8 x = xt ’ ;
9 y = yt ’ ;
10
11 % I n t i a l guess
12 a = 40; % i n i t i a l g u e s s f o r a and b
13 b = 0.005;
14
15 % Pre−a l l o c a t i o n
16 z = zeros ( length ( xt ) ,2) ; % Jacobian matrix
17 D = zeros ( length ( xt ) ,1) ; % forcing vector matrix
18 y_f = zeros ( l e n g t h ( xt ) ,1) ; % y f i t matrix
19
23 count = 0;
24 label = 0;
25 w h i l e l a b e l ==0
26
27
28 syms av bv xv ;
29 y _ f f =av ∗(1− exp (−1∗bv∗xv ) ) ; % f u n c t i o n y=a[1− exp(−bx ) ]
30 dyda= d i f f ( y _ f f , av ) ; % d e r i v a t i v e o f y w. r . t a
31 dydb= d i f f ( y _ f f , bv ) ; % d e r i v a t i v e o f y w. r . t b
32
33 y _ f = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
34 z ( : , 1 ) = s i n g l e ( s u b s ( dyda , { av , bv , xv } , { a , b , x } ) ) ;
35 z ( : , 2 ) = s i n g l e ( s u b s ( dydb , { av , bv , xv } , { a , b , x } ) ) ;
36 D=y−y _ f ;
37
38 i f count >0
39
40 z t =( z ’ ∗ z ) ;
41 z t z =( z t +(L∗ eye ( s i z e ( z t ) ) ) ) ;
3.5 Nonlinear Least Squares 119
42 z t d =z ’ ∗D;
43 DV= i n v ( z t z ) ∗ z t d ;
44 a=a+DV( 1 ) ;
45 b=b+DV( 2 ) ;
46
47 end
48
49 y _ f n = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
50
51 e r r = sum ( ( y − y _ f n ) . ^ 2 ) ;
52
53 i f e r r < e r r T o l | | c o u n t ==countmax
54 l a b e l =1;
55 end
56
57 % Print
58 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
59 ’ , a = ’ , num2str ( a ) , . . .
60 ’ , b = ’ , num2str ( b ) , . . .
61 ’ , e r r = ’ , num2str ( e r r ) , ’ \ \ ’ ] ;
62 disp ( prt )
63
64 count=count +1;
65
66 end
Example 3.7 Revisit Example 3.5 and solve it using the Levenberg–Marquardt
method. Perform at least 3 iterations.
Solution
Let us solve this problem using the Levenberg–Marquardt method.
We get a high value for the residue here (see Table 3.13) which tells us that the
values for a and b are not correct.
120 3 Curve Fitting
Table 3.13 Values from the initial guess and first iteration
t, s θi , ◦ C θfit (θi − θfit )2 θfit,new (θ − θfit,new )2
10 3.1 1.95 1.32 3.39 0.0660
41 11.8 7.41 19.24 12.30 0.2511
79 21.1 13.05 64.76 20.78 0.1023
139 29.8 20.04 95.32 30.11 0.0933
202 37.4 25.43 143.25 36.29 1.2301
298 42.5 30.99 132.59 41.55 0.8949
456.4742 2.6377
1st iteration: ⎡ ⎤
0.0488 380.49
⎢0.1854 1336.02⎥
⎢ ⎥
⎢0.3263 2128.83⎥
Z0 = ⎢ ⎢0.5009 2774.85⎥
⎥
⎢ ⎥
⎣0.6358 2942.89⎦
0.7746 2686.44
T 1.40 6302.89
Z0 Z0 + λ diag(Z T Z) =
6302.89 3.00 × 107
1.40 0
+ 0.01 ×
0 3.00 × 107
2nd iteration: ⎡ ⎤
0.0721 432.0501
⎢0.2642 1404.6893⎥
⎢ ⎥
⎢0.4463 2036.7543⎥
Z0 = ⎢⎢ ⎥
⎥
⎢0.6466 2287.4426⎥
⎣0.7794 2074.7312⎦
0.8924 1492.3301
T 2.0962 5739.1245
Z0 Z0 + λ diag(Z Z) =
T
5739.1245 1.81 × 107
2.0962 0
+ 0.01 ×
0 1.81 × 107
3nd iteration: ⎡ ⎤
0.0679 452.1065
⎢0.2504 1490.7325⎥
⎢ ⎥
⎢0.4261 2199.1311⎥
Z0 = ⎢⎢ ⎥
⎥
⎢0.6235 2538.0128⎥
⎣0.7582 2368.7984⎦
0.8769 1779.7580
T 1.9814 6280.1462
Z0 Z0 + λ diag(Z Z) =
T
6280.1462 2.25 × 107
1.9814 0
+ 0.01 ×
0 2.25 × 107
1 clc ;
2 clear ;
3
4 % I n p u t Data
5 xt = [10 ,41 ,79 ,139 ,202 ,298]; % input X data
6 yt = [3.1 ,11.8 ,21.1 ,29.8 ,37.4 ,42.5]; % input Y data
7 L = 0.01; %H y p e r p e r a m e t e r lambda v a l u e
8 x = xt ’ ;
3.5 Nonlinear Least Squares 123
9 y = yt ’ ;
10
11 % I n t i a l guess
12 a = 40; % i n i t i a l g u e s s f o r a and b
13 b = 0.005;
14
15 % Pre−a l l o c a t i o n
16 z = zeros ( length ( xt ) ,2) ; % Jacobian matrix
17 D = zeros ( length ( xt ) ,1) ; % forcing vector matrix
18 y_f = zeros ( l e n g t h ( xt ) ,1) ; % y f i t matrix
19
23 count = 0;
24 label = 0;
25 w h i l e l a b e l ==0
26
27
28 syms av bv xv ;
29 y _ f f =av ∗(1− exp (−1∗bv∗xv ) ) ; % f u n c t i o n y=a[1− exp(−bx ) ]
30 dyda= d i f f ( y _ f f , av ) ; % d e r i v a t i v e o f y w. r . t a
31 dydb= d i f f ( y _ f f , bv ) ; % d e r i v a t i v e o f y w. r . t b
32
33 y _ f = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
34 z ( : , 1 ) = s i n g l e ( s u b s ( dyda , { av , bv , xv } , { a , b , x } ) ) ;
35 z ( : , 2 ) = s i n g l e ( s u b s ( dydb , { av , bv , xv } , { a , b , x } ) ) ;
36 D=y−y _ f ;
37
38 i f count >0
39
40 z t =( z ’ ∗ z ) ;
41 z t z =( z t +(L∗ d i a g ( d i a g ( ( z t ) ) ) ) ) ;
42 z t d =z ’ ∗D;
43 DV= i n v ( z t z ) ∗ z t d ;
44 a=a+DV( 1 ) ;
45 b=b+DV( 2 ) ;
46
47 end
48
49 y _ f n = s i n g l e ( s u b s ( y _ f f , { av , bv , xv } , { a , b , x } ) ) ;
50
51 e r r = sum ( ( y − y _ f n ) . ^ 2 ) ;
52
53 i f e r r < e r r T o l | | c o u n t ==countmax
54 l a b e l =1;
55 end
56
57 % Print
58 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
59 ’ , a = ’ , num2str ( a ) , . . .
124 3 Curve Fitting
60 ’ , b = ’ , num2str ( b ) , . . .
61 ’ , e r r = ’ , num2str ( e r r ) , ’ \ \ ’ ] ;
62 disp ( prt )
63
64 count=count +1;
65
66 end
Problems
3.1 The dynamic viscosity of liquid water in Ns/m2 varies with temperature (T in
K) as given in Table 3.14.
(a) Using Newton’s divided difference method, appropriate for 4 data points,
obtain an exact fit to the viscosity as a function of temperature.
(b) Using the fit obtained in (a), estimate the viscosity at 295 K.
(c) Compare the result obtained in (b) with a linear interpolation.
3.2 The specific volume of saturated water vapor (vg ) varies with temperature (T)
as shown in Table 3.15.
Using Lagrange interpolation polynomial of order 2, determine the specific vol-
ume of saturated water vapor at 60◦ C. Compare this with the estimate obtained
using a linear interpolation and comment on the results.
3.3 The saturation pressure of water (P) in kPa varies with temperature (T in K)
as given in Table 3.16. Using Lagrange interpolation, determine the value of
saturation pressure at 305 K.
Table 3.15 Specific volume for various temperatures for problem 3.2
T , ◦C vg m3 /kg
30 32.90
50 12.04
80 3.409
Table 3.17 Variation of thermal conductivity with temperature for problem 3.4
Temperature, T in K 400 450 500
k, W/mK 0.0339 0.0372 0.0405
3.4 The thermal conductivity of dry air (k) for various temperatures is given in Table
3.17.
Obtain “k” at 470K using Lagrange interpolation and compare it with a linear
interpolation. Comment on your result.
3.5 In a forced convection heat transfer experiment, the dimensionless heat transfer
coefficient or the Nusselt number is known to vary with the Reynolds number in a
power law fashion, as Nu = aReb , where a and b are constants. The experimental
results are tabulated in Table 3.18.
(a) Using Least Squares Regression, estimate the parameters a and b.
(b) Determine the standard error of the estimate and the correlation coefficient.
3.6 Consider the cooling of an aluminum plate of dimensions 150 × 150 × 3 (all in
mm). The plate loses heat by convection from all its faces to still air. Radiation
from the plate can be neglected. The plate can be assumed to be spatially isother-
mal. The ambient temperature is constant at T∞ = 30◦ C. The temperature–time
response of the plate, based on experiments is given in Table 3.19.
The temperature excess θ = T − T∞ is known to vary as θ/θi = exp(−t/τ )
where θi is the initial temperature excess given by Ti − T∞ and τ = mCp /hA,
the time constant.
126 3 Curve Fitting
Table 3.19 Transient temperature history of the aluminum plate for problem 3.6
S. No. Time t, s Temperature T, ◦ C
1 10 98.5
2 40 96.1
3 80 90.2
4 120 85.9
5 180 82.8
6 240 75.1
(a) With the data given above, perform a linear least squares regression and
obtain the best estimates of θi and τ . Additionally, determine the correlation
coefficient and the standard error of the estimate of the temperature.
(b) Given Cp = 940 J/kg K and ρ = 2700kg/m3 for aluminum, from the results
obtained in part (a), determine the heat transfer coefficient (h) for the situa-
tion.
3.7 An experiment was designed to examine the effect of load, x (in appropriate units)
on the probability of failure of specimens of a certain industrial component. The
following results were obtained from the experiment (see Table 3.20).
The regression model suitable for this problem is of the following form (also
known as the logistic regression model):
1
p=
1+ e−(a+bx)
Table 3.20 Variation of number of failures with load for problem 3.7
Load, x Number of Number of
specimens failures
10 400 17
30 500 49
60 450 112
85 600 275
95 550 310
110 350 253
3.8 A device operating in outer space, undergoing cooling, can be treated as a first-
order lumped capacitance system, losing heat to outer space at 0 K purely by
radiation (no convection). The hemispherical emissivity of the device is , the
surface area of the device is A, the mass of the device is m, and its specific heat
Cp . The initial temperature of the device is Ti and there is no heat generation in
the system.
(a) State the governing equation for the system temperature (T) as a function of
time.
(b) Prove that the general solution to (a) can be given by
1 σAt −1/3
T= +
Ti3 mcp
Reference
Chapra Steven, C., & Canale Raymond, P. (2009). Numerical methods for engineers. New York,
USA: Mc Graw Hill.
Chapter 4
Optimization—Basic Ideas
and Formulation
4.1 Introduction
We now come to a very important part of the book, namely optimization. Before
we could do this, we needed to know system simulation, regression, and we assume
some prior knowledge of modeling.
What is optimization? Mathematically, optimization is the process of finding
conditions that give us the maximum or minimum values of a function. We want
to find an extremum that may be a maximum or a minimum. As far as engineering is
concerned, it is expected that every engineer will optimize. Whatever be the design,
after proposing, we want to optimize it. Therefore, we can say that optimization is
always expected from engineers, but is not “always” done. It seems perfectly logical
that once we design, say a power plant, we want to optimize it. We want to optimize
the clutch, the brake, our time, our resources, and so on. Everyone wants to optimize
something or the other. Even so, there are many instances where it is not done. This
is usually because of something called the cost benefit.
• In very large engineering projects on the other hand, sometimes it may not be
possible for us to bring them down to a mathematical form and write the set
of constraints and objective functions. Even if we are able to do so, it may be
mathematically so complex that it is not possible for us to optimize.
• The second thing is that there are situations where it is possible for us to optimize.
But because of the time and the effort involved in optimizing it, we do not want
that. More often than not, we are satisfied with a design that works reasonably well
and satisfies all the performance requirements. One such example is the pump and
piping system for the apartment complex we saw earlier.
• The third reason may be an absolute lack of knowledge on our part about opti-
mization and the techniques available. We sometimes do not know and hence say
it is not required!
• There are other projects in which it is not worth doing optimization. For example,
an engineering system may have reached saturation. An electrical motor efficiency
Example 4.1 An ethylene refining plant receives 500 kg/hr of 50% pure ethy-
lene. It refines it into two types of outputs 1 and 2. Type 1 has 90% purity. Type
2 has 70% purity. The raw material cost is Rs 40/kg. The selling price of Type
1 is Rs 200/kg and that of Type 2 is Rs 120/kg. Packaging facilities allow a
maximum of 200 kg/hr of Type 1 and 225 kg/hr of Type 2. The transportation
cost is Rs 8/kg for Type 1 and Rs 16/kg for Type 2, and the total transportation
cost should not exceed Rs 4000/hr. Set up the optimization problem to maxi-
mize the profit and state all the constraints. Do not try to solve the optimization
problem.
Solution
Let us first get the expression for profit.
Profit = sales − (investment + transportation cost)
Profit = (200x1 + 120x2 ) − (40 × 500) − (8x1 + 16x2 ) = 192x1 + 104x2 − 20000
(4.1)
So what is the final feasible region? It is the portion shaded in Fig. 4.3. What is the
revelation here? The objective function is in the background thus far. The feasible
region is completely decided by the constraints. In this region, we have to determine
the point or points that will maximize the objective function.
In certain cases, it is possible that the feasible region itself reduces to a point.
The constraints decide the final solution. We will then have a set of simultaneous
equations and we solve them and get a solution and that is it. Whether it is the
optimum or not, we do not know; that is the only solution available. And far worse,
if the number of constraints is more than the variables, we have an overdetermined
system that cannot be solved at all.
4.1 Introduction 133
This begs the question “What holds the key to the optimization?”. The answer is
“the constraints”.
Each of the points in the feasible region is a valid solution and will not violate
any of the constraints. But each of these points is not equally desirable because each
will result in a different value of the profit. Therefore, we will now have to develop
an optimization algorithm which will help us find out which of these points in the
feasible region maximizes the objective function y.
We are so fortunate that the objective function and the constraints are all lin-
ear. But this is a highly nonlinear world. As thermal engineers, we know that we
frequently encounter the term (T 4 − T∞ 4
) in radiation, which is highly nonlinear.
Certain problems involving manufacturing, job allocation, and some applications
where the objective function and the constraints are linear, can be solved by a special
technique called Linear Programming (LP). We look at LP in detail in Chap. 7.
Y = Y [x1 , x2 , x3 . . . xn ] (4.2)
subject to
ψ1 = ψ1 [x1 , x2 , x2 . . . xn ] ≤ r1 (4.6)
ψ2 = ψ2 [x1 , x2 , x2 . . . xn ] ≤ r2 (4.7)
..
.
ψk = ψk [x1 , x2 , x2 . . . xn ] ≤ rk (4.8)
Equation 4.2 is the objective function and also called figure of merit in many
places.
Equations 4.3–4.5 represent the equality constraints. They arise when we have
to satisfy the basic laws of nature like the law of conservation of mass or Newton’s
second law of motion. For instance, suppose we say that the mass balance has to
be satisfied exactly and no mass can be stored within a system, the mass balance
constraint of Example 4.1 will have to be represented as an equality as 0.9x1 + 0.7x2
= 250.
However, when we say the maximum temperature in a laptop, for example, should
not exceed 70 ◦ C, we write the constraint as T ≤ 70. Inequality constraints like
this arise when restrictions or limitations are set on the possible values of certain
parameters, because of safety and other considerations. Equations 4.6–4.8 represent
the inequality constraints.
Sometimes it is possible to treat equality constraints also as inequality constraints.
If x = 5, one can represent it as a set of inequality constraints x < 6 and x > 4. If we
have a powerful solver that works very well for problems with inequality constraints,
such a strategy can be gainfully employed.
We also have nonnegativity constraints, wherein we declare that all x’s are greater
than or equal to 0. Bounds for variables are also prescribed under normal circum-
stances.
The objective function need not be linear, but can be hyperbolic, logarithmic,
exponential, and so on. Solving the objective function with the constraints may be
very formidable for some problems. For example, to get one solution, we may have
to do a finite element or a CFD simulation.
The above is possible when “a”is a constant. So we just optimize y and add the
constant to it, after the optimization is accomplished.
Fig. 4.4 Depiction of the feasible domain and the optimum for a two-variable problem
1. The constraints allow us to explore the solution space; they do not decide the
final solution themselves. They do not decide the values of all the variables. If
they decide the values of all the variables, there is nothing we can do. Though
the constraints bind the values, they give us the freedom to play around with the
variables.
2. Where does the smartness lie? It lies in getting the global optimum without work-
ing out the value of y for each of the points in the feasible domain. First, there
is the feasible domain in which we have got the choice of so many solutions of
which all are not equally desirable (see Fig. 4.4). The constraints allow us some
breathing space. But in doing so, we do not unimaginatively and exhaustively
search for all the points in the feasible domain. Therefore, there is a need to
develop appropriate optimization techniques to obtain the optimum without hav-
ing to exhaustively search for all the points that satisfy all the constraints. Therein
lies the key to developing a successful optimization technique.
A typical flowchart for solving an optimization problem is given in Fig. 4.5. First,
we have to establish the need for optimization, i.e., get convinced that the problem
is worth optimizing. Having decided that optimization is required, the next step is
to set up the objective function. Following this will be setting up of the constraints.
We need to set limits on the variables, called bounds. The next step is to choose the
method that we are going to use. Then comes a decision box, which checks if the
136 4 Optimization—Basic Ideas and Formulation
There are several ways of classifying optimization techniques and one such is given
in Fig. 4.6. Optimization methods are broadly classified into calculus methods and
search methods. Under each method we can have a single-variable or a multivariable
problem. Again, a single or a multivariable problem can be a constrained or an
unconstrained variable optimization problem.
In calculus methods, we use the information on derivatives to determine the opti-
mum. In search methods, we use objective function information mostly and start
with an initial point and progressively improve the objective function.
In calculus methods, however, we completely ignore the objective function and
just determine the values of x1 , x2 ... xn at which y becomes an extremum. We do not
worry about the value of the optimum. The value of the function y at the optimum
is a post-processed quantity in calculus methods. So the calculation of the objective
function is pushed to the end. However, in search methods, it is the objective function
that we are always comparing.
The important requirement for a calculus method is that the constraints must be
differentiable and must be equalities. If they are inequalities, then it is a lot of trouble.
138 4 Optimization—Basic Ideas and Formulation
Please note that the first division into calculus and search methods is based on
the type of method used. Further down, the divisions are based on the problem. We
have indicated both types of optimization problems we normally encounter and the
methods used for solving this. We can have a multivariable constrained optimization
problem which can be solved by a search or a calculus method. Or we can have
a single-variable unconstrained optimization problem that can be solved using an
appropriate method.
Problems
4.1 In a steam power plant, 5 kg/s steam enters the turbine. Bleeding occurs at two
stages as shown in Fig. 4.7. The bled steam is used for preheating. The prices are,
Rs.4/kWh for electricity, Rs. 0.15/kg for low-pressure steam, and Rs. 0.25/kg
for high-pressure steam. Assume that each kg/s into the generator can produce
0.025 kWh electricity.
To prevent overheating of the generator, the mass flow into it should be less than
3 kg/s. To prevent unequal loading on the shaft, the extraction rates should be
such that 2x1 + 3x2 ≤ 10. The design of the bleed outlets allows the constraint
6x1 + 5x2 ≤ 20. Formulate the optimization problem for maximizing the profit
from the plant.
4.2 General Representation of an Optimization Problem 139
4.2 A one-dimensional pin fin losing heat by natural convection has length L and
diameter d (both in m). The volume of the fin is fixed at V m 3 . The base tempera-
ture of the fin, Tb , the ambient temperature, T∞ , and the heat transfer coefficient,
h, are known and are constant. The fin is adiabatic at the tip. It is desired to
maximize the heat transfer from the fin for a given volume, V.
(a) Formulate this as a two-variable, one-constraint optimization problem in L
and D.
(b) By substituting for L or D from the constraint, convert the problem into a
single-variable optimization problem in either L or D.
(c) Do you feel that an optimum L or D exists? Justify your answer.
4.3 Consider a power plant on a spacecraft working on an internally reversible Carnot
cycle on an organic fluid between two temperatures TH and TL , where TH is
the temperature of evaporation of the organic fluid and TL is the condensation
temperature. The heat rejection from the condenser has to be accomplished by
radiation to the outer space at a temperature of T∞ K. The emissivity of the
condenser surface is , and the total surface area available is A m 2 .
(a) Formulate the optimization problem for maximizing the work output from
the plant, with TH being fixed and TL being the variable.
(b) Solve the above problem for the special case of T∞ = 0 K, using calculus.
(c) For the case of TH = 350K , T∞ = 3 K, = 0.9, and A = 10m 2 , determine
the optimal value of TL and the corresponding work output and efficiency
of the space power plant . You may use the successive substitution method
for solving any nonlinear equation you may encounter.
4.4 Computer-based exercise†
(a) An overhead tank of a big apartment complex has a capacity of 12000 liters.
It is desired to select a pump and piping system to transport water from the
sump to the tank. The distance between the two is 270 m, and the tank is at
a level 22 m above the sump. For operational convenience, the time to fill
the tank shall be 60 min. Losses in the expansions, contractions, bends, and
elbows have to be calculated appropriately. Design a workable system for
the above and sketch the layout.
140 4 Optimization—Basic Ideas and Formulation
4.5 Using data for PVC pipes from local market sources, assign realistic values for
the cost of the pipe (PVC), pump, and running cost including maintenance costs.
With the help of any method known to you, obtain the value of P developed
by the pump at which the total cost will be minimum. The pump is expected to
work every day and the average daily consumption of water is 24000 l. The cost
of electricity may be assumed to be Rs.1 5.50 per unit and invariant with respect
to time. The life of the system may be assumed to be 20 years. Let x the increase
in electricity cost per year in percentage. Consider two values of x − 6% and
7%.
Output expected:
1. Setting up of the optimization problem with data
2. Sample calculations,
3. Plots on Excel/MATLAB ,
4. Final configuration and sketch.
†
(Solution to the problem 4.4 is given in Appendix)
5.1 Introduction
The method of Lagrange multipliers is one of the most powerful optimization tech-
niques. This can be used to solve both unconstrained and constrained problems with
multiple variables. So, is it a cure for all as it can solve all kinds of problems? No!
Because (i) the constraints must be equalities, (ii) the number of constraints must
be less than the number of variables, and (iii) the objective function and constraints
must be differentiable. These restrictions notwithstanding, there is a wide class of
problems including those in thermal engineering which can be solved using Lagrange
multipliers.
For the Lagrange multiplier method to work, “m” should be less than equal to “n”.
If m = n, the constraints themselves will fix the solution and no optimum exists. If
m < n, there will be a feasible domain, where we can possibly explore. If m > n, we
have an over constrained optimization problem that cannot be solved. The Lagrange
multiplier method to optimize an “m” constraint “n” variable optimization problem
is akin to solving a set of equations that are generally written as follows:
∇ y − λ∇φ = 0 (5.5)
∂y ∂φ1 ∂φ2 ∂φm
− λ1 − λ2 . . . − λm =0 (5.6)
∂x1 ∂x1 ∂x1 ∂x1
..
.
∂y ∂φ1 ∂φ2 ∂φm
− λ1 − λ2 . . . − λm =0 (5.7)
∂xn ∂xn ∂xn ∂xn
Here, there are no constraints (φ’s do not exist). So, there are no Lagrange multi-
pliers. Hence this method reduces to ∇ y = 0. For a one-variable problem, we have
dy/dx = 0. Typically, we can have three situations as seen in Fig. 5.1. Here, A is a
maximum, B is a minimum while C is an inflection point, where the second derivative
is also 0. The Lagrange multiplier method, therefore, will give us only the values
of the independent variables at which the function becomes stationary. It helps us
to locate the extremum. However, necessary and sufficient second-order conditions
are required to determine whether the optimum is a maximum or a minimum or an
inflection point. A typical depiction of minimum for a two-variable unconstrained
optimization problem is given in Fig. 5.2.
Having said that, we must also add that in many (not all !) engineering problems,
once we have made something stationary it is possible for us to figure out intuitively
whether we are heading toward a maximum or minimum.
Example 5.1 Determine the minimum of the following function using the
Lagrange multiplier method.
Solution
First we realize that the above is a two-variable unconstrained optimization problem.
Equation 5.8 represents the equation of a circle with center at (8, 6).
144 5 Lagrange Multipliers
∂y ∂y
We first set ∂x 1
= 0, ∂x 2
= 0 thus get the value of x1 and x2 at the optimum. We
substitute for x1 and x2 in Eq. 5.8 and obtain y.
We will denote the optimum by (x1+ ,x2+ ) to distinguish it from general x1 and x2 .
The Lagrange multiplier equations for this problem reduce to ∇ y = 0
∂y ∂y
= 0 and =0 (5.9)
∂x1 ∂x2
Example 5.2 Maximize the following function using the Lagrange multiplier
method.
Solution
The above problem is an unconstrained optimization problem. We say that x1 , x2
and x3 should be greater than 0 because they are physical variables. We cannot
produce −4 tables and −6 chairs for example. Maximization of a + y is equal to a +
maximization of y and minimization of (a − y) is equal to (a-max(y)). We can make
use of this for solving this unconstrained optimization problem.
Applying the method of Lagrange multipliers, we have
∂y ∂y ∂y
=0= = (5.11)
∂x1 ∂x2 ∂x3
5.2 The Algorithm 145
∂y
= −[2x1 − (x2 /x12 )] = 0 (5.12)
∂x1
∂y
= −[−1/(x22 x3 ) + (1/x1 )] =0 (5.13)
∂x2
∂y
= −[−1/(x2 x32 ) + (1/8)] =0 (5.14)
∂x3
2x13 = x2 (5.15)
x1 = x22 x3 (5.16)
x2 x32 = 8 (5.17)
x2 = (8/x32 ), x1 = (64/x33 ) (5.18)
x3+ = 4.87, x2+ = 0.337, x1+ = 0.55 (5.19)
y + = 3.86 (5.20)
Example 5.3 Consider a solar thermal application where the hot water pro-
duced by a solar collector is kept in a cylindrical storage tank and its use
regulated so that it is also available during night time. The storage tank has
a capacity of 4000 l. Convective losses from the tank have to be minimized.
Radiative losses can be neglected. Ambient temperature T∞ and convection
coefficient h are constant. The hot water temperature may be assumed constant
in the analysis. Solve this as an unconstrained optimization problem in r and
h, where r is the radius of the tank and h is the height of the tank using the
Lagrange multiplier method.
Solution
We want to solve this as an unconstrained optimization problem. However, there is
a constraint in this problem. So first the constraint is substituted into the objective
function to convert the latter into an unconstrained problem. The losses from both
the top and the bottom walls also need to be accounted. We have to just minimize A.
Q = h AT (5.21)
∂A
= 4πr − (8/r 2 ) = 0 (5.26)
∂r
4πr = 8/r 2 (5.27)
r = 2/π
3
(5.28)
r + = 0.860 m (5.29)
πr h = 4
2
(5.30)
h + = 1.72 m (5.31)
A+ = 13.94 m 2 (5.32)
∂A
= 4πr − (8/r 2 ) (5.33)
∂r
∂2 A
= 4π+(16/r 3 ) (5.34)
∂r 2
Since ∂∂rA2 is +ve. A+ is a minimum. If the temperature and the convection coefficient
2
are constant, the height of a cylindrical tank must be twice its radius or equal to its
diameter.
Solution
The Lagrange multiplier equations for this problem are
∂A ∂φ
−λ =0 (5.38)
∂r ∂r
5.2 The Algorithm 147
∂A ∂φ
−λ =0 (5.39)
∂h ∂h
φ=0 (5.40)
4r + 2h = 2λr h (5.44)
λ = 2/r (5.45)
πr 2 h = 4 (5.46)
4r + 2h = 2 (2/r ) r h (5.47)
4r = 2h (5.48)
h = 2r (5.49)
πr 2 2r = 4 (5.50)
r + = 0.86 m (5.51)
h + = 2r + = 1.72 m (5.52)
−1
λ = 2.335 m (5.53)
Now we have the optimum values r+ , h+ , A+ along with a new parameter called λ,
which has the units m−1 . We have to now interpret what this λ means. In order to
do this, let us undertake a small exercise. Suppose we change the volume of the tank
from 4000 to 4500 l, we want to see what happens to the solution. For this case, it
can be shown that r+ = 0.89 m. The other quantities of interest (for v = 4500 l) are
shown below.
r + = 0.89 m, h + = 1.79 m, A+ = 15.09 m 2 (5.54)
We evaluate the change in area to the change in volume. When the volume changes
from 4000 to 4500 liters, the surface area has changed from 13.94 to 15.09m 2 .
A (15.09 − 13.94)
= = 2.3m −1 (5.55)
V 0.5
What is the value of λ here? 2.3. What is A/V ? 2.3. So λ is nothing but the
change in objective function with respect to a change in the constraint. λ is called
the shadow price. So λ is the Lagrange multiplier, or the sensitivity coefficient, or the
shadow price. If we relax the constraint from 4000 to 4500 m 3 , how much additional
area can we get? What is the sensitivity? The answer to these questions is λ!. In fact,
it comes from the governing equation itself.
148 5 Lagrange Multipliers
∇ y − λ∇φ = 0 (5.56)
∇y y +
λ= ≈ (will be shown in due course) (5.57)
∇φ φ+
Example 5.5 Determine the shortest distance from the point (0, 1) to the
parabola x 2 = 4y by (a) eliminating x (b) Lagrange multiplier technique.
Explain why approach (a) fails to solve the problem while approach (b) does
not fail. (This problem is adapted from Engineering Optimization by Ravindran
et al. 2006).
Solution
A plot of the parabola is given. The shortest distance from the point (0, 1) is so
obvious. The Lagrange multiplier should also give the same answer. We will start
with approach (b). The first step is to minimize z given by
Minimize z = (x − 0)2 + (y − 1)2 (5.58)
R = z 2 = x 2 + (y − 1)2 (5.59)
subject to: φ = x − 4y = 0
2
(5.60)
∂R ∂φ
−λ =0 (5.61)
∂x ∂x
∂R ∂φ
−λ =0 (5.62)
∂y ∂y
φ=0 (5.63)
x, y and λ are the 3 unknowns, there are 3 equations. It is possible for us to solve the
3 equations to determine the 3 unknowns.
2x − λ2x = 0 (5.64)
2(y − 1) + λ4 = 0 (5.65)
φ = x 2 − 4y = 0 (5.66)
2(y − 1) + 4 = 0 (5.67)
(y − 1) = −2 (5.68)
y = −1 (5.69)
z 2 = x 2 + (y − 1)2 (5.70)
x = 4y
2
(5.71)
z = 4y + y − 2y + 1
2 2
(5.72)
z = (y + 1)
2 2
(5.73)
z = y+1 (5.74)
150 5 Lagrange Multipliers
We are stuck here and cannot proceed further with the solution.
The Lagrange method, without the intervention of the analyst, leads us to the
solution automatically because when we solve Eq. 5.64, x = 0 is a distinct possibility.
But in method (a), we have to give additional arguments to get the answer. So method
“a”is quite inferior compared to method “b”. z = y + 1 is correct, but we do not
know the value of y+ to get the minimum distance z+ .
The methodology we would like to use for illustrating this is as follows. We first
take a two-variable, one-constraint problem. Using the regular Lagrange multiplier
method, we first obtain the solution to convince ourselves that we do not have a
fictitious problem in hand. Then using graph sheets, we plot what is required and try
to interpret from the solution of the Lagrange multiplier and the plot, if there is a
correlation between the two.
Example 5.6 Minimize y = 4x1 + 3x2 , subject to (x1 − 8)2 + (x2 − 6)2 = 25
Solution
The above problem could be minimization of some cost, subject to some criterion.
We solve it first as a constrained problem using the Lagrange multiplier method.
Using the Lagrange multiplier equations, we have
∂y ∂φ
−λ =0 (5.75)
∂x1 ∂x1
∂y ∂φ
−λ =0 (5.76)
∂x2 ∂x2
φ = (x1 − 8)2 + (x2 − 6)2 − 25 = 0 (5.77)
4 − λ 2(x1 − 8) = 0 (5.78)
3 − λ 2(x2 − 6) = 0 (5.79)
(x1 − 8)2 + (x2 − 6)2 = 25 (5.80)
2λ(x1 − 8) = 4 (5.81)
2λ(x2 − 6) = 3 (5.82)
x1 − 8 4
= (5.83)
x2 − 6 3
3x1 − 24 = 4x2 − 24 (5.84)
5.3 Graphical Interpretation of the Lagrange Multiplier Method 151
Furthermore, any point on the constraint is a valid solution to the problem because
the constraint cannot be violated. Now if we move further, again the constraint will
be violated. So there may be other y = c lines that may cut the constraint at one or
more points, but none can have a value of y lower than what is obtained when y = c
becomes a tangent to the constraints.
When we are to the left of the constraint curve, we can get values of y that are
very low. However, the constraint (x1 − 8)2 + (x2 − 6)2 = 25 will not be met.
The final solution to this problem is x1+ = 4, x2+ = 3 and y + = 25.
What does the Lagrange multiplier method do? The tangent to the constraint
equation at the optimal point and the iso-objective line are parallel to each other. An
alternative way of saying this is the gradient vectors will be parallel to each other. If
the gradient vectors are parallel to each other, we are not saying that they will have
the same magnitude. But what we are saying here is, the gradient vectors have to
be collinear. They can even be pointing in the opposite direction. If we want to say
that the tangent to this curve and the iso-objective line are parallel to each other, it is
analogous to saying that ∇y and ∇φ must be parallel.
However, their magnitudes need not be the same. This is possible only when
∇ y − λ∇φ = 0 (5.95)
or
∇ y + λ∇φ = 0 (5.96)
5.3 Graphical Interpretation of the Lagrange Multiplier Method 153
Therefore we say that ∇ y and ∇φ must be collinear vectors. But is that all?
The solution must lie on the constraint curve so that the constraint equation is also
satisfied. Therefore, the above condition must be satisfied in conjunction with φ = 0,
because the final solution we are seeking is a point on the constraint curve. Therefore,
if we come up with the above set of equations and solve this set of scalar equations
in conjunction with φ = 0, then we will get a solution to the original optimization
problem. This is the graphical interpretation of the Lagrange multiplier.
The important point to remember is ∇ y and ∇φ are collinear vectors. The
Lagrange multiplier λ has to be involved because the magnitudes of y and φ need not
be the same. λ is only a scalar, so it does matter whether we use ∇ y − λ∇φ = 0 or
∇ y + λ∇φ = 0. Please remember that finally the solution must lie on the constraint.
∂φ ∂φ
dφ = d x1 + d x2 (5.99)
∂x1 ∂x2
When we are seeking a solution to the optimization problem, the constraint has to
be necessarily satisfied. For this, it must satisfy φ = 0; Therefore dφ = 0.
dφ =0 (5.100)
∂φ ∂φ
dφ =( )d x1 + ( )d x2 (5.101)
∂x1 ∂x2
∂φ
( ∂x )d x2
∴ d x1 = − 2
∂φ
(5.102)
( ∂x1 )
∂y ∂y
dy = ( )d x1 + ( )d x2 (5.103)
∂x1 ∂x2
154 5 Lagrange Multipliers
It is possible for us to substitute for d x1 in Eq. (5.103) from the expression obtained
in Eq. (5.102). Substituting for d x1 from Eq. 5.102, we have
∂ y ∂φ
∂x1 ∂x2
d x2 ∂y
dy = − ∂φ
+ d x2 (5.104)
∂x1
∂x2
Let us say
∂y
∂x1
λ=− ∂φ
(5.105)
∂x1
∂y ∂φ
dy = [ −λ ]d x2 (5.106)
∂x2 ∂x2
∂y ∂φ
−λ =0 (5.107)
∂x2 ∂x2
∂y ∂φ
−λ =0 (5.108)
∂x1 ∂x1
and finally
φ=0 (5.109)
∴ ∇ y − λ∇φ = 0 from Eqs. 5.107 and 5.108 (5.110)
These are the Lagrange multiplier equations stated at the beginning of the chapter.
∇y +
λ=[ ] (5.111)
∇φ
Therefore, λ is the ratio of the change in the objective function to the change in the
constraint. We saw this in an earlier example. This is the sensitivity coefficient. In
operations research, it is called the shadow price.
Let us look at a company that makes furniture. It makes only two types of products,
tables and chairs. So a certain amount of wood is required for making one chair and
a certain amount of wood is required for making one table. A certain amount of labor
is required for making a chair and a certain amount of labor is required for making
a table. The profit from a chair is C1 , while that from the table is C2 . Therefore,
the total profit will be C1 x1 + C2 x2 . We want to maximize the total profit subject to
the condition that there is a finite amount of labor and material available. So in this
problem, the two constraints are labor and material. So if more wood or labor is made
available to the furniture company, how will the objective function y, which is the
profit, change? So this is known as the shadow price. If we pump in more resources,
what will the profit be? It is called a shadow price because it is not realized yet!
dy
=0 (5.114)
dx
d2 y
> 0; y is a minimum (5.115)
dx2
d2 y
< 0; y is a maximum (5.116)
dx2
d2 y
= 0; We have saddle/inflection point (5.117)
dx2
2
When dd xy2 = 0, it means the second-order test is insufficient or the function is moving
very gently over there, which may be very good for us from an engineering viewpoint.
Mathematically, we would like to know precisely the value of x at which y would
become maximum or minimum. But this is sometimes dreaded by engineers. Since
the measurements of all variables are subject to errors. We would love to have an
optimum of y which is not very sensitive to the values of x1 , x2 . . . xn at the optimum,
a robust optimum so to speak. We need to have some objective function that gives
us some breathing space, where if the value of y is 100, it should be possible for us
156 5 Lagrange Multipliers
to get between 95 and 100 for a reasonable range of the independent variables. So
we will say that any of the solutions that give y as 95 and above is fine. Hence, the
sensitivity or rather its lack is very important for engineers.
When more than one variable is encountered, which is invariably the case in
optimization problems, we need to go in for detailed tests. Let us consider a two-
variable problem where y = f(x1 , x2 ). We seek the minimum of this function y. Let
the point (a1 , a2 ) be somewhere near the optimum or the optimum itself.
We expand y(x1 , x2 ) around (a1 , a2 ) using Taylor series
∂y ∂y
y(x1 , x2 ) = y(a1 , a2 ) + (x
∂x1 1
− a1 ) + (x
∂x2 2
− a2 )
+ 2!1 ∂∂xy2 (x1 − a1 )2 + 1 ∂ y
2 2
(x
2! ∂x22 2
− a 2 )2
1
+ ∂x∂1 ∂x
2
y
2
(x1 − a1 )(x2 − a2 )
+ higher or der ter ms (5.118)
We ignore the higher order terms assuming that they do not contribute significantly
to y(x1 , x2 ).
If ∂ y/∂x1 or ∂ y/∂x2 is a large value, we can simply move the point to a nearby
value from (a1 , a2 ) and increase the function or decrease the function depending
on whether we are seeking a maximum or a minimum. In that case (a1 , a2 ) will no
longer be a solution to the problem. Therefore, the first-order derivative becoming
zero is a mandatory condition.
For a minimum, if we move from (a1 , a2 ), any perturbation from (a1 , a2 ) should
result in a value of y which is more than that at (a1 , a2 ). Therefore, it is enough
for us to prove that the second-order terms on the RHS of Eq. 5.118 result in a
positive quantity for a minimum or alternatively we find out conditions such that the
second-order terms are positive.
The second-order terms, thus, will decide whether y(a1 , a2 ) is a minimum
∂2 y
Let a11 = ∂x12
(5.119)
∂ y2
a22 = ∂x22
(5.120)
∂ y2
a12 = ∂x1 x2
(5.121)
If this condition is satisfied, we get a minimum regardless of the values of x1 and
x2 . So, how do we get the conditions? A very crude way is to keep changing x1
and x2 and find this out. This is very unimaginative. We take 20 values of x1
and 20 values of x2 and write a program to see if the condition given in Eq. 5.122
is violated. Apart from being an imperfect procedure it also becomes impossible if
5.6 Tests for Maxima/Minima 157
and z 2 = x2
2
a12
a11 .z 12 + (a22 − )z 2 > 0 (5.126)
a11 2
If for all values of z 1 and z 2 , inequality in Eq. 5.126 has to be true, individually the
terms have to be positive. For this to be true for any value of z 1 and z 2 , a11 and the
term within the brackets have to be greater than 0.
2
a12
∴ a11 > 0 and (a22 − )>0 (5.127)
a11
a11 a12
If D = (5.128)
a12 a22
∂2 y ∂2 y
∂x12 ∂x1 ∂x2
Then D = ∂2 y ∂2 y > 0 and a11 > 0, then y is a minimum (5.129)
∂x1 ∂x2 ∂x22
If the determinant is greater than 0 and a11 > 0, H is called a positive-definite matrix,
then y is a minimum. If H is negative definite, where D > 0 and a11 < 0, y is a
maximum. If H is indefinite, then y is a saddle point or an inflection point?
What will happen if D = 0? The solution becomes a critical point where the Hes-
sian test is inconclusive. It very rarely happens. We can possibly set up mathematical
equations such that D = 0, but in most engineering problems, this will not happen.
In fact, in most engineering problems, without doing the test for the Hessian matrix,
we will be in a position to decide whether the resulting optimum is a maximum or
a minimum. However, we can use this test and be sure that the final extremum we
have obtained is really a maximum or a minimum.
When the Hessian matrix of y is positive definite or positive semi-definite for
all values of x1 . . . xn , then the function f is called a convex function. If the Hessian
is positive definite, then y is said to be strictly convex and has a unique minimum.
By the same token, a function y is a concave function if and only if -y is a convex
function (needless to say over the same range of each of the variables x1 . . . xn ).
Mathematically, a function of n variables y(X), where X = (x1 , x2 . . . xn ) on a
convex set R is said to be convex if and only if for any two points X1 and X2 ∈ R,
and 0 ≤ γ ≤ 1
Equation 5.131 tells us that the weighted average of the function at points X1 and
2
X (RHS of Eq. 5.131)will always be equal to or more than the value of the function
evaluated at the weighted average of X1 and X2 themselves (LHS of Eq. 5.131).
However, this is often cumbersome to test in a multivariable problem. The Hessian
test may be more useful in such cases.
There is one more way of looking at it. We have the Hessian matrix of the partial
derivatives of the second order. If all the eigenvalues are positive, we have a positive-
definite matrix. If all the eigenvalues are negative, we have a negative definite matrix.
If some values are positive and some negative, the matrix is indefinite.
Example 5.7 Minimize y = f (x1 , x2 ) = (x1 − 8)2 + (x2 − 6)2 using Lagrange
multipliers. Check for minimum using the Hessian matrix.
Solution
This is an unconstrained optimization problem, which we have already solved.
It is a straightforward problem, where y is equal to square of the radius of the circle.
∂y
= 0 = 2(x1 − 8) = 0; x1 = 8 (5.132)
∂x1
∂y
= 0 = 2(x1 − 6) = 0; x2 = 6 (5.133)
∂x2
5.6 Tests for Maxima/Minima 159
∂2 y
=2 (5.134)
∂x12
∂2 y
=2 (5.135)
∂x22
∂2 y
=0 (5.136)
∂x1 ∂x2
∂2 y ∂2 y
∂x12 ∂x1 ∂x2
D= ∂2 y ∂2 y (5.137)
∂x1 ∂x2 ∂x22
20
D= (5.138)
02
∂2 y
= a11 > 0 (5.139)
∂x12
D>0 (5.140)
Example 5.8 Revisit the cylindrical solar water heater storage problem (see
Example 5.3). Minimize A = 2πr 2 + 2πr h subject to φ = πr 2 h − 4 = 0.
Establish that the solution is a minimum by evaluating the Hessian matrix.
Solution
The Lagrange multiplier equations are
∂A ∂φ
−λ =0 (5.141)
∂r ∂r
∂A ∂φ
−λ =0 (5.142)
∂h ∂h
φ=0 (5.143)
A = 2πr h + 2πr h
2
(5.144)
∂A ∂2 A
= 4πr + 2π, 2 = 4π (5.145)
∂r ∂r
∂A ∂2 A ∂2 A
= 2πr, = 0, =0 (5.146)
∂h ∂h 2 ∂h∂r
160 5 Lagrange Multipliers
⎡ ⎤
∂2 A ∂2 A
⎢ 2 ∂r ∂h ⎥
H = ⎣ ∂r
∂2 A ∂2 A ⎦
(5.147)
∂r ∂h ∂h 2
4π 2π
H= (5.148)
2π 0
So we find that D is negative and a11 is positive. Hence, the Hessian test is inconclusive
as we said D has to be necessarily positive for us to conclude if it is a minimum or
a maximum. a22 = 0 because it does not vary with the variables. When we said
a11 = 0, it also means that a22 = 0. So if any of these become 0, then already the
second derivative is becoming 0.
But when we considered this as a single-variable problem, we established the
point to be a minimum. Or we can write a program and for all combinations of r and
h, with the precision of 10−5 m, we can prove that we cannot get a solution that has
an area smaller than what was obtained earlier. So, we should not get carried away
by the Hessian test. Sometimes, what common sense tells us may not be revealed by
the Hessian!.
It now becomes apparent that the Hessian needs to be tested on a new quantity
L, the Lagrangian defined as L = Y − λφ, if we want to use it for a constrained
optimization problem. We will see if the Hessian test is conclusive for L.
In this case L = A − λφ, L = 2πr 2 + 2πr h + λ(πr 2 h − 4). On evaluating the
Lagrangian, we have
4π − 2πλh 2π − 2πλr
L=
2π − 2πλr 0
Example 5.9 This is a problem from fluid mechanics. Flow in a pipe network
is being optimized using Lagrange multipliers. We have a circular duct, whose
diameter varies as 3 levels d1 , d2 , and d3 . Air is flowing in.
Determine the diameters d1 , d2 , and d3 of the circular duct shown below such
that the static pressure drop between the inlet and the outlet is a minimum. The
total quantity of sheet metal available is 120m2 and the Darcy friction factors
for pipes 1, 2, and 3 are to be calculated from the relation f = 0.184Re−0.2 D .
The density of air is constant at 1.18 kg/m 3 . Use the method of Lagrange mul-
tipliers. The kinematic viscosity of air may be assumed to be 15 × 10−6 m 2 /s.
5.6 Tests for Maxima/Minima 161
Solution
Formulation of the optimization problem
f 1 L 1 v12
P1 = ρg (5.151)
2g D1
f 2 L 2 v22
P2 = ρg (5.152)
2g D2
f 3 L 3 v32
P3 = ρg (5.153)
2g D3
ρπd12
m 1 = 4 = ρA1 v1 = v1 (5.154)
4
ρπd22
m 2 = 1.5 = ρA2 v2 = v2 (5.155)
4
ρπd32
m 3 = 1.5 = ρA3 v3 = v3 (5.156)
4
d12 v1 = 4.32 (5.157)
d22 v2 = 2.7 (5.158)
d32 v3 = 1.08 (5.159)
Now we are able to see that there is considerable effort in formulating the optimization
problem. This is what is normally encountered in engineering.
f 1 L 1 v12
P1 = ρg (5.160)
2g D1
4m −0.2 4.32 2
0.184( πμd ) 4( d1 ) ρ
P1 = 1
(5.161)
2d1
P1 = 0.971 d1−4.8 (5.162)
P2 = 0.564 d2−4.8 (5.163)
P3 = 0.135 d3−4.8 (5.164)
Min y = P1 + P2 + P3 (5.165)
So we have not yet solved the problem, but just formulated it!! Now we have to use
the Lagrange multiplier method.
∂y ∂φ
−λ =0 (5.170)
∂d1 ∂d1
∂y ∂φ
−λ =0 (5.171)
∂d2 ∂d2
∂y ∂φ
−λ =0 (5.172)
∂d3 ∂d3
φ=0 (5.173)
There are 4 unknowns here. The resulting Lagrange equations are very simple and
solvable. λ is unrestricted in sign. The only condition is ∇ y and ∇φ must be collinear.
If we get d1 , d2 and d3 in terms of λ and substitute in the equation φ = 0, we are
done.
0.971(−4.8)d1−5.8 − 3λ = 0 (5.174)
0.564(−4.8)d2−5.8 − 4λ = 0 (5.175)
0.135(−4.8)d3−5.8 − 5λ = 0 (5.176)
d1 = −1.07(λ)−0.172 (5.177)
−0.172
d2 = −0.93(λ) (5.178)
−0.172
d3 = −0.68(λ) (5.179)
φ=0 (5.180)
φ = 3d1 + 4d2 + 5d3 − 19.1 = 0 (5.181)
⎡ ⎤
0.2743 0 0
L = ⎣ 0 0.4221 0 ⎦ (5.189)
0 0 0.7017
1 clear ;
2 clc ;
3
4 syms x1 x2 x3 x4
5
6 % Objective function
7 y=0.971∗(x1^(−4.8))+0.564∗(x2^(−4.8))+0.135∗(x3^(−4.8)) ;
8
9 % Constraint
10 f=3∗x1+4∗x2+5∗x3−19.1;
11
50
57 % Hessian matrix
58 H = [double(subs(dydx12,{x1,x2,x3},{x(1) ,x(2) ,x(3) }) ) 0 0;
59 0 double(subs(dydx22,{x1,x2,x3},{x(1) ,x(2) ,x(3) }) ) 0;
60 0 0 double(subs(dydx32,{x1,x2,x3},{x(1) ,x(2) ,x(3) }) ) ] ;
61
the solution. But there is no way upfront or a priori to know whether a constraint
is active or binding. This problem was thought about by Kuhn and Tucker and they
finally came out with the Kuhn and Tucker Conditions (KTCs).1
Let us consider a nonlinear optimization problem involving n variables, m equality,
and r inequality constraints.
Now we write the KTCs applicable to this problem. These are given by
m
k
∇y − λi ∇φi − u r ∇ψr = 0 (5.198)
i=1 r =1
φi = 0 for all i ≤ m (5.199)
ψr ≥ 0 for all r (5.200)
Please note that the first two terms of Eq. 5.198 are the same as the Lagrange multipli-
ers method. We have handled the objective function y and all the equality constraints
where φ = 0, applicable for i = 1 to m.
The last term accounts for the inequality constraints ψ, while u is similar to the
Lagrange multiplier. It is a sensitivity coefficient, whose nature is not yet known
to us.
Now for the new conditions.
1A little history: Professor Tucker is no longer alive. He was the Ph.D. advisor of Prof. John Nash,
who won the Nobel prize in 1994 for mathematics and also the subject of the movie “A beautiful
mind ”. Prof. Kuhn is a contemporary of Prof. Nash and continues to work at Princeton University
and was the mathematics consultant for the movie. He was largely responsible for nominating John
Nash for the Nobel prize and for getting the movie made. Prof. Tucker and Prof. Nash have done
a lot of pioneering work in operations research, game theory and are particularly known for their
work on the problem called Prisoners dilemma. They also developed the Hungarian method for the
assignment problem and the traveling salesman problem in operations research.
5.7 Handling Inequality Constraints 167
Subject to
ψ = x1 + x2 > 9 (5.204)
Solution
The above problem is similar to Example 5.1. However, we now have given an
additional constraint that x1 + x2 > 9. This is an inequality constraint and hence we
want to use the KTC. The first step is to assume it as an active constraint and see
whether u is positive or negative.
∂y ∂ψ
−u =0 (5.205)
∂x1 ∂x1
∂y ∂ψ
−u =0 (5.206)
∂x2 ∂x2
x1 + x2 = 9 (5.207)
ψ=0 (5.208)
168 5 Lagrange Multipliers
u≥0 (5.209)
2(x1 − 8) − u = 0 (5.210)
2(x2 − 6) − u = 0 (5.211)
x1 = 5.5, x2 = 3.5 (5.212)
From this, we get u = −5 and y+ = 12.5. Since u is negative, therefore the original
assumption that ψ is an active constraint is incorrect and ψ does not affect the solution
to this problem. The same can be seen when depicted graphically in Fig. 5.7. We
have a solution y = 0 corresponding to x1 = 8 and x2 = 6, which does not violate
x1 + x2 > 9. The KTCs helped us to identify that having x1 + x2 > 9 is not an active
constraint.
Now we rework the problem for x1 + x2 > 18.
∂y ∂ψ
−u =0 (5.215)
∂x1 ∂x1
∂y ∂ψ
−u =0 (5.216)
∂x2 ∂x2
x1 + x2 = 18 (5.217)
ψ = 0 ,u ≥ 0 (5.218)
2(x1 − 8) − u = 0 (5.219)
2(x2 − 6) − u = 0 (5.220)
x1 = 10, x2 = 8 (5.221)
ψ = 0, u = +4 (5.222)
Since u has now become positive, ψ is a binding or active constraint. ψ has got to be
treated as an equality constraint. y + = 8.
If we disregard the constraint, the solution is x1+ = 8 and x2+ = 6. If we substitute
8 and 6 in the constraint x1 + x2 > 18, we see that the constraint is violated and
ignoring the constraint will lead to an erroneous solution!.
1 clear ;
2 clc ;
3
5 syms x1 x2 x3
6
7 % Objective function
8 y = ((x1−8)^2)+((x2−6)^2) ;
9
10 % Constraint ψ = x1 + x2 > 9
11 f = x1+x2−9;
12
25 eqns(x1,x2,x3)=[dydx1−(x3∗(dfdx1) ) ,dydx2−(x3∗(dfdx2) ) , f ] ;
26
27 % I n i t i a l guess
28 x0 = [10, 10, 10];
29
38
43 prt1 = [ ’u = ’ ,num2str(x(3) ) ] ;
44 disp ( prt1 ) ;
45
50 % I n i t i a l guess
51 x0_new = [10, 10];
52
62 % Print
63 prt = [ ’ x1 = ’ ,num2str(x(1) ) , . . .
64 ’ , x2 = ’ ,num2str(x(2) ) , . . .
65 ’ , y = ’ ,num2str(y_value) ] ;
66 disp ( prt )
67
68 else
69
70 prt1 = [ ’u = ’ ,num2str(x(3) ) ] ;
71 disp ( prt1 ) ;
72
73 fprintf ( ’Since u is positive , constraint is binding \n’ ) ;
74
78 % Print
79 prt = [ ’x1 = ’ ,num2str(x(1) ) , . . .
80 ’ , x2 = ’ ,num2str(x(2) ) , . . .
81 ’ , y = ’ ,num2str(y_value) ] ;
82 disp ( prt )
83
84 end
5.7 Handling Inequality Constraints 171
y
u≈ (5.225)
ψ
ψ = x1 + x2 − 9, ψ > 0 (5.226)
the solution for this, we get the optimum as (10.5, 8.5). When ψ > 0, yold = 8 ;
When ψ > 1, ynew = 12.5
y ynew − yold
u≈ = (5.227)
ψ 1
∴ u = (12.5 − 8)/1 = 4.5 (5.228)
When ψ was an inactive or a nonbinding constraint, the value of u was 0, while when
it is an active constraint, u is positive. There is no other possibility for u. Therefore,
u(you!) should be positive, always!
Problems
2 2
5.1 Determine the shortest distance from the point (5, 1) to the ellipse x9 + y4 = 1
by employing the method of Lagrange multipliers. Establish that the solution is
a minimum.
5.2 Consider the following minimization problem.
Solve this using the Lagrange multipliers method. Obtain the values of the two
Lagrange multipliers. Confirm (mathematically) that the solution obtained is
indeed a minimum.
5.3 Rubber O-rings as shown in Fig. 5.10 are to be made for sealing a giant pressure
vessel used for an industrial application. The total quantity of molten rubber
available is 10 m3 from which two O-rings need to be molded. The mean diameter
for the rings is 1 and 2 m, respectively. Find the optimum values of d1 and d2 for
maximum total surface area.
5.4 A circular orifice in a tank has a radius r1 = 0.25 m. Flow from the orifice is
to be accelerated by means of a convergent nozzle as shown in Fig. 5.11. The
nozzle, shaped like the frustum of a cone is to be made out of a stainless steel
sheet.
A total of 0.5 m2 sheet is available. Find the maximum volume of the nozzle that
is achievable.
5.5 There are two electrical generators G 1 and G 2 , whose power outputs are p1 and
p2 MW, respectively. The generators are connected to a load line such that
p1 + p2 = 800MW
The cost of producing power from the generator is given by following equations.
C1 = a1 p12 + b1 p1 + c1
C2 = a2 p22 + b2 p2 + c2
Determine the optimum value of p1 and p2 at which the total cost is minimum.
5.6 If in the previous problem, the cost functions are given by
determine p1 and p2 using the solution obtained to the previous problem. Deter-
mine the value of λ and comment on its significance.
5.7 A shell and a tube heat exchanger (shown in Fig. 5.12) is to be designed for the
minimum total cost. The shell diameter D, the length of the tubes L (which is
also the length of the shell, approximately), and the number of tubes “n” have
to be designed to minimize the total cost. The tubes are all 1 inch (d = 0.025m)
in diameter and have a single pass. The cost of the shell (in lakhs of rupees) is
given by 50D 1.5 L 1.25 . The cost of each tube is 0.4 L0.5 , again in lakhs.
A few constraints in the problem are as follows:
(a) The total tube surface area needs to be 47 m2 .
(b) The packing density of the tubes (total tube volume/shell volume) shall not
exceed 50% to allow for shell-side fluid movement.
(c) The length of any tube (all tubes are of the same length) shall not exceed
10 m for ease of maintenance and replacement (Note: D and L are in m)
5.8 Why Should U Be Positive? 175
Set up the optimization problem for minimizing the total cost of the exchanger
and solve it as a constrained optimization problem using the Lagrange multiplier
method to determine the optimal solution. You may use constraint (b) as an
equality to reduce the number of variables (n, D, and L) to 2. This way you can
solve this as a two-variable, one-constraint optimization problem by substituting
for one of the variables from either constraint (a) or (b) into the objective function.
Furthermore, you may ignore constraint (c) and check if it is violated after
obtaining the optimum.
5.8 Establish the Kuhn–Tucker Conditions (KTC) for the following minimization
problem and solve it.
5.9 (a) Write down the Kuhn–Tucker conditions for the following nonlinear opti-
mization problem (NLP).
(b) Solve the above problem and check whether the inequality constraint is
binding.
Reference
Ravindran, A., And, Ragsdell K. M., & Reklaitis, G. V. (2006). Engineering Optimization- Methods
and Applications. New York, USA: Wiley.
Chapter 6
Search Methods
6.1 Introduction
Example 6.1 Revisit the solar water heater storage problem (see Example
5.3) of minimizing the heat losses from a cylindrical water heat storage tank.
Treating it as a single variable, the unconstrained optimization problem in
radius r, with an initial uncertainty of 0.5≤ r ≤3.5 m, solve the problem by
searching with a uniform step size of 0.5 m by evaluating A(r) at 7 intermediate
points.
Solution:
Minimize A = 2πr 2 + 2πr h ; V = πr 2 h = 4; subject to πr 2 h = 4
We now tabulate A(r) for 7 values of r, as indicated in Table 6.1. The graphical
variation is depicted in Fig. 6.1.
6.1 Introduction 179
From both Table 6.1 and Fig. 6.1, we are able to see that the solution lies between
r = 0.5 and r = 1.5 m. We do not know whether the minimum is reached between
0.5 and 1 or 1 and 1.5. However, we know that it takes a turn between 0.5 and 1.5 m.
What did we achieve? We did 7 function evaluations and reduced the interval of
uncertainty from (3.5–0.5) m to (1.5–0.5) m. It is not great but not very bad either.
What we have done now is called the equal interval exhaustive search method. The
original interval of uncertainty = b–a = 3.5–0.5 = 3 m. The original interval of
uncertainty can be written as a ≤ r ≤ b.
Number of intermediate points = m (for this problem, m = 5).
Interval spacing = (b − a)/(m + 1) (in this problem: 0.5 m).
The final interval of uncertainty = 2(b − a)/(m + 1) (in this problem this is 1 m).
The ratio of the original interval of uncertainty to the final interval of uncertainty
RR is known as the reduction ratio. The RR in the current example is (3/1) = 3.
The RR tells by how much the original interval of uncertainty has reduced. It is a
performance metric. It is like mileage for the car or CGPA for a student’s academic
performance. The RR has to be read in conjunction with the number of observations.
If m is the number of observations, then the reduction ratio = (m+1)/2. If we
perform the equal interval exhaustive search method as shown above. Therefore, it is
180 6 Search Methods
also reasonable to assume, scientists would have developed more advanced methods,
where, for a given m observation, we get a reduction ratio RR far superior to what
could be achieved by the exhaustive search method. The exhaustive search method
is highly unimaginative to say the least. But if we do not know anything about the
nature of the function, for starters we can use this. We can use the exhaustive search
method to bracket between the interval between r = 0.5 m and r = 1.5 m as we have
done now. After we get here, we can switch over to very sophisticated methods to
quickly reach the optimum value.
What is the relationship between RR and the number of observations or the number
of functional evaluations? We say observations, because they could also be experi-
mental. Here, the objective function, A is just a formula. But in reality, A could be
the output of a CFD software program or some experiments or could be the data
from elsewhere. Therefore we are interested in the number of observations. If n is
the number of observations (n = 7 here): n = m + 2 (where m is the number of
intermediate points).
Can we solve the same problem in a little smarter fashion without going for a sophis-
ticated algorithm? We can take 3 points at a time starting from the left. We have
r = 0.5, 1, 1.5, 2, 2.5, 3, and 3.5. Let us take the first 3 points 0.5, 1, and 1.5. If
f (x1 ) > f (x2 ) and f (x2 ) < f (x3 ), then we are home. But it is fortuitous that at
one side of the interval we got the solution for this problem. That is the solution lies
between x1 and x3 .
If the above condition is not satisfied, we make x1 = x2 , x2 = x3 , and x3 = x4 .
We keep doing this till we get to the right end of the interval. If we reach the other end
point and still do not get an optimum, either the function does not have an optimum
or the optimum lies at one of the two boundaries.
On an average, if the solution is likely to be around the middle, it may take only
half the number of observations. The RR of (m + 1)/2, which we saw a little while
ago, is the worst-case scenario for the exhaustive search.
Algorithm for the alternative approach for Example 6.1 with 3 points taken at
a time will be as follows:
(b−a)
1 r = (m+1) where m is the number of intermediate points
2 r2 = r1 + r, r3 = r2 + r
3 If A(r1 ) ≥ A(r2 ) ≤ A(r3 ), optimum lies in the range r1 ≤ r ≤ r3 - Stop
4 Else r1 = r2 ; r2 = r3 ; r3 = r2 + r , proceed to step 3
5 If r3 = b, and if stopping criterion is not satisfied, optimum does not lie between
(a,b) or may lie at the boundary.
6.2 Monotonic and Unimodal Functions 181
First, we need to look at the nature of the objective function. There are some important
definitions like a monotonic function, a unimodal function, the concept of global
minimum and local minimum that need to be fleshed out. These are best understood
in relation to a single variable problem . Once we understand them with respect to
a single variable problem, we can extrapolate these definitions for a multivariable
problems without much difficulty.
Solution:
The function is shown in Fig. 6.4a, and its derivative is shown in Fig. 6.4b. This
function (y = |x|) exhibits no discontinuity. However, dy/dx is discontinuous at x =
0 and so y = |x| is not differentiable at x = 0. Yet, y = |x| is a unimodal function.
Suppose a function is multi-modal as given in Fig. 6.5, we have to divide it into
intervals and seek the unimodal optimum in a particular interval.
y can take a value that is lower than the value taken by y at the point x ++ . A typical
multi-modal function with the global optimum at x4 is given Fig. 6.5.
A function f (x) on a domain R is said to attain a local minimum at a point
x + ∈ R if and only if f (x + ) ≤ f (x) for all x which lie within a reasonable distance
from x + .
If we locate a local optimum and perturb the independent variables around that,
we will not get a value of “f” that is better than this. However, it is not global because
if we go far away from this x + , there could be other values of x where f (x) will be
184 6 Search Methods
significantly lower than this. This is the basic difference between a global minimum
and a local minimum.
Among the 6 points marked in Fig. 6.5, 5 (x1 …x6 , sans x4 ) are local minima. x4
is the global minimum here. There are 3 important points one needs to remember.
1. For a unimodal function, the local minimum and the global minimum coincide.
2. For multi-modal objective functions, several local minima exist. We need to eval-
uate y at all these local optima and select the lowest value of y, which is the global
minima. Hence, it is much harder to solve a multi-modal function.
3. The definition for local and global minimum can be modified for a maximization
problem.
Algorithms have been specifically developed to handle multi-modal functions that
can finally give us the global minima. The problem with Lagrange multipliers or any
calculus-based technique is that if we start from the left of the function that is shown
in Fig. 6.5, the first minimum is determined as the solution. Even if we go on either
side of this, the algorithm will say that this is indeed the solution.
Now, it is possible for us to come out with a rectangular uniform grid like what
is shown in Fig. 6.7, where x1 = x2 . We start from a point, say P.
For a two-dimensional problem, where y = f (x1 , x2 ), we evaluate y at P and at
its 8 neighboring nodes. So in total, we do 9 function evaluations. Depending on
where y is lowest, we can assign a new P. So if this is NE, say, then the NE point
becomes new P and we take 8 points around it and proceed further.
Conceptually, it is quite a different story compared to the elimination of the interval
by using the exhaustive search method. If the optimum can be considered as a summit
or a hilltop by adopting the strategy discussed above, we are systematically climbing
toward it. So this is called a hill climbing technique. We are not cutting away portions
of the original interval, We just start with one guess value and keep moving around
the guess value till the minimum of y is reached. What we saw above is known as
the Lattice method. If we do not have time and money to invest in optimization and
want a quick solution, instead of an exhaustive search, one can go for this. Without
using calculus, we just keep getting y. If getting y is difficult, we just set up a neural
network with 10 values of x1 , 10 values of x2 and corresponding 100 values of y such
that when the network is supplied with any value of (x1 , x2 ), it will automatically
give us the values of y. This may be called a neural-network-based lattice search
method for solving a two variable optimization problem. In sum, conceptually, we
have 2 approaches: regional elimination methods, and the hill climbing methods. We
can now come out with a broad framework for solving optimization problems using
search methods (refer to Fig. 6.8). We can superpose the nature of an optimization
problem (single or multivariable, constrained or unconstrained) with the type of
search method and several possibilities emerge, some of which are shown in Fig. 6.8.
However, it is instructive to mention that some of these techniques will use cal-
culus. We may use calculus to find out the best direction to proceed. Yet, there is
a difference in using the search method for finding the optimum solution. We start
with a guess point. What is the best movement from position 1 to position 2? For
that can we take the help of calculus. We try to answer this question. However, we
are not taking the help of calculus to make the whole function stationary like we did
in the Lagrange multiplier method.
186 6 Search Methods
2. For the case shown in Fig. 6.9b, f (x1 ) < f (x2 ), the region to the right of x2 can
be eliminated.
3. For Case 3 shown in Fig. 6.9c, f (x1 ) = f (x2 ), the regions to the left of x1 , and
right of x2 can be eliminated.
Hence for a unimodal function f (x) in the closed interval a ≤ x ≤ b, for x1 < x2 :
1. If f (x1 ) > f (x2 ), minimum does not lie in (a, x1 ).
2. f (x1 ) < f (x2 ), minimum does not lie in (x2 , b).
3. f (x1 ) = f (x2 ), minimum lies in between (x1 , x2 )—but this is very rare in prac-
tice. Numerically, it is not very difficult to achieve this except for simple functions.
Even so, it can be a mathematical possibility.
We can use these rules to develop algorithms for solving the optimization problem
at hand.
188 6 Search Methods
n Dichotomous Exhaustive
8 16 3.5
16 256 7.5
If n is very small, we do not see much difference between the RR of the two
methods. But as n increases, either because we want an increased accuracy or because
the function is so complicated that it is not very easy for us to figure out the optimum,
it can be seen that the reduction ratio of the dichotomous search method is far superior
compared to the single exhaustive search.
Invariably, we want to solve an optimization problem to a desired accuracy, rather
than say that we want to do 30 evaluations, 64 evaluations, and so on. So we start off
with a search method from the last step.
I0
RR = (6.3)
In
RR is also given by
RR = f (n) (6.4)
where n is the number of evaluations. From RR we can evaluate n. So, upfront when
we use a search technique we know how many function evaluations are required.
190 6 Search Methods
Example 6.3 Consider the cylindrical solar water heater storage problem (see
Example 5.3). Minimize A. Use dichotomous search and perform 8 evalua-
tions.
Solution:
A = 2πr 2 + 2πr h; V = πr 2 h = 4 m3 ; 0.5 ≤ r ≤ 3.5 m ; We take = 0.02 m
Second iteration:
The RR is 15 and not 16 because is not 0. If we make say like 0.00001, the
RR would have been 15.8 or so.
1 clear ;
2 clc ;
3
10
11 for i =1:countmax/2
12
13 I=b−a ;
14
27 i f A1>A2
28 a=r1 ;
29 else
30 b=r2 ;
31 end
6.4 Elimination Method 193
32
33 % Print
34 prt = [ ’ I t r = ’ ,num2str( i ) , . . . ,
35 ’ , a = ’ ,num2str(a) , . . .
36 ’ , b = ’ ,num2str(b) ] ;
37 disp ( prt )
38 end
A natural question that arises is why so much of fuss about having a sophisti-
cated technique for a single variable problem, while most optimization problems
are multivariable ones. The answer to this question is that each of these problems
can be broken down to single variable problems. We can keep all variables except
one variable at some constant value in an iteration and then solve the resulting opti-
mization problem using the most powerful single variable search algorithm for that
variable. We can then change the variable under consideration and continue to use
the best optimization technique available to us. After we are finished with one round
of iterations for all the variables, we go to the next round. Hence, it makes eminent
sense to research on more efficient single variable searches.
Can one increase RR beyond this? Intuitively the answer is no as it appears that the
best one can do is cut the interval by 50%. Anything less than that is suboptimal.
So if at all there is a technique which claims that it has a reduction ratio which is
superior to the dichotomous search, then what should support that claim? What is
the logic that makes such a proposition possible? We do not say that we can use a
three point test. Each function evaluation comes with a cost, cost of computational
time. So if somebody claims that he/she has come up with an algorithm that gives a
better reduction ratio than the dichotomous search, will it be similar to violating the
Kelvin–Planck statement of the second law of thermodynamics? We now try to get
answers to these questions.
First, there are indeed algorithms that are superior to the dichotomous search
method. One such is the Fibonacci search method. Needless to say this method
194 6 Search Methods
works on the Fibonacci series. Any number in the Fibonacci series can be written as
In the Fibonacci search method, we use the Fibonacci series to divide the interval
into two and choose two points. In other words, if the original interval of uncertainty
is I0 = (b–a), the first two points are given by
Fn−1
I1 = I0 (6.23)
Fn
or
Fn−1
I1 = (b − a) (6.24)
Fn
Fn−1 Fn−1
I1 = I0 = (b − a) (6.25)
Fn Fn
I0 = (b − a) (6.26)
Fn−1
I1 = I0 × (6.27)
Fn
Fn−2 Fn−2
I2 = I1 × = I0 (6.28)
Fn−1 Fn
..
.
I0
In = (6.29)
Fn
I0 I0
RR = = Fn (6.30)
In I0
R R = Fn (6.31)
Therefore the Fibonacci number itself becomes the reduction ratio of the algorithm.
Let us now look at the algorithm a little more closely for a unimodal function f(x)
in the interval a ≤ x ≤ b. Consider the first two points in the Fibonacci search
Fn−1
x1 = a + (b − a) (6.32)
Fn
Fn−1
x2 = b − (b − a) (6.33)
Fn
a Fn + (b − a)Fn−1
x1 =
Fn
[a(Fn−1 + Fn−2 ) + (b − a)Fn−1 ]
= (6.34)
Fn
(a Fn−2 + bFn−1 )
x1 = (6.35)
Fn
similarly
We then perform the two point test and depending on whether y(x1 )< y(x2 ), we
eliminate the region to the right of x1 or the region to the left of x2 , depending
on whether the problem under consideration is a maximization or a minimization
problem. There is nothing great so far!!.
Now for the purpose of demonstration, we have to assume how the curve looks
and decide on the region to be eliminated. So let us assume that the region to the left
of x2 (as shown in the figure below) is eliminated and proceed to the next iteration.
So the end points of the new interval are x2 and x=b and the new interval is I1 .
Now let us choose 2 points x3 and x4 such that they are at a distance I2 from the 2
new ends. I2 then becomes
Fn−2
I2 = I1 (6.38)
Fn−1
Fn−1
I1 = I0 (6.39)
Fn
Fn−1 Fn−2 Fn−2
I2 = I0 = I0 (6.40)
Fn Fn−1 Fn
Fn−2
I2 = (b − a) (6.41)
Fn
x 3 = x 2 + I2 (6.42)
x 4 = b − I2 (6.43)
Therefore, x1 = x4
Out of two points (x3 , x4 ), only one is really new!! This happens iteration after
iteration and this contributes to the magical efficiency of the method.
6.4 Elimination Method 197
Example 6.4 Execute the Fibonacci search algorithm for the cylindrical solar
water heater storage problem (see Example 5.3). Perform 8 function evalua-
tions. Start with Fn = 34.
Solution:
Fn−1 21
I1 = I0 =3× = 1.85 m (6.50)
Fn 34
r1 = 2.35 m (6.51)
r2 = 1.65 m (6.52)
We do not worry if r1 is greater than r2 right now. Our notation is that whatever is
chosen from the left is r1 and whatever is chosen from the right is r2 .
First, we used 21/34 for getting two points from the ends, now we use 13/21. We
move from the right to the left of the Fibonacci series to get the ratios, till we hit 1,
when xn and xn+1 will coincide. That is when we stop the Fibonacci search.
198 6 Search Methods
There may be a small problem with the decimals, but we adjust them so that we
are able to use one value of x at every iteration from the previous iteration.
Fn−2 13
I2 = I1 = 1.85 × = 1.15 m (6.55)
Fn−1 21
r3 = a + I2 = 1.65 m = r2 (6.56)
r4 = r1 − I2 = 1.20 m (6.57)
A(r3 ) = 21.95 m2 (6.58)
A(r4 ) = 15.2 m2 (6.59)
8
r5 = 0.5 + 1.15 × = 1.2 m (6.60)
13
8
r6 = 1.65 − 1.15 × = 0.95 m (6.61)
13
A(r5 ) = 15.7 m2 (6.62)
A(r6 ) = 14.1 m 2
(6.63)
The region to the right of r5 can be eliminated. Please note that we now use a ratio
of (8/13) for getting two points in the next iteration.
Though we seem to be working out a lot, it is just one evaluation per step, because
the other point is already in. In this case, it so happens that the function evaluation is
easy and the other steps seem to be as laborious or as time consuming as the function
evaluation. However, many times in thermal sciences, the function evaluation may
be the result of a solution to the Navier–Stokes equations and the equation of energy.
In such a case, the other steps will be much less time consuming compared to the
function evaluation.
6.4 Elimination Method 199
5
r7 = 0.5 + 0.7 × = 0.95 m (6.64)
8
5
r8 = 1.2 − 0.7 × = 0.76 m (6.65)
8
A(r7 ) = 14.1 m2 (6.66)
A(r8 ) = 14.2 m2 (6.67)
3
r9 = 0.76 + 0.44 × = 1.02 m (6.68)
5
3
r10 = 1.2 − 0.44 × = 0.94 m (6.69)
5
A(r9 ) = 14.38 m2 (6.70)
A(r10 ) = 14.1 m2 (6.71)
2
r11 = 0.76 + 0.26 × = 0.94 m (6.72)
3
2
r12 = 1.02 − 0.26 × = 0.84 m (6.73)
3
A(r11 ) = 14.1 m2 (6.74)
A(r12 ) = 13.95 m2
(6.75)
A(r11 ) > A(r12 ) (6.76)
The next point is right in the center and the two points coincide. The final interval
of uncertainty is 0.18m. The solution (r + ) now lies between 0.76 m and 0.94 m. With
just 7 function evaluations, we are able to reduce the final interval of uncertainty to
18 cm from 300 cm! So we got a remarkable reduction ratio.
The original interval of uncertainty is 3 m.
∴ RR = 3/0.18 = 16.67
The final interval of uncertainty is 0.94–0.76 = 0.18 m.
Now, what is the theoretical RR? The theoretical RR is supposed to be fn = 34.
But we do not seem to get this performance.
Why are we getting this difference? Because we actually stopped with 7 eval-
uations. We can go for one more and do a three point test in the last step. So we
can halve the interval further which will make our RR = 32. We will take this issue
up in Sect. 6.4.4. In an exhaustive search, after 7 evaluations, the final interval of
uncertainty would have been 75 cm.
1 clear ;
2 clc ;
3 a=0.5; % lower limit of radius
4 b=3.5; % upper limit of radius
5 countmax=12; % number of evaluations (even number)
6
9 %f = fibonacci(3:3+countmax/2) ;
10
11 f (1) = 1;
12 f (2) = 1;
13
14 for i = 3 : 3+countmax/2
15
16 f ( i ) = f ( i−1) + f ( i−2);
17
18 end
19
20 f (1:2) =[];
21
22 for j = 2:length ( f )
23 ratio ( j−1)=f ( j−1)/ f ( j ) ;
24 end
25
26 for i =1:countmax/2
27
28 I=b−a ;
29
32
42 i f A1>A2
43 a=r1 ;
44 else
45 b=r2 ;
46 end
47
48 % Print
49 prt = [ ’ I t r = ’ ,num2str( i ) , . . .
50 ’ , a = ’ ,num2str(a) , . . .
51 ’ , b = ’ ,num2str(b) ] ;
52 disp ( prt )
53
54 end
There are some higher order searches which are possible. For example, after we have
gotten the center point, r13 = r14 , the Fibonacci search ends there. Now if we are
pretty confident that we have bracketed the minimum and it is indeed lying between
r = 0.76 m and r = 0.94 m, it is possible for us to use the Lagrange interpolation
formula and have a local polynomial connecting these points and then make the
function stationary. That will take us close to the true solution. While it may be an
202 6 Search Methods
overkill for a simple problem like this, it is possible. If Fn = 34, we have to do the last
step which is the three point test. So if both the points coincide, we do the evaluation
and take the three point test. Else if we have a desirable accuracy required, we go on
to the next nearest Fibonacci number so that this problem of “getting stuck” in the
last iteration does not arise.
Though theoretically the RR is supposed to be Fn , we were not able to reach it
because in the last iteration, both the points were at the center and coincided and
hence we were not able to proceed any further. So the time has come critically to
revisit our reduction ratio.
Actual Reduction ratio of Fibonacci search method
The theoretical RR of Fn which we derived in Eq. 6.31 can never be reached!.
If we take F8 =34 and we start dividing the interval as 21/34, 15/21, and so on,
we are not able to do the 8th evaluation as both the points were at the center and it
reduces to half. Hence, if Fn = 34, it corresponds to n = 8 in the Fibonacci series.
When n = 8, we are able to do only (n–1) function evaluations. So, we can say that
RR of the Fibonacci search with n function evaluations is given by
R R = Fn+1 /2 = I0 /In
We could have started out this problem as follows. Minimize the surface area of the
cylindrical water storage heater with an initial interval of uncertainty as 0.5–3.5 m
our and get a final interval of uncertainty of 0.17 or 0.18 m.
the right-hand side will be eliminated based on the two point test. Any disturbance
to this, that is, the distance from the left side not being equal to the distance from
the right side, may sometimes be advantageous while sometimes could be terrible.
Therefore, in all these methods, including the Fibonacci method, the points are at
the same distance from both the ends.
The Fibonacci method actually obeys this. In the two point test, the two points are
always symmetrically placed about the center and about the 2 ends. We get a good
reduction every time.
Disadvantages of the Fibonacci search method
There are a few disadvantages with the Fibonacci search method. The value n is small,
as for example, 8 in this problem, Even so, it is small only for trivial problems. When
we solve very complex problems, may be a 100 variable optimization problem, and
we desire a high level of accuracy, n may be of the order of 300 or 400. So we have to
calculate and store the Fibonacci series numbers first. Then we have to recall these
numbers in every iteration. Furthermore, the reduction in the interval of uncertainty
is not the same in all the iterations.
Now the question is, can we think of some method that enjoys all the advantages of
the Fibonacci search method, but having the same interval reduction for all iterations,
that is, when we go from I0 to I1 , whatever be the reduction in the interval of
uncertainty, will be exactly the same as we would get from I1 to I2 and I2 to I3 and
so on.
x1 = τ , x2 = 1 − τ (6.77)
We are still talking about a unimodal function. Based on the two point test, one
portion of the interval can be eliminated. Let us say the region to the left of x2 can
be eliminated. What is I1 now ? I1 = τ
204 6 Search Methods
x3 = 1 − τ + τ 2 (6.78)
x4 = 1 − τ 2
(6.79)
We started out with 2 points in the interval 0 and 1 and we are saying that in the first
iteration, the percentage of reduction is τ and that this percentage of reduction has
to be maintained. So in the second iteration, we take 2 new points x3 and x4 , but we
want to retain all the advantages of the Fibonacci search method. Therefore we want
x1 to be x4 .
If x1 has to be equal to x4
τ = 1 − τ2 (6.80)
τ +τ −1 = 0
2
(6.81)
√ √
−1 ± 1+4 5−1
τ = = (6.82)
2 2
τ = 0.618 (6.83)
What is so great about 0.618? The reciprocal of 0.618, that is 1/0.618, is 1.618.
That is why it is satisfying the property x1 = x4 . The number 1.618 is called the
Golden ratio or the Golden number.
If φ is the Golden ratio, it satisfies the following property:
1
=Φ (6.84)
1+Φ
Φ 2 + Φ − 1 = 0, Φ = 0.618 (6.85)
So instead of struggling with the Fibonacci search method, we multiply I0 every time
with 0.618 and 38.2% of the interval will be gone. This is called the Golden section
search. We take 2 points at distance 0.618 I0 from the ends. Next time, it will be
0.618 I1 from both the ends. We get a very good reduction ratio with this algorithm.
The letter was introduced by the Greek sculptor Phidias.
The ratio of the height to the base of the Egyptian pyramids is 0.618. There are also
claims by Leonardo da Vinci and others that the height to width ratio of a beautiful
face must be 1.618.
6.4 Elimination Method 205
What is the connection between the golden section and Fibonacci search meth-
ods?
Fn−1
lim = 0.618 (6.86)
n→∞ Fn
As we approach larger and larger numbers in the Fibonacci series, Fn−1 /Fn reduces
asymptotically to the value 0.618. So the two search methods are interconnected.
So, if we want to enjoy all the advantages of the Fibonacci search, but do not want
to calculate and store the numbers and also do not want unequal interval reduction
across iterations, we can use the Golden section search.
Example 6.5 Consider the cylindrical solar water heater storage problem
(Minimize surface area A for a given volume V = 4000 litres). Perform 7
function evaluations with the Golden section search method for one variable.
Initial interval of uncertainty is 0.5 ≤ r ≤ 3.5 m.
Solution:
The region to the right of r11 can be eliminated. We have done 7 function evaluations
now.
The final interval of uncertainty is 0.77 ≤ r ≤ 0.94 m and is the same as what we
got using the Fibonacci search method. Now let us calculate the reduction ratio RR.
Where n is the number of function evaluations. In this formula, we get (n–1) and not
n because in the first iteration, we did 2 evaluations!
10 for i = 1:countmax
11
12 I = b−a ;
13
18 r2=b−I∗0.618;
19
26 i f A1>A2
27 b=r1 ;
28 else
29 a=r2 ;
30 end
31
32 % Print
33 prt = [ ’ I t r = ’ ,num2str( i ) , . . .
34 ’ , a = ’ ,num2str(a) , . . .
35 ’ , b = ’ ,num2str(b) ] ;
36 disp ( prt )
37 end
By now, we have learnt a few powerful techniques for performing single variable
searches. They do not exploit any information on the derivatives or the higher order
derivatives or we do not even look at the nature of the variation of the y between
2 points. We just see if y1 > y2 or y1 < y2 . Obviously, optimization specialists and
mathematicians would not have left it at this!
There are other powerful methods which will exploit all this information. Let us
take a sneak peek of how powerful it can get when we employ some superior or more
advanced techniques to this problem.
6.4 Elimination Method 209
We expand the first derivative around xi+1 using the Taylor series expansion. Now
we want to make this f (xi+1 ) stationary, because we are looking at an optimization
problem.
f (xi )
xi+1 − xi = − (6.111)
f (xi
f (xi )
xi+1 = xi − (6.112)
f (xi )
But in an optimization problem we have f (x)/ f (x) in the algorithm because we are
trying to make f (x) stationary and not f (x) stationary. Since the Newton–Raphson
method for optimization is using information on the first and second derivatives, it
is demonstrably superior compared to the other search techniques. But where is the
catch? It should be possible for us to evaluate f (x) and f (x). Suppose we are
solving a CFD problem or we are trying to determine stress in a system, getting
the value of f(x) itself is difficult. Outside of that we want to get f and f . So
calculus is not the only route to solving optimization problems ! Calculus may look
very attractive. But it is simply not possible to get the higher order derivatives,
when we are working with complicated optimization problems, where each function
evaluation itself is so difficult. Even getting the derivatives numerically does not help
in many situations. All these have been responsible for the development of calculus-
free techniques like Genetic Algorithms and Simulated Annealing, which can be
applied to a wide class of engineering optimization. More on these in Chap. 8.
Let us consider the last step of Example 6.5 problem and apply the Newton–
Raphson method. Refer to Fig. 6.12 that displays the final interval of uncertainty for
Example 6.5.
A (ri )
r(i+1) = ri − (6.118)
A (ri )
(−9.57 × 10−3 )
r(i+1) = 0.86 − (6.119)
37.7
r(i+1) = 0.861 m (6.120)
A (r ) has become a very small value which shows that A(r ) has reached the optimum.
What we have done now is essentially a hybrid optimization technique. We start
with a non-calculus search technique and toward the end we use a calculus-based
optimization technique which exploits the power of the first and second derivatives
in order to narrow down the solution.
There is yet another route to solving this problem. We have the end points r =
0.77 m and r = 0.94 m at the end of the Fibonacci/Golden section search method
and the center of this interval is 0.86 m. We can fit a Lagrange interpolation formula
between the 3 points and Newton’s divided difference formula and get the derivative
of this function and make it stationary and determine the solution by finding out
where it becomes 0. All these are required if we want to get a very sophisticated or
a very narrow interval.
The assumption behind fitting a function to the 3 points is that if the 3 points are
sufficiently close and if we approximate it by a second degree or polynomial, not
much error is incurred. This cannot be done on the original interval of uncertainty
which varies from say, r = 1–200 m. However, when we have sufficiently narrowed
down the interval and we are pretty sure that the interval is already very tight, it is
possible to do the polynomial interpolation. This puts an end to our tour of search
techniques for single variable optimization problems.
very efficient single variable technique, there are some protocols involved in han-
dling multivariable problems. A multivariable problem can either be a constrained
and unconstrained one. Earlier, we did solve a set of multivariable constrained prob-
lems, where the number of constraints, which are equality constraints, is less than
the number of variables and so on. The Lagrange multiplier is very elegant when
the derivatives are not very cumbersome to evaluate. The major problem with the
Lagrange multiplier technique is, after getting the derivatives, solving the set of
simultaneous equations is tough, especially when there are 20, 30, or more variables.
So one possibility is to use the information on the derivatives and see if there is
some other way of getting the solution instead of solving the resultant simultaneous
equations. This means that instead of solving them simultaneously, we search for the
solution.
Multivariable search techniques are very important in thermal sciences because
we rarely encounter single variable problems. The problem of optimization of a
shell and tube heat exchanger is a typical two variable problem, where the objective
function is to minimize the surface area (and so the cost) of the exchanger subject to
the constraints of heat duty and allowable pressure drop. The length and diameter of
the tubes are important variables here.
As foresaid, multivariable optimization problems are of two types—
unconstrained and constrained. Needless to say, constrained multivariable optimiza-
tion problems are a lot harder to solve compared to unconstrained multivariable
optimization problems.
Let us look at a typical two variable problem in x1 and x2 . One possibility is to start
from an initial, draw the grid around it, as seen before. We have a node P and mark
the eight neighboring points in the 8 directions E, SE, S, SW, W, NW, N, and NE.
We evaluate the function at the 8 neighboring points and find out in which direction
the rate of change of y is maximum (see Fig. 6.13). For example, if we are seeking a
maximum here and this is at the center of the iso-objective lines, then from node P,
1.0
8
6.
.6 1
16
16.2
16.0
15.8
15.6
15.4
x2
15
.0
15.6
15.2
15.8
16.2
16.6 16.0
16.4
0.2
1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
x1
hopefully we go to NE and take that as the next node. We then take 8 points around
the new point and continue with our search. This is known as the lattice method ,
which we discussed briefly earlier.
As we go closer and y keeps increasing, the rate of change of y decreases, we
can make the grid finer and finer. We can have adaptive meshing, wherein, we can
start with coarse grids and then refine the grid. The number of function evaluations
in each iteration for a 2 variable problem is 9.
The number of function evaluations at every iteration can then be generalized as
3n , where n is called the dimensionality of the problem or the number of variables.
So computationally, the above approach is a very costly affair. If each of these
solutions represents a CFD solution or a solution arising from a set of nonlinear partial
differential equations from any branch of science and engineering, the algorithm is
terribly expensive. We just keep evaluating though there clearly is some “method in
the madness”. Please recall that a technique like this falls under the category of “hill
climbing”.
Let us say that we want a reasonable solution to a two-dimensional CFD problem
with two design variables using this method and let each solution take 2 hours on a
powerful workstation. Every iteration will require 9 function evaluations. This will
help us get an idea of the total time required to get to the optimum.
A logical question that arises now is whether one can solve for one variable at a
time? Such an approach is called a unidirectional search. Let us start with a point
in the domain as shown in Fig. 6.14. We keep x1 fixed and try to get x2 at which the
function becomes minimum or maximum. In the next iteration, we keep x2 fixed and
determine x1 at which the function again is a minimum or a maximum.
We then determine the value of x2 at which y becomes minimum. This may not
be the final solution because this is the final solution corresponding to the value of
x1 obtained after iteration 1 which itself is not the solution. Now we have got the
new value of y and x2 . We keep this x2 fixed and get the new value of x1 . We now
6.5 Search Methods for Multivariable Unconstrained Optimization Problem 213
keep this x1 fixed and get back x2 and keep going this way. So instead of doing 9
evaluations, we are going one at a time. So if y is a function of several variables, we
assume values for all but one variable and write the objective function in terms of
that variable, say x. We then take dy/dx1 , equate it to 0 and solve the equation, if it is
quadratic or cubic and so on. If the function is too complex to be solved this way, we
apply the Golden section search or the Fibonacci search or the dichotomous search
and determine that value of x1 which is the optimum solution when x2 is fixed. In
the next step, we solve for, say, x2 and eventually we will reach the final solution.
x12 2
y = 8+ + + 6x2 (6.121)
2 x1 x2
subject to (x1 , x2 ) ≥ 0
Solve it using a unidirectional search method with an initial guess of x1 = 1
and x2 = 1. Decide on an appropriate stopping criteria.
Solution:
∂y 2
= x1 − 2 (6.122)
∂x1 x1 x2
∂y 2
=6− (6.123)
∂x2 x1 x22
x13 x2 = 2 (6.124)
13
2
x1+ = (6.125)
x2
1
x1 x22 = (6.126)
3
1
1 2
x2+ = (6.127)
3x1
We made the 2 derivatives stationary because of which, we got the expression for x1
∂y
in terms of x2 and for x2 in terms of x1 . In any iteration, if we have x2 , using ∂x 1
=0
and making y stationary with respect to x1 , we have an opportunity to calculate x1
in terms of x2 . Once we calculate x2 , we can substitute in the other equation which
helps us calculate x2 in terms of x1 .
So we start with x1 = 1 and determine x2 . For these values of x1 and x2 , we
calculate the value of y.
214 6 Search Methods
and so iterations can be stopped here. Hence the optimal solution to this problem is
We can check for the positive definiteness of the Hessian matrix of second derivatives
of y to confirm that the solution obtained is indeed a minimum.
If the expression for x1+ in terms of x2+ or vice versa were to involve a difficult
to evaluate function, it would be a chore to just evaluate x1 and x2 at each step.
We then use the Golden section search to find out x1 from x2 . Again we will use the
Golden section search to find out x2 from x1 . The advantage here is we do not have to
solve the equations simultaneously. If we have many variables and it becomes messy
to solve simultaneous equations, we can use this method instead of the Lagrange
multiplier method. In this method, if the differential cannot be obtained as a closed-
form expression, we can get it numerically too. Though this method also employs
derivatives, it is better than the Lagrange multiplier method because we do not have
to simultaneously solve the equations. They need to be solved only sequentially.
Cauchy’s method is a very powerful and popular search technique for multivariable
unconstrained problems. If we start a little away from the optimum, the method
converges rapidly but as we get closer to the optimum, it becomes very very sluggish.
Needless to say, because of the name steepest ascent or descent, we know that we
are going to use the information on the derivatives in order to develop the algorithm.
6.5 Search Methods for Multivariable Unconstrained Optimization Problem 215
The algorithm:
y = y(x1 , x2 . . . xn ) (6.128)
∂y ∂y ∂y
y = .i 1 + .i 2 + · · · + .i n (6.129)
∂x1 ∂x2 ∂xn
In the above equation, if x1 is fixed, all other xs can be evaluated at every
iteration. We start with X 0 = (x1 , x2 · · · xn )0 . We assume the values of x1 , x2 , .. up
to xn . We also assume that y is continuous and differentiable. So, all the derivative
values can be calculated at X0 and the values of all the denominators in Eq. 6.130
are known. Therefore
Likewise, all the other values of x can be got and hence X1 can be obtained. Since
from vector calculus, we know that when we are orthogonal to the iso-objective line,
there is a maximum rate of change of the function. If we follow this, we are moving
along this direction. But there is a problem here and, that is, how far should we go?
The direction may be alright.
The vector X represents x1 to xn . This represents the direction of movement. But
how far will we go in this direction?
A very simple way of making this algorithm will be to make x ∂y
1
= x
∂y
2
=
∂x1 ∂x2
xn = 1. If after sometime, there is some problem with the function, the x1 can
∂y
∂xn
be appropriately changed. This is a very simple interpretation of the Cauchy’s method.
A slightly more advanced version comes next. Let us call x ∂Y
n
= α. Though we can
∂xn
decide the direction on this basis, in every iteration we calculate the value of α. What
is that value of α which will minimize y at the current point? It is possible to answer
this question. But now the problem is getting more difficult as in every iteration, we
want to calculate the value of α. But if we do this and proceed with the algorithm, it
will be exceedingly fast. Now if we are making α = 1, we are pre-setting the value
of x1 and calculating all the other x, as they have to obey Eq. 6.130.
Example 6.7 Revisit Example 6.6 and solve it using the method of steepest
ascent or descent (steepest ascent is for maximization problem and steepest
descent is for minimization problem)
Solution:
x12 2
y = 8+ + + 6x2 (6.132)
2 x1 x2
∂y 2
= x1 − 2 (6.133)
∂x1 x1 x2
∂y 2
= 6− (6.134)
∂x2 x1 x22
x1 = 0.3, we determine the new value of x2 to be −0.2. But the constraint
is x1 and x2 have to be greater than 0. So let us rework the problem assuming
x1 = 0.1.
Now it may start behaving funny. Now y will become very close to 0 because
we are very close to the solution. Even after 8 or 9 iterations, we do not have
convergence. If we are far away from the optimum, suppose we started out
with x1 = 5, it will quickly come down to 1. But it will struggle to come from
x1 = 1 to 1.6, which is the solution in this case. After the second iteration, x2 is
tantalizingly close to the correct answer but x1 is far off. When x1 starts inching
forward, x2 will move away!!
Let us now make x1 = 0.05 as we are close to convergence. In step 7, ∂ y/∂x2
is changing sign which tells us that we have overshot the answer. At the end of 8
iterations, x1 = 1.57 and x2 = 0.475. At the end of 15 iterations, it is seen that
x1+ = 1.631, x2+ = 0.45, and y + = 14.76.
So the method seems to be going in the right direction but is now very slowly
converging. The solution is given in Table 6.3.
(b) Method where α is calculated at every iteration.
x12 2
y =8+ + + 6x2 (6.135)
2 x1 x2
∂y 2
= x1 − 2 (6.136)
∂x1 x1 x2
∂y 2
= 6− (6.137)
∂x2 x1 x22
218 6 Search Methods
∂y
At (1,1) = −1 (6.138)
∂x1
∂y
=4 (6.139)
∂x2
Solution:
This is the equation of a circle and the optimum is (8, 6) if we are seeking a minimum
to y which we know (y + at the optimum is y + = 0).
We now solve it using Cauchy’s method.
Let us start with (2, 2). As the initial guess value,
Example 6.9 Minimize y = 2x12 − 2x1 x2 + x22 with an initial guess of (3,5)
using Cauchy’s Steepest Descent Method and perform at least 4 iterations
Solution:
We now solve it using the Cauchy’s method.
dy/dα = 0; (6.172)
α = −0.1923 (6.173)
1 clear ;
2 clc ;
3
7 syms x1 x2 a
8 y=2∗(x1^2)−2∗x1∗x2+x2^2; % objective function
9
16 % value of derivative
17 DYDX1=single (subs(dydx1,{x1,x2},{X1,X2}) ) ;
18
19 % value of derivative
20 DYDX2=single (subs(dydx2,{x1,x2},{X1,X2}) ) ;
21
27 while label==0
28
29 count=count+1;
30
31 DYDX1=single (subs(dydx1,{x1,x2},{X1,X2}) ) ;
32 DYDX2=single (subs(dydx2,{x1,x2},{X1,X2}) ) ;
33
37 dy_newda=diff (y_new, a) ;
222 6 Search Methods
38
39
57 label=1;
58 end
59
60 % Print
61 prt = [ ’ I t r = ’ ,num2str(count) , . . .
62 ’ , x1 = ’ ,num2str(X1) , . . .
63 ’ , x2 = ’ ,num2str(X2) , . . .
64 ’ , err = ’ ,num2str( err ) ] ;
65 disp ( prt )
66
67 end
Let
y = y(x1 , · · · , xn ) (6.174)
∂y ∂y ∂y
c = y = i1 + i2 + · · · + in (6.175)
∂x1 ∂x2 ∂xn
d = −c (6.176)
d is the negative of the gradient vector. The modulus of the C vector is given by
2 2 2
∂y ∂y ∂y
|c| = + . + ··· + (6.177)
∂x1 ∂x2 ∂xn
We solve for α, get X 1 , then d 1 and so this is how the algorithm works. What we
have done so far is to write the algorithm in compact mathematical notation.
Let us now look at the product of ci and ci+1 . We are trying to see if this product
is 0 so that the new direction is always orthogonal to the old direction. The idea is to
see if ci .ci+1 = 0.
How did we get α?
We minimize y(X i + αdi )
∂y ∂ y ∂x
|i+1 = 0 = (6.180)
∂α ∂x i+1 ∂α i+1
∂y
|i+1 = ci+1 (6.181)
∂x
∂x ∂ i
|i+1 = (x + αd i ) = d i = −ci (6.182)
∂α ∂α
So if dy/dα has to be 0, this itself is a product of 2 partial derivatives ∂ y/∂x and
∂x/∂α.
∂ y/∂x turns out to be ci+1 whereas ∂x/∂α turns out to be ci . From this we
get, ci .ci+1 = 0. Therefore at the (i + 1)th iteration, the direction of movement is
orthogonal to the previous direction. This is the proof of the steepest descent method.
224 6 Search Methods
We saw a very crude form of the conjugate gradient method when we discussed the
Levenberg–Marquardt algorithm in Chap. 3, when we tried to do nonlinear regres-
sion, where we introduced the damping factor, λ.
Consider a two variable problem. Let us start with x0 = (x1,0 , x2,0 ) and hit
x1 = (x1,1 , x2,1 ). We keep proceeding until we reach the optimum. This is Cauchy’s
method. The next logical question to ask is: can we directly go to x3 bypassing x2 ? It
is possible to jump some steps. One possibility is to deflect the direction in which we
are moving instead of always going orthogonal to the previous direction and come up
with a deflected steepest descent method. This deflected steepest descent method
is called the conjugate gradient method.
So the first 2 steps are the same as the Cauchy’s method. We start with x0 and
go to x1 using the steepest descent. Then from x1 , we directly go to x3 . This is also
called the Fletcher–Powell method. It is one of the most powerful algorithms ever
used.
Conjugate gradient method—the Algorithm
• The first 2 steps are the same as in Cauchy’s method
• di = −ci + βi di−1
2
|ci |
where βi = |ci−1 |
Calculate the length of y at the current and previous iteration, take the ratio of
this, square it and this becomes β.
• Evaluate αi to minimize y(x i + αd i )
• Update x i+1 ; If β = 0, it becomes the Cauchy’s method.
To calculate β, we require the value of c at 2 steps. That is why we will start
with Cauchy’s method, get 2 values of C and then move on to the conjugate gradient
method.
Let us revisit Example 6.7 The first two steps from Cauchy’s method are shown
in Table 6.6.
We use the exhaustive search technique or Golden section search method to determine
the value of α that minimizes y(1.12 + 2.2α, 0.52 − 0.4α).
6.5 Search Methods for Multivariable Unconstrained Optimization Problem 225
As can be seen from Table 6.7, the solution is reached at the third iteration itself
compared to 14 iterations for method (a) of the steepest descent and 5 iterations for
method (b) of the steepest descent method.
The progression of the solution using the conjugate gradient method is depicted
in Fig. 6.16.
The flowchart for solving a typical two variable (x1 , x2 ) optimization problem is
shown in Fig. 6.17.
226 6 Search Methods
Fig. 6.17 Flowchart for the conjugate gradient method for a two variable unconstrained optimiza-
tion problem
6.5 Search Methods for Multivariable Unconstrained Optimization Problem 227
Example 6.10 Minimize y = 2x12 − 2x1 x2 + x22 with an initial guess of (3,
5) using the conjugate gradient Method
Solution:
We now solve it using the conjugate gradient method.
Let us start with (3, 5). As the initial guess value,
dy/dα = 0; (6.195)
α = −1.25 (6.196)
1 clear ;
2 clc ;
3
7 syms x1 x2 a
8
9 % objective function
10 y=2∗(x1^2)−2∗x1∗x2+x2^2;
11
18 % value of derivatives
19 DYDX1=single (subs(dydx1,{x1,x2},{X1,X2}) ) ;
20 DYDX2=single (subs(dydx2,{x1,x2},{X1,X2}) ) ;
21
27 while label==0
28
29 count=count+1;
30
31 DYDX1=single (subs(dydx1,{x1,x2},{X1,X2}) ) ;
32 DYDX2=single (subs(dydx2,{x1,x2},{X1,X2}) ) ;
33
34
35 i f count==1
36 d2 = [DYDX1; DYDX2] ; % update gradiant vector
37 else
38 c2 = [DYDX1; DYDX2] ; % update gradiant vector
39 b=((norm(c2) ) / (norm(c1) ) ) ^2;
40 d2 = −c2+b∗d1;
41 end
42
46 dy_newda=diff (y_new, a) ;
47
50 r=solve (dy_newda, a) ;
51
67 i f count==1
68 d1 = −c1;
69 else
70 d1=d2;
71 end
72 i f count==countmax | | err<errTol
73 label=1;
74 end
75 % Print
76 prt = [ ’ I t r = ’ ,num2str(count) , . . .
77 ’ , x1 = ’ ,num2str(X1) , . . .
78 ’ , x2 = ’ ,num2str(X2) , . . .
79 ’ , Y = ’ ,num2str(y) , . . .
80 ’ , err = ’ ,num2str( err ) ] ;
81 disp ( prt )
82 end
Though the minus and plus signs appear to be incongruous for the maximization and
minimization problems, it needs to be reiterated that we gave a positive penalty for a
cost function and negative penalty for a profit function. Other penalty functions are
also possible. For example, we can have
V =y+ P j |ψ j | (6.222)
j∈R
232 6 Search Methods
Here j is an element of the set R which contains all the violated constraints. We can
employ P j or a universal P. We take the modulus of ψ and if it violates, take the total
sum of the violations, multiply by P, which will be a very huge quantity like 1020 or
1025 . This constraint could be ψ j > 0.
When ψ j > 0, the penalty is 0, while if ψ j < 0, then P takes on the value 1020
or 1025 . P always looks at the value of ψ, whenever it becomes negative, P is active
and gets added to the cost. Such a penalty is known as an infinite barrier penalty.
Immediately, a host of new ideas may strike us. Instead of |ψ|, we can have log(ψ)
or log|ψ| and so on. We can come up with our own concept of penalty or barrier.
Basically, we add to the cost and ensure that ψ never becomes negative. If ψ becomes
negative, the product becomes huge and we will automatically correct it such that ψ
is satisfied.
In Eq. 6.222, R is the set of violated constraints and we eventually ensure that they
are not violated. Instead of checking each time whether the constraints are violated or
not, we make them a part of the objective function, set up an infinite barrier penalty
so that they are not violated, eventually when convergence is reached.
To start with, we will assign a small value to the Ps so that convergence is fast.
We will get some solution and then keep changing the value of Ps and increase them
till the solution proceeds in such a way that the s are not violated too much. A
stage will come when the s are satisfied more or less exactly and regardless of
the value of Ps, i.e., we will get the same value of V. That means we have reached
convergence. In view of the above, it is evident that it is a lot more hard work to
solve an optimization problem using the penalty function method, as the solution
proceeds with guess values of Ps. However, on a computer, the solution is elegant
and the penalty function method is a powerful tool to solve practical optimization
problems.
Minimize A, subject to
V = πr 2 h = 4 (6.224)
Solution:
The composite objective function is given by
1 clear ;
2 clc ;
3
14 while label==0
15
16 count=count+1;
17 % 1st point selection according to Golden method
18 I=b−a ;
19 J=d−c ;
20 r1=a+I∗0.618;
21 h1=c+J∗0.618;
22 % 2nd point selection according to Golden method
23 r2=b−I∗0.618;
24 h2=d−J∗0.618;
25
27 %of i t s range .
28 i f count==1
29 h=(c+d) /2;
30 end
31 % objective function accoridng to 1st point
32 %of r for fixed h
33 y1=2∗pi∗r1^2+2∗pi∗r1∗h+p∗(pi∗r1^2∗h−4)^2;
34
39 i f y1>y2
40 b=r1 ;
41 r=r1 ;
42 else
43 a=r2 ;
44 r=r2 ;
45 end
46 % objective function accoridng to 1st point
47 %of h for fixed r
48 y3=2∗pi∗r^2+2∗pi∗r∗h1+p∗(pi∗r^2∗h1−4)^2;
49
50 % objective function accoridng to 2nd point
51 %of h for fixed r
52 y4=2∗pi∗r^2+2∗pi∗r∗h2+p∗(pi∗r^2∗h2−4)^2;
53 i f y3>y4
54 d=h1;
55 h=h1;
56 else
57 c=h2;
58 h=h2;
59 end
60 A= 2∗pi∗r^2+2∗pi∗r∗h;
61 i f count==countmax
62
63 label=1;
64 end
65
66 % Print
67 prt = [ ’ I t r = ’ ,num2str(count) , . . .
68 ’ , y1 = ’ ,num2str(y1) , . . .
69 ’ , y2 = ’ ,num2str(y2) , . . .
70 ’ , A = ’ ,num2str(A) , . . .
71 ’ , r1 = ’ ,num2str( r1 ) , . . .
72 ’ , r2 = ’ ,num2str( r2 ) , . . .
73 ’ , h1 = ’ ,num2str(h1) , . . .
74 ’ , h2 = ’ ,num2str(h1) ] ;
75 disp ( prt )
76
77 end
6.7 Multi-objective Optimization 235
So far in this book, we have looked at optimization problems with a single objective.
However, often in engineering as in life, multiple objectives are present and these are
invariably in conflict with each other. Consider the problem of choosing a personal
car. There are many objectives but comfort and cost often hold the key. Unfortunately,
these are orthogonal as can be seen from the figure below.
From Fig. 6.18, it can be seen that multiple solutions exist and we cannot minimize
cost and maximize comfort simultaneously. Some higher order information will be
required to choose the most desirable car. This, for example, may be weightage for
cost and weightage for comfort. Let us now consider a problem in thermal engineering
involving heat exchangers. The two key objectives in a heat exchanger are
1. Heat transfer, Q
2. Pressure drop, P.
The goal would often be to maximize Q and minimize P. However, unfortunately,
these two objectives are at conflict with each other. If one tries to increase the veloc-
ity so as to increase the heat transfer coefficient and thereby increase Q, the P
will simultaneously increase. Therefore, an optimization procedure would typically
yield a lot of solutions called Pareto-optimal solutions instead of just a single solu-
tion. Pareto-optimal solutions are those in which no solution is better than the other
solutions on the front, as an automatic degradation of one objective results if we try
to improve the other objective. Stated more explicitly, a unique optimum does not
exist. These typically look like what is shown in Fig. 6.19.
The abscissa of the plot is P and the ordinate is 1/Q. It is apparent that we are
seeking a minimum to both these objectives. However, from the Pareto plot, it is seen
that it is impossible to get a solution where both the quantities are minimum simulta-
neously. Each point on the Pareto front actually represents the optimal solution with
Solution:
Let us write a composite objective function in the following form.
∂ y/∂x1 = 2 (6.235)
∂ y/∂x2 = 4 (6.236)
238 6 Search Methods
dy/dα = 0; (6.239)
α = −0.5 (6.240)
Iteration 1:
dy/dα = 0; (6.251)
α = −0.5 (6.252)
Iteration 1:
∂ y/∂x1 = 3 (6.259)
∂ y/∂x2 = 4 (6.260)
dy/dα = 0; (6.263)
α = −0.5 (6.264)
Iteration 2:
Iteration 1:
dy/dα = 0; (6.275)
α = −0.5 (6.276)
(v) γ = 1, (1 − γ) = 0
y = (x12 + x22 )
Let us start with (2, 2). As the initial guess value,
Iteration 1:
∂ y/∂x1 = 4 (6.283)
∂ y/∂x2 = 4 (6.284)
dy/dα = 0; (6.287)
α = −0.5 (6.288)
The optima for the five Cases (i)–(v) are shown in Table 6.10
The method seen above has actually not treated the multi-objective problem in its
full strength. In fact, the multi-objective problem has been converted to an equivalent
242 6 Search Methods
single objective problem with the introduction of weights. While the introduction of
weights seems to be intellectually appealing, often one is confronted with conflicting
multiple objectives that have different dimensions and units.
In view of this, if the equivalent weighted single objective function is to be mean-
ingful, all the objectives have to be scaled with respect to their extrema. This is
tantamount to solving n single objective problems where n is the number of objec-
tives. Over and above this, the choice of weights is entirely subjective which can cut
both ways if one is looking at an engineering solution to the multi-objective problem.
In the light of the foregoing discussion, it is quite clear that the optimization commu-
nity would have developed better methods to solve the multi-objective problems. One
such powerful method is TOPSIS (The Technique for Order of Preference by Simi-
larity to Ideal Solution) method which is a multi-criteria decision analysis method.
This method was first developed by Ching-Lai Hwang and Yoon in 1981 and was
further developed by Yoon in 1987 and Hwang et al. in 1993.
The algorithm for this method is as follows:
• Create a matrix containing m different alternatives and n objectives. Each element
ai j represents the value of ith alternative for jth objective, where i = 1, 2, . . . , m
and j = 1, 2, . . . , n. The matrix can be represented as (ai j )m×n .
• The elements of the matrix are normalized as follows:
ai j
an i j = (6.293)
k=1
m
ak2j
i = 1, 2, . . . , m and j = 1, 2, . . . , n
• Use the pre-decided weights for the objective and create a weight normalized
decision matrix. The weights must be decided such that nj=1 w j = 1 The weight
normalized decision matrix is calculated as
awn i j = an i j .w j (6.294)
6.7 Multi-objective Optimization 243
i = 1, 2, . . . , m and j = 1, 2, . . . , n
• Determine the best alternative, i.e., positive ideal solution (I j+ ) and worst alterna-
tive, i.e., negative ideal solution (I j− ) with respect to the objectives.
• Calculate the Euclidean distances from the positive ideal solution (di+ ) and the
negative ideal solution (di+ ).
di+ = nj=1 (awn i j − I j+ )2 (6.295)
di− = nj=1 (awn i j − I j− )2 (6.296)
di−
Di+ = (6.297)
di− + di+
• Rank the alternatives with respect to Di+ such that the alternative with the highest
Di+ values is the best ranked.
The above algorithm will become clear after solving the Example 6.13.
Example 6.13 Consider a thermal system, whose heat dissipation rate is given
by Q = 2.5 + 6.2v 0.8 , where Q is in kW and v is the velocity in m/s of
the fluid being used as the medium for accomplishing the heat transfer. The
accompanying pumping power is given by P = 1.3 + 0.04v 1.8 , again in kW
with v in m/s ( in both the expressions, the constants ensure that both Q and
P are in kW). It is desired to maximize Q and minimize P. Solve this multi-
objective optimization problem using the TOPSIS method.
Solution:
Given,
This is a two objective, one variable problem. From Fig. 6.20, one can observe that
the two objectives are conflicting with each other. The TOPSIS algorithm described
above can be used to solve this multi-objective optimization problem.
Select some values for velocity in the given range. Evaluate the values of P and
Q for all the selected values of velocity. The values of P and Q are normalized to
244 6 Search Methods
50
Q
Heat dissipation rate, kW P
40
Maximum Q
30
20
Minimum P
10
0
3 4 5 6 7 8 9 10 11 12
v
Fig. 6.20 Heat dissipation rate (Q) and Pumping power (P) versus velocity (v)
Pn and Qn based on Eq. 6.293. For the values given in Table 6.11, Pi2 = 9.9301
and Q i2 = 109.2875. Therefore, we get the value of Pn and Qn by dividing P
and Q with 9.9301 and 109.2875, respectively.
Let us give equal weightage to both the objectives, i.e., w p = 0.5 for minimizing
P and wq = 0.5 for maximizing Q. We can now evaluate the weight normalized
values Pwn and Q wn using Eq. 6.294. The values are shown in Table 6.11.
The positive ideal solution for this multi-objective problem is given by the maxi-
mum value of Q wn , i.e., 0.218 and minimum value of Pwn , i.e., 0.080. The negative
ideal solution is given by the minimum value of Q wn , i.e., 0.079 and maximum value
6.7 Multi-objective Optimization 245
of Pwn , i.e., 0.241. Therefore, the distances (di+ and di− ) of a point (P, Q) from the
positive ideal solution (0.080, 0.218) and negative ideal solution (0.241, 0.079) are
calculated as shown in Table 6.11.
The proximity to the positive ideal solution (D + ) for all the points is calculated
and it is observed that D + = 0.588 is the maximum and it is obtained for v = 7 m/s.
Therefore, the solution to this multi-objective problem using the TOPSIS method
is
v + = 7 m/s,
+
P = 2.628 kW
Q + = 31.908 kW
The above solution is obtained when both the objectives are given equal weigh-
tage, i.e., (w p = wq = 0.5). The solution changes if both the objectives are given a
different weightage as shown in Table 6.12.
Additionally, the TOPSIS method requires specification of weights which is quite
intuitive from an engineering perspective. If we look at Row 1 from Table 6.12, it
corresponds to the highest velocity which results in the highest heat dissipation (Q)
and the highest pumping power (P). Whereas, if we look at Row 5 from Table 6.12,
it corresponds to the lowest velocity which results in the lowest heat dissipation (Q)
and the lowest pumping power (P).
Hence TOPSIS method has helped us to arrive at a compromise where there is a
penalty involved in deviating from the solution corresponding to maximum Q and
minimum P. This is done by minimizing the distance from the positive ideal solution.
MATLAB code for Example 6.13 is given below.
1 clear ;
2 clc ;
3
4 v1 = 3;
5 v2 = 12;
6
7 dv = 1;
246 6 Search Methods
8 n = (v2−v1) /dv + 1;
9
10 v = linspace (v1,v2,n) ;
11
12 P = 1.3 + 0.04.∗v.^1.8;
13 Q = 2.5+6.2.∗v.^0.8;
14
15 wp = 0.5;
16 wq = 1−wp;
17
18 Pn = P. / sqrt (sum(P.^2) ) ;
19 Qn = Q. / sqrt (sum(Q.^2) ) ;
20
21 Pwn = Pn∗wp;
22 Qwn = Qn∗wq;
23
24 p_best = min(Pwn) ;
25 p_worst = max(Pwn) ;
26 q_best = max(Qwn) ;
27 q_worst = min(Qwn) ;
28
36 % Print
37 prt = [ ’ velocity (v) = ’ ,num2str(v( i (1) ) ) , . . .
38 ’ , Heat dissipation rate (Q) = ’ ,num2str(Q( i (1) ) ) , . . .
39 ’ , Pumping power = ’ ,num2str(P( i (1) ) ) ] ;
40 disp ( prt )
Problems
6.1 Determine the minimum of the function y = x 2 − [(40x 2 + 1)/x] + 6 in the
interval 5 ≤ x ≤ 45 using the Fibonacci search method. The required final uncer-
tainty in x should be less than 1.
6.7 Multi-objective Optimization 247
6.9 Revisit Problem 6.5. (i) Develop a composite objective function, which is to be
maximized, for this multi-objective optimization problem using the weighted
sum approach and with the help of dimensionless objective functions (i.e., Q/Q-
max and Pmin /P) along with a weighting factor of γ. (For convenience, you
may want to use 1/P in the composite objective function). (ii) For γ = 0, 0.5,
and 1, solve the multi-objective problem with the Golden section search (single
variable in velocity, v), wherever the solution is not obvious from common sense.
Take the initial interval of uncertainty as 3≤ v ≤12 m/s. A level of uncertainty
of 0.5 m/s on the velocity is required.
Chapter 7
Linear Programming and Dynamic
Programming
7.1 Introduction
More or less, we, by now, have seen the techniques that are applicable to problems
frequently encountered in thermal sciences. We now look at two techniques that are
not so frequently used in thermal sciences namely (i) Linear programming and (ii)
Dynamic programming. Linear programming is a very important technique used in
areas like management sciences, operations research, and sociology. Yet there are
some problems in engineering that can be eminently handled using linear program-
ming or dynamic programming and it would be instructive to know how they work.
An LP problem is one in which both the objective function and all the con-
straints can be represented as linear combinations of the variables. So we will
not encounter sinh(x), tanh(x), e x terms either in the objective function or the con-
straints. Obviously, now we can see why it is not frequently used in heat transfer, fluid
mechanics, and thermal sciences, as there are only very few situations where both
the objective function and constraints can be written as linear combinations of the
variables. But if there is such a situation, then there is a body of knowledge which is
available and can be used, instead of trying to rely on either the Lagrange multiplier
method or the penalty function method or a conventional search technique.
LP was first used in World War II for the optimal allocation of men or ammunition,
aircraft, and artillery for maximizing the strategy. It was first tried by the allied forces
and origins can be traced to the UK in the 1930s and 1940s. Subsequently, the subject
of operations research, in which LP occupies a pride of place, has now become a
very mature field in its own right.
LP is also used in sociology and industrial engineering but has limited applications
in engineering. Some applications in mechanical engineering where this technique
can be applied are the optimal allocation of jobs to a special purpose machine like a
CNC lathe, optimal use of labor, optimal use of several machines on a shop floor, and
so on. There is a total machine availability and there are cost and time allocations
that have to be done.
We also constantly do this linear programming in our minds. Students often try to
maximize the pleasure such that their CGPA does not fall below say 7.5, when CGPA
is written as a constraint. Alternatively, students may try to maximize the CGPA
subject to minimum effort or maximize the CGPA subject to minimum pleasure.
By nature, everybody wants to optimize! Optimization lies not only at the root of
engineering but at the root of life itself. Consider the example of an optimal product
mix in a petroleum refinery. Here, we have raw material costs, refining costs, and
selling price. With the goal being maximizing profit, the challenge before an analyst
is to decide whether everything we want to sell is only petrol or diesel or kerosene
or a mix of these. This is a classic LP problem in which the objective function
and constraints arising due to mass balance, refining capacities, uneven demand for
different products, transportation costs can all be written as linear combinations of
the variables under consideration.
y = C1 x1 + C2 x2 + C3 x3 + · · · + Cn xn (7.1)
x1 , x2 , x3 …xn are all positive. Therefore, we have to write out an extra constraint
that none of these variables can become negative. These are called the non-negativity
constraints. If we solve the resultant system of equations, we obtain the optimal values
of x1 to xn at which y is maximized or minimized, as the case may be. C1 to Cn and
a11 to a jn are all coefficients that are known upfront. So when we get the optimal
values of x1 to xn , we can determine the value of y+ right away.
Immediately, some of us feel that the Lagrange multiplier method can be used with
the Kuhn–Tucker conditions for solving the system represented by Eqs. 7.1–7.5. But
the catch is if it were possible to solve such a system using the Lagrange multiplier
method and the Kuhn Tucker conditions, why then were special optimization tech-
niques like LP developed? There must have been some special reasons. What are
these?
There are some key differences between this problem and the Lagrange multiplier
formulation. Please remember that though we handled inequalities, (it was just one
or 2), most were equalities for the problems handled through the Lagrange multipli-
ers. If everything is an inequality, the Kuhn–Tucker conditions get very messy. So,
essentially the Lagrange multiplier method is used for solving problems in which we
have primarily equality constraints. For one or a few inequality constraints, the KTC
formulation is very advantageous. The LP on the other hand is basically designed
for handling inequality constraints.
This is possible because they are inequalities and inequalities only reduce the feasible
domain; these prohibit certain solutions. But equality constraints have to be exactly
obeyed. This is the reason why the equality constraint is a lot more restrictive than
the inequality constraint. Therefore, in the Lagrange multiplier method, if “m” is the
number of equality constraints and “n” is the number of variables, then m≤ n. The
advantage with Lagrange multiplier method is that the objective function can be as
complicated as it can get and need not be a simple linear function as in this case.
Even so, we can have any number of inequality constraints in the LP problem.
252 7 Linear Programming and Dynamic Programming
We now look at the graphical method of solving an LP problem. Though the graphical
method has limited scope as only two-variable problems can be handled, all the
features of an LP problem can be elegantly brought out using the graphical solution
itself.
Example 7.1 A furniture company can make tables or chairs or both. The
amount of wood required for making one table is 30 kg while that for a chair
is 18 kg. The total quantity of wood available is 300 kg. The labor required for
making a table is 20 man hours, while that for a chair is 8 man hours. The
total number of man hours available is 160 (say for one day). The profit from
a table is Rs. 1500, while that from the chair is Rs. 800. Determine the optimal
product mix of the company using the graphical method of LP.
Solution
First we formulate the LP problem and then plot the constraints. Let x1 be the number
of tables and x2 be the number of chairs.
Subject to:
(i) Material constraint:
It is not that after making x1 tables and x2 chairs, the 300 kg of wood need to be
finished, there could be some left too, which is why we are using the less than equal
to sign rather than the equality sign. (In this sense, it is different from the mass
balance or energy balance equation in thermal sciences). There can be residual wood
at the end of the day. (ii) Labor constraint:
x1 ≥ 0 and x2 ≥ 0 (7.9)
We first plot the labor constraint 5x1 + 2x2 ≤ 40. When x1 = 0, x2 = 20; when
x2 = 0, x1 = 8. So we get a straight line that satisfies the equation 5x1 + 2x2 = 40.
The feasible region lies below this line. Since x1 ≥ 0 and x2 ≥ 0, we insert arrows
pointing inwards as shown in Fig. 7.1. So the solution can only be in the first quadrant
and these constraints only reduce the feasible region. Let us now plot the material
constraint given by 3x1 + 1.8x2 = 30. The feasible region is indicated by the shaded
area in Fig. 7.2. Any point within the feasible region will satisfy all the constraints.
But each point will result in a different value of y. So first, we have identified the
region in which ymax is expected to lie, because ymax should first be a feasible
solution. Now getting to ymax is not so straightforward. We have to assume some
value of y, draw the straight line representing the profit function y = 1500x1 + 800x2
and keep moving the straight line till it just moves out of the feasible region. So
when the line just moves out of the feasible region, the points it cuts in the feasible
254 7 Linear Programming and Dynamic Programming
region is/are the optimal solutions to the problem. So let us take one iso-objective
line y = 1500x1 + 800x2 = 8000. To plot this line, we need two points. These are
(5.33, 0) and (0, 10). This line is shown in Fig. 7.2. Any point on this line will give
combinations of x1 and x2 such that y = Rs. 8000.
We can have x1 = x2 = 0, where we have all the material and labor but are not
making anything, which is a trivial solution. We can also have a line that goes out of
the board itself such that the optimum is unbounded. So both x1 and x2 are ∞ and
y is also ∞. But this tells us that we have not put in the constraints. So within the
feasible region, we have got to find the solution.
There is a theorem which states that even though there are infinite solutions within
the feasible region, the corners of the polygon that are formed by the coordinate axes
and by the plotting of all the constraints, qualify as candidates for the optimum
solution. This can be proved. One reason for this is that the objective function is
convex. To evaluate y only at the vertices of this polygon, which, in this example,
are 4 in number.
• The first solution (0, 0) is trivial and can be discarded.
• The vertex on the x2 axis (0, 16) indicates that all objects made are chairs.
• The vertex on the x1 axis (8, 0) corresponds to only tables being made and no
chairs being produced.
• The fourth point is a mix of tables and chairs.
Each of these will result in a value of y and now we can determine the product mix
that results in the maximum value for y. The y values for the 4 vertices are given in
the accompanying table (Table 7.1).
7.2 Linear Programming or LP 255
Solution
This is how we solve a two-variable LP problem using the graphical method. This can
also be solved by introducing what are called slack variables. The slack variables
are introduced to convert the inequalities to equalities. Let us consider the same
problem.
Subject to:
s1 and s2 are called slack variables and have to be positive. Why should they be
positive? Because after the optimal solution is determined, s1 tells us the amount of
wood which is available and has been wasted in the process. s2 refers to the number
of man hours that are available. Neither of them can be negative and can be, at best,
0. Hence, they are not unrestricted in sign.
256 7 Linear Programming and Dynamic Programming
Already we had 2 variables and we have now converted the same into a four-
variable problem. How do we solve this system? With 2 equations and 4 unknowns,
there is no hope of solving this. One way is to make 2 variables 0 at a time and
determine the resulting y. Since s1 and s2 also have to be positive, it is equivalent
to representing x1 , x2 , s1 and s2 on a four-dimensional plane and then trying to find
out the feasible region. We then use the same logic that iso-objective line cuts the
polygon at one of the points and emerges as the optimal solution.
How many combinations are there if 2 variables are made 0 at a time? There are 6.
The number of combinations is given by n!/m!(n − m)!, where n is the total number
of variables including the slack variables and m is the number of constraints. In this
example, n = 4 and m = 2. Hence the number of combinations is 4!/(2!2!) = 6. So
we need to evaluate 6 combinations.
Example 7.3 An ethylene refining plant receives 500 kg/h of 50% pure ethy-
lene and refines it into 2 types of output 1 and 2. Type 1 has a purity of 90%
while type 2 has a purity of 70%. The raw material cost is Rs. 40/kg and the
selling price of type 1 is Rs. 200/kg, while that of type 2 is Rs. 120/kg. Pack-
aging facilities allow a maximum of 200 kg/h of type 1 and 225 kg/h of type 2.
The transportation cost of type 1 is Rs. 8/kg while that of type 2 is Rs. 16/kg.
Total transportation cost should not exceed Rs. 4000. Set up the optimization
problem to maximize the profit and solve it using the graphical method of LP.
Declare the mass flow rates of types 1 and 2 as x1 and x2 .
Solution
First we have to plot all the constraints, identify the feasible region, and plot one
iso-objective line. Then, we move the iso-objective line till it cuts the feasible region
at the farthest point because as we move away from the origin, each iso-objective
line represents a higher profit. The highest profit, subject to the constraints, is what
we are seeking.
There are 2 constraints in this problem. The selling price for type 1 is Rs. 200/kg and
that for type 2 is Rs. 120/kg. But the transportation cost for type 1 is Rs. 8/kg while
that for type 2 is Rs. 16/kg. That is why the profit in terms of x1 and x2 is written as
192x1 + 104x2 by subtracting the transportation cost from the selling price. The raw
material cost of 500 × 40 is further subtracted from this to get the actual profit. The
constraints involved in the packaging are taken care of by the conditions x1 ≤ 200
and x2 ≤ 225.
Mass balance: 0.9x1 + 0.7x2 ≤ 0.5 × 500 or 0.9x1 + 0.7x2 ≤ 250 (7.20)
When we first plot the non-negativity constraints (Eq. 7.22) we get the feasible region
indicated by the shaded region in Fig. 7.3.
We then plot the mass balance constraint (see Fig. 7.4) and finally plot the trans-
portation constraint (Fig. 7.5).
The shaded region gives us the final feasible region. We saw that there is a property
for linear programming problem, where we will have to search only at the vertices.
There are 6 vertices from A–F as seen in Fig. 7.5. So we evaluate the objective function
258 7 Linear Programming and Dynamic Programming
Now we know that y is a maximum at E and this is the optimum solution to the
problem. This is how the graphical method is used to find the optima.
At the optimum, x+ + +
1 = 200 kg/h and x2 = 100 kg/h, y = Rs. 28,800.
2. If the selling price of x1 goes down to Rs. 160/kg and the selling price of x2 goes
up to Rs. 140/kg and other things are the same, what is the optimum solution?
Even in this case, the feasible region still remains the same, as the constraints are
not affected and the optimum solution will again lie at one of the vertices of the
polygon ABCDEFA. But now, the objective function ynew has now changed to
So, the additional relaxation of Rs. 500 for the transportation constraint has no
bearing on the final solution.
As discussed earlier, the simplex method may be considered as a variant of solving the
LP problem with the slack variable method. In real-world problems in several fields
such as management sciences, operation research, sociology, and often in thermal
engineering, the number of unknown variables are way more than two. Thus, the
graphical method cannot be used in such scenarios. In the graphical method, we
demonstrated that one of the corner points of the feasible region is the optimum
7.2 Linear Programming or LP 261
Y = C1 x1 + C1 x1 + · · · + Cn xn (7.25)
Subject to
x1 ≤ 8 (7.30)
x 1 + s1 = 8 (7.31)
In the first equation, x1 is the basic variable and in the latter, x2 is the basic variable.
These equations are said to be in canonical form with the basic variables x1 and x2 .
The basic solution is given by
x3 = x4 = x5 = 0
x1 = 6
x2 = 2
Example 7.4 Solve the following optimization problem using the simplex
method.
Maximize,
Y = 5x1 + 2x2 + 3x3 − x4 + x5
Subject to,
x1 + 2x2 + 2x3 + x4 = 8
3x1 + 4x2 + x3 + x5 = 7
x1 , x2 , x3 , x4 , x5 ≥ 0
Solution
Here, we can observe that the constraints are in their canonical form, eliminating the
need for pivot operations. There x4 and x5 are the basic variables in Eqs. 1 and 2,
7.2 Linear Programming or LP 263
x2 = x3 = 0 (7.36)
x4 = 8 (7.37)
x5 = 7 (7.38)
Y = −8 + 7 = −1 (7.39)
x4 = 8 − x1 = 8 − 1 = 7 (7.40)
and,
x5 = 7 − 3x1 = 7 − 3 = 4 (7.41)
So,
Ynew = 5 − 7 + 4 = 2 (7.42)
Ynew − Y = 2 − (−1) = 3 (7.43)
Therefore, the relative profit of x1 is 3, which means that, by increasing the variable
x1 by one unit, the total change in the profit is 3 units. The question now to ask is by
how much can x1 increase? x1 can be increased to the extent that it does not violate
the two constraints. So taking both the constraints into consideration, we have
x1 + x4 = 8 (7.44)
3x1 + x5 = 7 (7.45)
In the first constraint, the maximum value that x1 can take is 8, and in the second one
the maximum value is 7/3. Thus, we take the minimum of the two, x1 = 7/3. The
minimum is chosen so that the nonnegative constraint is not violated. If x1 is taken
to be 8, x5 would be equal to −17, which violates the non-negative constraint. This
rule is called the minimum ratio rule. Thus,
x2 = x3 = x5 = 0 (7.46)
7 17
x4 = 8 − = (7.47)
3 3
7 17
Y =5× − +0 = 6 (7.48)
3 3
Now the constraints are as follows.
264 7 Linear Programming and Dynamic Programming
This table is known as the Simplex Tableau. C̄ is the relative profit which is given by
the following equation:
C= C j − (inner product of C B and column vector of xi in the canonical system)
1
C1 = 5 − (−1 1) =3 (7.49)
3
2
C2 = 2 − (−1 1) =0 (7.50)
4
2
C3 = 3 − (−1 1) =4 (7.51)
1
1
C4 = −1 − (−1 1) =0 (7.52)
0
0
C5 = 1 − (−1 1) =0 (7.53)
1
Here, C j is the coefficient of the variables in the objective function. We can observe
that x3 has the highest relative profit. This implies that by increasing the value of x3
by one unit, the objective function increases by 4 units, which is the highest among
all the variables. So, x3 is chosen to be the basic variable. Thereafter, to check to
what extent x3 can be incremented and to find the variable which would leave the
basis, the minimum ratio rule is applied as follows (Table 7.4).
Here, the constants of the equation are divided by the corresponding coefficients
of x3 . In Table 7.5, Row 1 has the minimum ratio, thus x4 exists and x3 enters the
basis column. Now the constraints are to be converted into the canonical form with
the basic variables as (x3 , x5 ). So the following pivot operations are done:
7.2 Linear Programming or LP 265
R1
R1 =⇒
2
R1
R2 =⇒ R2 −
2
The constraints in the canonical form then become
1 1
x1 + x2 + x3 + x4 = 4 (7.54)
2 4
5 1
x1 + 3x2 − x4 + x5 = 3 (7.55)
2 2
x1 = x2 = x4 = 0
x3 = 4, x5 = 3 and Y = 15
In the second iteration, it can be seen that the maximum relative profit is for the vari-
able x1 . Hence, x1 enters the basis. To determine which variable leaves the solution,
the minimum ratio rule is again applied. It can be observed that variable x5 leaves
the basis (Tables 7.6, 7.7).
266 7 Linear Programming and Dynamic Programming
The next step is to carry out pivot operations to convert the equation in the canon-
ical form with the basic variables (x3 , x1 ).
2
R2 =⇒ × R2
5
R2
R1 =⇒ R1 −
5
In Table 7.8, the elements of the C̄ row are all either 0 or negative, thus confirming
no further improvement is possible in this problem.
Final solution:
x2 = x4 = x5 = 0,
x1+ = 6/5, x3+ = 17/5 and Y + = 81/5
Example 7.5 A furniture company can make tables or chairs or both. The
amount of wood required for making one table is 30 kg while that for a chair
is 18 kg. The total quantity of wood available is 300 kg. The labor required for
making a table is 20 man hours, while that for a chair is 8 man hours. The
total number of man hours available is 160 (say for one day). The profit from
a table is Rs. 1500, while that from the chair is Rs. 800. Determine the optimal
product mix of the company using the simplex method.
Solution
We now solve the above problem using the simplex method. The slack variables are
introduced as discussed in Example 7.2 to convert the inequalities to equalities. Let
us consider the same problem.
7.2 Linear Programming or LP 267
s1 and s2 are slack variables and have to be positive as discussed in the previous
example. The next step to identify the basic variables. s1 and s2 are the basic variable
because they have unit coefficient in one equation and zero in all the other equations.
Let us formulate the simplex tableau.
We can observe that the maximum relative profit of x1 is the highest of all variables.
Hence, x1 enters the basis column. Now, the next step is to find the variable which
leaves the basis, and for which the following table is made (Tables 7.9–7.11).
268 7 Linear Programming and Dynamic Programming
Here, s2 has the minimum ratio, and hence it leaves the basis column. The next
step is to convert the equations into their canonical form with basic variables (s1 , x1 ).
R2
R2 =⇒
20
3
R1 =⇒ R1 − × R2
2
After calculating the relative profit for each of the variables, we can see that the
variable x2 has the maximum value. In the eighth column of the table, ratios are
calculated and we can see that the minimum is for s1 . Hence, s1 leaves and x2 enters
the basis column. The next step is to convert the equations into their canonical form
with basic variables (x2 , x1 ) (Table 7.12).
After the second iteration, we can notice that the relative profits of all the variables
as smaller than or equal to zero. This implies that by increasing the value of any of
the variables by one unit, the value of the objective function would not increase.
Hence, we have arrived at the optimum value of the objective function.
Hence, the final solution is, x1+ = 4 x2+ = 10 and Y + = 14, 000.
Minimization Problem:
In the case of the minimization problem, we can follow the same algorithm with just
one modification. In the C row, the variable with the highest negative value is chosen
and the iteration is stopped when all the values are greater than or equal to zero. An
alternative approach for a minimization problem is to convert the objective function
into a maximization problem by multiplying it with minus one.
The optimization problem in which some or all of the variables are restricted to
integer (or discrete) values is commonly referred to as Pure integer programming (IP)
or Mixed-integer problems. IP problems are inherently nonlinear since the different
7.2 Linear Programming or LP 269
functions of the problem are defined only at the discrete values of the variables.
However, for the purpose of developing the solution procedure, after removing the
integer restrictions on the variables concerned, the resulting problem is an LP then
we can treat the IP problem as being an LP. Otherwise, it is classified as a nonlinear
problem.
Most algorithms of IP are based on the continuous version of the IP model. IP
algorithms not based on the continuous version are generally known to be unpre-
dictable.
There are mainly two methods to solve IP problems.
1. Cutting plane methods:
We first solve for the continuous version of the problem and then add special
“secondary” constraints, which represent necessary conditions for ensuring the
variables are integers. In this way, the continuous solution space is gradually
modified till its continuous optimum extreme points satisfy the integer con-
ditions. The added constraints cut certain portions of the solution that do not
contain feasible integer points.
2. Searching method:
This method is based on the simple idea of enumerating all the feasible integer
points. The basic idea is to develop “clever tests” that consider only a “small”
portion of that feasible integer points explicitly but automatically account for
the remaining points implicitly.
From Fig. 7.6, it is seen that OABO is the feasible domain and the values of the
objective functions at the corner points are,
Thus, the solution to the linear programming problem is x1+ = 0 and x2+ = 8/3.
However, we see that of x2+ is not an integer (Table 7.13).
270 7 Linear Programming and Dynamic Programming
3 X1=1
A X2=2
2
O B X1
1 2 3 4 5 6
3
A
X2=2
2
O B X1
1 2 3 4 5 6
The nearest integer value to x2 is 2. So, we draw a line x2 = 2 on the graph and in
turn cut the domain into two parts. Now we arrive at an integer solution with x1 = 0
and x2 = 2. In this case, the value of Y = 10. But this might not be the maximum
value of the objective function. Now, we check what other integer values can x1 take
without leaving the feasible domain (Fig. 7.7).
7.2 Linear Programming or LP 271
(3 × 1) + (6 × 2) = 3 + 12 = 15 ≤ 16
(3 × 2) + (6 × 2) = 6 + 12 = 18 > 16
Here, the constraint is violated. Thus, the optimum value of the objective function is
at (1, 2) and is 12.
Example 7.6 Solve the following integer programming problem using Gomory’s
fractional cut algorithm.
Maximize
Y = 3x1 + 5x2
subject to,
x1 + 4x2 ≤ 9
2x1 + 3x2 ≤ 11
x1 , x2 ≥ 0
Solution
x1 + 4x2 + x3 = 9 (7.71)
2x1 + 3x2 + x4 = 11 (7.72)
Since x+ +
1 and x2 are not integers, the above is not an acceptable solution and so we
need to work on this problem further.
2 1 7
1x2 + x3 − = (7.73)
5 5 5
2 4 2
(0 + 1)x2 + 0 + x3 + −1 + x4 = 1 + (7.74)
5 5 5
272 7 Linear Programming and Dynamic Programming
2 4 2
+ x4 = + (1 − x2 + x4 ) (7.75)
5 5 5
For the above to be true, the following arguments are in order. Since x2 , x3 and x4
are all integers, (1 − x2 + x4 ) has to be positive as 25 x3 + 45 x4 will have to be greater
than 25 for all values of x3 and x4 . Hence to satisfy the constraint of being integers,
the following additional condition is proposed (Tables 7.14–7.17).
2 4 2
+ x4 ≥ (7.76)
5 5 5
2 4 2
− − x4 ≤ − (7.77)
5 5 5
2 4 2
− − x4 + x5 = − (7.78)
5 5 5
Please note that in the above tableau the “cut” is implemented as a constraint in
Row 3
7.2 Linear Programming or LP 273
−1/5 1
x3 = = (7.79)
−2/5 2
−7/5 7
x4 = = (7.80)
−4/5 4
We now continue with the same approach as the simplex method. The final Tableau
becomes Table 7.18.
From the above tableau, it is seen that all entries in C row are either negative
or zero, implying that no further improvement in the objective function is possible.
Furthermore, x+ +
1 and x2 are now integers, thereby satisfying the integer constraint.
Hence, the final solution is x1+ = 4 x2+ = 1 and Y + = 17.
Example 7.7 Determine the cheapest cost of transporting goods from city
A to city B. The constraint is that the path must involve one of three C’s
(C1 ,C2 ,C3 ), one of the three D’s, and one of the three E’s. All the constants are
given in Fig. 7.8. Using DP, determine the optimal path to minimize the total
cost.
Solution
There are totally 27 paths available. We have to pass through one of the nodes at each
level. The point to bear in mind, is that, if we are able to calculate the costs of all
7.3 Dynamic Programming (DP) 275
Fig. 7.8 Paths represented by nodes and associated costs for example 7.7
27 paths comfortably and cheaply, there is no need for dynamic programming. But
imagine that instead of 3 choices, we have 6 or 7 choices at each level and instead of
3 intermediate levels, there are 7 or 8 intermediate stages. If the objective function
also involves formidable calculations, then it is worthwhile to develop an algorithm
for this. The problem has to be subdivided into stages.
Cost from A to D:
We have to find out the cost from A to any D first, because the final optimal path
must pass through either D1 or D2 or D3 . First, we get the total cost to reach any
D from A. There are 9 paths to reach D. We will come out with a tabular column
as shown in Table 7.19. The path from A to any D can be accomplished by starting
from any A and passing through one of the three Cs and reaching any D. There are
3 choices for C and 3 choices for D. So there are 9 possibilities.A − C1 − D1 , A −
C 1 − D2 , A − C 1 − D3 , A − C 2 − D1 , A − C 2 − D2 , A − C 2 − D3 , A − C 3 − D1 ,
A − C3 − D2 , A − C3 − D3 . We evaluate the 9 entries.
Now the optimal path to D1 , optimal path to D2 and the optimal path to D3 are
all indicated by boldface. At this stage, it is too premature to look at the minimum
of these three and proceed from there, because we do not know what the story from
D to E is. So we should not try to get premature convergence by going in for a local
optimum. But one thing which is sure is, if we have to reach the end, it has to be
through a D. So to reach each of these D’s, we first figure out what optimum path is.
Once we determine the optimal path to each of these Ds, when we go from D to E,
we use these optimal paths alone for calculation. Suboptimal paths are left out. This
way a complex is broken down in to stages.
Till the calculation of these 9 entries, no algorithm or dynamic programming was
done. Now, when the lowest in every column is already noted (with boldface), we
have used the algorithm already. There are two things to be noted. We are (i) looking
at the minimum cost to D1 , D2 and D3 (ii) but not looking at the minimum cost in
the table and removing the other 8 entries. That may result in a suboptimal solution
and these are 2 cardinal points in dynamic programming.
276 7 Linear Programming and Dynamic Programming
Cost from A to E:
How do we reach E 1 , E 2 or E 3 ? It has to be either through D1 , D2 or D3 . When we
reach from D1 to E 1 , the other paths like A − C2 − D1 − E 1 or A − C3 − D1 − E 1
are not evaluated because we have already eliminated them. If we calculate those
again, it becomes an exhaustive search. Up to D1 , anyway, we know what the optimal
path is! We again indicate with boldface, the lowest in every column of the Table
as shown in Table 7.20. Through E 1 , if we want to calculate the total cost to B, it
is 250. Through E 2 it is 240. Through E 3 , it is 290. Therefore, the optimal path is
A − C1 − D1 − E 2 − B.
The control variables in the above problem were basically the cost associated with
each stage. But in the cricket match, there are two control variables: the number of
wickets and the number of overs. So it is a lot more difficult and so Duckworth and
Lewis must have written a program and solved it. So in principle, one can have any
number of variables. Here, it is simple with just one variable.
Computational gain:
What is the big deal? How many evaluations did we do? We did 9 at the first
stage, 9 more at the second and 3 at the last stage and so a total of 21 evaluations.
An exhaustive search would have required 27 evaluations. So the computational gain
is 6/27. There is a reduction of 28.52% in our effort because we applied dynamic
programming.
At each and every stage, we identify the optimal solution and in the subsequent
stage, we carry over from the optimal solution left behind in the previous stage and
there is no distinction between the start and the finish. We could have started from
B and found the optimal path from B to E, B to D, B to C, and B to A, we will get
the same solution. It does not matter whether we start from the left or the right. At
each and every stage, we proceed with whatever optimum we get and discard the
suboptimal solutions at that stage. But we do not look at the overall minimum and
discard all the others, as that may actually mislead us.
7.3 Dynamic Programming (DP) 277
The example, we considered above, closely resembles the classical “stage coach
problem”. This program is said to originate from a traveling salesman who moves
from one small town A to another small town B, through several intermediate towns.
His objective was not to pay more than what is necessary for transport. Between
small towns, he is supposed to have used “stage coaches”. Dynamic programming
is also applicable in other problems involving reliability, manpower planning, opti-
mum layout of gas pipelines, machine replacement problem, and so on. Dynamic
programming is extensively treated in operations research texts (see, for example,
Srinivasan 2010).
Problems
7.1 Use the graphical method of linear programming to solve the LP problem given
below and obtain the optimal solution.
7.2 Consider the LP problem given above. However, the objective function is now
y = x1 + 4x2 . Other constraints remain the same. Use the graphical method of
LP to solve this problem. Comment on your result.
7.3 A raw bauxite ore consisting of 50% of Aluminum (Al) is processed in a plant
and the output consists of two different grades of Al, one consisting of 70%
purity and the other 90%.
The cost of raw bauxite is Rs. 10/kg and the selling price of processed Al is
Rs.20/kg (70%) while that of the other grade is Rs. 25/kg (90%). Assume that
1000 kg of bauxite is fed into the plant at a time. Let x1 and x2 (in kg) be the
outputs, respectively, of the 70 and 90% grades. The design of the plant gives rise
to the constraint 5x1 + 3x2 ≤ 3000. Determine x1 and x2 for maximum profit
by using the graphical method of linear programming. Assume dumping of Al
is allowed during the processing.
7.4 Revisit exercise Problem 1 of Chap. #4. A turbine receives 5 kg/s steam and can
produce superheated steam or electricity. The prices are Rs. 4/kWhr electricity;
278 7 Linear Programming and Dynamic Programming
Rs. 0.15/kg low-pressure steam, 0.25/kg high-pressure steam. Assume that each
kg/s into the generator can produce 0.025 kWhr/s electricity.
To prevent overheating of the generator, the mass flow into the generator should
be less than 3 kg/s. To prevent unequal loading on the shaft, the extraction rates
should be such that 2x1 +3x2 ≤ 10. The design of the bleed outlets allows the
constraint 6x1 + 5x2 ≤ 20. Find x1 and x2 for maximum profit by using the
graphical method of linear programming.
7.5 A typical gas turbine engine for a military aircraft consists of a combustor and an
afterburner for increased thrust. The total energy at the exit has to be maximized
for maximum work output. The aircraft is flying with a speed of 600 m/s and
intakes air at a rate of 10 kg/s. (C p for air = 1000 J/kgK). The temperature at the
inlet of the engine is 250 K and heating value of the fuel is 45000 kJ/kg. Due to
limitations in the spraying of fuel, 4x1 +3x2 ≤ 1.5. The combustor can withstand
more heat than the afterburner and this is employed in the design of the spray
rates of fuel by the constraint: x1 ≥ x2 + 0.1. Further, due to limitations in fuel
storage and distribution, 2x1 +5x2 ≤1.
Determine the optimum values of x1 and x2 for maximum energy at the exit
using the graphical method of linear programming.
7.6 Solve problem no. 7.1 given above, using the method of slack variables. Confirm
that the solution you obtain is the same as that obtained using the graphical
method.
7.7 Solve the gas turbine with afterburner problem (problem no. 7.5 above) using
the method of slack variables. Confirm that the solution you obtain is the same
as that obtained using the graphical method. Comment on the values of the slack
variables in the final solution.
7.8 The first stage of a space shuttle consists of a central liquid engine consisting of
three nozzles and two solid straps on boosters as shown.
7.3 Dynamic Programming (DP) 279
For convenience, assume complete expansion in the nozzles, i.e., pressure at exit
= ambient pressure.
The exit gas velocities are ve1 = 4000 m/s; ve2 = 1500 m/s; based on the design
of the nozzles, ṁ 1 and ṁ 2 are the mass flow rates (in kg/s) for each of the
central and external nozzles. Optimize ṁ 1 and ṁ 2 for maximum total thrust. The
constraints are
(a) Due to a limit on the maximum propellant weight that can be carried, 5ṁ 1 +
ṁ 2 ≤ 6500.
(b) The total fuel that can be carried in the main central tank is 750 tons and
this has to burn for 8 min.
(c) The total fuel that can be carried in the external casings is 1000 tons and this
has to burn for 2 min.
(d) The nozzle geometry allows the constraint: 4ṁ 1 + ṁ 2 ≤ 6000.
Determine the optimum thrust using (a) the graphical method of linear program-
ming and (b) the method of slack variables.
7.9 Revisit exercise problem 7.1 and solve it using the simplex method.
7.10 Revisit exercise problem 7.3 and solve it using the simplex method.
7.11 Revisit exercise problem 7.4 and solve it using the simplex method.
7.12 Consider a container truck that is climbing a “ghat” (hill) road. There are three
sections on the ghat road denoted by B-C, C-D, and D-E. The fuel consumed
in each section varies with the time taken to cover the particular section and is
given in the accompanying Table 7.21.
Please note that section CD is the toughest part of the ghat road that “drinks” or
“guzzles” so much fuel. The hill climb needs to be completed within a total time
of 34s. Using dynamic programming, determine the optimal time to be spent
by the truck in the three sections so as to minimize the total fuel consumption.
What is the computational gain achieved by using dynamic programming for
this problem? (adapted with modifications from Stoecker 1989).
7.13 A person wants to go from city A to city J. The various options and distances(in
terms of km) are given in the schematic and Table 7.22. Using DP, obtain the
minimum distance to be traveled (Fig. 7.9).
280 7 Linear Programming and Dynamic Programming
Table 7.21 Time and fuel consumption for various sections of the hill climbing (Problem 7.12)
Section Time, t (s) Fuel consumption, (g)
B-C 10 60
11 51
12 43.5
13 37.5
C-D 10 91.5
11 78
12 67.5
13 57
D-E 10 73.5
11 61.5
12 52.5
13 45
Table 7.22 Path and distance for various combinations in (Problem 7.13)
Path A-B A-C A-D B-E B-F B-G C-E C-F
Distance 7 7 9 6 5 8 9 7
Path C-G D-E D-F D-G E-H E-I F-H F-I
Distance 10 6 5 7 8 9 8 6
Path G-H G-I H-J I-J
Distance 9 7 12 13
References
Srinivasan, G. (2010). Operations Research- Principles and Applications. New Delhi, India:
Prentice Hall India.
Stoecker, W. F. (1989). Design of Thermal Systems. Singapore: Mc Graw Hill.
Chapter 8
Nontraditional Optimization Techniques
8.1 Introduction
In this chapter, we will look at two optimization techniques, namely, (a) Genetic
algorithms (GA) and (b) Simulated annealing (SA), both of which are unconven-
tional, yet powerful. Both use only the information on the objective function and not
any auxiliary information like derivatives. GA and SA are both search techniques
that also employ probabilistic laws in the search process and have gained immense
popularity in the recent past.
Genetic algorithm is based on the mechanics of natural selection and natural genetics.
The central idea is the survival of the fittest concept, which stems from the Darwinian
theory, wherein, the fittest will survive and procreate such that successive generations
become better and better progressively. This is true if we look at any parameter. For
example, the average life expectancy of an Indian male is now 68 years. It was only
40 years at the time of independence. The life expectancy in Japan is about 88 years.
The probability that a child born in Germany today will live for 100 years is more
than 0.5.
If we consider the field of medicine, we cannot say that all the diseases have been
conquered. Several diseases have been conquered, but new diseases have come and
medical research is going on at a frenetic pace and intensity. However, it cannot
be denied that a lot more people live longer “with” and “in spite” of diseases. So,
outside of Darwinian evolution, there has been intervention by man too. Successive
generations are becoming better, i.e., in several metrics.
So, basically, genetic algorithms simulate or mimic the process of evolution. The
key idea here is that evolution is an optimizing process. If evolution is an optimizing
process and successive generations are becoming better and better, each generation is
like an iteration in numerical methods. So if we have 5 generations, it is like having
5 iterations in a numerical method and with each iteration, there is a progressive
improvement in the objective function. This means that more or less, GA is like a
hill-climbing technique. For example, if y is a function of x1 , x2 , x3 , we start with
some combination of values of x1 , x2 , x3 , and we get a value of y. As we apply
the genetic algorithm, with successive iterations, y keeps increasing (for a maxi-
mization problem). Based on a suitable convergence criterion, we stop the iterations
(generations).
8.2.2 Origin of GA
Prof. John Holland, along with his colleagues and students developed this method at
the University of Michigan around 1975. Prof. David Goldberg, an illustrious student
of Holland, is the author of the book “Genetic Algorithms in search, optimization and
machine learning” (1989). Goldberg is a civil engineer who did his Ph.D. with Prof.
Holland and is now an independent consultant. He also worked with Prof. Holland
on genetic algorithms and has contributed significantly to the spectacular growth of
this field along with a few other key scientists.
8.2 Genetic Algorithms (GA) 285
8.2.3 Robustness of GA
The central theme of research on genetic algorithms has been robustness. If we say
something is robust, it means it can survive even in hostile environments. Robustness
in optimization refers to a fine balance between efficiency and efficacy that is essential
for survival in varied environments. The robustness of a species or the robustness
of human beings, as a race, can mean many things. How high a temperature can a
man withstand? How long can we go on without food and water? How long can we
survive under hostile conditions? Equivalently from an optimization point of view,
we are looking at how many different classes of problems are amenable to genetic
algorithms.
We do not claim that for all the optimization problems, genetic algorithms will
solve it better than other optimization techniques. But the point is that for a wide
class of problems, genetic algorithms work very well and that is what we mean by
robustness. For a specific two-variable problem, which is continuous and differen-
tiable, where the first and second derivatives can be obtained, the steepest descent
or the conjugate gradient method will work very fast and will defeat genetic algo-
rithms hands down. There is a class of problems where specialized techniques will
be superior to GA.
However, if we apply the conjugate gradient method for a function that has a lot of
local minima and maxima and is oscillating wildly, the conjugate gradient will just
get choked. In the class of problems, where getting the derivatives is messy and is
computationally expensive, GA can be very potent and useful. Needless to say, in a
large majority of exceedingly complex problems, GA would outperform traditional
methods.
regardless of the starting point. So the GA is one algorithm that can be used to get
the correct solution if f(x) looks as shown here.
Several engineering problems may have local minima like this and just one global
optimum and we want to determine the global optimum. Now, this picture should
give us additional ideas. If we look at whether the genetic algorithm will be fast or
slow compared to the conjugate gradient method, we see that there is a disadvantage
of GA, that it converges slowly. The advantage is that it converges to the global
optimum. Can we combine the speed with the robustness? How do we do this?
We start initially with genetic algorithms and zero in on region C. Once we get to
this region, we quickly go to the conjugate gradient or some other faster traditional
technique. So this is called a hybrid optimization technique. So, here we start with
GA and switch later to a gradient-based method. This is routinely used nowadays
for complex optimization problems.
The Rastrigin function is one such example. The function is given here (Eq. 8.1). It
really tortures y and has several peaks and valleys but the minimum of the function
is 0. The global minimum occurs at (0, 0).
288 8 Nontraditional Optimization Techniques
The Rosenbrock function is called the Banana function, Ban(x1 , x2 ) (Eq. 8.2). The
global minimum occurs at (1, 1). If somebody is interested in working on new opti-
mization methods, one needs to first test his/her algorithm against standard functions
like this. We have to benchmark and find out on a particular computer, how many
iterations this function takes using the new algorithm and what level of accuracy it
reaches. First, it should reach the global optimum, then the speed and time are tested.
These are all considered standard.
An optimization problem falls under one of the three categories listed below.
1. Unimodal problems are those that have got only one peak or valley.
2. A combinatorial problem is one that permits only a finite number of combinations
of the variables under question. For example, there are 5 machines and each can
do 4 types of operations. There are various combinations of which machine will
do which job, so that we maximize the profit or minimize the cost as the case
may be.
3. Multimodal problems are mathematical or engineering problems that have sev-
eral peaks and valleys.
Now if we look at unimodal problems, the conjugate gradient or the steepest ascen-
t/descent method will be very efficient. In Fig. 8.2, we see that the efficiency is almost
close to 1. But if we try to apply the same to a multimodal function, it may not con-
verge or may converge very slowly. That means we have to restart with different
initial points. So the efficiency is very low. Even when the exhaustive search or the
random walk algorithm is used for a combinatorial problem, the efficiency is very
low. This is because, for a combinatorial type of problem, we may try to exhaustively
search for the optima. Now, if we look at GA, its efficiency is more or less the same
for all classes of problems and looks, as shown here. So the GA has an efficiency
that is far greater compared to what a traditional optimization technique or exhaus-
tive search method has for multimodal and combinatorial problems. However, for a
specific unimodal problem, the efficiency of GA will be lower than a sophisticated
traditional technique developed for it.
While the exhaustive search or the random walk works with a uniform or system-
atically low efficiency, for a class of problems, GA works with a reasonably high
efficiency for a wide range of problems. This is known as robustness. We cannot
claim that for all the problems, the efficiency of GA is the highest. Such a statement
would be far from the truth!
8.2 Genetic Algorithms (GA) 289
Fig. 8.2 Efficiency of various classes of optimization algorithms for three kinds of problems
Now we have to look at the philosophy of optimization. What is the goal of opti-
mization? We keep improving the performance to reach a or some optimal points.
We see that as y improves, we keep going in that direction.
The implications are as follows:
• We seek improvement to approach some optimal point. There is always an optimal
point and the goal is to reach that. That is the belief!
• What Prof. Goldberg alleges is that this is a highly calculus-oriented view of
optimization. We have learned so much of calculus that we feel that dy/dx must
be equal to 0 such that there is a peak and on both sides, y should drop sharply.
He says it never happens in nature or any engineering problem.
• He alleges that it is not a natural emphasis.
According to human perception,
• Goodness is invariably judged relative to the competition.
• The best scientist, tennis player or cricketer, poet, we cannot define a maximum y
in these cases! When somebody is outstanding it means that compared to others,
he or she is very good and is far better compared to the others. But after 20 years,
someone else may come who is better than him/her. We cannot simply specify
criterion and say he/she satisfies all this, he/she is the best.
• Convergence to the best is beside the point as no one knows the definition of best
or what the criteria for it are.
• So the definition of best in most of the situations is “far better compared to others”,
but there is no objectivity here.
290 8 Nontraditional Optimization Techniques
• Can we say the best human being has arrived? How do we define such a person?
Someone who has the maximum money or has solved a 200 year problem or
propose a radically new theory?
• Doing better relative to the others is the key. This also justifies, in a lighter vein,
why many universities in the world have a relative grading in place.
Therefore, we have to look at whether the optimization algorithm goes in the right
direction. After it has reached some particular level, we just stop working on the
problem. We do not try to get one optimum solution. So from the perspective of a
GA analyst, the priorities of optimization have to be relooked. The most important
goal of optimization is improvement. We look at how the objective function changes.
The key point now is that instead of optimizing, can we reach a satisfying level
of performance quickly? For example, with the design variable like temperature,
pressure, and so on, can we quickly reach a good level of efficiency for the power
plant? That is the goal. So the attainment of the optimum is not a critical requirement
for complex systems.
A parallel that can be drawn here is the Kaizen philosophy used extensively in
Japan. In Japanese, Kaizen means “continuous improvement” is that we do not have
a separate set of supervisors. Usually, someone makes a product and someone else
checks it. They got rid of it and said that the person who makes it also checks it.
He/she reports efficiency and there is a self-correction. We cannot see rapid progress
in Kaizen but there is continuous improvement. But because there is incremental
progress over a period of time, we get substantial progress. The Japanese carmaker
Toyota uses this, and this automobile giant is known for its quality consciousness.
• GA employs the coding of the parameter set and not the parameters themselves.
Generally, the variables x1 to xn are replaced by their binary equivalent such that
we convert them into 0s and 1s. While it is possible to write a genetic algorithm
code without this binary representation too, the most popular of them all is the
binary representation.
• The GA searches for a population of points and not a single point. Instead of
taking a single value of x1 and x2 and taking y, we see how y changes with different
sets of x1 and x2 . If we have a two-variable problem, we take a, b, c, and d which
represent different (x1 , x2 ) values. a, b, c, d are known as candidates. For a, b,
c, and d, we now calculate y(a), y(b), y(c), and y(d). We then find which among
them is maximum and rank them. We convert x1 and x2 into 0 and 1. We mix and
match the bits from “better” parents. The “better” parents are those for which y is
higher (for a maximization problem). We now get new values of a, b, c, and d. The
8.2 Genetic Algorithms (GA) 291
“children” are now born, who then become parents. The strategy involved is that
among all these parents, only the fittest can survive and reproduce. The number of
members in a population, (the sample size) is kept fixed (in a constant population
size approach). So if we have 4 or 5 solutions, we look at how these solutions
evolve generation after generation. After a certain stage, we can take the average
of these points and say that is the optimum. We will reach a stage where there is
little difference between members of the species. That is the way it should be. If
there is continuous improvement, all members should be equally strong. That is
what GA strives to achieve.
• GA uses the objective function information and does not use the information of
the derivative or second derivative.
• GA uses stochastic transition rules and not deterministic rules.
Genetic algorithms have some other techniques also to overcome this problem of
losing strings of high quality. There is another strategy called the elitist strategy.
What we do in this is that out of the n solutions, the best solution is left untouched
and automatically goes to the next generation. So all the crossover and permutations–
combinations are done for the (n-1) solutions while the “king” remains as it is. But
“he” cannot remain king forever. When we do mixing and matching and the (n-
1) parents recombine to produce new (n-1) children, the king is added to the list
and ranked along with the others. If he is no longer king, i.e., he does not have the
highest fitness, then he also joins the process of crossover and mutation. The king in
a particular generation is not touched. This is the elitist strategy.
This is also used in yet another optimization technique known as particle swarm
optimization. This is also very similar to the genetic algorithm and is an evolutionary
technique. For example, when a group of birds is moving and searching for food,
the group adjusts its orientation such that each of the birds is at the smallest possible
distance from the leader and the leader is the one who is closest to the food. After some
time, the position of the leader may change. This change of leadership is analogous
to the elitist strategy in GA.
when we interpret the final value of these 9 bits, we have to divide it by 100. So if the
accuracy we seek keeps increasing, then the number of bits required will also increase
dramatically. The values of the variables alone do not decide the string length and
the accuracy too influences it. If we know for a fact that a variable x lies between 1
and 5, and we need 2-digit accuracy, then 9 bits are required or are sufficient for its
representation.
GA has evolved a lot since it first made its appearance in 1975. The main features of
GA are
1. GA is conceptually easy to understand.
2. GA works very well in exceedingly complex problems and in problems, where
getting derivatives of the objective function, is very cumbersome.
3. Lots of approaches and possibilities exist and so considerable subjectivity exists
in implementation.
4. It is easy to parallelize. That is, when we want the value of y for 4 different
cases, each can be sent to a different processor on a server and this speeds up
the process. However, GA is not massively parallel as crossover, reproduction,
mutation, and so on cannot be parallelized.
5. It is not easy to make GA work very efficiently though it is easy to make it work.
In summary, GA is a contemporary powerful tool available to us to solve complex
optimization problems. However, not all the features of GA can be established rig-
orously by mathematics. Even so, GA is certainly not a helter-skelter technique that
has no philosophy or logic behind it.
Example 8.1 The cost of engines plus fuel for a cargo ship (in lakhs of rupees
per year for 100 tons of cargo carried) varies with speed and is given by 0.2x 2 ,
where x is the speed of the ship in m/s. The fixed cost of hull and crew (again
in the same units) is given by 450/x. Using genetic algorithms, perform 2
iterations to determine the optimal operating speed of the ship. The original
interval of uncertainty can be taken as 0.5 ≤ x ≤ 25.5 m/s and 1.
Solution:
The largest value of x is given as 25.5 m/s and one decimal accuracy is required. So
the number we are looking at is 255. The largest number with 8 bits is 255 and so,
8 bits are enough to represent x here. y = 0.2x 2 + 450/x; 0.5 ≤ x ≤ 25.5 m/s We
will work with 4n solutions, where n is the number of variables. In this case, n = 1.
The number of design solutions = 4n = 4.
8.2 Genetic Algorithms (GA) 295
We randomly take 4 values of x that are uniformly spread over the interval 0.5 ≤
x ≤ 25.5. Here, when we are using 8 bits and the maximum value is 255, we must
remember after all the operations are done and we are reconverting from binary to
decimal, we have to divide by 10 to get the value of x.
Now we convert these values to binary and check for bias. We have 16 ones and
16 zeros, which is very good. We can also generate the initial population by using
a random number table. A typical random number is given at the end of the book.
In this table, we proceed from the top to the bottom and once a column is over, we
proceed from left to right. We now use the first random number. If this is less than
0.5, we generate “0” in the binary form of the first candidate solution. Else it is 1.
We then go to the next random number and apply the same rule and proceed till we
generate 32 bits.
Next we calculate the value of y by substituting the values of x in the equation
y = 0.2x 2 + 450/x . We determine y = 386.44 and average fitness ȳ = 98.61.
Please note that the original GA is for maximization, while we are trying to do
minimization, in this example.
For a minimization problem, the fittest is string 2 whose y value is 64.95 and
relative fitness is 0.168. Now we want to know the count so that we can decide on the
mating pool. We must keep total count as 4n so that we have a constant number of
members in the mating pool. Now we do the single-point crossover though we can
do a uniform crossover bit by bit. We now generate the new population by deciding
on (i) the mate for a particular parent and (ii) the exact point of crossover. We need
to do both of these randomly (or stochastically).
After these operations with the mates and crossover sites are done, as shown in
Table 8.1, we generate the new population and evaluate the fitness (yi ) of all the
candidates (Table 8.2). We take note of the one with the lowest yi , meaning the one
with the lowest cost.
The beauty of the GA is that in just one iteration, y has come down from 386.44
to 315.89 and the average fitness has come down from 98.6 to 78.97. This means
the cost has come down dramatically. This may sound controversial but we want
the average fitness to decrease as we are looking at a minimization problem. The
most important thing in genetic algorithm is that the variance in y or the difference
Table 8.1 Initial population and mating pool for Example 8.1
String No Initial Initial yi yi / yi Count Mating pool
population, population
x (binary)
1 4.2 00101010 110.67 0.286 1 01100101
2 10.1 01100101 64.95 0.168 2 01100101
3 16.4 10100100 81.23 0.21 1 00101010
4 23.5 11101011 129.59 0.335 0 10100100
386.44
296 8 Nontraditional Optimization Techniques
between the minimum and maximum values of y will dramatically come down, with
generations. Initially, this variation was 64.64 and now it is 39.05. It will come down
even further and we have to do one more iteration according to the question. The
results of the second iteration are shown in Table 8.3. The average fitness has come
down further to 69.125. Furthermore, the variance in the fitness across the population
comes down with iterations, which is the hallmark of evolution.
We can have a double crossover also in this stage if we want and in 3 or 4 iterations,
we will get an average fitness that is very close to the correct answer. We may again
say that we can do an exhaustive search, but for multimodal problems, it will not
work. However, GA searches in the whole solution space and even if a solution is
very weak, it is not discarded. How does this happen? This can come in the form
of mutations or some patterns of strings that can be picked up and so on. In the
tournament selection , we get a good mating pool compared to just ranking the trial
solution where we compare the fitness of two candidates at a time, just as we do
in cricket or tennis tournament in ascending or descending order. This way, some
diversity in the population is preserved and we can avoid premature convergence.
8.2 Genetic Algorithms (GA) 297
1 clear ;
2 close a ll ;
3 clc ;
4
11 % Initialize fitness
12 y= z e r o s ( 1 , ps ) ;
13
14 %I n i t i a l p o p u l a t i o n randomly g e n e r a t e d and
15 %r o u n d e d t o 2 d e c i m a l a c c u r a c y
16 x=a +( b−a ) ∗ r a n d ( 1 , ps ) ;
17 x= round ( x , 1 ) ;
18
19 count =0;
20 w h i l e count < i t r % S t a r t i n g t h e i t e r a t i o n
21 count=count +1;
22
23 y =0.2∗ x . ^ 2 + 4 5 0 . / x ; %C a l c u l a t i n g t h e F i t n e s s
24
25 %C o n v e r t i n g x , y from d e c i m a l t o b i n a r y
26 x _ b i n = d e 2 b i ( f l o o r ( x ∗10) , nb ) ;
27 x_bin2=x_bin ;
28
29
30 %G e n e r a t i n g t h e m a t i n g p o o l
31
32 % N o t i n g t h e v a l u e and i n d e x f o r t h e maximum
33 [ Vmax , j 1 ] = max ( y ) ;
34 % N o t i n g t h e v a l u e and i n d e x f o r t h e minimum
35 [ Vmin , j 2 ] = min ( y ) ;
36 xmin_bin = x _ b i n ( j2 , : ) ; % N o t i n g t h e minimum v a l u e
37 xmin = b i 2 d e ( xmin_bin ) / 1 0 ;
38
42 % e l i m i n a t i n g t h e one w i t h minimum f i t n e s s
43 %S e l e c t Mates
44 m= o n e s ( 2 , ps / 2 ) ;
45 k = 1 : ps ;
46
47 m( 1 , 1 ) = j 1 ;
48 m( 1 , 2 ) = j 2 ;
49 r = c e i l ( ( ps −2) ∗ ( r a n d ( 1 ) ) ) ;
298 8 Nontraditional Optimization Techniques
50 m( 2 , 1 ) =k ( r ) ;
51 k( r ) =[];
52 r = c e i l ( ( ps −3) ∗ ( r a n d ( 1 ) ) ) ;
53 m( 2 , 2 ) =k ( r ) ;
54 k( r ) =[];
55
56 i f ps >4
57 f o r i = 1 : ( ps −4) / 2
58 r = c e i l ( ( ps −2−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
59 m( 1 , 2 + i ) =k ( r ) ;
60 k( r ) =[];
61 r = c e i l ( ( ps −3−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
62 m( 2 , 2 + i ) =k ( r ) ;
63 k( r ) =[];
64 end
65 end
66
67 %G e n e r a t e random c r o s s −o v e r s i t e
68 c r = c e i l ( r a n d ( 1 , ps / 2 ) ∗ ( nb −1) ) ;
69
77 end
78
79 % C o n v e r t i n g t h e new p o p u l a t i o n t o d e c i m a l
80 %f o r c a l c u l a t i n g t h e f i t n e s s
81 x= b i 2 d e ( x _ b i n ) / 1 0 ;
82
83 % Print
84 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
85 ’ , minVel = ’ , n u m 2 s t r ( xmin ) , . . .
86 ’ , minY = ’ , n u m 2 s t r ( Vmin ) ] ;
87 disp ( prt )
88
89 end
8.2 Genetic Algorithms (GA) 299
Example 8.2 Revisit Example 8.1 and solve it using genetic algorithm with
elitist strategy and perform two iterations to determine the optimal operating
speed of the ship.
Solution:
In the elitist strategy, the best solution is left untouched and it carries forward to the
next generation. Therefore in this case we begin with a population size of 4n+1, i.e.,
5 as it is more compatible to perform the crossover operation in an even number of
strings.
We randomly take 5 values of x that are uniformly spread over the interval 0.5 ≤
x ≤ 25.5 (Table 8.4).
Next, we calculate the value of y by substituting the values of x in the equation
y = 0.2x 2 + 450/x . We determine y = 456.82 and average fitness ȳ = 91.36.
Please note that the original GA is for maximization, while we are trying to do
minimization, in this example.
For a minimization problem, the fittest is string 2 whose y value is 65.68 and
relative fitness is 0.144. Now we want to know the count so that we can decide on the
Table 8.4 Initial population and mating pool for Example 8.2
String No Initial Initial yi yi / yi Count Mating pool
population, population
x (binary)
1 9.4 01011101 65.68 0.143 2 01011101
2 3.5 00100011 131.02 0.287 0 01011100
3 14.7 10010011 73.83 0.162 1 10010011
4 22.4 11100000 120.44 0.264 1 11100000
5 9.2 01011100 65.84 0.144 1 01011101
456.82
300 8 Nontraditional Optimization Techniques
mating pool. We must keep the total count as 4n+1 so that we have a constant number
of members in the mating pool. Since the fittest string is given a count 2, only one
fittest string is used for the crossover operation. The other fittest string directly goes
to the next generation without any crossover.
Now we do the single-point crossover though we can do a uniform crossover
bit by bit. We now generate the new population by deciding on (i) the mate for a
particular parent and (ii) the exact point of crossover. We need to do both of these
randomly (or stochastically).
After these operations with the mates and crossover sites are done, as shown in
Table 8.5, we generate the new population and evaluate the fitness (yi ) of all the
candidates (Table 8.5). We take note of the one with the lowest yi , meaning the one
with the lowest cost.
The beauty of the GA is that in just one iteration, y has come down from 456.82
to 411.57 and the average fitness has come down from 91.36 to 82.31. This means
the cost has come down dramatically. This may sound controversial but we want the
average fitness to decrease as we are looking at a minimization problem. It will come
down even further and we have to do one more iteration according to the question.
The results of the second iteration are shown in Table 8.6. The most important thing
in genetic algorithm is that the variance in y or the difference between the minimum
and maximum values of y will dramatically come down, with generations. Initially,
this variation was 65.34 and now it is 7.04. The average fitness has come down further
to 67.58. Furthermore, the variance in the fitness across the population comes down
with iterations, which is the hallmark of evolution.
1 %Matlab code f o r G e n e t i c A l g o r t i h m
2 %Example # 8 . 1
3
4 clear ;
5 close a ll ;
6 clc ;
7
14 % Initialize fitness
15 y= z e r o s ( 1 , ps ) ;
16
17 %I n i t i a l p o p u l a t i o n randomly g e n e r a t e d and
18 %r o u n d e d t o 1 d e c i m a l a c c u r a c y
19 x=a +( b−a ) ∗ r a n d ( 1 , ps ) ;
20 x= round ( x , 1 ) ;
21
22 count =0;
23 w h i l e count < i t r % Starting the i t e r a t i o n
24 count=count +1;
25
26 y =0.2∗ x . ^ 2 + 4 5 0 . / x ; %C a l c u l a t i n g t h e F i t n e s s
27 %C o n v e r t i n g x , y from d e c i m a l t o b i n a r y
28 x _ b i n = d e 2 b i ( f l o o r ( x ∗10) , nb ) ;
29 x_bin2=x_bin ;
30
31 [Y, i d x ] = s o r t ( y , ’ d e s c e n d ’ ) ;
32 f o r i = 1 : ps
33 x_bin2 ( i , : ) = x_bin ( idx ( i ) , : ) ;
34 end
35
43 %S e l e c t Mates
44 m= o n e s ( 2 , ( ps −1) / 2 ) ;
45 k = 1 : ps −1;
46
47 f o r i = 1 : ( ps −1) / 2
48 r = c e i l ( ( ps −1+2−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
49 m( 1 , i ) =k ( r ) ;
50 k( r ) =[];
51 r = c e i l ( ( ps −1+1−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
52 m( 2 , i ) =k ( r ) ;
53 k( r ) =[];
54 end
55
56
57 %G e n e r a t e random c r o s s −o v e r s i t e
58 c r = c e i l ( r a n d ( 1 , ( ps −1) / 2 ) ∗ ( nb −1) )
59
72 % Print
73 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
74 ’ , v_min = ’ , n u m 2 s t r ( xmin ) , . . .
75 ’ , Y_min = ’ , n u m 2 s t r ( Ymin ) ] ;
76 disp ( prt )
77
78 end
Solution:
1 − xy
z= (8.3)
x+y
The largest value of x or y is given as 2.55 m and two decimal accuracies are required.
So the number we are looking at is 255. The largest number with 8 bits is 255 and so,
8 bits are enough to represent x or y here in the range 0.01 ≤ x or y ≤ 2.55 m We
will work with 4n solutions, where n is the number of variables. In this case, n = 2.
Number of design solutions = 4n = 8, i.e., 4 for each variable.
We randomly take 4 values each of x and y that are uniformly spread over the
interval 0.01 ≤ x or y ≤ 2.55. Here, when we are using 8 bits and the maximum value
is 255, we must remember after all the operations are done and we are reconverting
304 8 Nontraditional Optimization Techniques
Table 8.7 Initial population and mating pool for Example 8.3
S. No. Initial population Binary Volume Count Mating pool
x y x y x y
1 1.55 0.28 10011011 00011100 0.1342 1 10011011 00011100
2 0.90 0.48 01011010 00110000 0.1778 2 01011010 00110000
3 0.77 1.15 01001101 01110011 0.0528 1 01001101 01110011
4 0.40 2.14 00101000 11010110 0.0485 0 01011010 00110000
Max. fitness = 0.1778; Min. fitness = 0.0485; Avg. fitness = 0.1026
from binary to decimal, we have to divide by 100 to get the values of x and y
(Table 8.7).
Next, we calculate the value of V by substituting the values of x and y. We note the
maximum fitness to be 0.1778, minimum fitness to be 0.0485, and average fitness to be
0.1026. We shall look at how these 3 values change across the generations(iterations).
Please note that the original GA is for maximization, while we are trying to do
minimization, in this example.
For a maximization problem, the fittest is string 2 whose V value is 0.1778. Now
we want to know the count so that we can decide on the mating pool. We must keep
the total count as 4n+1 so that we have a constant number of members in the mating
pool. Since the fittest string is given a count 2, only one fittest string is used for
the crossover operation. The other fittest string directly goes to the next generation
without any crossover. The string with the minimum fitness is given a count 0 as it
needs to be eliminated.
Now we do the single-point crossover though we can do a uniform crossover bit by
bit. We now generate the new population by deciding on (i) the mate for a particular
parent and (ii) the exact point of crossover. We need to do both of these randomly (or
stochastically). In this problem, since it involves two variables, we have the option
of choosing the mates and crossover sites differently for both the variables. But we
chose the mates and crossover sites the same for both the variables.
After these operations with the mates and crossover sites are done, as shown
in Table 8.8, we generate the new population and evaluate the volume of all the
candidates (Table 8.8). We note the maximum fitness to be 0.1771, minimum fitness
8.2 Genetic Algorithms (GA) 305
1 clear ;
2 close a ll ;
3 clc ;
4
11 % Initialize fitness
12 V= z e r o s ( 1 , ps ) ;
13 Vmax= z e r o s ( 1 , i t r ) ;
14 Vmin= z e r o s ( 1 , i t r ) ;
15 Vavg= z e r o s ( 1 , i t r ) ;
16
17 %I n i t i a l p o p u l a t i o n randomly g e n e r a t e d
18 % and r o u n d e d t o 2 d e c i m a l a c c u r a c y
19 x=a +( b−a ) ∗ r a n d ( 1 , ps ) ;
20 x= round ( x , 2 ) ;
21 y=a +( b−a ) ∗ r a n d ( 1 , ps ) ;
22 y= round ( y , 2 ) ;
23
24 count =0;
25 w h i l e count < i t r % Starting the i t e r a t i o n
26 count=count +1;
27 %C a l c u l a t i n g t h e volume ( f i t n e s s )
28 f o r i = 1 : ps
29 V( i ) =x ( i ) ∗y ( i ) ∗(1−x ( i ) ∗y ( i ) ) / ( x ( i ) +y ( i ) ) ;
30 % E l i m i n a t i n g samples with n e g a t i v e f i t n e s s
31 i f V( i ) <=0
32 %and r e g e n e r a t i n g new s a m p l e s
33 l a b e l =1;
34 w h i l e l a b e l ==1
35 p= r a n d ( 1 ) ;
36 i f p >0.5
37 x ( i ) =a +( b−a ) ∗ r a n d ( 1 ) ;
38 x= round ( x , 2 ) ;
39 else
40 y ( i ) =a +( b−a ) ∗ r a n d ( 1 ) ;
41 y= round ( y , 2 ) ;
42 end
43 V( i ) =x ( i ) ∗y ( i ) ∗(1−x ( i ) ∗y ( i ) ) / ( x ( i ) +y ( i ) ) ;
44 i f V( i ) >0
45 l a b e l =0;
46 end
47 end
48 end
49 end
8.2 Genetic Algorithms (GA) 307
50
51 %C o n v e r t i n g x , y from d e c i m a l t o b i n a r y
52 x _ b i n = d e 2 b i ( f l o o r ( x ∗100) , nb ) ;
53 y _ b i n = d e 2 b i ( f l o o r ( y ∗100) , nb ) ;
54
55 x_bin2=x_bin ;
56 y_bin2=y_bin ;
57
58 % N o t i n g t h e v a l u e and i n d e x f o r t h e maximum
59 [ V_max , j 1 ] = max (V) ;
60 % N o t i n g t h e v a l u e and i n d e x f o r t h e minimum
61 [ V_min , j 2 ] = min (V) ;
62
63 xmax = b i 2 d e ( x _ b i n ( j1 , : ) ) / 1 0 0 ;
64 ymax = b i 2 d e ( y _ b i n ( j1 , : ) ) / 1 0 0 ;
65
71 %S e l e c t Mates
72 m= o n e s ( 2 , ps / 2 ) ;
73 k = 1 : ps ;
74
75 % Making s u r e t h e s a m p l e s w i t h
76 % maximum f i t n e s s do no mate
77 i f j1 > j 2
78 k( j1 ) =[];
79 k( j2 ) =[];
80 else
81 k( j1 ) =[];
82
83 i f j2 >1
84 k ( j2 −1) = [ ] ;
85 else
86 k( j2 ) =[];
87 end
88 end
89
90 m( 1 , 1 ) = j 1 ;
91 m( 1 , 2 ) = j 2 ;
92 r = c e i l ( ( ps −2) ∗ ( r a n d ( 1 ) ) ) ;
93 m( 2 , 1 ) =k ( r ) ;
94 k( r ) =[];
95 r = c e i l ( ( ps −3) ∗ ( r a n d ( 1 ) ) ) ;
96 m( 2 , 2 ) =k ( r ) ;
97 k( r ) =[];
98
99 i f ps >4
100 f o r i = 1 : ( ps −4) / 2
308 8 Nontraditional Optimization Techniques
101 r = c e i l ( ( ps −2−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
102 m( 1 , 2 + i ) =k ( r ) ;
103 k( r ) =[];
104 r = c e i l ( ( ps −3−2∗ i ) ∗ ( r a n d ( 1 ) ) ) ;
105 m( 2 , 2 + i ) =k ( r ) ;
106 k( r ) =[];
107 end
108 end
109
110 %G e n e r a t e random c r o s s −o v e r s i t e
111 c r = c e i l ( r a n d ( 1 , ps / 2 ) ∗ ( nb −1) ) ;
112
129 %R e c o r d i n g maximum f i t n e s s i n e v e r y i t e r a t i o n
130 Vmax ( c o u n t ) =V_max ;
131 %R e c o r d i n g maximum f i t n e s s i n e v e r y i t e r a t i o n
132 Vmin ( c o u n t ) =V_min ;
133 %R e c o r d i n g maximum f i t n e s s i n e v e r y i t e r a t i o n
134 Vavg ( c o u n t ) =mean (V) ;
135
136 % Print
137 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
138 ’ , x_max = ’ , n u m 2 s t r ( xmax ) , . . .
139 ’ , y_max = ’ , n u m 2 s t r ( ymax ) , . . .
140 ’ , Vol_max = ’ , n u m 2 s t r ( max (V) ) ] ;
141 disp ( prt )
142
143 end
144 p l o t ( 1 : i t r , Vmax )
145 xlabel ( ’ Iterations ’ )
146 y l a b e l ( ’Vmax ’ )
147 t i t l e ( ’Vmax vs . Number o f I t e r a t i o n s ’ )
148 g r i d on
149 disp (x)
150 disp (y)
8.2 Genetic Algorithms (GA) 309
Then we use the GA and generate 4 new positions. The new configurations are again
solved for the maximum temperatures using CFD. We use GA again and get 4 new
positions. We kept repeating this till convergence is obtained (Refer to Madadi and
Balaji, 2008, for a detailed discussion on this).
Consider the cooling process of molten metals through annealing. At high temper-
atures, the atoms in the molten state can move freely with respect to one another.
However, as the temperature reduces, the movement gets restricted. So analogously,
during the initial iterations, the samples are free to move anywhere in the domain.
For a single-variable problem in x, x can move anywhere in the domain. This is
similar to the situation at the start of annealing, where the atoms have the probability
of being in any state.
However, as the energy becomes low, the probability of attaining a higher energy
state also becomes low. If the energy is high, the system has an equal probability
of attaining any of the states. In short, it means that the freedom gets reduced as
the energy level goes down. Similarly in the initial iterations, the freedom is very
8.3 Simulated Annealing (SA) 311
high. If we have variables x1 and x2 , they can move here and there initially. But as
the iterations proceed, the conditions for accepting a particular sample, i.e., when
we are proceeding from (x1 , x2 )i to (x1 , x2 ) j , the conditions for accepting the latter
(x1 , x2 )i+1 become stricter and stricter, when the iteration is not proceeding in the
direction of decreasing objective function, for a minimization problem.
At the end of annealing, the atoms begin to get ordered and finally form crystals
with minimum potential energy. If the cooling takes place very fast, it may not reach
the final state of minimum potential energy. The crucial parameter here is the cooling
rate which decides if eventually we reach the state of minimum potential energy. In
view of this, if the cooling rate can be tweaked, fine-tuned, or controlled in such a way
that we get the optimum end product in metallurgy. Analogously, the convergence
rate of the optimization algorithm is controlled in such a way that we reach global
convergence.
If the cooling rate is not properly controlled in metallurgy, instead of reaching the
crystalline state, finally one may reach a polycrystalline state that may be at a higher
energy state than the crystalline state. Analogously, for the optimization problem,
we may get a solution that has converged prematurely. The final solution may well
be an optimum but unfortunately it could turn out to be a local optimum. There is no
guarantee that this is the global optimum.
The key to achieving the absolute minimum state is slow cooling. This slow
cooling is known as annealing in metallurgy and simulated annealing mimics this
process. Achieving the global optimum is akin to reaching the minimum energy state
in the end.
The key point is that the cooling is controlled by a parameter called the temperature,
that is, closely related to the concept of Boltzmann probability distribution. Accord-
ing to the Boltzmann distribution, a system in thermal equilibrium at a temperature
T has its energy distributed probabilistically according to
P(E) = e− kT
E
(8.5)
In Eq. 8.5, k is Bolt zmann constant, 1.38 × 10−23 J/K. Equation 8.5 suggests that
as T increases, the system has a uniform probability of being in any energy state.
If T is very high, it means that the expression reduces to the exponent raised
to the power of minus of a very low quantity, which makes it close to 1, i.e.,
P(E) = e−ver y small quantit y ≈ 1, for all E.
While Eq. 8.5 suggests that when T increases, the system has a uniform probability
of being at any energy state, it also tells us that when T decreases, the value of the
expression P(E) = e− kT takes on a very small value and hence the system has a
E
small probability of being at a higher energy state (see Fig. 8.5). How does it work?
312 8 Nontraditional Optimization Techniques
The probability is exponential to the power of a negative quantity. So for small values
of (-E/kT), P(E) will be e−0.1 , e−0.2 , e−0.05 , and so on, there is a chance of getting
different values of P(E). However, once we have reached e−4 , e−5 , and so on, P(E)
is almost 0. Therefore by controlling T and assuming that the search process follows
Eq. 8.5 (the Boltzmann probability distribution), the convergence of the algorithm
can be controlled. We use the Boltzmann distribution like condition, to decide if the
next sample will be accepted or not.
What is the key difference between this and a traditional algorithm? If we are
looking for a search algorithm, conventional thinking says that when Y(x) has to be
minimized, when we go from X 0 to X 1 and from Y 0 to Y 1 , we accept X 1 only if
Y 1 < Y 0 . In simulated annealing too, there is no problem with this. X is the design
vector with variables x1 to xn . So what we are doing in simulated annealing is that if
we are seeking a minimum and Y 1 decreases compared to Y 0 , there can be no doubt
in our mind that X 1 has to be accepted.
But the beauty is that if X 1 is such that Y 1 > Y 0 , we do not reject X 1 right away.
Instead, we reject it with a probability. We get this probability from the Boltzmann
E
distribution. We recast Eq. 8.5 as P(E) = e− kT , where E is the change in energy
in the metallurgical context and is the change in objective function Y 1 − Y 0 , in the
context of optimization. Furthermore, the Boltzmann constant k can be made 1 for our
algorithm and there are several ways to represent the “temperature” in the Boltzmann
distribution. This temperature could be the average of Y for say 4 values of X.
How the system proceeds is like this. Initially, if we want to start, what we do is,
if we have only one X, we take 4 arbitrary values of X and obtain the 4 values of Y.
We take the average. If we recall, GA also proceeded the same way. Now we assume
that Ȳ = T and generate a new sample X 1 from X 0 . There are several ways of doing
this. Either we use the random number table or a Gaussian distribution. Now we
compare Y 1 and Y 0 . If Y 1 decreases, we accept X 1 . But if Y 1 increases, we apply the
probability criterion and we will get a number between 0 and 1. We generate another
8.3 Simulated Annealing (SA) 313
So unlike GA, SA is generally a point by point method (SA for multiple points is
also available and can be considered to be a variant of the classic SA algorithm).
314 8 Nontraditional Optimization Techniques
• Begin with an initial point and a high temperature T. The initial temperature could
be the mean of a few values of Y calculated from randomly chosen values of x.
• A second point is created in the vicinity of the initial point by using a sampling
technique and E is calculated.
• If E (Yi+1 − Yi ) is negative, the new point is accepted right away.
• If E is positive (for a minimization problem), the point is accepted with a prob-
E
ability of e− kT , k=1. This completes one iteration.
• In the next generation, again a new point is chosen, but now the temperature T is
reduced. This is where the cooling rate is controlled. Usually Tnew = Told
2
.
• At every temperature, a number of points (4 or 6) are evaluated before reducing
the temperature.
• The stopping criterion is either E ≤ 1 or T ≤ 2 or |X i+1 − X i | ≤ 3 .
The progress of iterations in a typical two-variable minimization problem in x1 and
x2 is shown in Fig. 8.6. We can see that the initial search covers almost every possible
subregion of the solution space, so that premature convergence is avoided.
A typical flowchart for solving an optimization problem with SA is shown in
Fig. 8.7.
Consider the cargo ship problem, we discussed while learning GA.
450
y = 0.2x 2 + ; 0.5 ≤ x ≤ 25.5 m/s.
x
Example 8.4 Consider the cargo ship problem. We would like to solve it
using SA. Perform 4 iterations of the SA with an initial interval of uncertainty
0.5 ≤ x ≤ 25.5. Use the random number table provided to you.
Solution:
450
y = 0.2x 2 + ; 0.5 ≤ x ≤ 25.5 m/s.
x
Calculate the initial cooling rate T0 . For this, we need to use 4 values of x and the
corresponding y. We already did this for GA (refer to Example 8.1), we will use the
same 4 values. We got y = 98.61; Hence T0 = 98.61.
0.5 + 25.5
μ= = 13 m/s
2
25.5 − 0.5
3σ = ; σ = 4.17 m/s
2
We now draw a Gaussian around μ = 13 m/s. The objective of writing the Gaus-
sian distribution is that we have to solve this equation to obtain xnew .
Rand 1 (xnew −μ)2
√ =√ e− 2σ 2
2π × σ 2π × σ
xnew = μ ± −2σ 2 log(Rand) = xold ± −2σ 2 log(Rand)
The random numbers are selected from the random number table in Table 8.11.
Now, the iterations of SA follow as
Iteration 1
r1 r2 r3
0.0012 0.8989 0.5788
318 8 Nontraditional Optimization Techniques
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 13 + −2 × 4.172 log(0.0012)
r1 r2 r3
0.4996 0.2827 0.7306
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 13 − −2 × 4.172 log(0.4996)
Iteration 2
r1 r2 r3
0.1085 0.3862 0.7691
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 8.087 − −2 × 4.172 log(0.1085)
r1 r2 r3
0.5574 0.7998 0.4568
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 8.087 + −2 × 4.172 log(0.5574)
Iteration 3
r1 r2 r3
0.0926 0.5896 0.3322
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 12.59 + −2 × 4.172 log(0.0926)
320 8 Nontraditional Optimization Techniques
r1 r2 r3
0.7626 0.6962 0.1703
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 12.59 + −2 × 4.172 log(0.7626)
r1 r2 r3
0.0327 0.2993 0.3086
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 15.66 − −2 × 4.172 log(0.0327)
r1 r2 r3
0.3528 0.5741 0.2659
8.3 Simulated Annealing (SA) 321
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 15.66 + −2 × 4.172 log(0.3528)
r1 r2 r3
0.9418 0.2400 0.6556
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 15.66 − −2 × 4.172 log(0.9418)
So this is how simulated annealing works and after some time, by the zigzag path, it
will cover the whole solution space well, so that the global optimum is not missed. If
the objective function is computationally expensive, for example, if a CFD solution
or something as complex is required for getting each value, we can develop a neural
network. We can run it for certain combinations of x, validate it and train it. After
we get the optimum, we can substitute the values of x back into our original forward
model or governing equations and check if the values predicted by the neural network
are the same, as got by substituting in the equations. This completes the loop. The
following points are in order.
• We are suggesting that the cooling rate algorithm for this problem can be T2 at
every stage. We can also have different rates. Ultimately the reduction in cooling
rate has to be decided based on our problem. We do not want to reduce it by 4 times
or 8 times or 10 times because the rejection will become very strict and though it
may accelerate our convergence, the latter could become premature.
• At the start, we calculate the average of y at 4 points and use it as T.
• Additionally, the new value of x becomes the mean of the distribution for that
iteration. That is the way all sampling algorithms work. When we start with 13,
the mean is around 13. If x reduces to 8, the mean is around 8. Next, if x becomes
8.6, the mean is around 8.6. The new value of x becomes μ automatically.
322 8 Nontraditional Optimization Techniques
1 clear ;
2 close a ll ;
3 clc ;
4
5 a = 0.5; % Range o f v
6 b = 25.50;
7 cs = 0 . 7 5 ; % Cooling Schedule
8 i t r =100; % No . o f i t e r a t i o n s
9
22 w h i l e count < i t r
23
45 % C a l c u l a t i n g p r o b a b i l i t y of r e j e c t i o n
46 P = exp ( ( Yi ( c o u n t )−Yi ( c o u n t +1) ) / T ) ;
47 i f P< r a n d ( 1 )
48 v i ( c o u n t +1)= v i ( c o u n t ) ;
8.3 Simulated Annealing (SA) 323
49 Yi ( c o u n t +1)=Yi ( c o u n t ) ;
50 else
51 T = c s ∗T ;
52 end
53 end
54
55 % Print
56 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
57 ’ , v = ’ , num2str ( vi ( count ) ) , . . .
58 ’ , Y = ’ , n u m 2 s t r ( Yi ( c o u n t ) ) ] ;
59 disp ( prt )
60 end
Solution:
y = 2x 2 + 7
x3
− 4; 0.5 ≤ x ≤ 5.5 m.
Calculate the initial cooling rate T0 . For this, we need to use 4 values of x and the
corresponding y (shown in Table 8.12). The mean value of y is 19.28 which is the
initial cooling rate (T0 ).
μ= 0.5+5.5
2
=3
3σ = 5.5−0.5
2
; σ = 0.833
Now, the iterations of SA follow as
Iteration 1
xold = μ = 3, yold = 14.26, T0 = 19.28
Generate a set of 3 random numbers
r1 r2 r3
0.0012 0.8989 0.5788
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 3 + −2 × 0.8332 log(0.0012)
r1 r2 r3
0.4996 0.2827 0.7306
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 3 − −2 × 0.8332 log(0.4996)
xnew = 2.018, ynew = 4.996, y = −9.264
y
P = e− T = 1.62
Iteration 2
r1 r2 r3
0.1085 0.3862 0.7691
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
8.3 Simulated Annealing (SA) 325
xnew = 2.018 − −2 × 0.8332 log(0.1085)
r1 r2 r3
0.5574 0.7998 0.4568
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 2.018 + −2 × 0.8332 log(0.5574)
r1 r2 r3
0.0926 0.5896 0.3322
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 2.018 + −2 × 0.8332 log(0.0.0926)
r1 r2 r3
0.7626 0.6962 0.1703
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 2.018 + −2 × 0.8332 log(0.7626)
326 8 Nontraditional Optimization Techniques
Iteration 3
r1 r2 r3
0.0327 0.2993 0.3086
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
xnew = 2.631 − −2 × 0.8332 log(0.0327)
r1 r2 r3
0.3528 0.5741 0.2659
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
xnew = 2.631 + −2 × 0.8332 log(0.3528)
r1 r2 r3
0.9418 0.2400 0.6556
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
8.3 Simulated Annealing (SA) 327
xnew = 2.631 − −2 × 0.8332 log(0.9418)
1 clear ;
2 close a ll ;
3 clc ;
4
5 a = 0.5; % Range o f x
6 b = 5.5;
7 cs = 0 . 5 ; % Cooling Schedule
8 i t r =100; % No . o f i t e r a t i o n s
9
22 w h i l e count < i t r
23 count = count +1;
24 label = 0;
25 w h i l e l a b e l ==0 % To make s u r e t h a t t h e sample i s i n t h e
range
26 i f rand ( 1 ) > 0.5
27 x i ( c o u n t +1)= x i ( c o u n t ) + s i g ∗ r a n d ( 1 ) ;
28 else
29 x i ( c o u n t +1)= x i ( c o u n t ) − s i g ∗ r a n d ( 1 ) ;
30 end
31 i f x i ( c o u n t +1) >a && x i ( c o u n t +1) <b
32 l a b e l =1;
33 end
34 end
35 Yi ( c o u n t +1) = 2∗ x i ( c o u n t +1) ^2 + 7 / x i ( c o u n t +1) ^3 −4;
36 i f Yi ( c o u n t +1) <Yi ( c o u n t ) % Accept i f Ynew < Yold
37 T = c s ∗T ; % Reducing T u s i n g c o o l i n g s c h e d u l e
328 8 Nontraditional Optimization Techniques
38 else
39 % Calculating probability for rejection
40 P = exp ( ( Yi ( c o u n t )−Yi ( c o u n t +1) ) / T ) ;
41 i f P< r a n d ( 1 )
42 x i ( c o u n t +1)= x i ( c o u n t ) ;
43 Yi ( c o u n t +1)=Yi ( c o u n t ) ;
44 else
45 T = c s ∗T ;
46 end
47 end
48
49 % Print
50 p r t = [ ’ I t r = ’ , num2str ( count ) , . . .
51 ’ , x = ’ , num2str ( xi ( count ) ) , . . .
52 ’ , Y = ’ , n u m 2 s t r ( Yi ( c o u n t ) ) ] ;
53 disp ( prt )
54 end
Itr = 1, x = 3, Y = 14.2593
Itr = 2, x = 3.1091, Y = 15.5665
Itr = 3, x = 3.874, Y = 26.1368
Itr = 4, x = 4.2115, Y = 31.5674
Itr = 5, x = 4.2115, Y = 31.5674
.. .. ..
. . .
Itr = 99, x = 1.3912, Y = 2.4706
Itr = 100, x = 1.3912, Y = 2.4706
Example 8.6 Consider a thermal system, whose heat dissipation rate is given
by Q = 2.5 + 6.2v 0.8 , where Q is in kW and v is the velocity in m/s of the fluid
being used as the medium for accomplishing the heat transfer. The accompa-
nying pumping power is given by P = 1.3 + 0.04v 1.8 , again in kW with v in
m/s (in both the expressions, the constants ensure that both Q and P are in
kW). It is desired to maximize the performance parameter Q/P in the range
3 ≤ v ≤ 12 m/s. Perform 5 iterations of Simulated Annealing.
Solution:
Q 2.5 + 6.2v 0.8
Maximi ze y = = ; 3 ≤ v ≤ 12 m/s.
P 1.3 + 0.04v 1.8
8.3 Simulated Annealing (SA) 329
Calculate the initial cooling rate T0 . For this, we need to use 4 values of v and the
corresponding y (shown in Table 8.13). The mean value of y is 11.74 which is the
initial cooling rate (T0 ).
Since this is a maximization problem, the probability of accepting the sample is
y
calculated as P = e T .
μ = 12+32
= 7.5
3σ = 12−3
2
; σ = 1.5
Now, the iterations of SA follow as
Iteration 1
r1 r2 r3
0.0012 0.8989 0.5788
Here, r2 > 0.5, so vnew = vold + −2σ 2 log(r1 )
vnew = 7.5 + −2 × 1.52 log(0.0012)
r1 r2 r3
0.4996 0.2827 0.7306
Here, r2 < 0.5, so vnew = vold − −2σ 2 log(r1 )
vnew = 7.5 − −2 × 1.52 log(0.4996)
y
P=e T = 1.03
Iteration 2
r1 r2 r3
0.1085 0.3862 0.7691
Here, r2 < 0.5, so vnew = vold − −2σ 2 log(r1 )
vnew = 5.73 − −2 × 1.52 log(0.1085)
r1 r2 r3
0.5574 0.7998 0.4568
Here, r2 > 0.5, so vnew = vold + −2σ 2 log(r1 )
vnew = 5.73 + −2 × 1.52 log(0.5574)
Iteration 3
r1 r2 r3
0.0926 0.5896 0.3322
8.3 Simulated Annealing (SA) 331
Here, r2 > 0.5, so vnew = vold + −2σ 2 log(r1 )
vnew = 7.35 + −2 × 1.52 log(0.0926)
Iteration 4
r1 r2 r3
0.7626 0.6962 0.1703
Here, r2 > 0.5, so vnew = vold + −2σ 2 log(r1 )
vnew = 10.62 + −2 × 1.52 log(0.7626)
Iteration 5
r1 r2 r3
0.0327 0.2993 0.3086
Here, r2 < 0.5, so vnew = vold − −2σ 2 log(r1 )
vnew = 11.72 − −2 × 1.52 log(0.0327)
y
P=e T = 11.46
In some cases, instead of proceeding with a single algorithm till the end, switching to a
different algorithm after a certain number of iterations will ensure faster convergence
and better accuracy of the solution. This idea will become clear after going through
Examples 8.7 and 8.8. In Example 8.7, we start with GA and then switch to the
golden section search. In Example 8.8, we start with GA and then switch to simulated
annealing.
Example 8.7 Consider the cargo ship problem. Solve it using hybrid opti-
mization technique by starting with genetic algorithm and proceeding till 20
iterations With the results obtained from the genetic algorithm, switch to golden
section search method and perform 7 iterations.
Solution:
Using genetic algorithm, the population (20) after 20th iteration is shown in
Table 8.14. This table is the output of the MATLAB code presented earlier.
From this population we have, xmin = 9.6 and xmax = 12.6.
So we use golden section search method to find the optimum in the range [9.6,
12.6]. The iterations using golden section search method are as follows:
Iteration 1
Iteration 2
Iteration 3
Iteration 4
Iteration 5
Iteration 6
Iteration 7
Example 8.8 Consider the cargo ship problem. Solve it using hybrid opti-
mization technique by starting with genetic algorithm and proceeding till 20
iterations With the results obtained from genetic algorithm, switch to simulated
annealing and perform 5 iterations.
Solution:
Calculate the initial cooling rate T0 . For this, we need to use 4 values of x and
the corresponding y (shown in Table 8.15). The mean value of y is 65.37 which is
the initial cooling rate (T0 ).
μ= 9.6+12.6
2
= 11.1, σ = 12.6−9.6
2
= 0.5
Iteration 1
r1 r2 r3
0.0012 0.8989 0.5788
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
r1 r2 r3
0.4996 0.2827 0.7306
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
Iteration 2
r1 r2 r3
0.1085 0.3862 0.7691
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
r1 r2 r3
0.5574 0.7998 0.4568
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
Iteration 3
r1 r2 r3
0.0926 0.5896 0.3322
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
Iteration 4
r1 r2 r3
0.7626 0.6962 0.1703
Here, r2 > 0.5, so xnew = xold + −2σ 2 log(r1 )
Iteration 5
xold = μ = 12.51, yold = 67.27, T3 = 8.171
2
= 4.085
r1 r2 r3
0.0327 0.2993 0.3080
Here, r2 < 0.5, so xnew = xold − −2σ 2 log(r1 )
Problems
8.1 Consider the problem of minimization of convective heat loss from a cylindrical
storage heater that makes use of the solar energy collected by a suitable system.
The volume of the tank is 5 m 3 and is fixed. The radius of the tank is “r” and the
height is “h”.
(a) Convert this to a single-variable unconstrained optimization problem in radius
“r”.
(b) We would like to use Simulated Annealing (SA) to solve the above problem,
as a single-variable minimization problem in “r”.
Perform four iterations of the SA with a starting value of r =2 m. You may
assume that 0.1 ≤ r ≤ 4 m. Use random number tables. The “initial temperature
T” (used in the algorithm that does not correspond to the physical temperature
in this problem) may be taken to be the average of four objective function values
for appropriately chosen values of the radius.
8.2 Revisit exercise Problem 8.1 and solve it using genetic algorithm for one variable.
Perform three iterations of the GA with an initial population size of 4. You may
assume that 0.1 ≤ r ≤ 2.55 m.
Reference
Goldberg, D. E. (1989). Algorithms, genetic in search, optimization and machine learning. Boston,
MA, USA: Addison-Wesley Longman Publishing.
Chapter 9
Inverse Problems
9.1 Introduction
An inverse problem, by definition, is one in which the effect is known (or measured)
and the cause(s) need(s) to be identified. Let us take a simple example of a person
suffering from fever and going to a doctor for treatment. Invariably, the first parameter
the doctor checks is the patient’s temperature (usually the oral temperature). If this
quantity is more than 38 ◦ C, then the patient has a fever. Fever is the effect and the
doctor has to correctly identify the cause–or rather, has to identify the correct cause.
The problem is “ill-posed”, as there could be several causes for the fever. The fever
could be because of a viral infection, bacterial infection, inflammation or allergy, or
some very serious underlying disorder. In order to reduce the ill-posedness, the doctor
either goes in for additional tests or simply starts treating the fever empirically, by
guessing the cause based on his prior knowledge and applies midcourse corrections,
if the patient does not feel better in, say, 3–5 days time.
Similarly, in thermal sciences, as in other branches of science and engineering,
there are several situations where the correct cause needs to be identified from the
effect. The “effect” is usually a temperature or heat flux distribution in thermal
sciences. The cause we are seeking could be a thermophysical property like thermal
conductivity, thermal diffusivity or emissivity, or a transport coefficient like heat or
mass transfer coefficient or could even be the heat flux (in this case, the ‘effect’ is
usually the measured temperature distribution).
A familiar example is the surface heat flux on a reentry vehicle that reenters
the atmosphere. The velocities here are terrific and the kinetic energy of the fluid
(because of the relative motion between stagnant air and the fast-moving vehicle)
is “braked” and converted to an enthalpy rise on the outer surface of the vehicle.
This phenomenon is frequently referred to as “aerodynamic heating” and results in
a huge temperature rise on the surface. It is impossible to place a heat flux gauge on
the outside surface of the reentry vehicle as the temperature is of the order of a few
thousand Kelvin. In view of this, thermocouple measurements are made at locations,
where the temperature is “measurable”, and from these one has to correctly estimate
the heat flux at the outer surface. This flux is a critical design parameter that decides
the cooling strategies. For instance, the outer surface of a reentry vehicle can be
coated with an ablating material with a thickness that ensures the heat flux at the
outer surface, is “melted” or more correctly “sublimated” away, thereby protecting
the inside of the vehicle.
Other important areas where inverse problems find applications in thermal sci-
ences are as follows:
• Estimation of thermophysical properties.
• Computerized Axial Tomography (CAT) or CT scans, where mapping of the
absorption coefficient in a tissue is diagnostic of the structure and functional status
of the tissue.
• Identifying the distribution of the radiation source in applications like thermal
control of spacecraft, condensers, and internal combustion engines.
• Retrieving the constituents like water vapor, liquid water, and ice in the atmosphere
by remotely making measurements of infrared or microwave radiation from the
earth’s surface by placing sensors onboard a satellite orbiting the earth.
From the foregoing discussion, it can be seen that parameter estimation is an impor-
tant inverse problem in thermal sciences. The parameter estimation problem is invari-
ably posed as an optimization problem, wherein minimization of the sum of the least
squares of the residues is done. Mathematically, if ydata,i is the measured data vector
and ysim,i is the simulated or calculated vector of y values for assumed values of the
parameters, then, we define S as
N
S= (ydata,i − ysim,i )2 (9.1)
i=1
Example 9.1 Consider a thin aluminum foil coated with a paint of “high”
emissivity with dimensions of 2 cm × 2 cm, 6 mm thickness suspended in
an evacuated chamber (Fig. 9.1). The chamber is maintained at 373 K and the
foil is initially at 303 K. The foil gets heated radiatively and its temperature
response is given in Table 9.1. Propose a strategy for estimating the emissivity
of the coating.
9.1 Introduction 341
Solution:
One can straightaway see that this is an inverse problem. In a direct (or forward
problem), typically all the properties of the system will be known beforehand or “a
priori” and the temperature response of the system will be sought. However, in this
case, the experimentally measured temperature response is available and we have to
estimate or retrieve the value of the emissivity. The first step would be to set up the
mathematical model for the direct problem.
The following assumptions are to be made.
1. There is no heat loss from/to the aluminum foil other than that due to radiation
from/to the walls of the enclosure.
2. The temperature of the enclosure remains constant throughout.
3. The properties of the foil do not change with temperature.
4. The foil is spatially isothermal (lumped capacitance formulation).
For the above assumptions, the heating of the foil can be mathematically represented
as
342 9 Inverse Problems
dT
mCP = −εσA(T 4 − T∞
4
) (9.2)
dt
where
In Eq. 9.2, the left-hand side represents the rate of change of enthalpy and the right-
hand side represents the heat transfer by radiation. Since T<T∞ always, the minus
sign on the RHS of Eq. 9.2 ensures that dT/dt is positive, thereby confirming that the
foil is getting heated radiatively. Equation 9.2 is frequently referred to as the forward
model, in the parlance of inverse problems.
Solution to the forward model
dT εσAdt
=− (9.3)
T4 − T∞
4 mCp
Initial condition: T = Ti at t = 0.
The LHS can be integrated using partial fractions as follows:
T T
dT dT
= (9.4)
Ti T − T∞ Ti (T − T∞ )(T + T∞ )
4 4 2 2 2 2
T
1 1 1
= − dT (9.5)
2T∞2
Ti T − T∞
2 2 T + T∞
2 2
T
1 1 1 1 1
= − − 2 dT (9.6)
2T∞2
Ti 2T∞ (T − T∞ ) (T + T∞ ) T + T∞ 2
T T T
1 dT 1 dT 1 dT
= − − (9.7)
Ti (T − T∞ ) Ti (T + T∞ ) Ti (T + T∞ )
4T∞3 4T∞ 3 2T∞ 2 2 2
1 T − T∞ T 1 1 −1 T T
= ln − tan (9.8)
4T∞3 T + T∞ Ti 2T∞ 2 T
∞ T∞ Ti
1 T − T∞ Ti − T∞ −1 T −1 Ti
= ln − ln − 2 tan + 2 tan
4T∞3 T + T∞ Ti + T∞ T∞ T∞
(9.9)
9.1 Introduction 343
Table 9.2 Variation of the residual with emissivity for Example 9.1
ε 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95
S(ε) 358.02 229.76 132.59 65.19 21.57 2.94 5.11 27.04
εσAt
− (9.10)
mCp
εσAt
=− (9.11)
mCp
Equation 9.11 is the solution to the forward problem or forward model. It can be
seen that the solution is algebraically involved and is not explicit in “T”. The rather
complicated nature of the solution arises from the non-linearity associated with the
(T 4 − T∞4
) term. Equation 9.11 also suggests that it is not possible to apply the linear
least squares directly to estimate the emissivity, as the resulting equation will be
highly difficult to solve.
One possibility of solving the inverse problem is to substitute various values of
ε in Eq. 9.11 and determine the temperatures Ti at various time instants given in the
problem. With these, the following can be calculated.
N
S(ε) = (Texp,i − Tcalc,i )2 (9.12)
i=1
Upon doing such an exercise for ε ranging from 0.6 to 0.95 in steps of 0.05 (it is our
prior belief that the coating will have its emissivity in the above range), we obtain
S(ε) as shown in Table 9.2.
Based on our discussion of optima in the previous chapters, it is clear that, at best,
we can say that 0.8 ≤ ε ≤ 0.9. The residuals are plotted against emissivity and is
shown in Fig. 9.2. What we have essentially done now is an exhaustive, equal inter-
val search. We can apply more sophisticated single variable searches like the Golden
section search to get a better estimate of “ε” with the same number of functional eval-
uations, namely, 8. However, even with the inefficient exhaustive search method, as
afore discussed, we can do a little better, by locally fitting a Lagrangian interpolation
polynomial for S(ε), by employing three values of ε where the residuals appear to
344 9 Inverse Problems
dS
= 13024 ε − 11235 = 0 (9.14)
dε
ε = 0.86 (9.15)
Therefore, the best estimate of ε with the level of computational intensity discussed
above and our limited prior belief (0.6 ≤ ε ≤ 0.95) is ε= 0.86. Of course, we can
employ the Gauss–Newton method or even the Levenberg–Marquardt method to get
better estimates of ε.
It is instructive to mention that the above effort is for the estimation of just one
parameter and the forward model is just an ordinary differential equation. The com-
putational complexity of the inverse problem will dramatically increase if (i) the
number of parameters to be estimated increases (ii) the direct problem becomes very
involved, as, for example, if it becomes a CFD problem or an integrodifferential
equation involving gas radiation or (iii) both (i) and (ii). From the foregoing exam-
ple, we can come up with a broad framework for depicting the solution to a parameter
estimation problem.
From the schematic shown in Fig. 9.3, one can see that the inverse problem often
involves a repeated solution to the forward model. The minimization of R2 converts
an inverse problem to a minimization problem in optimization.
9.2 The Bayesian Approach to Inverse Problems 345
In the previous section, we learned the basic ideas involved in an inverse problem
in thermal sciences and saw an example of single parameter estimation. It is now
intuitively apparent that in a multiparameter problem, there could be several com-
binations of parameters, all of which may lead to the same ‘effect’, namely, the
temperature distribution or a transient response, depending on the problem. Hence,
the inverse problem is essentially ill-posed and suffers from a lack of uniqueness.
Mathematically, several techniques have been developed, which try to address this
problem and these are explained in several texts and journals. However, as engineers,
invariably we have some more information about the parameter than what is seen
from the measurements or data. If, then, we are able to systematically inject these
prior beliefs of ours about the parameters in the estimation process, then what we
are attempting is a Bayesian approach.
346 9 Inverse Problems
where P(x/Y ) is called the Posterior Probability Density Function (PPDF), P(Y /x)
is the likelihood density function, P(x) is the prior density function and P(Y ) is the
normalizing constant.
In the above equation, the first term on the RHS represents the probability of
getting Y for an assumed value of x. This can be obtained from a solution to the
direct problem for an assumed x and we convert the S = Ni=1 (Yexp,i − Ysim,i )2 in to
a PDF (Probability density function). Invariably, a Gaussian distribution for the mea-
surement errors is assumed for doing this. The P(x) is our prior knowledge belief
about x, even before the measurements are made or calculations are done. One can
call this as ‘expert knowledge’or ‘domain knowledge’. For example, if the goal of
an inverse problem is to determine the thermal conductivity of a metal using an
inverse methodology, if the material looks like steel, we can construct a Gaussian
for P(k) with a mean (μ), say 50 W/mK and a standard deviation (σ), 5 W/mK.
This means 99% of the time our prior belief is that the thermal conductivity lies
between 50 ± 15W/mK. This is a very reasonable assumption and often reduces the
ill-posedness. For instance, there is no need to conduct searches starting from, say,
k = 0.1 W/mK to k = 400 W/mK for solving this problem, if a sensible and rational
prior is used. This is the hallmark of the Bayesian approach.
Figure 9.4 presents an overview of the Bayesian approach to inverse problems.
Construction of P(x|Y ) requires data, model, and a model for the distribution of
errors. P(x) is a quantification of our beliefs and here again a Gaussian distribution
has been used. P(x|Y ), the PPDF is a joint PDF of the likelihood and the prior.
1 0.06
0.8 0.05
0.04
0.6
P(y/x)
P(x)
0.03
0.4
0.02
0.2 0.01
0 0
40 60 80 100 120 140 40 60 80 100 120 140
x x
(a) (b)
1
0.8
0.6
P(x/y)
0.4
0.2
0
40 60 80 100 120 140
x
(c)
Fig. 9.4 Schematic of a likelihood density function P(Y |x) b prior density function P(x) and c
posterior probability density function P(x|Y )
The first step is done by conducting experiments. In so far as the likelihood is con-
cerned, we exploit the idea of measurement error in temperature as follows
ω is a random variable from a normal distribution with mean “0” and standard
deviation σ, where σ is the standard deviation of the measuring instrument (thermo-
couple). Assuming that the uncertainty ω follows a normal or Gaussian distribution,
the likelihood can be modeled as
1 (Y − F(x))T (Y − F(x))
P(Y /x) = √ exp − (9.18)
( 2πσ 2 )n 2σ 2
where Y is a vector of dimension n, i.e, n measurements are available and F(x) is the
solution to the forward model with the parameter vector x (x can consist of several
variables)
348 9 Inverse Problems
−χ2
1
P(Y /x) = √ exp (9.19)
( 2πσ 2 )n 2
(Ymeas,i − Ysim,i )2
n
where χ2 = (9.20)
i=1
σ2
In the above equation, Ysim,i are the simulated values of Y for an assumed x (set of
parameters).
The posterior PDF (PPDF) then becomes
−χ2
√ 1 exp [P(x)]
( 2πσ 2 )n 2
P(x/Y ) = (9.21)
−χ2
√ 1n exp 2
[P(x)]dx
2πσ 2
The prior probability density P(x) typically follows a uniform, normal, or log-normal
distribution. In the case of a uniform prior, P(x) is the same for all values of x, i.e.,
we have absolutely no selective preference. This happens in some cases where we
have no knowledge of x. Such a prior is called a non-informative prior. Needless
to say, it is also an objective prior, as there is no subjective input to the prior.
Let us say P(x) follows a normal distribution with mean μ and standard deviation
σp . Mathematically, P(x) is given by
1 −(x − μp )2
P(x) = exp (9.22)
( 2πσp2 )n 2σp2
Therefore, for every assumed value of the data vector X (x1 , x2 . . . xn ), P(x/Y) can
be worked out. From this posterior distribution, one can use two possible estimators
(i) Mean estimate also known as expectation or (ii) Maximum a Posteriori (MAP),
that is, the value of x for which P(x/Y) is maximum. Usually, a sampling algorithm
is used to generate samples of x consecutively. In a multiparameter problem, the
marginal PDF of every parameter must be worked out. It is pertinent to note that
P(x/Y) is a joint PDF. The estimators and the concept of sampling will become clear
on further consideration of the example problem that was considered in the earlier
part of the chapter.
9.2 The Bayesian Approach to Inverse Problems 349
1 χ2 (x−μ)2
n exp(−) +
n+1
(2π) 2 (σ σp ) 2 2σp2
P(x/Y ) = (9.24)
exp(−)
χ2 (x−μ)2
n+1
1
+ dx
(2π) 2 (σ n σp ) 2 2σp2
Therefore
χ2 (x−μ)2
exp(−) 2
+ 2σp2
P(x/Y ) = 2 (9.25)
exp(−) χ2 + (x−μ)2
2σp2
dx
2σp2
dx
x̄ = 2 (9.26)
exp(−) χ2 + (x−μ)
2
2σ 2
dx
p
Often, the integral is replaced by a summation when only discrete values of x are
used. 2
χ (x−μ)2
i xi exp(−) 2 + 2σp2 xi
x̄ = 2 (9.27)
exp(−) χ2 + (x−μ)
2
2σ 2 xi
p
2σp2
2
χ (x−μ)2
i (xi − x̄) exp(−) 2 + 2σp2
2
σx2 = 2 (9.29)
exp(−) χ2 + (x−μ)
2
2σ 2
p
Now, we can use this framework to estimate the emissivity ε in the example problem.
In the above equation, σx is the standard deviation of the estimated parameter, which
is very diagnostic of the potency of the estimation process.
Example 9.2 Consider Example 9.1, wherein an aluminum foil was heated in
an evacuated enclosure. Using the same data and the same samples, determine
the mean of the estimate using a Bayesian approach. Take P(ε) to be a Gaussian
with μp = 0.8 and σp = 0.05. The standard deviation of the uncertainty in the
temperature measurement is ± 0.3 K.
350 9 Inverse Problems
1 0.6 358.02 0 0 0
2 0.65 229.76 0 0 0
3 0.7 132.59 8.70 ×10−321 1.24×10−320 2.85×10−322
4 0.75 65.19 3.87×10−158 5.16×10−158 5.16×10−160
5 0.8 21.57 7.25×10−53 9.06×10−53 2.26×10−55
6 0.85 2.94 6.85×10−8 8.06×10−8 6.81×10−21
7 0.90 5.11 4.22×10−13 4.69×10−13 1.17×10−15
8 0.95 27.04 5.46×10−66 5.75×10−66 5.75×10−68
6.85×10−8 8.06×10−8 1.17×10−15
0.6
PPDF
0.4
0.2
0
0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Emissivity, ε
Solution:
Using the Bayesian formulation presented above, for the no prior case, we have the
following results presented in Table 9.3.
6.85×10−8 1.17×10−15
ε̄ = = 0.85, σ = = 1.21 × 10−4 (9.30)
8.06×10−8 8.06×10−8
The typical PPDF for the case without prior is shown in Fig. 9.5.
The mean estimate without the prior is given by 0.85 with σε = 1.21 × 10−4 .
Such an estimate is frequently referred to as the maximum likelihood estimate.
Now if we include the Gaussian prior, we get results presented in Table 9.4.
9.2 The Bayesian Approach to Inverse Problems 351
Table 9.4 Estimation of ε using the Bayesian method (with a Gaussian prior)
i) (ε−με,prior )2
εi S(ε) A= S(ε
2σ 2
B= 2σp2
εi exp−(A+B) exp−(A+B) (εi −
ε̄)2 exp−(A+B)
0.6 358.02 1989 8 0 0 0
0.65 229.76 1276.44 4.5 0 0 0
0.7 132.59 736.61 2 1.176×10−321 1.68×10−321 4.00×10−323
0.75 65.19 362.17 0.5 2.35×10−158 3.13×10−158 3.13×10−160
0.8 21.57 119.83 0.0 7.25×10−53 9.06×10−53 2.26×10−55
0.85 2.94 16.33 0.5 4.16×10−8 4.89×10−8 2.056×10−22
0.9 5.11 28.39 2.0 5.71×10−14 6.34×10−14 1.59×10−16
0.95 27.04 150.22 4.5 6.06×10−68 6.38×10−68 6.38×10−70
4.16×10−8 4.89×10−8 1.59×10−16
0.6
PPDF
0.4
0.2
0
0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1
Emissivity, ε
From Table 9.4, we can obtain the mean of ε and also the standard deviation of
the estimate as
4.16×10−8 1.59×10−16
ε̄ = = 0.85, σ = = 5.69 × 10−5 (9.31)
4.89×10−8 4.89×10−8
The typical PPDF for the case with prior is shown in Fig. 9.5 (Fig. 9.6).
We can see that after the incorporation of the Gaussian prior, the standard deviation
of the estimate of ε has gone down. Hence, the subjective informative Gaussian prior
has helped us considerably in the estimation process. Again, instead of taking uniform
samples of ε from 0.6 in steps of 0.05, one can employ a Markov chain, wherein the
next sample of x (ε in this case) depends on only the current value of x. This can be
done by drawing the new sample from a Gaussian distribution with its mean being
the current value of “x” and “σ” being typically 5% of the current mean. The new
sample can be accepted or rejected stochastically, in a manner analogous to what we
352 9 Inverse Problems
did in the Simulated Annealing method. The procedure elucidated above is frequently
referred to as a “Markov chain Monte Carlo (MCMC)” technique. Interested readers
may look up advanced statistics text or journals to learn more about MCMC methods
for parameter estimation in thermal sciences.
Problems
9.1 Consider a vertical rectangular mild steel fin of constant area of cross section
that is 3 mm in thickness, and 100mm long. The depth of the fin is 300 mm
in the direction perpendicular to the plane of the paper. The fin stands on a
horizontal base. The heat transfer coefficient at the fin surface is 8 W/m2 K,
the base temperature is 370 K and the ambient temperature is 300 K. The fin
temperature distribution can be assumed to be one dimensional, varying only
along the height. Steady-state temperature excess as (θ = T − T∞ ) at various
locations along the height of the fin is measured and tabulated below. For the
case of an adiabatic fin tip, estimate the thermal conductivity of the fin material
by using an exhaustive equal interval search, in the range 20 ≤ k ≤ 60 W/mK
with an interval of 5 W/mK and then switching to a Lagrangian interpolation
formula by using a least square approach.
x (mm) 0 10 20 35 50 65 80 100
θ (K) 70 63.9 59.8 52.7 49.1 44.6 43.4 41.3
Curve Fitting
• To perform system simulations, the performances of various components in a
system and also the thermodynamic and thermophysical properties of the media
should be available in the form of equations. These equations relate quantities
like pump efficiency, head loss, or properties like viscosity and conductivity to the
operating variables like flow rate, pressure, temperature, mole fractions, time, and
spatial coordinates.
• Obtaining equations for the above from a set of points is known as curve fitting.
Curve fitting is essentially of two types. (a) Exact fit-curve passes through every
point. Useful for a limited number of measurements or a limited number of parame-
ters. Some examples include polynomial interpolation and Lagrange interpolation.
(b) Best fit-curve does not pass through every point, invariably used in curve fitting.
Usually based on minimization of the sum of squares of the difference between
actual data and data generated by the fit. This procedure is known as Least Squares
Regression (LSR).
• Many forms like y = ax b , y = ax1 + bx2 + c, y = aebt and so on can be regressed
using LSR. Forms like θ = a[1 − e−bt ] cannot be linearized. a and b in the above
equation have to be regressed using nonlinear least squares.
• Gauss–Newton Algorithm (GNA)- powerful nonlinear regression tool. We start
with initial guess values of the parameters and iteratively obtain the parameters
(as, for example, a, b of the previous bullet). Marquardt algorithm and Levenberg–
Marquardt algorithm are improvements to the basic GNA, where a damping term
is introduced.
Optimization
• The process of finding a condition that gives maximum or minimum.
• May not always be feasible or possible, because of the time, labor, or money
involved.
• For small projects, the cost, time, and effort may not justify optimization.
• For complex systems, the design is too complex to optimize. One strategy is to
subdivide the system subsystems and proceed.
Objective Function
• The important decision is what is to be optimized.
• For e.g., for aircraft and racing cars, it could be the weight.
• For automobiles, it could be the size, cost, or specific fuel consumption.
• For refrigerators, it could be the initial cost when we buy it in the market, while
for an air conditioner, the more important parameter is the running cost.
Search Methods
• Based on eliminating a portion of the interval (Elimination method).
• Based on systematically climbing to the top (Hill climbing method).
Key Point in Search Method—Interval of Uncertainty
• The final solution lies between two limits. The precise point of optimization is
never known.
• Only the final interval of uncertainty can be specified.
• Reduction ratio = original interval of uncertainty
new interval of uncertainty
= IIn0
Exhaustive Search Technique
The figure is a simple depiction of how to use the two-point test method. The function
is monotonic and hence we can only say the optimum lies somewhere between y4
and y6 .
I = (n+1)
2I0
, where n is the number of observations.
– At each trial point, the gradient vector is calculated and search proceeds along
this direction.
x1 x2 xn
∂y
( ∂x )
= ∂y
( ∂x )
= ··· = ∂y
( ∂x )
=α
1 2 n
Choose x1 and, all the other x s can be obtained or we can simultaneously
choose all xs by defining an α and solving for it.
Multivariable Constrained Optimization
• Penalty function method
– The constrained problem is converted into an unconstrained problem by creating
a composite objective function, which takes care of both the objective function
and the constraint.
– The penalty parameter penalizes the objective function for violating the con-
straints.
– The resulting unconstrained problem is solved using known techniques.
– The problem is solved with different values of the penalty parameter.
– If there is no significant change in the optimum with a change in the penalty,
one can assume that the final solution has been reached.
Multi-objective Optimization
• Multi-objective optimization problems are a kind of problems in which multiple
objectives are present which invariably are in conflict with each other.
• Preference method weights the objectives and solves for an equivalent single-
objective optimization problem.
Other Optimization Techniques
• Linear programming: Applicable only if the objective function and the constraints
are linear combinations of the independent variables. The graphical method can
be used for two-variable problems. LP problems can be algebraically solved by
introducing slack variables.
• Simplex method is a very systematic algebraic solver that makes use of an iterative
approach to determine the optimum solution for the LP problem.
• Integer programming is a type of LP problem where all the variables must be
integers. This type of problems can be solved using cutting plane or searching
methods.
• Nontraditional optimization techniques like genetic algorithms and simulated
annealing are search techniques that are calculus free, robust, and use stochas-
tic principles. GAs use evolution and SA slow cooling or annealing as the basics
for optimization.
Inverse Problems
• If the effect (temperature) is known but the causes, say (k, ε, h) are not known,
such a problem is known as an inverse problem.
358 Summary
• An inverse problem is typically ill-posed, as several causes could lead to the same
effect.
• For simple problems (for example, those involving one parameter) least square
minimization can be used to estimate the parameter. In this method, for guess
values of the parameter, the effect (say, temperature distribution) is calculated and
the sum of the squares of the difference between calculated and measured qualities
(temperature) is minimized.
• Parameter estimation problems (an important category of the inverse problem) are
hence, eventually minimization problems. For multiple parameters, ill-posedness
can be handled by injecting prior information about the parameters in the estima-
tion process.
• A systematic injection of parameters is afforded by the Bayesian framework.
• In a Bayesian framework, measurements, measurement errors, forward (or mathe-
matical) model, and prior benefits are all synergistically combined to work out the
conditional probability of a set of parameters, being the cause of an effect (e.g.,
temperature distribution).
• These probabilities are worked out and by post-processing them, the mean of the
estimate and maximum a posteriori are determined.
Appendix
v2
(h b ) = 4 × 0.15 × = 0.2643 m (A.3)
2×g
Pressure loss
Tank
Tank
--
Fig. A.1 Sketch of the layout of the pump and piping arrangement (a design solution)
Table A.1 The costs of different pumps collected from market sources
S.No. Power (hp) P (Pa) Cost (Rs.)
1 0.25 47605.5 2816
2 0.5 95210.2 3088
3 1 190420.4 3551
4 1.5 285630.6 6708
5 2 380840.8 15825
6 3 571261.2 18535
7 5 952102.1 19102
8 7 1332942.9 25000
9 8 1523363.3 26000
10 10 1904204.2 38082
11 12.5 2380255.2 44212
12 15 2856306.3 46622
13 20 3808408.4 65047
14 25 4760510.5 85204
362 Appendix
Table A.2 The cost of PVC pipes of different diameters available in market
S.No. Diameter (mm) Cost/m (Rs.)
1 20 28.07
2 25 42.32
3 32 69.11
4 40 107.63
5 50 166.68
Sample Calculation
1 × 746 × 0.85
P3 = = 190420.42 Pa
3.33 × 10−3
The data in Table A.1 is fit linearly to obtain the pump cost in terms of P as
follows.
From Fig. A.2, the cost of the pump in terms of P is given by the following
equation:
Pump cost = Rs. 0.016×P + 3526
Pipe cost:
In a similar fashion, the pipe costs are also collected from market sources and are
tabulated in Table A.2.
Appendix 363
40,000
20,000
0
0 1.0×106 2.0×10 6 3.0×10 6 4.0×106 5.0×106
ΔP, Pa
y = 4637.37x - 72.126
150
Pipe cost, Rupees/m
100
50
0
0.01 0.02 0.03 0.04 0.05 0.06
Diameter, m
Again using a linear fit for the cost of the pipe, we obtain the pipe cost in terms
of diameter d as follows:
pipe cost per meter = 4637.37 × d −72.126 (It is shown in Fig. A.3)
The pipe cost now needs to be obtained as f(P)
f × l × u2 × ρ × g
P = (A.8)
2×d ×g
0.0327
f = 0.182 × (Re)−0.2 = (A.9)
d −0.21
Running Cost
Case 1: Running cost for 20 years with 6% increase in unit cost per year. Based on
the data given, the running costs are calculated and tabulated in Table A.3.
Based on the data given in the problem, the life of the system is assumed to be 20
years. The unit cost of electricity is Rs. 5.50 per unit and this increases by 6% every
year and the pump has to work two hours daily.
Sum of the electricity costs per unit over the-20 year period
(P × Q)
Running cost = × 202.3196 × 2 × 365 (A.16)
(η × 1000)
Appendix 365
Table A.3 Running cost for different pumps over 20 years with 6% increase in unit cost per year
S.No Power (hp) P (Pa) Running cost (Rs.)
1 0.25 47605.5 27544.8
2 0.5 95210.2 55088.6
3 1 190420.4 110177.3
4 1.5 285630.6 165265.8
5 2 380840.8 220354.5
6 3 571261.2 330531.7
7 5 952102.1 550886.3
8 7 1332942.9 771240.8
9 8 1523363.3 881418.1
10 10 1904204.2 1101772.5
11 12.5 2380255.2 1377215.7
12 15 2856306.3 1652658.8
13 20 3808408.4 2203545.1
14 25 4760510.5 2754431.4
Using a linear fit for life time running cost, we obtain the following relation
(Fig. A.4).
So,
Running cost = Rs. 0.578(P) + 0.071 (A.20)
Optimization
Total life time cost (C) of the pump and piping arrangement
1,500,000
1,000,000
500,000
0
0 1.0×10 6 2.0×10 6 3.0×106 4.0×10 6 5.0×106
ΔP, Pa
dC
=0 (A.23)
dP
P = 249288Pa (A.24)
Q × P
The optimum power of the pump = = 1.31 hp
η × 746
The nearest rating of the pump available in the market is 1.50 hp.
Therefore, the optimum pipe diameter and pump power for a life of 20 years with
6% increase in unit cost per year are 66.16 mm and 1.5 hp, respectively.
Case 2: Running cost for 20 years with 7% increase in unit cost per year. Using the
approach detailed below, the running costs are calculated and presented in Table A.4.
The life of the system is 20 years. The unit cost of electricity is Rs. 5.50 per unit and
this increases by 7% every year and the pump has to work two hours daily.
Appendix 367
Table A.4 Running costs for different pumps over 20 years with 7% increase in unit cost per year
S.No. Power (hp) P (Pa) Running cost (Rs.)
1 0.25 47605.5 30695.7
2 0.5 95210.2 61391.5
3 1 190420.4 122783.0
4 1.5 285630.6 184174.6
5 2 380840.8 245566.2
6 3 571261.2 368349.3
7 5 952102.1 613915.4
8 7 1332942.9 859481.6
9 8 1523363.3 982264.7
10 10 1904204.2 1227830.8
11 12.5 2380255.2 1534788.6
12 15 2856306.3 1841746.3
13 20 3808408.4 2455661.7
14 25 4760510.5 3069577.2
(P × Q)
= × 225.47 × 2 × 365 (A.25)
(η × 1000)
Using a linear fit for lifetime running cost, the running cost is given by
Running cost of pump for 20 years with 7% increase per unit cost (Fig. A.5).
2,000,000
1,500,000
1,000,000
500,000
0
0 1.0×10 6 2.0×10 6 3.0×10 6 4.0×10 6 5.0×10 6
ΔP, Pa
Optimization
Total life time cost (C) of the pump and piping arrangement
The optimum value of total life time cost (C) is obtained as follows:
dC
=0 (A.32)
dP
P = 243292Pa (A.33)
Q × P
= = 1.28 hp (A.35)
η × 746
Tank
Tank
--
Fig. A.6 Sketch of the optimum configuration of the pump and piping arrangement (case 2)
Fig. A.7 A diagrammatic representation of the tools required to optimize a thermal system
370 Appendix
Therefore, the optimum pipe diameter and pump power for a life of 20 years with
7% increase in unit cost per year are 68.96 mm and 1.5 hp respectively.
A sketch of the final configuration of the pump and piping arrangement for case
2 is given in Fig. A.6 .
In summary, this example epitomizes the three key aspects involved in an opti-
mizing thermal system and this can be visualized as shown in Fig. A.7.
Random Number Table
A E
Air conditioning, 27 Eigen values, 158
Aircraft, 130 Elimination method, 184
Annealing, 310 Elitist strategy, 292
Emissivity, 339, 341
Enthalpy, 342
Equality constraints, 134
B
Ethylene refining plant, 131
Bayesian inference, 346
Exact fit, 96
Baye’s theorem, 346 Exhaustive search, 180
Best fit, 69 Extremum, 188
Bits, 290
Boltzmann constant, 342
Boltzmann distribution, 311 F
Breast tumor, 78 Fibonacci search, 193
Fitness, 303
Fukushima, 27
C Furniture company, 250
Calculus, 137
Cargo ship, 158, 294
Coefficient of determination, 92 G
Composite objective function, 231 Gaussian distribution, 346
Computerized axial tomography, 340 Gaussian prior, 350
Conjugate gradient method, 224 Gauss Newton Algorithm, 107, 128
Correlation coefficient, 92 Gauss–Seidel method, 67
Cricket match, 274 Generator, 278
Crossover, 303 Genetic algorithms, 283
Cutting plane methods, 269 Global minimum, 181
Golden ratio, 204
Golden section search, 203
Gomory’s fractional cut algorithm, 269
D
Darwinian theory, 284
Determinant, 157 H
Dichotomous search, 188 Heat flux, 84
Duckworth-Lewis method, 274 Helter skelter approach, 136
Dynamic programming, 273 Hessian matrix, 157
© The Author(s) 2021 375
C. Balaji, Thermal System Design and Optimization,
https://doi.org/10.1007/978-3-030-59046-8
376 Index
P
I Pareto front, 235
Inequality constraints, 165 Parity plot, 93
Infinite barrier penalty, 232 Particle swarm optimization, 292
Inflection point, 143 Penalty function, 232, 249
Integer programming, 268 Positive definite matrix, 158
Interval of uncertainty, 356 Posterior probability density, 346
Inverse, 339 Post optimality analysis, 252
Prior, 346
Prior probability, 348
K Probability, 346
Kaizen philosophy, 290 Pump, 136
Kuhn Tucker, 251
R
L Random number, 316
Lagrange interpolation, 210 Random table, 371
Lattice method, 212 Reduction ratio, 189
Least squares regression, 354 Refrigerator, 130
Levenberg–Marquardt method, 115 Root mean square, 96
Levenberg method, 113
Linear programming, 133
Local minimum, 181 S
Logarithmic mean temperature difference, Satellite, 130
12 Search methods, 137, 177, 178, 189, 210
Log-normal, 348 Sensitivity analysis, 252
Sequential arrangement, 30
Shadow price, 147
M Shell and tube heat exchanger, 12
MATLAB, 108, 128, 140, 247 Simplex method, 256, 260
MATLAB code, 54, 100, 112, 118, 122, 164, Simulated annealing, 316
169, 192, 200, 207, 221, 229, 233, Slack variables, 278
245, 297, 301, 306, 322, 327 Solar water heater, 159
Matrix, 157 Space power plant, 139
Matrix inversion, 52 Standard deviation, 346
Maximum, 143, 155, 158 Steepest ascent/descent, 214
Maximum a posteriori, 348 Street car, 137
Micro-economics, 9, 66 String, 295, 299, 304
Minimum, 143 Successive substitution, 65
Monotonic function, 181
Multi-modal function, 182
Multi-objective optimization, 235 T
Mutation, 303 Taylor series, 346
Technique for Order of Preference by Sim-
ilarity to Ideal Solution (TOPSIS),
N 242
Newton–Raphson method, 52 Tennis tournament, 296
Nozzle, 173 Thermogram, 78
Thermophysical properties, 354
Three points test, 78
O Tournament selection, 296
Optimization, 175, 231 Traveling salesman, 277
Index 377
U
Uncertainty, 246 W
Uniform power, 352 Water, 178, 285, 340