Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Optimization Lesson 2 - Constrained Multi-Variable Optimization

The document discusses constrained multi-variable optimization problems in engineering design, focusing on the need to minimize or maximize an objective function while satisfying various constraints. It outlines the structure of such problems, differentiating between linear and non-linear programming, and describes solution techniques including the Lagrange Multiplier method. Additionally, it explains the conditions for feasible solutions and the implications of equality and inequality constraints on the optimization process.

Uploaded by

Sameer Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Optimization Lesson 2 - Constrained Multi-Variable Optimization

The document discusses constrained multi-variable optimization problems in engineering design, focusing on the need to minimize or maximize an objective function while satisfying various constraints. It outlines the structure of such problems, differentiating between linear and non-linear programming, and describes solution techniques including the Lagrange Multiplier method. Additionally, it explains the conditions for feasible solutions and the implications of equality and inequality constraints on the optimization process.

Uploaded by

Sameer Kumar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

DESIGN OF FRICTIONAL MACHINE ELEMENTS

(ME3202)

Module 7: Optimization in Design


Lesson 2: Constrained Multi-Variable Optimization
Problems

B. Tech., Mech. Engg., 6th Sem, 2022-23


Constrained Multi-variable Optimization Problems
• Almost all engineering design and decision making problems have an objective of
minimizing or maximizing a function (usually cost or profit functions) and simultaneously
have a requirement for satisfying some constraints arising due to space, strength, or
stability considerations.
• A constrained optimization problem comprises an objective function together with a
number of equality and inequality constraints. Often lower and upper bounds on the
design variables are also specified.
• Constraints and bounds enclose a feasible region in the solution space. A point (or
solution) is defined as a feasible point (or solution) if all equality and inequality constraints
and variable bounds are satisfied at that point. Feasible region usually shrinks when more
constraints are added and vice-versa.
• A point 𝑥𝑗 𝑖 is said to have satisfied a constraint, if the left side expression of the constraint
at that point agrees with the right-side value by the relational operator between them.
This constraint-qualified point is now a candidate optimal or feasible solution.
• The problem is called a Non-Linear Programming Problem (NLPP), if any of the objective
function or the constraints is non-linear. If all are linear, then the problem is LPP.
Constrained Multi-variable Optimization Problems
• The general structure of a constrained (single objective) multivariable optimization
problem is:
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑗 , for 𝑗 = 1 to 𝑁
Subject to,
𝑔𝑘 𝑥𝑗 ≥ 0, for 𝑘 = 1 to 𝐾
ℎ𝑚 𝑥𝑗 = 0, for 𝑚 = 1 to 𝑀
And,
𝑢𝑝𝑝𝑒𝑟
𝑥𝑗 𝑙𝑜𝑤𝑒𝑟 ≤ 𝑥𝑗 ≤ 𝑥𝑗

• There may be ‘less than or equal to’ type constraints, which can be converted to ‘greater
than or equal to’ type by multiplying the constraints with −1
Constrained Multi-variable Optimization Problems
• There is no restriction on the number of inequality constraints. However, the total number of active
constraints (satisfied at equality) must be less than, or at the most, equal to the number of design
variables.
• The inequality constraints can be scaled by any positive constant, and the equalities, by any
constant. This will not affect the feasible region and hence the optimum solution. All the foregoing
transformations, however, affect the values of the Lagrange multipliers and the performance of the
numerical algorithms also.
• The number of independent equality constraints must be less than, or at the most, equal to the
number of design variables, i.e. 𝑀 ≤ 𝑁
• If, 𝑀 < 𝑁, we have a feasible region to search for the optimal solution.
• When 𝑀 = 𝑁, no optimization of the system is necessary because the roots of the equality
constraints are the only candidate points for optimum design.
• If, 𝑀 > 𝑁, we have an overdetermined system of equations. In that case,
1. Either, some equality constraints are redundant (linearly dependent on other constraints), In
this case, the redundant constraints can be deleted and, if 𝑀 becomes < 𝑁, the optimum
solution for the problem is now possible.
2. Or, they are inconsistent. In this case, no solution for the design problem is possible and the
problem formulation can be reexamined.
Constrained Multi-variable Optimization Problems
• The inequality constraints 𝑔𝑘 𝑥𝑗 ≥ 0 are said to be active, if equalities are satisfied at the
optimal point 𝑥𝑗∗ i.e., 𝑔𝑘 𝑥𝑗∗ = 0. These are also called a tight or binding constraint.
• For a feasible design, an inequality constraint may or may not be active but, equality
constraints are.
• The inequality constraints are inactive if, 𝑔𝑘 𝑥𝑗∗ > 0
• The inequality constraints are violated if, 𝑔𝑘 𝑥𝑗∗ < 0 (opposite relationship w.r.t the
constraints’ original definitions)
• The inequality constraints are violated if, ℎ𝑚 𝑥𝑗 ≠ 0. So, by these definitions, an equality
constraint is either active or violated at a given optimal point.
Solution Techniques for
Constrained Multi-Variable Optimization Problems
• Constrained Multi-Variable Optimization Problems can be solved analytically as well as numerically.
• Lagrange Multiplier method is an analytical technique to tackle multi-variable optimization
problems with equality constrained. Though, it can be extended to solve inequality constrained
cases as well.
• Using slack variables and Lagrange multiplier technique, the inequality and equality constraints can
be added to the objective function to form an unconstrained problem, solution to which gives the
Kuhn-Tucker points in which optimal point may be present. This is a generalized method.
• Often, the necessary conditions of the Lagrange Multiplier Theorem lead to a nonlinear set of
equations that cannot be solved analytically. In such cases, we must use a numerical algorithm,
such as Newton’s method.
• The numerical algorithms can be classified into direct and gradient-based methods.
• A solution to a constrained optimization problem may not exist, this may happen due to over-
constraining or conflicting constrains. In that case, the formulation may be rechecked.
Multivariable Optimization with Equality Constraints
• Let us consider a multivariable optimization problem,
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑗 , for 𝑗 = 1 to 𝑁
Subject to,
ℎ𝑚 𝑥𝑗 = 0, for 𝑚 = 1 to 𝑀

• Analytical solution techniques:


Direct Substitution method (difficult to solve for non-linear constraints),
Method of Constrained Variation (difficult calculations of higher order determinants),
Lagrange Multiplier method
Direct Substitution method
2 2
Consider, minimization of 𝑓 𝑥1 , 𝑥2 = 𝑥1 − 1.5 + 𝑥2 − 1.5 … (1)
Subject to an equality constraint: ℎ 𝑥1 , 𝑥2 = 𝑥1 + 𝑥2 − 2 = 0 … (2)
By direct substitution method, we can express one variable in terms of the other,
𝑥2 = −𝑥1 + 2 … (3)
or, 𝑥2 = 𝜙 𝑥1 … (4)
Substituting the value of 𝑥2 in the function definition we get,
2 2
𝑓 𝑥1 , 𝑥2 = 𝑓 𝑥1 , 𝜙 𝑥1 = 𝑥1 − 1.5 + −𝑥1 + 2 − 1.5 … (5)
𝑑𝑓
The necessary condition, is now, = 0, gives 𝑥1∗ = 1 and from eq.(3), 𝑥2∗ = 1
𝑑𝑥1
𝑑2 𝑓
And from the sufficient condition as found, ฬ > 0, gives, the function 𝑓 𝑥1 , 𝑥2 is
𝑑𝑥12
𝑥1∗ ,𝑥2∗
minimum at 𝑥1∗ , 𝑥2∗ and the minimum value is 𝑓min ȁ 𝑥1∗ ,𝑥2∗ = 0.5
When Function 𝝓 is Not Explicit
• In most of the complex real life problems, the function 𝜙 𝑥1 cannot be obtained in an
explicit form. For those cases we need to see for alternatives.
𝑑𝑓
• Let us consider the necessary condition 𝑑𝑥 ቚ =0
1 𝑥 ∗ ,𝑥 ∗
1 2

• Using the chain rule of differentiation on the l.h.s., we have:


𝑑𝑓 𝜕𝑓 𝜕𝑓 𝑑𝑥2 𝜕𝑓 𝜕𝑓 𝑑𝜙
• = 𝜕𝑥 + 𝜕𝑥 = 𝜕𝑥 + 𝜕𝑥 … (6)
𝑑𝑥1 1 2 𝑑𝑥1 1 2 𝑑𝑥1
𝑑𝜙
• As, 𝜙 𝑥1 is not explicitly known, needs to be eliminated from the equation. And, in
𝑑𝑥1
order to do that, let us differentiate the constrained equation, ℎ 𝑥1 , 𝑥2 = 0 w.r.t. 𝑥1
(assuming, ℎ 𝑥1 , 𝑥2 is differentiable w.r.t. 𝑥1 ), which gives:
𝑑ℎ 𝜕ℎ 𝜕ℎ 𝑑𝑥2 𝜕ℎ 𝜕ℎ 𝑑𝜙
• = 𝜕𝑥 + 𝜕𝑥 = 𝜕𝑥 + 𝜕𝑥 = 0 … (7)
𝑑𝑥1 1 2 𝑑𝑥1 1 2 𝑑𝑥1
Explicit and Implicit Functions
Explicit Function Implicit Function
• When the dependent variable is clearly • When the dependent variable is not
expressed in terms of the independent clearly expressed in terms of the
variables in the function. The general form is:
𝑦 = 𝑓(𝑥) independent variables in the function. The
general form is: 𝑓 𝑥, 𝑦 = 0
• Examples:
𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 1 • It is not always clear which of the
variables is dependent variable.
𝑦 = 𝑒𝑥 + 1
𝑦 = cos −1 𝑥 + 4𝑥 • Examples:
𝑥 2 + 𝑥𝑦 + 𝑦 2 = 10
• If dependent and independent variables can
be separated without performing any 𝑎𝑥 + 𝑏𝑦 2 = cos 𝑥𝑦
mathematical operations, the function is • The implicit function can be converted to
called explicit. Example, 𝑦 = 𝑥 + 2 is explicit the explicit form, but it will be complex.
but 𝑥 − 𝑦 + 2 = 0 is implicit
When 𝝓 is not explicitly obtainable …contd.
𝜕ℎ
𝑑𝜙 𝜕𝑥1
• or, 𝑑𝑥 = − 𝜕ℎ … (8)
1
𝜕𝑥2
𝜕ℎ
𝑑𝜙 𝑑𝑓 𝜕𝑓 𝜕𝑓 𝑑𝜙 𝜕𝑓 𝜕𝑓 𝜕𝑥1
• Now replacing 𝑑𝑥 in eq.(6), we get, 𝑑𝑥 = 𝜕𝑥 + 𝜕𝑥 = 𝜕𝑥 −
1 1 1 2 𝑑𝑥1 1 𝜕𝑥2 𝜕ℎ
𝜕𝑥2
𝜕𝑓
𝑑𝑓 𝜕𝑓 𝜕𝑥2 𝜕ℎ
• Rearranging we get, = − 𝜕ℎ … (9)
𝑑𝑥1 𝜕𝑥1 𝜕𝑥1
𝜕𝑥2

𝜕𝑓
• At the optimal point 𝑥1∗ , 𝑥2∗ , as per the necessary condition, 𝜕𝑥 ቚ = 0 … (7)
1 𝑥1∗ ,𝑥2∗
𝜕𝑓
𝜕𝑓 𝜕𝑥2 𝜕ℎ
• Therefore, ቚ
𝜕𝑥1 𝑥 ∗ ,𝑥 ∗
+ − 𝜕ℎ อ ቚ
𝜕𝑥1 𝑥 ∗ ,𝑥 ∗
= 0 … (10)
1 2 𝜕𝑥2 1 2
𝑥1∗ ,𝑥2∗
Lagrange Multiplier
𝜕𝑓

𝜕𝑥2 𝑥∗ ,𝑥∗
• We define a scalar quantity 𝑣, called Lagrange multiplier, as: 𝑣 = − 𝜕ℎ
1 2
… (11)

𝜕𝑥2 𝑥∗ ,𝑥∗
1 2
(Remember, 𝑣 is a sign free variable, can be considered +ve also)
𝜕𝑓 𝜕ℎ
• Replacing 𝑣, eq.(10) can be written as, 𝜕𝑥 ቚ + 𝑣 𝜕𝑥 ቚ = 0 … 12
1 𝑥1∗ ,𝑥2∗ 1 𝑥1∗ ,𝑥2∗

𝜕𝑓 𝜕ℎ
• From eq.(11), we can also write, 𝜕𝑥 ቚ + 𝑣 𝜕𝑥 ቚ = 0 … 13
2 𝑥1∗ ,𝑥2∗ 2 𝑥1∗ ,𝑥2∗

• eq.(12) & (13) can be written in general vector notation as, 𝛻𝑓 𝐱 ∗ + 𝑣 𝛻ℎ 𝐱 ∗ = 𝟎 … (14)
𝜕𝑓 𝜕𝑓

ቚ ቚ
𝑥 𝜕𝑥1 𝑥 ∗ ,𝑥 ∗ 𝜕𝑥1 𝑥 ∗ ,𝑥 ∗
1
• Where, 𝐱 ∗ = ∗ , 𝛻𝑓 𝐱 ∗ = 1 2
and 𝛻ℎ 𝐱 ∗ = 1 2
𝑥2 𝜕𝑓

𝜕𝑓

𝜕𝑥2 𝑥 ∗ ,𝑥 ∗ 𝜕𝑥2 𝑥 ∗ ,𝑥 ∗
1 2 1 2
Necessary Condition for Optimality
for Lagrange Multiplier method
• Eq.(2), eq.(12) and (13) or eq.(2) and eq.(14) gives the necessary conditions of optimality for
the problem, for Lagrange Multiplier method.
• The necessary conditions are more commonly generated by constructing a function 𝐿,
known as the Lagrange function, as 𝐿 𝑥1 , 𝑥2 , 𝑣 = 𝑓 𝑥1 , 𝑥2 + 𝑣 ℎ 𝑥1 , 𝑥2 . For this case 𝐿 is
a function of three variables 𝑥1 , 𝑥2 and 𝑣
• The necessary conditions for its extremum are given in terms of 𝐿, by:
𝜕𝐿 𝜕𝑓 𝜕ℎ
• ቚ = 𝜕𝑥 ቚ + 𝑣 𝜕𝑥 ቚ =0
𝜕𝑥1 𝑥 ∗ ,𝑥 ∗ 1 𝑥1∗ ,𝑥2∗ 1 𝑥1∗ ,𝑥2∗
1 2
𝜕𝐿 𝜕𝑓 𝜕ℎ
• ቚ
𝜕𝑥2 𝑥 ∗ ,𝑥 ∗
= 𝜕𝑥 ቚ + 𝑣 𝜕𝑥 ቚ =0
1 2
2 𝑥1∗ ,𝑥2∗ 2 𝑥1∗ ,𝑥2∗
𝜕𝐿
• ቚ = ℎ 𝑥1∗ , 𝑥2∗ = 0
𝜕𝑣 𝑥 ∗ ,𝑥 ∗
1 2
Lagrange Multiplier Method in General Index Notation
• Constraints, ℎ𝑚 𝑥𝑗 = 0, must be differentiable.
• Number of constraints must be less than or equal to the number of design variables. If
they are equal, solving constraints themselves will give the solution.
• For a general multivariable (single objective) optimization problem with equality
constraint,
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑗 , for 𝑗 = 1 to 𝑁
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜, ℎ𝑚 𝑥𝑗 = 0, for 𝑚 = 1 to 𝑀
• If the Lagrange function 𝐿 𝑥𝑗 , 𝑣𝑚 = 𝑓 𝑥𝑗 + 𝑣𝑚 ℎ𝑚 𝑥𝑗
• or, expanding, 𝐿 𝑥𝑗 , 𝑣𝑚 = 𝑓 𝑥𝑗 + 𝑣1 ℎ1 𝑥𝑗 + 𝑣2 ℎ2 𝑥𝑗 + ⋯ + 𝑣𝑀 ℎ𝑀 𝑥𝑗 then,
Lagrange Multiplier Method in Index Notation …contd.
• The necessary conditions (otherwise known as the necessity theorem) for optimality for
Lagrange multiplier method is given by:
𝜕𝐿 𝜕𝑓 𝜕ℎ𝑚
ฬ = 𝜕𝑥 ฬ + 𝑣𝑚 ฬ =0
𝜕𝑥𝑗 𝜕𝑥𝑗
𝑥𝑗∗ 𝑗 𝑥𝑗∗ 𝑥𝑗∗
𝜕𝐿
ቚ = ℎ𝑚 𝑥𝑗∗ = 0
𝜕𝑣𝑚 𝑥 ∗
𝑗

• The necessary conditions become sufficient conditions for 𝑓(𝑥𝑗 ) to have a constrained
relative minimum (or, maximum) at 𝑥𝑗∗ if:
1) The objective function 𝑓(𝑥𝑗 ) is concave (convex), by checking the definiteness of the
Hessian matrix (details can be found at the end of this PPT), and,
2) The constraints are of equality type
Lagrange Multiplier Method in Index Notation …contd.
• Another way of checking the sufficient conditions is by constructing the bordered Hessian Matrix,
𝟎 𝑈
𝐻 𝐵 as: 𝐻 𝐵 = 𝑇 which is a 𝑀 + 𝑁 × 𝑀 + 𝑁 matrix,
𝑈 𝑉
𝜕ℎ1 𝜕ℎ1 𝜕ℎ1 𝜕2 𝐿 𝜕2 𝐿 𝜕2 𝐿
⋯ ⋯
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑁 𝜕𝑥12 𝜕𝑥1 𝜕𝑥2 𝜕𝑥1 𝜕𝑥𝑁
𝜕ℎ2 𝜕ℎ2 𝜕ℎ2 𝜕2 𝐿 𝜕2 𝐿 𝜕2 𝐿
𝜕ℎ𝑚 ⋯ 𝜕2 𝐿 ⋯
where, 𝑈 = 𝜕𝑥𝑗
= 𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑁 and 𝑉 = 𝜕𝑥 𝜕𝑥 = 𝜕𝑥2 𝜕𝑥1 𝜕𝑥22 𝜕𝑥2 𝜕𝑥𝑁
𝑖 𝑗
⋮ ⋮ ⋱ ⋮ ⋮ ⋮ ⋱ ⋮
𝜕ℎ𝑀 𝜕ℎ𝑀 𝜕ℎ𝑀 𝜕2 𝐿 𝜕2 𝐿 𝜕2 𝐿
⋯ ⋯ 2
𝜕𝑥1 𝜕𝑥2 𝜕𝑥𝑁 𝑀×𝑁 𝜕𝑥𝑁 𝜕𝑥1 𝜕𝑥𝑁 𝜕𝑥2 𝜕𝑥𝑁
𝑀×𝑁

• Starting with the principal minors of order 2𝑀 + 1, compute the last (𝑁 − 𝑀) principal minors of
𝐻 𝐵 at the point 𝑥𝑗∗ with 𝜆∗
 If the principal minors are of alternate sign, starting with −1 𝑀+𝑁
, point 𝑥𝑗∗ is a maximum point
 If the principal minors are of same sign, starting with −1 𝑀
, point 𝑥𝑗∗ is a minimum point
Multivariable Optimization with Inequality Constraints
• Let us consider a multivariable optimization problem,
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑗 , for 𝑗 = 1 to 𝑁
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜, 𝑔𝑘 𝑥𝑗 ≤ 0, for 𝑘 = 1 to 𝐾
𝑙𝑜𝑤𝑒𝑟 𝑢𝑝𝑝𝑒𝑟
(in which variable bounds 𝑥𝑗 ≤ 𝑥𝑗 ≤ 𝑥𝑗 are included, so, 𝐾 = 𝐾 + 2𝑁 total
inequalities)
• The inequality constraints can be transformed into equality constraints by adding non-
negative slack variables, 𝑠𝑘2 (to impose non-negativity and at the same time to avoid
additional constraints 𝑠𝑘 ≥ 0, the square of 𝑠𝑘 are taken).
• Note:
 ‘greater-than-or-equal-to’ inequalities are to be converted to ‘less-than-or-equal-to’ first by multiplying
−1 , to add slack variable.
 if the problem is maximization or if the constraint is ‘greater than or equal to’ type, the 𝑠𝑘 had to be
non-positive.
Multivariable Optimization with Inequality Constraints
• Now, the problem definition becomes,
𝑀𝑖𝑛𝑖𝑚𝑖𝑧𝑒 𝑓 𝑥𝑗 , for 𝑗 = 1 to 𝑁
𝑆𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜, 𝑔𝑘 𝑥𝑗 , 𝑠𝑘 = 𝑔𝑘 𝑥𝑗 + 𝑠𝑘2 = 0, for 𝑘 = 1 to 𝐾
• This problem can now be solved conveniently by the method of Lagrange multipliers.
• For this, we construct the Lagrange function 𝐿 as,
• 𝐿 𝑥𝑗 , 𝑢𝑘 , 𝑠𝑘 = 𝑓 𝑥𝑗 + 𝑢𝑘 𝑔𝑘 𝑥𝑗 , 𝑠𝑘 , where, 𝑢𝑘 is the vector of Lagrange Multipliers
(letter 𝑢 instead of 𝑣 is used to differentiate that in this case, the inequality has been
converted to equality)
Multivariable Optimization with Inequality Constraints
• So, the necessary conditions for optimality for Lagrange multiplier method is given by:
𝜕𝐿 𝜕𝑓 𝜕𝑔𝑘
ฬ = ฬ + 𝑢𝑘 𝜕𝑥 ฬ =0
𝜕𝑥𝑗 ∗ 𝜕𝑥𝑗 ∗ 𝑗 𝑥∗
𝑥𝑗 𝑥𝑗 𝑗

𝜕𝐿

𝜕𝑢𝑘 𝑥 ∗
= 𝑔𝑘 𝑥𝑗 , 𝑠𝑘 ห𝑥 ∗ = 0
𝑗
𝑗

𝜕𝐿
ቚ = 2 𝑢𝑘 𝑠𝑘 ȁ𝑥𝑗∗ = 0
𝜕𝑠𝑘 𝑥 ∗
𝑗

• It can be shown that 𝑢𝑘 ≥ 0 considering the changes of gradients in the feasible direction
• The solution of these equations gives the optimum point, 𝑥𝑗∗ and the values of the
Lagrange multiplier vector, 𝑢𝑘 and the values of the slack variable vector, 𝑠𝑘 evaluated at
𝑥𝑗∗ . For any of the 𝑠𝑘 = 0 gives corresponding 𝑔𝑘 = 0 (active) at 𝑥𝑗∗
Karush-Kuhn-Tucker (KKT) Conditions for
Multivariable Optimization with (General) Constraints
• Previously known as Kuhn-Tucker (KT) condition based on the first derivative tests.
Sometimes known as the ‘first order necessary conditions’.

• From the previous two cases, by combining we get the KKT conditions, we get:

1. Standard Lagrange function for the problem:

𝐿 𝑥𝑗 , 𝑣𝑚 , 𝑢𝑘 , 𝑠𝑘 = 𝑓 𝑥𝑗 + 𝑣𝑚 ℎ𝑚 𝑥𝑗 + 𝑢𝑘 𝑔𝑘 𝑥𝑗 , 𝑠𝑘
Karush-Kuhn-Tucker (KKT) Conditions for
Multivariable Optimization with (General) Constraints
2. Gradient condition:
𝜕𝐿 𝜕𝑓 𝜕ℎ𝑚 𝜕𝑔
Stationary Conditions: 𝜕𝑥 ฬ = 𝜕𝑥 ฬ + 𝑣𝑚 𝜕𝑥𝑗
ฬ + 𝑢𝑘 𝜕𝑥𝑘 ฬ =0
𝑗 𝑥𝑗∗ 𝑗 𝑥𝑗∗ 𝑥𝑗∗ 𝑗 𝑥𝑗∗

𝜕𝐿
ቚ = ℎ𝑚 𝑥𝑗 ห ∗ = 0
𝜕𝑣𝑚 𝑥 ∗ 𝑥 𝑗
𝑗
Primal Feasibility Conditions: 𝜕𝐿

𝜕𝑢𝑘 𝑥 ∗
= 𝑔𝑘 𝑥𝑗 , 𝑠𝑘 ห𝑥 ∗ = 0
𝑗
𝑗

3. Feasibility check for inequalities: 𝑠𝑘2 ≥ 0, or equivalently 𝑔𝑘 ≤ 0


𝜕𝐿
4. Switching conditions or, complementary slackness condition: 𝜕𝑠 ቚ = 2 𝑢𝑘 𝑠𝑘 ȁ𝑥𝑗∗ = 0
𝑘 𝑥∗
𝑗

(remember, here though 𝑘 is repeated but it doesn’t mean that 𝑢𝑘 𝑠𝑘 undergo Einstein summation)
Karush-Kuhn-Tucker (KKT) Conditions for Multivariable
Optimization with (General) Constraints
5. Non-negativity of the Lagrange Multipliers for inequalities or, Dual Feasibility: 𝑢𝑘 ȁ𝑥𝑗∗ ≥ 0
(positive or negative depends on the convexity or concavity of the inequality constraints)
6. Regularity check: Gradients of the active constraints must be linearly independent. In such
a case the Lagrange multipliers for the constraints are unique.

• These six conditions are known as KKT conditions


• If a point (𝑥𝑗 ) satisfies all KKT conditions then it is called a KKT point and can be an optimal
solution.
• All feasible solutions (which satisfies the constraints) are not KKT points
• This process handles large number of equations (increases multiplicatively with added
constraints) and so can go out of scope of manual calculations for practical problems.
Significance of the Switching Condition (KKT condition 4)
𝜕𝐿
• The switching conditions is given by: ቚ = 2 𝑢𝑘 𝑠𝑘 ȁ𝑥𝑗∗ = 0 . While solving this
𝜕𝑠𝑘 𝑥 ∗
𝑗
equations, we can have two different scenarios:
1) The 𝑢𝑘 = 0. This implies, corresponding constraints are satisfied as strict inequalities
𝑔𝑘 < 0 or the corresponding inequality constraints are inactive type. And therefore,
corresponding (inequality) Lagrange multiplier has to be zero. This condition shows that
the corresponding constraints can be relaxed (like, if 𝑔𝑘 < 0, then of course 𝑔𝑘 < 5).
2) The 𝑠𝑘 = 0. This implies, corresponding constraints are satisfied as equalities 𝑔𝑘 = 0 or
the corresponding inequality constraints are active type. And therefore, corresponding
(inequality) Lagrange multiplier be, generally, either positive or negative according to the
convexity or concavity of the inequality constraints in the problem definition (KKT
condition 5, dual feasibility).
Karush-Kuhn-Tucker Sufficiency Condition
• For minimization of a convex function, with a concave feasible space, KKT conditions itself
is necessary as well as sufficient conditions.
• Similarly, for maximization of a concave function, with a convex feasible space, KKT
conditions itself is necessary as well as sufficient conditions.
• And in dealing with convex or concave objective functions, the local minima or maxima
respectively are the global minima and maxima.
• There may be 4 cases:
1) Maximize 𝑓, the function must be concave for a maxima to exist, subject to 𝑔𝑖 ≤ 0.
Here, ≤ shows 𝑔𝑖 is convex
2) Maximize 𝑓, the function must be concave for a maxima to exist, subject to 𝑔𝑖 ≥ 0.
Here, ≥ shows 𝑔𝑖 is concave
3) Minimize 𝑓, the function must be convex for a minima to exist, subject to 𝑔𝑖 ≤ 0.
Here, ≤ shows 𝑔𝑖 is convex
4) Minimize 𝑓, the function must be convex for a minima to exist, subject to 𝑔𝑖 ≥ 0.
Here, ≥ shows 𝑔𝑖 is concave
Convex Function: Definition
Let 𝐶 ⊆ ℝ𝑛 (read: C is a subset of n-tuple (tuple means an ordered list of elements) real
numbers set) be a convex set. A function 𝑓: 𝐶 → ℝ (read: a real valued function defined on C
(which is the domain of f)) is said to be convex if for any 𝑥1 , 𝑥2 ∈ 𝐶 (read: x1 and x2 belonging
to C) and any scalar 𝜆 ∈ 0,1 (read: is in the closed interval from 0 to 1, 0 ≤ 𝜆 ≤ 1),

𝑓 𝜆𝑥1 + 1 − 𝜆 𝑥2 ≤ 𝜆𝑓 𝑥1 + 1 − 𝜆 𝑓 𝑥2

Note: A convex set 𝐶 is a collection of points (vectors, 𝑥𝑗 ) having the following property: If 𝑃1
and 𝑃2 are two arbitrary points in 𝐶, then the entire line segment 𝑃1 𝑃2 must also be in 𝐶.
Convex Function:
Graphical Representation

• Let us try to see this


condition graphically.
• Let us consider a convex
curve 𝑓 𝑥 which has a
minima within the two
chosen points on the
curve, point 1 having
coordinate (𝑥1 , 𝑓(𝑥1 )) and
point 2 has (𝑥2 , 𝑓(𝑥2 )).
• Let us connect these points
and get a chord (marked as
red centerline)
Convex Function: Graphical Representation
• If we choose a scalar 𝜆 such that 0 ≤ 𝜆 ≤ 1 and try to find out any linear combination of 𝑥1
and 𝑥2 for a particular choice of 𝜆, we can have point P with 𝑥 coordinate (or, abscissa)
𝜆𝑥1 + 1 − 𝜆 𝑥2 and 𝑦 coordinate (or, ordinate) 0.
If 𝑥1 and 𝑥2 remain in the convex set 𝐶, then 𝑥 will always remain in 𝐶 as well and, be called
a convex linear combination of 𝑥1 and 𝑥2 .
• From the figure, it is evident that point R divides the chord 12 internally by the same ratio
(by the proportionality of triangles) of 1 − 𝜆 to 𝜆 as 𝑥1 𝑥2 on 𝐶 (convex set) axis, we obtain
1−𝜆 𝑥2 +𝜆𝑥1 1−𝜆 𝑓 𝑥2 +𝜆𝑓 𝑥1
the coordinates of point R as: , or, by simplifying we get,
1−𝜆 +𝜆 1−𝜆 +𝜆
𝜆𝑥1 + 1 − 𝜆 𝑥2 , 𝜆𝑓 𝑥1 + 1 − 𝜆 𝑓 𝑥2
The abscissa of point R is 𝜆𝑥1 + 1 − 𝜆 𝑥2, linear combination of 𝑥1 and 𝑥2 . The ordinate of
point R is 𝜆𝑓 𝑥1 + 1 − 𝜆 𝑓 𝑥2 , which is the linear combination of 𝑓 𝑥1 and 𝑓 𝑥2 .
• If we find out the coordinate of point Q on the curve itself, we get the same abscissa as per
the geometry, and 𝑓 𝜆𝑥1 + 1 − 𝜆 𝑥2 as ordinate, conforming the curve equation.
Convex Function: Graphical Representation
• Now, if we compare the lengths, 𝑃𝑄 and 𝑃𝑅 in the figure, we find that generally 𝑃𝑄 ≤ 𝑃𝑅
between any 𝑥1 and 𝑥2 .
• Now, 𝑃𝑄 = 𝑓 𝜆𝑥1 + 1 − 𝜆 𝑥2 − 0 2 + 𝜆𝑥1 + 1 − 𝜆 𝑥2 − 𝜆𝑥1 − 1 − 𝜆 𝑥2 2

or, 𝑃𝑄 = 𝑓 𝜆𝑥1 + 1 − 𝜆 𝑥2
• And, similarly, 𝑃𝑅 = 𝜆𝑓 𝑥1 + 1 − 𝜆 𝑓 𝑥2
• Therefore, 𝑃𝑄 ≤ 𝑃𝑅 means, 𝑓 𝜆𝑥1 + 1 − 𝜆 𝑥2 ≤ 𝜆𝑓 𝑥1 + 1 − 𝜆 𝑓 𝑥2
Convexity of a Single Variable Function
• For a single variable function,
 if the double derivative of the function is greater than or equal to zero 𝑓 ′′ 𝑥 ≥ 0
throughout the domain of 𝑥, then the function is convex. Example: 𝑓(𝑥) = 𝑥 2 . The
domain of the convex function is a convex set.
 if the double derivative of the function is less than or equal to zero 𝑓 ′′ 𝑥 ≤ 0
throughout the domain of 𝑥, then the function is concave. Example: 𝑓 𝑥 = log 𝑥, or,
𝑓 𝑥 = 𝑥, or, 𝑓 𝑥 = 1 − 𝑥 2 , etc.
 if the double derivative of the function is equal to zero 𝑓 ′′ 𝑥 = 0 throughout the
domain of 𝑥 , then the function is both convex and concave or linear.
Example: 𝑓(𝑥) = 4𝑥
 if the double derivative of the function greater than or equal to zero 𝑓 ′′ 𝑥 ≥ 0 for
some points in the domain of 𝑥 and less than or equal to zero 𝑓 ′′ 𝑥 ≤ 0 for some
other points in the domain of 𝑥, then the function is neither convex nor concave.
Example: 𝑓 𝑥 = sin 𝑥, or, 𝑓 𝑥 = 𝑥 3 , etc.
Convexity of a Multi-Variable Function
• For a multi-variable function, if the Hessian matrix is positive semidefinite, the function is
convex. A positive definite matrix gives the function to be strictly convex.
• In other words, if the Hessian matrix of the function is negative semi-definite, the
function is concave.
Properties of Convex / Concave Functions
• A function, 𝑓 𝑥 , is said to be concave, if the negative function, −𝑓 𝑥 , is convex and vice-versa.
• A convex / concave function need not to be differentiable. Example, 𝑓 𝑥 = 𝑥 is a convex function
but not differentiable at 𝑥 = 0
𝑥 2 , −1 ≤ 𝑥 ≤ 1
• A convex function need not to be continuous. Example, 𝑓 𝑥 = ቊ ; but the function
2 , 𝑥=1
is always continuous in the interior of the domain: −1 < 𝑥 < 1
• If 𝑓(𝑥) is convex, the for constant 𝑎 and 𝑏, 𝑓(𝑎𝑥 + 𝑏) will also be convex.
• If 𝑓 and 𝑔 are two convex functions, then, 𝑓 + 𝑔, max (𝑓, 𝑔) and α𝑓 (where 𝛼 is a non-negative
scalar, 𝛼 ≥ 0) are also convex.
• If 𝑓 and 𝑔 are two concave functions, then, 𝑓 + 𝑔, min (𝑓, 𝑔) and α𝑓 (where 𝛼 is a non-negative
scalar, 𝛼 ≥ 0) are also concave.
• A convex function always gives a minima and a concave function always gives a maxima.
• If 𝑓 is a convex function defined on a convex set, then every local minima is global minima of 𝑓.

You might also like