LP Book
LP Book
LP Book
Huseyin Topaloglu
School of Operations Research and Information Engineering,
Cornell Tech, New York, NY 10044
i
I starting coming up with the examples in the book, as far as I remember, I constructed
all of the examples myself. The book directly uses the tableau to show strong duality, to
justify why we can fetch an optimal dual solution from the final primal tableau, and to derive
the economic interpretation of an optimal dual solution. Bob often uses such tableau-based
derivations in his book, and they are a great way to make a photographic argument. I
did not see the specific derivations I mentioned in other material. They are obviously not
revolutionary, but I hope someone will find them uplifting. There might be several other
derivations scattered in the book that may be new.
I sincerely thank Cornell University for the wonderful academic environment it
provides. Using this material with students over the years has been a great source of joy. I look
forward to doing so for many iterations.
Huseyin Topaloglu
New York City, NY
August, 2021
Preface i
iii
7 Min-Cost Network Flow Problem 49
7.1 Min-Cost Network Flow Problem . . . . . . . . . . . . . . . . . . . . . . . . 49
7.2 Integrality of the Optimal Solution . . . . . . . . . . . . . . . . . . . . . . . 52
7.3 Min-Cost Network Flow Problem in Compact Form . . . . . . . . . . . . . . 54
After we identify the decision variables, we need to express the objective function as a
function of the decision variables. In this problem, our objective is to maximize the revenue
per day. We obtain a revenue of $3 for each standard user that we serve and we obtain
a revenue of $4 for each power user that we serve. As a function of the decision variables
above, we can express the revenue per day as 3 xs + 4 xp , which is our objective.
Next, we need to express the constraints as a function of the decision variables. The
number of available servers and the amount of energy available per day restrict our
decisions. As a function of our decision variables, the total number of servers that we lease
is given by xs + xp and the number of servers that we lease cannot exceed 1000. Thus,
one constraint we have is xs + xp ≤ 1000. Also, we consume 1 unit of energy per day for
each server that we leave to a standard user and 2 units of energy per day for each server
that we lease to a power user. So, as a function of our decision variables, the total energy
consumption per day is xs + 2 xp and the total energy consumption per day cannot exceed
1600 units. Thus, another constraint we have is xs + 2 xp ≤ 1600. Note that both of our
constraints are expressed with a less than or equal to sign, but we can express constraints
1
with a greater than or equal to sign or with an equal to sign. Which constraint type we use
depends on the problem statement. Finally, the number of servers that we lease to each type
of users cannot be negative. Therefore, we have the constraints xs ≥ 0 and xp ≥ 0.
Putting the discussion above together, the optimization problem that we want to solve
can be expressed as
The set of equations above characterize an optimization problem. The first row shows
the objective function and max emphasizes the fact that we are maximizing our objective
function. The second, third and fourth rows show the constraints. The acronym st stands
for subject to and it emphasizes that we are maximizing the objective function subject to
the constraints in the second, third and fourth rows. Since the objective function and
the constraints are linear functions of the decision variables, the optimization problem
characterized by the set of equations above is called a linear program. We study linear
programs for a significant portion of this course, but there are optimization problems
whose objective functions and constraints are not necessarily linear functions of the decision
variables. Such optimization problems are called nonlinear programs.
A pair of values of the decision variables (xs , xp ) that satisfies all of the constraints in
the linear program above is called a feasible solution to the linear program. For example, if
we set (xs , xp ) = (200, 700), then we have 200 + 700 ≤ 1000, 200 + 2 × 700 ≤ 1600, 200 ≥ 0
and 700 ≥ 0. Thus, the solution (200, 700) is a feasible solution to the linear program. This
solution provides an objective value of 3 × 200 + 4 × 700 = 3400. On the other hand, a
pair of values for the decision variables (xs , xp ) that maximizes the objective function, while
satisfying all of the constraints is called an optimal solution. There is no feasible solution to
the linear program that provides an objective value exceeding the objective value provided
by the optimal solution. In certain problems, there can be multiple optimal solutions. We
will come back to the possibility of multiple optimal solutions later on.
In a few lectures, we discuss algorithms to find an optimal solution to the linear program
above. Before we go into these algorithms, we demonstrate how to use Microsoft Excel’s
solver to obtain an optimal solution. We set up a spreadsheet where two cells include the
values of our decision variables. In the figure below, we use cells A1 and B1 to include
the values of our decision variables. For the time being, we put dummy values of 1 and
1 into these cells. Next, we set up a formula that computes the objective function as a
Once we set up the formulas, we choose Solver under Tools menu. This action brings
up a window titled Solver Parameters. In the box labeled Set Objective, we put the
reference for the cell that includes the formula for the objective function, which is A2. In
the box labeled To, we choose Max since we want to maximize the value of the objective
function. In the box labeled By Changing Variable Cells, we put =$A$1:$B$1, which is
the range of cells that includes our decision variables. Next, we click on Add to specify the
constraints for our problem. This action brings up a window titled Add Constraints.
In the box labeled Cell Reference, we put A3, which includes the formula for the left
side of the first constraint. In the middle box, we keep <=. In the box labeled Constraint,
we put 1000, which is the right side of the first constraint. We click on Add, which adds the
first constraint into the linear program. In the same way, we include the second constraint
into the linear program. In particular, in the box labeled Cell Reference, we put A4,
which includes the formula for the left side of the second constraint. In the middle box, we
keep <=. In the box labeled Constraint, we put 1600, which is the right side of the second
constraint. We click on OK to note that we added all of the constraints that we want to
add. This action brings us back to the window titled Solver Parameters.
In this window, we make sure that Make Unconstrained Variables Non-Negative is
checked so that the decision variables are constrained to be non-negative. In the drop down
menu titled Select a Solving Method, we choose Simplex LP, which is the algorithm that
is appropriate for solving linear programs. After constructing the linear program as described
above, the window titled Solver Parameters should look like the one in the figure below. We
click on Solve in the window titled Solver Parameters. Microsoft Excel’s solver adjusts
the values in the cells A1 and B1, which include the values of our decision variables. A dialog
box appears to inform us that the optimal solution to the problem has been reached. The
values in the cells A1 and B1 correspond to the optimal values of our decision variables. For
A B C
1 1.5 3.5 2.5
2 2 1 3
3 1.5 4 2
We are interested in figuring out how many ads from each advertiser to show to how many
viewers of each type to maximize the revenue per day. To formulate the problem as a linear
program, we use the following decision variables.
Considering practical applications, the linear program above is actually not that
large. Many linear programs that appear in practical applications include thousands of
decision variables and thousands of constraints. Explicitly listing all of the decision variables
and the constraints in such linear programs can easily become tedious. To overcome this
difficulty, we often express a linear program in compact form. We use the example above to
For example, we have R1A = 1.5, R3B = 4, V1 = 1500, V2 = 2000, V3 = 2500, DA = 2000,
DB = 3000, DC = 1000. Note that {Rij : i = 1, 2, 3, j = A, B, C}, {Vi : i = 1, 2, 3},
{Dj : j = A, B, C} are known data of the problem. Furthermore, we express the decision
variables as follows.
of visitors of types 1, 2 and 3 that are shown an ad from advertiser j should be equal to the
daily number of viewers desired advertiser j, we have the constraints 3i=1 xij = Dj for all
P
We can find the numbers of the two types customers to serve to maximize the revenue per
day by solving the linear program
The objective function above accounts for the revenue that we obtain per day. The first
constraint ensures that the total number of CPU’s used by the two customer types on each
day does not exceed the number of available CPU’s. The second constraint ensures that
the total amount of memory used by the two customer types on each day does not exceed
the available memory. The third constraint ensures that we do not serve more than 400
CPU-intensive customers per day.
The set of (x1 , x2 ) pairs that satisfy all of the constraints is called the set of feasible
solutions to the linear program. To understand the set of feasible solutions to the linear
program above, we plot the set of (x1 , x2 ) pairs that satisfy each constraint. Consider the
constraint 2 x1 +x2 ≤ 1000. Note that 2 x1 +x2 = 1000 describes a line in the two-dimensional
7
plane. We plot this line in the left side of the figure below. The point (x1 , x2 ) = (0, 0) is to
the lower left side of this line and it satisfies 2 × 0 + 0 ≤ 1000. Therefore, all points to the
lower left side of the line satisfy the constraint 2 x1 + x2 ≤ 1000. We shade these points in
light blue. Similarly, consider the constraint x1 + 2 x2 ≤ 1200. As before, x1 + 2 x2 = 1200
describes a line in the two-dimensional plane. We plot this line in the right side of the
figure below. The point (x1 , x2 ) = (0, 0) is to the lower left side of this line and it satisfies
0 + 2 × 0 ≤ 1200. So, all points to the lower left side of the line satisfy the constraint
x1 + 2 x2 ≤ 1200. We shade these points in light red.
x2 1400 x2 1400
1200 1200
1000 1000
2x1 + x2 =1000
800 800
600 600
400 400
In the figure below, we take the intersection of the light blue and light red regions in the
previous figure, which means that the set of points that satisfy both of the constraints
2 x1 + x2 ≤ 1000 and x1 + 2 x2 ≤ 1200 is given by the light orange region below.
x2 1400
1200
1000
2x1 + x2 =1000
800
600
400
Carrying out the same argument for the constraints x1 ≤ 400, x1 ≥ 0 and x2 ≥ 0, it follows
that the set of points that satisfy all of the constraints in the linear program is given by the
Any (x1 , x2 ) pair in the light green region above is a feasible solution to our linear
program. We want to find the feasible solution that maximizes the objective function.
x2 1400 x2 1400
1200 1200
1000 1000
800 800
600 600
400 400
set of feasible set of feasible 3x1 +4x2 =3000
200 solutions 200 solutions
0 0
200 0 200 400 600 800 1000 1200 1400 200 0 200 400 600 800 1000 1200 1400
3x1 +4x2 =900 x1 x1
200 200
Observe that the lines 3 x1 + 4 x2 = 900 and 3 x1 + 4 x2 = 3000 are all parallel to each
other. Therefore, to check whether there exists a feasible solution that provides a revenue
of K, we can shift the line 3 x1 + 4 x2 = 900 parallel to itself until we obtain the line
3 x1 + 4 x2 = K. If the line 3 x1 + 4 x2 = K is still in contact with the set of feasible solutions,
then there exists a feasible solution that provides a revenue of K.
x2 1400
1200
1000
800
600
400
set of feasible 3x1 +4x2 =8000/3
200 solutions
0
200 0 200 400 600 800 1000 1200 1400
x1
200
The coordinates of the black dot in the figure above gives the optimal solution to the
problem. To compute these coordinates, we observe that the black dot lies on the lines
that represent the first two constraints in the linear program and these lines are given
by the equations 2 x1 + x2 = 1000 and x1 + 2 x2 = 1200. Solving these two equations
simultaneously, we obtain x1 = 800/3 and x2 = 1400/3. Therefore, the optimal solution to
the linear program is given by (x∗1 , x∗2 ) = (800/3, 1400/3). The revenue from this solution is
3 x∗1 + 4 x∗2 = 3 × 800
3
+ 4 × 1400
3
= 8000
3
, which is the optimal objective value.
A key observations from the discussion above is that the optimal solution to a linear
program is achieved at one of the corner points of the set of feasible solutions. This
observation is critical for the following reason. There are infinitely many possible feasible
solutions to the linear program. So, we cannot check the objective value provided by each
possible feasible solution. However, there are only finitely many possible corner points of the
set of feasible solutions. If we know that the optimal solution to a linear program occurs at
one of the corner points, then we can check the objective value achieved at the corner points
and pick the corner point that provides the largest objective value. Using this observation,
we will develop an algorithm to efficiently solve linear programs when there are more than
two decision variables and we cannot even plot the set of feasible solutions.
It is also useful to observe that in the optimal solution (x∗1 , x∗2 ) = (800/3, 1400/3), we
have x∗1 < 400. Thus, the third constraint in the linear program does not play a role in
determining the optimal solution, which implies that the optimal solution would not change
even if we dropped this constraint from the linear program.
x2 1400
1200
1000
800
600
400
set of feasible
200 solutions 2x1 +4x2 =2400
0
200 0 200 400 600 800 1000 1200 1400
x1
200
Note that we follow the convention that a vector is always a column vector, which is
essentially a matrix with one column and multiple rows.
13
3.2 Matrix Addition and Multiplication
If we have two matrices that are of the same dimension, then we can add them. To
demonstrate matrix addition, we have
1 2 −1 2 4 9 3 6 8
3 1 7 1 3 8 4 4 15
+ = .
1 4 2 7 7 1 8 11 3
4 5 1 2 −1 0 6 4 1
multiplication, we have
2 3
1 3 4 2 17 6
1 1
2 4 1 3 = 21 15 .
1 −1
5 1 2 3 25 20
4 2
To verify the computation above, note that the entry in row 1 and column 1 of the product
matrix is 4k=1 a1k bk1 = 1 × 2 + 3 × 1 + 4 × 1 + 2 × 4 = 17. Similarly, the entry in row 3
P
5 x1 + 6 x 2 + 3 x3 + x4 = 7
6 x1 + 2 x2 + 4 x4 = 8
9 x1 + 6 x3 + 2 x4 = 1.
AA−1 = A−1 A = I,
where I is the identity matrix. Computing the inverse of a matrix is related to row
operations. Consider computing the inverse of the matrix
1 2 2
A = 2 −1
1 .
1 0 2
To compute the inverse of this matrix, we augment this matrix with the identity matrix on
the right side to obtain the matrix
A row operation refers to multiplying one row of a matrix with a constant or adding a
multiple of one row to another row. To compute the inverse of the 3 × 3 matrix A above,
we carry out a sequence of row operations on the matrix [A | I] to bring [A | I] in the form
of [I | B]. In this case, B is the inverse of A. Consider the matrix [A | I] given by
Multiply the second row by −2 and add to the first row. Also, multiply the second row by
2 and add to the third row. Thus, we get
Multiply the third row by −4/5 and add to the first row. Also, multiply the third row by
−3/5 and add to the second row. Thus, we get
To check that our computations are correct, we can multiply the two matrices above to see
that we get the identity matrix. In particular, we have
1 2 2
−
1 2 2 3 3 3
1 0 0
2 −1 1 12 0 − 12 = 0 1 0 .
1 0 2 − 16 − 13 5
6
0 0 1
x1 + 2 x2 + 2 x3 = 12
2x1 − x2 + x3 = 4
x1 + 2 x3 = 18,
which is equivalent to
1 2 2 x1 12
2 −1 1 x2 = 4 .
1 0 2 x3 8
Using A ∈ <3×3 to denote the matrix on the left side, x ∈ <3×1 to denote the vector on the
left side and b ∈ <3×1 to denote the vector on the right side, the equation above is of the
form A x = b. Multiplying both side of the this equality by A−1 , we get
2 x 1 + x2 = 4
−x1 + x2 = 1.
For example, add the first row to the second row to get
2 x1 + x2 = 4
x1 + 2 x2 = 5.
Multiply the second row by −4/5 and add to the first row to get
6
5
x1 − 53 x2 = 0
x1 + 2 x2 = 5.
Noting the last system of equations above, consider the set of points that satisfy the system
of equations 56 x1 − 35 x2 = 0 and x1 + 2 x2 = 5. Plotting the lines that are characterized by
each of these equations in the figure below, we observe that the set of points that satisfy
the system of equations 56 x1 − 35 x2 = 0 and x1 + 2 x2 = 5 is the single point given by
(x1 , x2 ) = (1, 2). Observe that this is the same point that satisfy the system of equations
that we started with. Therefore, the critical observation from this discussion is that the set
of points that satisfy a system of equations does not change when we apply any sequence of
row operations to a system of equations.
x1 = 1
x2 = 2.
We can immediately see that the set of points that satisfy the last system of equations is the
single point (x1 , x2 ) = (1, 2). Therefore, the set of points that satisfy the original system of
equations is also the single point (x1 , x2 ) = (1, 2). This discussion shows why row operations
are useful to solve a system of equations.
We want to solve this linear program without using graphical methods. The first thing
we do is to introduce the decision variable w1 to represent how much the right side
of the first constraint above exceeds the left side of the constraint. That is, we have
w1 = 1000 − 2 x1 − x2 . If (x1 , x2 ) is a feasible solution to the linear program above, then
we must have w1 ≥ 0 and w1 = 1000 − 2 x1 − x2 . We refer to w1 as the slack variable
associated with the first constraint. Similarly, we associate the slack variable w2 with the
second constraint so that w2 = 1200 − x1 − 2 x2 . Thus, if (x1 , x2 ) is a feasible solution to
the linear program above, then we must have w2 ≥ 0 and w2 = 1200 − x1 − 2 x2 . Finally,
we associate the slack variable w3 with the third constraint so that w3 = 400 − x1 . So, if
(x1 , x2 ) is a feasible solution to the linear program above, then we must have w3 ≥ 0 and
w3 = 400 − x1 . In this case, we can write the linear program above equivalently as
21
Since the two linear programs above are equivalent to each other, we focus on solving the
second linear program. The advantage of the second linear program is that its constraints
are of equality form. The simplex method expresses the system of equations associated with
the second linear program above as
3 x1 + 4 x2 = z
2 x1 + x2 + w 1 = 1000
x1 + 2 x 2 + w2 = 1200
x1 + w3 = 400,
where the first row corresponds to the objective function and the other three rows correspond
to the three constraints. The system of equations captures all the information that we
have on the linear program. We do not explicitly express the non-negativity constraints
on the decision variables, but we always keep in mind that all of the decision variables are
constrained to be non-negative.
We make two observations for the system of equations above. First, if we keep on applying
row operations to the system of equations, then the system of equations that we obtain
through the row operations remain equivalent to the original system of equations. Second,
the decision variable w1 appears only in the first constraint row with a coefficient of 1
and nowhere else. Similarly, w2 and w3 respectively appear only in the second and third
constraint rows with a coefficient of 1 and nowhere else. Thus, it is simple to spot a solution
(x1 , x2 , w1 , w2 , w3 ) and z to the system of equations above. We can set
w1 = 1000, w2 = 1200, w3 = 400, x1 = 0, x2 = 0, z = 0.
Note that the solution above is feasible to the linear program. Also note that the value of
the decision variable z corresponds to the value of the objective function provided by the
solution above. Now, we iteratively apply row operations to the system of equations above
to obtain other solutions that are feasible to the linear program and provide larger objective
function values. As we apply the row operations, we make sure that there is always a set of
three variables such that each one of these variables appear in only one constraint row with
a coefficient of 1. Furthermore, these variables do not appear in the objective function row
and each of these three variables appear in a different constraint row. For example, in the
system of equations above, each one of the three decision variables w1 , w2 and w3 appear
in different constraint rows with a coefficient of 1 and they do not appear in the objective
function row. We start with the system of equations
We just carried row operations. So, the system of equations above is equivalent to the
original one. Each one of the decision variables w1 , x2 and w3 appears only in each one
of the three constraint rows above and nowhere else. Thus, it is simple to spot a solution
The solution above is feasible to the linear program. It is actually not surprising that this
solution is feasible, since this solution is obtained from the original system of equations by
using row operations. Furthermore, the value of the decision variable z corresponds to the
value of the objective function provided by the solution above. From the objective function
row, for each unit of increase in the decision variable x1 , the objective function increases by
1 unit, whereas for each unit of increase in the decision variable w2 , the objective function
decreases by 2 units. Therefore, we will increase the value of x1 .
Next, we ask how much we can increase the value of the decision variable x1 while making
sure that the other variables stay non-negative. Considering the first constraint row above,
w1 is the decision variable that appears only in this row. Thus, if we increase x1 , then we will
make up for the increase in x1 by a decrease in w1 . We can increase x1 up to 400/ 32 = 800/3,
while making sure that w1 remains non-negative.
Considering the second constraint row above, x2 is the decision variable that appears
only in this row. Thus, if we increase x1 , then we will make up for the increase in x1 by a
decrease in x2 . We can increase x1 up to 600/ 21 = 1200, while making sure that x2 remains
non-negative.
Considering the third constraint row above, w3 is the decision variable that appears only
in this row. Thus, if we increase x1 , then we will make up for the increase in x1 by a decrease
in w3 . We can increase x1 up to 400, while making sure that w3 remains non-negative. Since
min{800/3, 1200, 400} = 800/3, we can increase x1 at most up to 800/3 while making sure
that all of the variables remain non-negative and all of the constraints remain satisfied.
When we increase x1 up to 800/3, the new value of the decision variable x1 is determined
by the first constraint. Therefore, we carry out row operations in the system of equations
above to make sure that x1 appears only in the first constraint row with a coefficient of 1. So,
we multiply the first constraint row by −2/3 and add it to the objective row. We multiply
the first constraint row by −1/3 and add it to the second constraint row. We multiply the
first constraint row by −2/3 and add it to the third constraint row. Finally, we multiply the
first constraint row by 2/3. These row operations yield the system of equations
In the system of equations above, the basic variables are x1 , x2 and w3 , whereas the non-basic
variables are w1 and w2 . Thus, through the row operations that we applied, the variable
x1 became basic and the variable w1 became non-basic. At any iteration, the variable that
becomes basic is called the entering variable. The variable that becomes non-basic is called
the leaving variable. Solutions with m basic variables and n non-basic variables are called
basic solutions. The solutions visited by the simplex method are basic solutions.
max 5 x1 + 3 x2 − x3
st 4 x1 − x2 + x3 ≤ 6
3 x1 + 2 x2 + x3 ≤ 9
4 x1 + x2 − x3 ≤ 3
x1 , x2 , x3 ≥ 0.
Using the slack variables, w1 , w2 and w3 , we can write the linear program above as
5 x1 + 3 x2 − x3 = z
4 x1 − x2 + x3 + w1 = 6
3 x1 + 2 x2 + x3 + w2 = 9
4 x1 + x2 − x3 + w3 = 3,
In the system of equations above, the basic variables are w1 , w2 and w3 , whereas the non-
basic variables are x1 , x2 and x3 . The values of the variables are given by
w1 = 6, w2 = 9, w3 = 3, x1 = 0, x2 = 0, x3 = 0, z = 0.
From the objective row, we observe that each unit of increase in x1 increases the objective
function by 5 units. Each unit of increase in x2 increases the objective function by 3
units. Each unit of increase in x3 decreases the objective function by 1 unit. Thus, we
will increase the value of x1 .
Considering the first constraint row, w1 is the decision variable that appears only
in this row. Thus, if we increase x1 , then we will make up for the increase in x1 by
a decrease in w1 . We can increase x1 up to 6/4, while making sure that w1 remains
non-negative. Considering the second constraint row, w2 is the decision variable that appears
only in this row. Thus, if we increase x1 , then we will make up for the increase in x1 by
a decrease in w2 . We can increase x1 up to 9/3 = 3, while making sure that w2 remains
non-negative. Considering the third constraint row, w3 is the decision variable that appears
only in this row. Thus, if we increase x1 , then we will make up for the increase in x1
by a decrease in w3 . We can increase x1 up to 3/4, while making sure that w3 remains
non-negative. By the preceding discussion, since min{6/4, 3, 3/4} = 3/4, we can increase x1
up to 3/4 while making sure that all of the other decision variables remain non-negative.
If we increase x1 up to 3/4, then the new value of x1 is determined by the third constraint
row. Thus, we carry out row operations in the system of equations above to make sure that
x1 appears only in the third constraint row with a coefficient of 1. In other words, the
entering variable is x1 and the leaving variable is w3 . So, we multiply the third constraint
row by −5/4 and add it to the objective function row. We multiply the third constraint row
by −1 and add it to the first constraint row. We multiply the third constraint row by −3/4
and add it to the second constraint row. Finally, we multiply the third constraint row by
1/4. In this case, we obtain the system of equations
The basic variables above are w1 , w2 and x2 . The non-basic variables are x1 , x3 and w3 . The
values of the variables are given by
w1 = 9, w2 = 3, x2 = 3, x1 = 0, x3 = 0, w3 = 0, z = 9.
The basic variables above are w1 , x3 and x2 . The non-basic variables are x1 , w2 and w3 . The
values of the variables are given by
w1 = 9, x3 = 1, x2 = 4, x1 = 0, w2 = 0, w3 = 0, z = 11.
From the last system of equations, we observe that increasing the value of one of the decision
variables x1 , w2 and w3 decreases the objective function value, since these variables have
negative coefficients in the objective function row. So, we stop and conclude that the last
solution above is an optimal solution to the linear program. In other words, the solution
(x1 , x2 , x3 , w1 , w2 , w3 ) = (0, 4, 1, 9, 0, 0) is an optimal solution providing the optimal objective
value 11 for the linear program.
xj ≥ 0 ∀ j = 1, . . . , n.
In the linear program above, there are n decision variables given by x1 , . . . , xn . The objective
function coefficient of decision variable xj is cj . There m constraints. The right side
coefficient of the i-th constraint is given by bi . The decision variable xj has the coefficient
aij in the left side of the i-th constraint. Using the slack variables w1 , . . . , wm , we write the
linear program above equivalently as
n
X
max cj x j
j=1
Xn
st aij xj + wi = bi ∀ i = 1, . . . , m
j=1
xj ≥ 0, wi ≥ 0 ∀ j = 1, . . . , n, i = 1, . . . , m.
c1 x 1 + c2 x 2 + . . . + cn x n = z
a11 x1 + a12 x2 + . . . + a1n xn + w1 = b1
a21 x1 + a22 x2 + . . . + a2n xn + w2 = b2
.. .. .. ..
. . . = .
am1 x1 + am2 x2 + . . . + amn xn + wm = bm .
To make our notation uniform, we label the variables w1 , . . . , wm as xn+1 , . . . , xn+m , in which
case the system of equations above looks like
c1 x 1 + c2 x 2 + . . . + cn x n = z
a11 x1 + a12 x2 + . . . + a1n xn + xn+1 = b1
a21 x1 + a22 x2 + . . . + a2n xn + xn+2 = b2
.. .. .. ..
. . . = .
am1 x1 + am2 x2 + . . . + amn xn + xn+m = bm .
At any iteration of the simplex method, the variables x1 , . . . , xn+m are classified into two
groups as basic variables and non-basic variables. Let B be the set of basic variables and N
where the first row above corresponds to the objective row and the remaining rows correspond
to the constraint rows. The objective function coefficient of the non-basic variable xj in the
current system of equations is c̄j . There is one constraint row associated with each one of the
basic variables {xi : i ∈ B}. The non-basic variable xj appears with a coefficient of āij in the
constraint corresponding to the basic variable xi . The right side of the constraint associated
with the basic variable xi is b̄i . We can obtain a solution to the system of equations above
by setting xi = b̄i for all i ∈ B, xj = 0 for all j ∈ N and z = α.
If c̄j ≤ 0 for all j ∈ N , then we stop. The solution corresponding to the current
system of equations is optimal. Otherwise, we pick a non-basic variable k ∈ N such that
k = arg max{c̄j : j ∈ N }, which is the non-basic variable with the largest coefficient in the
objective function row. We will increase the value of the non-basic variable xk .
Consider each constraint row i ∈ B. The basic variable xi is the decision variable that
appears only in this row. If āik > 0, then an increase in xk can be made up for by a
decrease in xi . In particular, we can increase xk up to b̄i /āik , while making sure that xi
remains non-negative. If āik < 0, then an increase in xk can be made up for by an increase in
xi . Therefore, if we increase xk , we do not run into the danger xi going negative. If āik = 0,
then increasing xk makes no change in constraint i. Thus, we can increase xk up to
n b̄ o
i
min : āik > 0 ,
i∈B āik
while making sure that none of the other variables become negative and all of the constraints
remain satisfied. If we increase xk to the value above, then the new value of the decision
variable xk is determined by the constraint ` = arg mini∈B { āb̄iki : āik > 0}. Thus, the entering
variable is xk and the leaving variable is x` . We carry out row operations such that the
decision variable xk appears with a coefficient of 1 only in constraint row `.
max 3 x1 + 2 x 2
st x1 + 2 x2 ≤ 12
2 x1 + x2 ≤ 11
x1 + x 2 ≤ 7
x1 , x2 ≥ 0.
Using the slack variables w1 , w2 and w3 associated with the three constraints above, this
linear program is equivalent to
max 3 x1 + 2 x2
st x1 + 2 x2 + w1 = 12
2 x1 + x2 + w2 = 11
x1 + x2 + w3 = 7
x1 , x2 , w1 , w2 , w3 ≥ 0.
In this case, the simplex method starts with the system of equations
3 x1 + 2 x2 = z
x1 + 2 x2 + w 1 = 12
2 x1 + x2 + w2 = 11
x1 + x2 + w3 = 7.
Recall the following properties of basic variables. First, each basic variable appears in exactly
one constraint row with a coefficient of one. Second, each basic variable appears in a different
33
coefficient row. Third, the basic variables do not appear in the objective function row. Due
to these properties, it is simple to spot a solution that satisfies the system of equations that
the simplex method visits.
In the system of equations above, the basic variables are w1 , w2 and w3 , whereas the
non-basic variables are x1 and x2 . The solution corresponding to the system of equations
above is
w1 = 12, w2 = 11, w3 = 7, x1 = 0, x2 = 0.
Also, since the basic variables do not appear in the objective function row and the non-basic
variables take the value 0, we can easily find the value of z that satisfies the system of
equations above. In particular, we have z = 0.
The solution (x1 , x2 , w1 , w2 , w3 ) = (0, 0, 12, 11, 7) is feasible to our linear program. The
simplex method starts with this feasible solution and visits other feasible solutions while
improving the value of the objective function. In the linear program above, it was simple to
find a feasible solution for the simplex method to start with. As we show in the next section,
it may not always be easy to find an initial feasible solution. To deal with this difficulty, we
develop a structured approach to find an initial feasible solution.
max x1 + x2
st x1 − 3 x2 ≤ −28
x2 ≤ 20
− x1 − x2 ≤ −24
x1 , x2 ≥ 0.
If we associate slack variables w1 , w2 and w3 with the three constraints above, then this
linear program is equivalent to
x1 + x2 = z
x1 − 3 x2 + w 1 = −28
x2 + w2 = 20
−x1 − x2 + w3 = −24.
In the system of equations above, the basic variables are w1 , w2 and w3 , whereas the non-basic
variables are x1 and x2 . For the system of equations above, we have the solution
This solution is not feasible for the linear program above. In fact, we do not even know that
there exists a feasible solution to the linear program! So, we focus on the question of how
we can find a feasible solution to the linear program and how we can use this solution as the
initial solution for the simplex method.
Consider the linear program
We call this linear program as the phase-1 linear program since we will use this linear program
to obtain an initial feasible solution for the simplex method. We call the decision variable u
as the artificial decision variable. The phase-1 linear program always has a feasible solution
since setting u = 28, x1 = 0 and x2 = 0 provides a feasible solution to it. In the objective
function of the phase-1 linear program, we minimize u. So, if possible at all, at the optimal
solution to the phase-1 linear program, we want to set the value of the decision variable u
to 0. Observe that if u = 0 at the optimal solution to the phase-1 linear program, then the
optimal values of the decision variables x1 and x2 satisfy
which implies that these values of the decision variables are feasible to the original linear
program that we want to solve. Therefore, if we solve the phase-1 linear program and the
value of the decision variable u is 0 at the optimal solution, then we can use the optimal
values of the decision variables x1 and x2 as an initial feasible solution to the original linear
program that we want to solve.
On the other hand, if we have u > 0 at the optimal solution to the phase-1 linear program,
then it is not possible to set the value of the decision variable to u to 0 and still obtain a
Thus, if we have u > 0 at the optimal solution to the phase-1 linear program, then the
original linear program that we want to solve does not have a feasible solution. In other
words, the original linear program that we want to solve is not feasible.
This discussion shows that to obtain a feasible solution to the linear program that we
want to solve, we can first solve the phase-1 linear program. If we have u = 0 at the optimal
solution to the phase-1 linear program, then the values of the decision variables x1 and x2
provide a feasible solution to the original linear program that we want to solve. If we have
u > 0 at the optimal solution to the phase-1 linear program, then there does not exist a
feasible solution to the original linear program.
So, we proceed to solving the phase-1 linear program. Associating the slack variables
w1 , w2 and w3 with the three constraints and moving the decision variable u to the left side
of the constraints, the simplex method starts with the system of equations
In the system of equations above, the basic variables are w1 , w2 and w3 , whereas the non-
basic variables are x1 , x2 and u. The solution corresponding to the system of equations
above is given by
Note that this solution is not feasible to the phase-1 linear program because x1 − 3 x2 =
0 > −28 = −28 + u. However, with only one set of row operations on the system of
equations above, we can immediately obtain an equivalent system of equations such that we
can spot a feasible solution to the phase-1 linear program from the new equivalent system
of equations. In particular, we focus on the constraint row that has the most negative right
side. We subtract this constraint row from every other constraint row and we add this
constraint row to the objective function row. In particular, we focus on the first constraint
row above. We subtract this constraint row from every other constraint row and add it to
the objective function row. Also, we multiply the first constraint row by −1. In this case,
we obtain the system of equations
This solution is feasible for the phase-1 linear program. Now, we can apply the simplex
method as before to obtain an optimal solution to the phase-1 linear program.
Since we are minimizing the objective function in the phase-1 linear program, in the
system of equations above, we pick the decision variable that has the largest negative
objective function coefficient, which is x2 with an objective function coefficient of −3. We
increase the value of this decision variable. Applying the simplex method as before, we can
increase x2 up to min{28/3, 48/4, 4/2} = 2, while making sure that all of the other decision
variables remain non-negative. In this case, the new value of the decision variable x2 is
determined by the third constraint row. Thus, we carry out row operations such that x2
appears only in the third constraint row with a coefficient of 1. In other words, the entering
variable is x2 and the leaving variable is w1 . Carrying out the appropriate row operations,
we obtain the system of equations
In the system of equations above, the basic variables are u, w2 and x2 , whereas the non-basic
variables are x1 , w1 and w3 . Noting the objective function row, we increase the value of
x1 . We can increase x1 up to min{22/2, 40/3} = 11, while making sure that all of the other
decision variables remain non-negative. In this case, the new value of the decision variable
x1 is determined by the first constraint row. Thus, we carry our row operations such that x1
appears only in the first constraint row with a coefficient of 1. In other words, the entering
variable is x1 and the leaving variable is u. Through appropriate row operations, we obtain
the system of equations
x1 = 11, w2 = 7, x2 = 13, w1 = 0, w3 = 0, u = 0, z = 0.
Since we have u = 0 at the optimal solution to the phase-1 linear program, we can use the
values of the decision variables x1 and x2 as an initial feasible solution to the original linear
program. In particular, x1 = 11 and x2 = 13 provides a feasible solution to the original
linear program that we want to solve. We use this solution as an initial feasible solution
when we use the simplex method to solve the original linear program.
x1 −3 x2 +w1 −u = −28
x2 +w2 −u = 20
−x1 −x2 +w3 −u = −24
for the constraints. After applying a sequence of row operations, we ended up with the
system of equations
x1 + 14 w1 − 43 w3 + 12 u = 11
1
w +w2 + 41 w3 − 32 u = 7
4 1
x2 − 41 w1 − 41 w3 + 12 u = 13.
x1 −3 x2 +w1 = −28
x2 +w2 = 20
−x1 −x2 +w3 = −24
Thus, if we apply the same sequence of row operations that we applied in the previous
section, then we would end up with the system of equations
x1 + 14 w1 − 34 w3 = 11
1
w +w2 + 14 w3 = 7
4 1
x2 − 14 w1 − 14 w3 = 13.
Since a system of equations remains equivalent after applying a sequence of row operations,
this discussion implies that the last system of equations above are equivalent to the
constraints of the original linear program. Thus, noting that we maximize x1 + x2 in the
objective function of the original linear program, to solve the original linear program, we
can start with the system of equations
In the system of equations above, we are tempted to identify x1 , w2 and x2 as the basic
variables, but we observe that the decision variables x1 and x2 have non-zero coefficients in
the objective function row, while the basic variables need to have a coefficient of 0 in the
objective row. However, with only one set of row operations, we can immediately obtain a
new system of equations that is equivalent to the one above and the decision variables x1 ,
w2 and x2 appear only in one of the constraints with a coefficient of 1 without appearing
in the objective function row. In particular, we multiply the first constraint row by −1 and
add it to the objective row. We multiply the third constraint row by −1 and add it to the
objective row. Thus, we obtain the system of equations
which is a feasible solution to the linear program that we want to solve and this solution
provides an objective value of 24.
Since we are maximizing the objective function in our linear program, noting the objective
function row in the system of equations above, we increase the value of the decision variable
w3 . We can increase w3 up to 7/ 41 = 28 while making sure that all of the other decision
variables remain non-negative. In this case, the new value of the decision variable w3 is
determined by the second constraint row. Thus, we carry out row operations such that w3
appears only in the second constraint row with a coefficient of 1. In other words, the entering
variable is w3 and the leaving variable is w2 . Applying the appropriate row operations, we
obtain the system of equations
In this system of equations, x1 , w3 and x2 are the basic variables, whereas w1 and w3 are
the non-basic variables. The solution corresponding to the system of equations above is
Since all of the objective function row coefficients are non-positive, we conclude that the
solution (x1 , x2 ) = (32, 20) is an optimal solution providing an objective value of 52.
xj ≥ 0 ∀ j = 1, . . . , n.
If we have a greater than or equal to constraint of the form nj=1 aij xj ≥ bi , then we can
P
equivalently write this constraint as a less than or equal to constraint as − nj=1 aij xj ≤ −bi .
P
If we have an equal to constraint of the form nj=1 aij xj = bi , then we can equivalently
P
write this constraint as two inequality constraints nj=1 aij xj ≤ bi and nj=1 aij xj ≥ bi .
P P
If we have a decision variable xj that takes non-positive values, then we can use a new
decision variable yj that takes non-negative values and replace all occurrences of xj by −yj .
Finally, if we have a non-restricted decision variable xj that takes both positive and
negative values, then we can use two new non-negative decision variables x̂j and x̄j to
replace all occurrences of xj by x̂j − x̄j . By using the transformations above, we can convert
any linear program into a form where we maximize the objective function with less than or
equal to constraints and non-negative decision variables. Consider the linear program
min 5 x1 − 9 x2 + 3 x3 + 4 x4
st x1 + 7 x2 + 5 x3 + 2 x4 = 9
2 x1 + 3 x3 ≥ 7
x2 + 6 x4 ≤ 4
x1 ≥ 0, x2 is free, x3 ≤ 0, x4 ≥ 0.
max 4 x 1 + 6 x2 − 3 x3
st 2 x1 + x2 − 2 x3 ≤ 3
3 x 1 + 3 x2 − 2 x3 ≤ 4
x1 , x2 , x3 ≥ 0.
4 x1 + 6 x2 − 3 x3 = z
2 x1 + x2 − 2 x3 + w1 = 3
3 x1 + 3 x2 − 2 x3 + w2 = 4.
The basic variables above are w1 and w2 . The non-basic variables are x1 , x2 and x3 . This
system of equations has the corresponding solution w1 = 3, w2 = 4, x1 = 0, x2 = 0, x3 =
0, z = 0. We choose to increase the value of the decision variable x2 , since x2 has the largest
positive coefficient in the objective function row. We can increase x2 up to min{3/1, 4/3} =
4/3, while making sure that all of the other decision variables remain non-negative. Thus,
we carry out row operations so that x2 appears only in the second constraint row with a
coefficient of 1. These row operations provide the system of equations
In the system of equations above, the basic variables are w1 and x2 . The non-basic variables
are x1 , x3 and w2 . This system of equations yields the solution w1 = 5/3, x2 = 4/3, x1 =
43
0, x3 = 0, w2 = 0, z = 8. We increase the value of the decision variable x3 since it has the
largest positive coefficient in the objective function row.
Considering the first constraint row above, since the decision variable x3 appears with
a negative coefficient in this constraint row, if we increase x3 , then we can make up for
the increase in x3 by increasing w1 . Thus, we can increase x3 as much as we want without
running into the danger of w1 going negative, which implies that the first constraint row
does not impose any restrictions on how much we can increase the value of x3 . Similarly,
considering the second constraint row above, if we increase x3 , then we can make up for
the increase in x3 by increasing x2 . So, we can increase x3 as much as we want without
running into the danger of x2 going negative. This discussion shows that we can increase
x2 as much as we want without running into the danger of any of the other variables going
negative. Also, the objective function row coefficient of x2 in the last system of equations is
positive, which implies that the increase in x2 will make the value of the objective function
larger. Therefore, we can make the objective function value as large as we want without
violating the constraints. In other words, this linear program is unbounded.
The moral of this story is that if the system of equations at any iteration of the simplex
method has a non-basic variable such that this non-basic variable has a positive coefficient
in the objective function row and has a non-positive coefficient in all of the constraint rows,
then the linear program is unbounded.
We note that it is difficult to see a priori that the linear program we want to solve is
unbounded. However, the simplex method detects the unboundedness of the linear program
during the course of its iterations. Once the simplex method detects that the linear program
is unbounded, we can actually provide explanation for the unboundedness. For the linear
program above, for some t ≥ 0, consider the solution
5 4 4 2
x3 = t, w1 = + t, x2 = + t, x1 = 0, w2 = 0.
3 3 3 3
For any t ≥ 0, we have
4 2 4 4
2 x 1 + x2 − 2 x3 = + t − 2t = − t ≤ 3
3 3 3 3
4 2
3 x1 + 3 x2 − 2 x3 = 3 + t − 2t = 4
3 3
4 2
x1 = 0, x2 = + t ≥ 0, x3 = t ≥ 0.
3 3
Therefore, the solution above is feasible to the linear program that we want to solve for
any value of t ≥ 0. Also, this solution provides an objective value of 4 x1 + 6 x2 − 3 x3 =
6 43 + 23 t − 3t = 8 + t. If we choose t arbitrarily large, then the solution above is feasible
to the linear program that we want to solve, but the objective value 8 + t provided by this
solution is arbitrarily large. So, the linear program is unbounded.
max 7 x1 + 12 x2 − 3 x3
st 6 x1 + 8 x2 − 2 x3 ≤ 1
− 3 x1 − 3 x2 + x3 ≤ 2
x1 , x2 , x3 ≥ 0.
To solve the linear program above, the simplex method starts with the system of equations
7 x1 + 12 x2 − 3 x3 = z
6 x1 + 8 x2 − 2 x3 + w 1 = 1
−3 x1 − 3 x2 + x3 + w2 = 2.
The basic variables above are w1 and w2 . The non-basic variables are x1 , x2 and x3 . The
solution correspondong to this system of equations is w1 = 1, w2 = 2, x1 = 0, x2 = 0, x3 =
0, z = 0. Since x2 has the largest positive coefficient in the objective function row, we choose
to increase the value of the decision variable x2 . We can increase x2 up to 1/8, while making
sure that all of the other decision variables remain non-negative. Thus, we carry out row
operations so that x2 appears only in the first constraint row with a coefficient of 1. Through
these row operations, we obtain the system of equations
In the system of equations above, the basic variables are x2 and w2 . The non-basic variables
are x1 , x3 and w1 . The solution corresponding to this system of equations is
1 19 3
x2 = , w2 = , x1 = 0, x3 = 0, w1 = 0, z = .
8 8 2
Since the objective function row coefficients of all variables are non-positive, increasing any
of the variables does not improve the value of the objective function. Thus, the solution
above is optimal and the optimal objective value of the linear program is 3/2.
Now, the critical observation is that x3 is a non-basic variable whose objective function
row coefficient happened to be 0. If we increase the value of this decision variable, then
the value of the objective function does not increase, but the value of the objective function
does not decrease either! So, it is harmless to try to increase the decision variable x3 . Let
The basic variables are x2 and x3 . The non-basic variables are x1 , w1 and w2 . The solution
corresponding to the system of equations above is
5 19 3
x2 = , x3 = , x1 = 0, w1 = 0, w2 = 0, z = .
2 2 2
In the last system of equations, the objective function row coefficients are non-positive. Thus,
the solution above is also optimal for the linear program and it provides an objective value
of 3/2. The two solutions that we obtained are quite different from each other, but they are
both optimal for the linear program, providing an objective value of 3/2.
The moral of this story is that if the final system of equations in the simplex method
includes a non-basic variable whose coefficient in the objective function row is 0, then we
have multiple optimal solutions to the linear program.
6.3 Degeneracy
In the linear programs that we considered so far, the basic variables always took strictly
positive values. However, it is possible that some basic variables take value 0. In such cases,
we say that there is degeneracy in the current solution and the simplex method may have to
carry out multiple iterations without improving the value of the objective function. Consider
the linear program
max 12 x1 + 6 x2 + 16 x3
st x1 − 4 x2 + 4 x3 ≤ 2
x1 + 2 x2 + 2 x 3 ≤ 1
x1 , x2 , x3 ≥ 0.
We start with the system of equations
12 x1 + 6 x2 + 16 x3 = z
x1 − 4 x2 + 4 x3 + w 1 = 2
x1 + 2 x2 + 2 x3 + w2 = 1.
The basic variables are w1 and w2 . The non-basic variables are x1 , x2 and x3 . The solution
corresponding to the system of equations above is
w1 = 2, w1 = 1, x1 = 0, x2 = 0, x3 = 0, z = 0.
The basic variables are x3 and w2 . The non-basic variables are x1 , x2 and w1 . The solution
corresponding to the system of equations above is given by
1
x3 = , w2 = 0, x1 = 0, x2 = 0, w1 = 0, z = 8.
2
In the solution above, the basic variable w2 takes value 0. This solution provides an objective
value of 8.
We increase the value of the decision variable x2 , whose objective function row coefficient
is 22 in the system of equations above. We can increase x2 up to 0/4 = 0 while making sure
that all of the other decision variables remain non-negative. So, we carry out row operations
so that x2 appears only in the second constraint row with a coefficient of 1. In this case, we
get the system of equations
In this system of equations, the basic variables are x3 and x2 , whereas the non-basic variables
are x1 , w1 and w2 . The solution corresponding to the system of equations above is
1
x3 = , x2 = 0, x1 = 0, w1 = 0, w2 = 0, z = 8.
2
Now, the basic variable x2 takes value 0. We observe that the values of the decision variables
in the last two solutions we obtained are identical. Only the classification of the variables
as basic and non-basic has changed. Furthermore, the last two solutions both provide an
objective value of 8 for the linear program. Thus, this iteration of the simplex method did
not improve the objective value for the linear program at all.
We increase the decision variable x1 . We can increase x1 up to min{ 12 / 83 , 0/ 18 } = 0. Thus,
the new value of the decision variable x1 is determined by the second constraint. In this
case, we carry out row operations to make sure that x1 appears only in the second constraint
row with a coefficient of 1. We obtain the system of equations
The basic variables are w1 and x1 , whereas the non-basic variables are x2 , x3 and w2 . The
solution corresponding to the system of equations above is
w1 = 1, x1 = 1, x2 = 0, x3 = 0, w2 = 0, z = 12.
The objective value provided by the solution above is 12. So, we finally obtained a solution
that improves the value of the objective function from 8 to 12. In the last system of equations,
since the objective function row coefficients of all of the decision variables are non-positive,
the solution above is optimal for the linear program. We can stop.
The moral of this story is that we can have basic variables that take value 0. When we
have basic variables that take value 0, we say that the current solution is degenerate. If we
encounter a degenerate solution, then the simplex method may have to carry out multiple
iterations without improving the value of the objective function.
To formulate this problem as a linear program, we use the decision variable xij to
capture the number of units that we ship over arc (i, j). Thus, our decision variables are
x12 , x13 , x24 , x32 , x35 , x45 and x54 . To understand how we can set up the constraints in our
linear program, the figure below shows one possible feasible solution to the problem. The
labels on the arcs show the number of units shipped on each arc. The arcs that do not have
any flow of product on them are indicated in dotted lines. In particular, for the solution in
the figure below, the values of the decision variables are
49
Concentrating on the supply node 2 with a supply of 2 units, this node receives 4 units from
node 3. Also, counting the 2 units of supply at node 2, node 2 has now 6 units of product. So,
the flow out of node 2 in the feasible solution is 6. Therefore, the flow in and out of a supply
node i in a feasible solution must satisfy
Total Flow into Node i + Supply at Node i = Total Flow out of Node i,
Total Flow out of Node i − Total Flow into Node i = Supply at Node i.
On the other hand, concentrating on the demand node 4 with a demand of 3 units, this node
receives 6 units from node 2. Out of these 6 units, 3 of them are used to serve the demand
at node 4 and the remaining 3 become the flow out of node 4. Thus, the flow in and out of
a demand node i in a feasible solution must satisfy
Total Flow into Node i = Demand at Node i + Total Flow out of Node i,
Total Flow into Node i − Total Flow out of Node i = Demand at Node i.
Node 3 is neither a demand node or a supply node. For such a node, the total flow out
of the node must be equal to the total flow into the node. Thus, the linear programming
formulation of the problem is given by
Now, all of the constraints in this linear program are of the form
Total Flow out of Node i − Total Flow into Node i = Availability at Node i,
where availability is a positive number at supply nodes and a negative number at demand
nodes. The last linear program avoids the necessity to remember two different forms
of constraints for the supply and demand nodes. Our constraints always have the form
(total flow out) − (total flow in) = (availability at the node). We only need to remember
that availability is positive at supply nodes and negative at demand nodes.
An interesting observation for the min-cost network flow problem is that one of the
constraints in the problem is always redundant. For example, assume that we have a solution
that satisfies the first, second, fourth and fifth constraints. If we add the first, second, fourth
and fifth constraints, then we obtain
x12 + x13 = 5
−x12 + x24 − x32 = 2
− x24 + x45 − x54 = −3
− x35 − x45 + x54 = −4
x13 − x32 − x35 = 0,
x12 + x13 = 5
−x12 + x24 − x32 = 2
− x24 + x45 − x54 = −3
− x35 − x45 + x54 = −4.
We know that in a system of equations with four constraints, we have four basic
variables. Assume that we use the simplex method to solve the min-cost network flow
problem. We want to answer the question of what the system of equations for the constraints
would look like when the basic variables are, for example, x13 , x24 , x32 and x45 . To answer
this question, we carry out row operations in the system of equations above to make sure that
x13 , x24 , x32 and x45 appear in a different constraint with coefficients of 1. The variable x13
already appears in the first constraint with a coefficient of 1 and nowhere else. We multiply
the second constraint by −1 to get
x12 + x13 = 5
x12 − x24 + x32 = −2
− x24 + x45 − x54 = −3
− x35 − x45 + x54 = −4,
x12 + x13 = 5
x12 + x32 − x45 + x54 = 1
x24 − x45 + x54 = 3
− x35 − x45 + x54 = −4.
Thus, x24 appears only in the third constraint with a coefficient of 1 and nowhere else. Finally,
we subtract the fourth constraint from the second and third constraints, and multiply the
fourth constraint by −1 to get
x12 + x13 = 5
x12 + x32 + x35 = 5
x24 + x35 = 7
x35 + x45 − x54 = 4.
So, x45 now appears in the fourth constraint only with a coefficient of 1. Thus, if the simplex
method visited the solution with basic variables x13 , x24 , x32 and x45 , then the values of these
decision variables would be x13 = 5, x24 = 7, x32 = 5 and x45 = 4. Note that we did not
have to carry out a division operation to obtain these values. Also, all of the multiplication
operations were multiplication by −1. As a result, the values of the decision variables x13 ,
x24 , x32 and x45 are obtained by adding and subtracting the supply and demand quantities
in the original min-cost network flow problem. If the supply and demand quantities are
integers, then the values of x13 , x24 , x32 and x45 are integers as well.
As another example, let us check what the system of equations for the constraints in the
simplex method would look like when the basic variables are x12 , x13 , x24 and x35 . We start
from the last system of equations above. Since this system of equations was obtained from
the original constraints of the min-cost network flow problem by using row operations, this
system of equations is equivalent to the original constraints of the min-cost network flow
problem. The variable x13 appears only in the first constraint only with a coefficient of 1. To
make sure that x12 appears only in the second constraint with a coefficient of 1, we subtract
the second constraint from the first constraint to obtain
x13 − x32 − x35 = 0
x12 + x32 + x35 = 5
x24 + x35 = 7
x35 + x45 − x54 = 4.
The variable x24 already appears only in the third constraint only with a coefficient
of 1. To make sure that x35 appears only in the fourth constraint with a coefficient of 1,
we add the fourth constraint to the first constraint and subtract the fourth constraint from
Thus, if the simplex method visited the solution with basic variables x12 , x13 , x24 and
x35 , then the values of these decision variables would be x12 = 1, x13 = 4, x24 = 3 and
x35 = 4. Again, we only used addition and subtraction to obtain these values. In particular,
we did not use any division operation.
Although this discussion is not a theoretical proof, it convinces us that when we apply
the simplex method on the min-cost network flow problem, we never have to use division
and the only multiplication operation we use is multiplication by −1. So, the values of the
decision variables in any solution visited by the simplex method are obtained by adding and
subtracting the supply and demand quantities in the original problem. Thus, as long as
the supply and demand quantities in the original problem take integer values, the decision
variables will also take integer values in any solution visited by the simplex method. Since
this observation applies to the final solution visited by the simplex method, the optimal
solution to the min-cost network flow problem will be integer valued.
Supply"of"3"
1" Demand"of"3"
2" 4"
5"
1"
3" 5"
6" Demand"of"4"
We want to figure out how to ship the product from the supply nodes to the demand nodes
so that we incur the minimum shipment cost, while making sure that we do not violate the
supply availabilities at the supply nodes and satisfy the demands at the demand nodes, but
we do not need to ship out all the supply from the supply nodes. So, the flow in and out of
a supply node i in a feasible solution must satisfy
Total Flow into Node i + Supply at Node i ≥ Total Flow out of Node i,
which can equivalently be written as
Total Flow out of Node i − Total Flow into Node i ≤ Supply at Node i.
We only need to adjust our constraints for the supply nodes. The constraints for the other
nodes do not change. Using the decision variable xij with the same interpretation as before,
we can formulate the problem as the linear program
min 5 x12 + x13 + x24 + 2 x32 + 6 x35 + 2 x45 + 5 x54
st x12 + x13 ≤ 6
x24 − x12 − x32 ≤ 3
x32 + x35 − x13 = 0
x45 − x24 − x54 = −3
x54 − x35 − x45 = −4
x12 , x13 , x24 , x32 , x35 , x45 , x54 ≥ 0.
Job
1 2 3
Tech
1 2 4 5
2 3 6 8
3 8 4 9
This problem can be formulated as a special min-cost network flow problem. In the figure
below, we put one node on the left side for each technician. Each one of these nodes is a
supply node with a supply of 1 unit. We put one node on the right side for each job. Each
one of these nodes is a demand node with a demand of 1 unit. There is an arc from each
technician node to each job node. Assigning technicians to jobs is equivalent to shipping out
the supplies from the technician nodes to satisfy the demand at the job nodes. If we ship
the supply at technician node i to satisfy the demand at job node j, then we are assigning
technician i to job j, in which case, we get the reward of assigning technician i to job j.
We use xij to capture the number of units flowing on arc (i, j) in the figure above. We can
57
figure out how to ship the supplies from the technician nodes to cover the demand at the
job nodes to maximize the total reward by solving the linear program
In this linear program, we maximize the objective function, but maximizing the objective
function is equivalent to minimizing the negative of this objective function. We kept all of
the constraints of the form (total flow out) − (total flow in) = (availability at the node),
which is the form we used when formulating min-cost network flow problems in the previous
chapter. Thus, this linear program corresponds to the min-cost network flow problem for the
network depicted in the figure above. Since we know that min-cost network flow problems
have integer valued optimal solutions, we do not need to worry about the possibility of
sending half a unit of flow from a technician to one job and half a unit to another job in the
optimal solution. Thus, the optimal solution to the linear program above provides a valid
assignment of the technicians to the jobs. We refer to the problem above as the assignment
problem. It is common to multiply the last three constraints in the formulation above by
−1 and write the assignment problem as
max 2 x11 + 4 x12 + 5 x13 + 3 x21 + 6 x22 + 8 x23 + 8 x31 + 4 x32 + 9 x33
st x11 + x12 + x13 = 1
x21 + x22 + x23 = 1
x31 + x32 + x33 = 1
x11 + x21 + x31 = 1
x12 + x22 + x32 = 1
x13 + x23 + x33 = 1
xij ≥ 0 ∀ i = 1, 2, 3, j = 1, 2, 3,
in which case, the last three constraints ensure that we have a total flow of 1 into the demand
node corresponding to each job. So, each job gets one technician. The first three constraints
the demand node corresponding to job j is ni=1 xij . Thus, the compact formulation of the
P
assignment problem is
The moral of this story is that we can find the optimal assignment of technicians to jobs
by a linear program without explicitly imposing the constraint that the assignment decisions
should take integer values. This result follows from the fact that the assignment problem
can be formulated as a min-cost network flow problem.
This problem can also be formulated as a special min-cost network flow problem. In
the figure above, we put 1 unit of supply at the origin node 0 and 1 unit of demand at the
destination node 5. If we ship a unit of flow over an arc, then we incur the cost indicated on
the arc. Consider the problem of shipping the unit of supply at node 0 to satisfy the demand
at node 5 while minimizing the cost of the shipment. This unit supply will travel over the
path with the total minimum cost from node 0 to node 5. So, this unit will travel over the
shortest path from node 0 to node 5. Thus, figuring out how to ship the unit of supply from
node 0 to node 5 in the cheapest possible manner is equivalent to finding the shortest path
from node 0 to node 5. We use the decision variable xij to capture the flow on arc (i, j).
The problem of finding the cheapest possible way to ship the unit of supply from node 0 to
node 5 can be solved as the min-cost network flow problem
The problem above is called the shortest path problem. In the constraints, we follow the
convention that (total flow out) − (total flow in) = (availability at the node). The optimal
solution to the problem above is given by x∗03 = 1, x∗31 = 1, x∗12 = 1, x∗24 = 1, x∗45 = 1. The
other decision variables are 0 in the optimal solution. Thus, to go from node 0 to node 5
with the smallest possible cost, we go from node 0 to 3, from node 3 to node 1, from node 1
to node 2, from node 2 to node 4 and from node 4 to node 5.
To give a compact formulation of the shortest path problem, we use N = {0, 1, . . . , n}
to denote the set of nodes and A to denote the set of arcs in the network. We let 0 be the
where the first and third constraints are the flow balance constraints for the origin and
destination nodes, whereas the second constraint corresponds the flow balance constraints
for all nodes other than the origin and destination nodes.
7" 6"
1" 3" 5" ,t"
4"
4" 3" 3" 5" 7"
We can find the maximum amount of flow we can push from node 0 to node 5 by using
a special min-cost network flow problem. In the network above, we put t units of supply at
This problem is called the max-flow problem. We emphasize that t is a decision variable in
the problem above. The first constraint is the flow balance constraint for node 0. The sixth
constraint is the flow balance constraint for node 5. The second to fifth constraints are the
flow balance constraints for the nodes other than nodes 0 and 5. The last set of constraints
ensures that the flows on the arcs adhere to the maximum flow allowed on each arc.
The optimal solution to the problem above is given by t∗ = 11, x∗01 = 4, x∗02 = 4, x∗03 =
3, x∗13 = 4, x∗24 = 4, x∗34 = 1, x∗35 = 6, x∗45 = 5. The other decision variables are 0 in the
optimal solution. Since t∗ = 11, the maximum amount of flow we can push from node 0 to
node 5 is 11 units.
To give a compact formulation of the max-flow problem, we use N = {0, 1, . . . , n} to
denote the set of nodes and A to denote the set of arcs in the network. We use Uij to denote
the maximum flow allowed on arc (i, j). We want to find the maximum flow we can push
from node 0 to node n. We use the decision variable xij to capture the flow on arc (i, j)
The first and third constraints are the flow balance constraints for nodes 0 and n. The second
set of constraints corresponds to the flow balance constraints for the nodes other than nodes
0 and n. The fourth set of constraints ensures that the flows on the arcs do not exceed the
maximum flow allowed on each arc.
64
and xs to respectively denote the number of contracts we sign with memory-intensive and
storage-intensive customers. The problem we want to solve can be formulated as the linear
program
To solve the linear program above by using Gurobi, we construct a text file with the following
contents and save it in a file named cloud.lp.
Maximize
2400 xm + 3200 xs
Subject To
ramConst : 100 xm + 40 xs <= 10000
stoConst : 200 xm + 400 xs <= 60000
Bounds
xs <= 140
End
The section titled Maximize indicates that we are maximizing the objective function. We
provide the formula for the objective function by using the decision variables. We do not
need to declare the decision variables separately. The section titled Subject To defines
the constraints. We name the first constraints ramConst and provide the formula for this
constraint. We define the second constraint similarly. The section Bounds gives the upper
bounds on our decision variables. The decision variable xs has an upper bound of 140. We
could list the bound on the decision variable xs as another constraint under the section titled
Subject To, but if we list the upper bounds on the decision variables under the section titled
Bounds, then Gurobi deals with the upper bounds more efficiently.
Once we have the text file that includes our linear programming model, we open a
terminal window and type the command gurobi.sh, which runs Gurobi as a standalone
linear programming solver. We can solve the linear program as follows.
The command myModel = read("cloud.lp") reads the linear programming model in the
file cloud.lp and stores this model to the variable myModel. If the file cloud.lp is not stored
under the current working directory, then we need to provide the full path when reading the
After solving the linear program, Gurobi informs us that the optimal objective value is
520,000. We can explore the optimal solution to the linear program as follows.
gurobi> myModel.printAttr("X")
Variable X
-------------------------
xm 50
xs 125
gurobi> myVars = myModel.getVars()
gurobi> print myVars
[<gurobi.Var xm (value 50.0)>, <gurobi.Var xs (value 125.0)>]
gurobi> print myVars[0].varName, myVars[0].x
xm 50.0
The command myModel.printAttr("X") prints the "X" attribute of the model stored in
the variable myModel. This attribute includes the names and the values of the decision
variables. The command myVars = myModel.getVars() stores the decision variables of the
linear program in the array myVars. We can print this array by using the command print
myVars. Note that printing the array myVars shows the names and the values of the decision
variables. The command print myVars[0].varName, myVars[0].x prints the name and
the value of the first decision variable. In particular, myVars[0] returns the first decision
variable in the array myVars[0] and we access the name and the value of this decision
variable by using the fields varName and x.
Constraint pi
-------------------------
ramConst 10
stoConst 7
gurobi> myConsts = myModel.getConstrs()
gurobi> print myConsts
[<gurobi.Constr ramConst>, <gurobi.Constr stoConst>]
gurobi> print myConsts[0].constrName, myConsts[0].pi
ramConst 10.0
In a following chapter, we will study duality theory. When we study duality theory, we will
see that there is a dual variable associated with each constraint of a linear program. The
command myModel.printAttr("pi") prints the "pi" attribute of our model. This attribute
includes the names of the constraints and the values of the dual variables associated with
the constraints. From the output above, the optimal value of the dual variable associated
with the first constraint is 10. The command myConsts = myModel.getConstrs() stores
the constraints in the array myConsts. We can print this array by using the command print
myConsts. The output from printing the array myConsts is uninformative. It only shows
the constraint names. The command print myConsts[0].constrName, myConsts[0].pi
prints the name of the first constraint along with the value of the dual variable associated
with this constraint. In particular, myConsts[0] returns the first constraint in the array
myConsts and we access the name of this constraint and the optimal value of the dual
variable associated with this constraint by using the fields constrName and pi.
We can use the following set of commands to open a file and write the names and the
values of the decision variables into the file.
gurobi> outFile = open( "solution.txt", "w" )
gurobi> for curVar in myVars:
....... outFile.write( curVar.varName + " " + str( curVar.x ) + "\n" )
.......
gurobi> outFile.close()
The command outFile = open( "solution.txt", "w" ) opens the file solutions.txt
for writing and assigns this file to the variable outFile. Recall that we stored the decision
variables of our linear program in the array myVars. We use a for loop to go through all
elements of this array and write the varName and x fields of each decision variable into the
file. Lastly, we close the file. As may have been clear by now, interacting with Gurobi is
similar to writing a Python script. Many other constructions that are available for writing
Python scripts are also available when interacting with Gurobi.
We start by creating a model and store our model in the variable myModel. Next, we create
the decision variables and add them into our model. When creating a decision variable, we
specify that the variable takes continuous values and give a name for the decision variable. If
there is an upper or a lower bound on the decision variable, then we can specify these bounds
as well. If we do not specify any upper and lower bounds, then the default choices are infinity
for the upper bound and zero for the lower bound. Giving a name to the decision variable is
optional. We store the two decision variables that we create in the variables xm and xs. The
command myModel.update() is easy to overlook, but it is important. It ensures that our
model myModel recognizes the variables xm and xs.
We proceed to creating the objective function and constraints of our model. Both the
objective and the constraints are created by using the call LinExpr(), which creates an
empty linear function. We construct the components of the linear function one by one. For
the objective function, we create a linear function and store this linear function in the
variable objExpr. Next, we indicate the coefficient of each decision variable in the objective
function. Finally, we set the objective function of our model myModel to be the linear function
objExpr. While doing so, we specify that we are maximizing the objective function. We
create the constraints of our model somewhat similarly. For the first constraint, we create a
linear function and store the linear function in the variable firstConst. Next, we indicate
the coefficients of each decision variable in the constraint. Finally, we add the constraint to
Job
1 2 3
Tech
1 2 4 5
2 3 6 8
3 8 4 9
max 2 x11 + 4 x12 + 5 x13 + 3 x21 + 6 x22 + 8 x23 + 8 x31 + 4 x32 + 9 x33
st x11 + x12 + x13 = 1
x21 + x22 + x23 = 1
x31 + x32 + x33 = 1
x11 + x21 + x31 = 1
x12 + x22 + x32 = 1
x13 + x23 + x33 = 1
xij ≥ 0 ∀ i = 1, 2, 3, j = 1, 2, 3,
where the first three constraints ensure that each technician is assigned to one job and the
last three constraints ensure that each job gets one technician. The problem above has nine
decision variables and six constraints. In our Python program, we can certainly create nine
decision variables and six constraints one by one, but this task would be tedious when the
numbers of technicians and jobs get large. In the following Python program, we use loops
to create the decision variables and the constraints. We present each portion of the program
separately. We start by initializing the data for the problem.
The variables noTechs and noJobs keep the numbers of technicians and jobs. Since the
numbers of technicians and jobs are equal to each other, there is really no reason to define
two variables, but having two variables will be useful when we want to emphasize whether we
are looping over the technicians or the jobs in the subsequent portions of our program. We use
We have one decision variable for each technician and job pair. Each one of these decision
variables takes continuous values. We name the decision variables by using the technician
and job to corresponding to each decision variable. Lastly, we store all of the decision
variables in the two-dimensional array myVars, so that the (i, j)-th element of the array
myVars includes the decision variable corresponding to assigning technician i to job j. As
in the previous section, by using the call myModel.update(), we make sure that our model
myModel recognizes the decision variables we created.
After creating the decision variables in our linear program, we move on to defining the
objective function as follows.
# create a linear expression for the objective
objExpr = LinExpr()
for i in range( noTechs ):
for j in range ( noJobs ):
curVar = myVars[ i ][ j ]
objExpr += rewards[ i ][ j ] * curVar
myModel.setObjective( objExpr , GRB.MAXIMIZE )
The call LinExp() above creates a new linear function and we store this linear function
in the variable objExpr. Recalling that there is one decision variable for each technician
We loop over all technicians. For technician i, we need to create a constraint that ensures
that this technician is assigned to one job. We create a linear function that keeps the
left side of this constraint and store this linear function in the variable constExpr. The
decision variables that correspond to assigning technician i to any of the jobs appear in the
constraint with a coefficient of 1. Thus, we loop over each job j and add the decision variable
corresponding to assigning technician i to each job j into the constraint with a coefficient
of 1. Now, we have a linear function corresponding to the left side of the constraint that
ensures that technician i is assigned to one job. We add this constraint into our model as an
equality constraint with a right side of 1. We name the constraint by using the technician
corresponding to the constraint. By following the same approach, we create the constraints
that ensure that each job gets one technician.
# create constraints so that each job gets one tech
for j in range( noJobs ):
constExpr = LinExpr()
for i in range( noTechs ):
curVar = myVars[ i ][ j ]
constExpr += 1 * curVar
myModel.addConstr( lhs = constExpr , sense = GRB.EQUAL , rhs = 1 , \
name = "j" + str( i ) )
After creating the objective function and the constraints, we use the call myModel.update()
op&mal))
z*) objec&ve)
value))
objec&ve)value)
provided)by)
current)solu&on)
0) itera&on)number)
Imagine that we are able to construct another linear program such that this linear
program is a minimization problem and the optimal objective value of this linear program is
greater than or equal to the optimal objective of the original original linear we want to solve.
75
For the moment, we refer to this linear program as the upper bounding linear program. As
we solve the original linear program we want to solve, we also solve the upper bounding
linear program on another computer by using the simplex method. Since we minimize the
objective function in the upper bounding linear program, as the iterations of the simplex
method for the upper bounding linear program proceeds, we obtain feasible solutions for
the upper bounding linear program that provides smaller and smaller objective function
values. In the figure below, we depict the objective value provided by the solution on hand
for the upper bounding linear program as a function of the iteration number in the simplex
method. Since the optimal objective value of the upper bounding linear program is greater
than or equal to the optimal objective value of the original linear program we want to solve,
the objective value provided by the current solution in the figure below never dips below the
optimal objective value of the original linear program we want to solve.
objec&ve)value)
provided)by)
current)solu&on)
of)upper)bounding))
linear)program)
z*)
0) itera&on)number)
After two days of computation time, we terminate the simplex method for both the
original linear program we want to solve and the upper bounding linear program. We want
to understand how close the solution that we have for the original linear program is to being
optimal. In the figure below, z1 corresponds to the objective value provided by the solution
that we have for the original linear program after two days of computation time. The percent
optimality gap of this solution is (z ∗ −z1 )/z ∗ . We cannot compute this optimality gap because
we do not know z ∗ . On the other hand, z2 corresponds to the objective value provided by the
solution that we have for the upper bounding linear program after two days of computation
time. Note that we know z2 , which implies that we can compute (z2 − z1 )/z1 . Furthermore,
since the optimal objective value of the upper bounding linear program is greater than or
equal to the optimal objective value of the original linear program we want to solve, we have
z2 ≥ z ∗ ≥ z1 . Thus, we obtain
z ∗ − z1 z2 − z1
∗
≤ .
z z1
objec&ve)value)
provided)by)
current)solu&on)
of)upper)bounding))
z2) linear)program)
z*)
z1)
objec&ve)value)
provided)by)
current)solu&on)
of)original)linear)
program)
0) itera&on)number)
Motivated by the discussion above, the key question is how we can come up with an
upper bounding linear program that satisfies two properties. First, the upper bounding
linear program should be a minimization problem. Second, the optimal objective value of
the upper bounding linear program should be an upper bound on the optimal objective value
of the original linear program we want to solve.
max 5 x1 + 3 x2 − x3
st 3 x1 + 2 x2 + x3 ≤ 9
4 x1 + x 2 − x3 ≤ 3
x1 , x2 , x3 ≥ 0.
For the sake of illustration, assume that this linear program is large enough that we cannot
obtain its optimal objective value in reasonable computation time and we want to obtain
an upper bound on its optimal objective value. Let (x∗1 , x∗2 , x∗3 ) be the optimal solution to
the linear program providing the optimal objective value 5 x∗1 + 3 x∗2 − x∗3 . Since (x∗1 , x∗2 , x∗3 )
Also, since x∗1 ≥ 0, x∗2 ≥ 0 and x∗3 ≥ 0, we have 11 x∗1 ≥ 5 x∗1 , 4 x∗2 ≥ 3 x∗2 and −x∗3 ≥
−x∗3 . Adding these inequalities yield
Combining the two displayed inequalities above, we get 5 x∗1 + 3 x∗2 − x∗3 ≤ 11 x∗1 + 4 x∗2 − x∗3 ≤
15. Since the optimal objective value of the linear program is 5 x∗1 + 3 x∗2 − x∗3 , the last
inequality shows that 15 is an upper bound on the optimal objective value.
The key to the argument here is to combine the constraints by multiplying them with
positive numbers in such a way that the coefficient of each variable in the combined constraint
dominates its corresponding coefficient in the objective function.
A natural question is whether we can obtain an upper bound tighter than 15 by
multiplying the two constraints with numbers other than 1 and 2. To answer this question, we
generalize the idea by multiplying the constraints with generic numbers y1 ≥ 0 and y2 ≥ 0,
instead of 1 and 2 As before, since (x∗1 , x∗2 , x∗3 ) is the optimal solution to the linear program,
it should satisfy the constraints of the linear program, so that we have
3 x∗1 + 2 x∗2 + x∗3 ≤ 9, 4 x∗1 + x∗2 − x∗3 ≤ 3, x∗1 ≥ 0, x∗2 ≥ 0, x∗3 ≥ 0.
We multiply the first constraint by y1 ≥ 0 and the second constraint by y2 ≥ 0 and add
them up. Thus, if y1 ≥ 0 and y2 ≥ 0, then we have
Considering the last two displayed inequalities, the first one holds under the assumption that
y1 ≥ 0 and y2 ≥ 0, whereas the second one holds under the assumption that 3 y1 + 4 y2 ≥ 5,
2 y1 + y2 ≥ 3 and y1 − y2 ≥ −1. Combining these two inequalities, it follows that if
y1 ≥ 0, y2 ≥ 0, 3 y1 + 4 y2 ≥ 5, 2 y1 + y2 ≥ 3, y1 − y2 ≥ −1,
then we have
5 x∗1 + 3 x∗2 − x∗3 ≤ (3 y1 + 4 y2 ) x∗1 + (2 y1 + y2 ) x∗2 + (y1 − y2 ) x∗3 ≤ 9 y1 + 3 y2 .
Thus, since the optimal objective value of the linear program we want to solve is 5 x∗1 +
3 x∗2 − x∗3 , the last inequality above shows that 9 y1 + 3 y2 is an upper bound on the optimal
objective value of the linear program.
This discussion shows that as long as y1 and y2 satisfy the conditions y1 ≥ 0, y2 ≥ 0,
3 y1 + 4 y2 ≥ 5, 2 y1 + y2 ≥ 3 and y1 − y2 ≥ −1, the quantity 9 y1 + 3 y2 is an upper
bound on the optimal objective value of the linear program we want to solve. To obtain the
tightest possible upper bound on the optimal objective value, we can push the upper bound
9 y1 + 3 y2 as small as possible while making sure that the conditions imposed on y1 and y2
are satisfied. In other words, we can obtain the tightest possible upper bound on the optimal
objective value by solving the linear program
min 9 y1 + 3 y2
st 3 y1 + 4 y2 ≥ 5
2 y1 + y2 ≥ 3
y1 − y2 ≥ −1
y1 , y2 ≥ 0.
The optimal objective value of the linear program above is an upper bound on the optimal
objective value of the original linear program we want to solve. Furthermore, this linear
program is a minimization problem. Therefore, we can use the linear program above as the
upper bounding linear program as discussed in the previous section! In linear programming
vocabulary, we refer to the upper bounding linear program above as the dual problem. We
refer to the original linear program we want to solve as the primal problem.
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 ≤ 9 (y1 ) st 3 y1 + 4 y2 ≥ 5 (x1 )
4 x1 + x2 − x3 ≤ 3 (y2 ) 2 y1 + y2 ≥ 3 (x2 )
x1 , x2 , x3 ≥ 0, y1 − y2 ≥ −1 (x3 )
y1 , y2 ≥ 0.
By the discussion in the previous section, we can use the dual problem as the upper bounding
linear program when we solve the primal problem. Note that for each constraint in the primal
problem, we have a dual decision variable yi . For each decision variable xj in the primal
problem, we have a constraint in the dual problem. The objective coefficient of dual variable
yi in the dual problem is the same as the right side of the primal constraint corresponding
to variable yi . The right side of dual constraint corresponding to variable xj is the objective
coefficient of primal variable xj in the primal problem. The constraint coefficient of variable
yi in the dual constraint corresponding to variable xj is the same as the constraint coefficient
of variable xj in the primal constraint corresponding to variable yi .
Using the slack variables w1 and w2 for the primal constraints and the slack variables z1 ,
z2 and z3 for the dual constraints, we also can write the primal and dual problems as
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 + w1 = 9 (y1 ) st 3 y1 + 4 y2 − z1 = 5 (x1 )
4 x1 + x2 − x3 + w 2 = 3 (y2 ) 2 y1 + y2 − z2 = 3 (x2 )
x1 , x2 , x3 , w1 , w2 ≥ 0, y1 − y2 − z3 = −1 (x3 )
y1 , y2 , z1 , z2 , z3 ≥ 0.
Recall that for each constraint in the primal problem, we have a dual decision variable
yi . Also, for each constraint in the primal problem, we have a primal slack variable wi . So,
each dual decision variable yi is associated with a primal slack variable wi . On the other
hand, for each decision variable xj in the primal problem, we have a constraint in the dual
problem. Also, for each constraint in the dual problem, we have a dual slack variable zj . So,
each primal decision variable xj is associated with a dual slack variable zj .
Another way to look at the relationship between the primal and dual problems is to write
these problems in matrix notation. We define the matrices and the vectors
5 x1
3 2 1 9 y1
c= 3 A= b= x = x2 y= .
4 1 −1 3 y2
−1 x3
max ct x min bt y
st Ax ≤ b st At y ≥ c
x ≥ 0, y ≥ 0.
We can use the template above to write the dual problem corresponding to any general
primal problem. Consider a primal problem in the general form
n
X
max cj x j
j=1
Xn
st aij xj ≤ bi ∀ i = 1, . . . , m
j=1
xj ≥ 0 ∀ j = 1, . . . , n.
the primal problem above is of the form max ct x subject to A x ≤ b and x ≥ 0, which implies
that the dual problem corresponding to this primal problem has the form min bt y subject
to At y ≥ c and y ≥ 0. We can write the last problem equivalently as
m
X
min bi yi
i=1
m
X
st aij yi ≥ cj ∀ j = 1, . . . , n
i=1
yi ≥ 0 ∀ i = 1, . . . , m.
Thus, in general form, a primal problem and its corresponding dual problem are given by
n
X m
X
max cj x j min bi yi
j=1 i=1
Xn Xm
st aij xj ≤ bi ∀ i = 1, . . . , m st aij yi ≥ cj ∀ j = 1, . . . , n
j=1 i=1
xj ≥ 0 ∀ j = 1, . . . , n, yi ≥ 0 ∀ i = 1, . . . , m.
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 ≤ 9 st 3 y1 + 4 y2 ≥ 5
4 x1 + x2 − x3 ≤ 3 2 y1 + y2 ≥ 3
x1 , x2 , x3 ≥ 0, y1 − y2 ≥ −1
y1 , y2 ≥ 0.
Let (x̂1 , x̂2 , x̂3 ) be a feasible solution to the primal problem and (ŷ1 , ŷ2 ) be a feasible solution
to the dual problem. Since (x̂1 , x̂2 , x̂3 ) is a feasible solution to the primal problem, we have
9 ≥ 3 x̂1 + 2 x̂2 + x̂3 and 3 ≥ 4 x̂1 + x̂2 − x̂3 . Also, ŷ1 ≥ 0 and ŷ2 ≥ 0, since (ŷ1 , ŷ2 ) is a
feasible solution to the dual problem. Therefore, we have
On the other hand, since (ŷ1 , ŷ2 ) is a feasible solution to the dual problem, we have 3 ŷ1 +
4 ŷ2 ≥ 5, 2 ŷ1 + ŷ2 ≥ 3 and ŷ1 − ŷ2 ≥ −1. Also, x̂1 ≥ 0, x̂2 ≥ 0 and x̂3 ≥ 0, since (x̂1 , x̂2 , x̂3 )
is a feasible solution to the primal problem. In this case, we obtain
9 ŷ1 + 3 ŷ2 ≥ (3 ŷ1 + 4 ŷ2 ) x̂1 + (2 ŷ1 + ŷ2 ) x̂2 + (ŷ1 − ŷ2 ) x̂3 ≥ 5 x̂1 + 3 x̂2 − x̂3 .
So, we got 9 ŷ1 + 3 ŷ2 ≥ 5 x̂1 + 3 x̂2 − x̂3 , saying that the objective value of the dual problem
at the feasible dual solution (ŷ1 , ŷ2 ) is at least as large as the objective value of the primal
problem at the feasible primal solution (x̂1 , x̂2 , x̂3 ), which is exactly what weak duality says!
The moral of this story is that the objective value of the dual problem at any feasible
solution to the dual problem is at least as large as the objective value of the primal problem
at any feasible solution to the primal problem. This result is called weak duality. As
discussed in the next section, weak duality has an important implication that allows us to
check whether a pair of feasible solutions to the primal and dual problems are optimal to
their respective problems.
On the other hand, let (y1∗ , y2∗ ) be the optimal solution to the dual problem. Note that we
minimize the objective function in the dual problem. Thus, since (ŷ1 , ŷ2 ) is a feasible, but
not necessarily an optimal, solution to the dual problem, the objective value provided by the
solution (ŷ1 , ŷ2 ) for the dual problem cannot dip below the objective value provided by the
optimal solution (y1∗ , y2∗ ). So, we also have
Lastly, since (x∗1 , x∗2 , x∗3 ) is optimal to the primal problem, it is also a feasible solution to the
primal problem. By the same reasoning, (y1∗ , y2∗ ) is a feasible solution to the dual problem.
Thus, since (x∗1 , x∗2 , x∗3 ) is a feasible solution to the primal problem and (y1∗ , y2∗ ) is a feasible
solution to the dual problem, by weak duality, (y1∗ , y2∗ ) provides an objective value for the
dual problem that is at least as large as the objective value provided by (x∗1 , x∗2 , x∗3 ) for the
primal problem. In other words, we have
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 ≤ 9 st 3 y1 + 4 y2 ≥ 5
4 x1 + x2 − x3 ≤ 3 2 y1 + y2 ≥ 3
x1 , x2 , x3 ≥ 0, y1 − y2 ≥ −1
y1 , y2 ≥ 0.
For reference, we also write the versions of these linear programs with slack variables. Using
the slack variables w1 and w2 for the primal constraints and the slack variables z1 , z2 and z3
for the dual constraints, the linear programs above are equivalent to
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 + w1 = 9 st 3 y1 + 4 y2 − z1 = 5
4 x1 + x2 − x3 + w 2 = 3 2 y1 + y2 − z2 = 3
x1 , x2 , x3 , w1 , w2 ≥ 0, y1 − y2 − z3 = −1
y1 , y2 , z1 , z2 , z3 ≥ 0.
Consider solving the primal problem by using the simplex method. We start with the system
of equations
5 x1 + 3 x2 − x3 = ζ
3 x 1 + 2 x2 + x3 + w 1 = 9
4 x1 + x2 − x3 + w2 = 3.
We increase x1 up to min{9/3, 3/4} = 3/4. Thus, the entering variable is x1 and the leaving
variable is w2 . Doing the appropriate row operations, the next system of equations is
85
7 1 5 15
4
x2 + 4
x3 − 4
w2 = ζ − 4
5 7 3 27
4
x2 + 4
x3 + w 1 − 4
w2 = 4
1 1 1 3
x1 + 4
x2 − 4
x3 + 4
w2 = 4
.
−7 x1 + 2 x3 − 3 w2 = ζ − 9
−5 x1 + 3 x3 + w1 − 2 w2 = 3
4 x1 + x2 − x3 + w2 = 3.
We increase x3 up to 3/3 = 1. In this case, the entering variable is x3 and the leaving
variable is w1 . Carrying out the necessary row operations gives
− 11
3
x1 − 2
3
w1 − 5
3
w2 = ζ − 11
5 1 2
− 3 x1 + x3 + 3
w1 − 3
w2 = 1
7 1 1
− 3 x1 + x2 + 3
w1 + 3
w2 = 4.
We reached the optimal solution. The solution (x∗1 , x∗2 , x∗3 , w1∗ , w2∗ ) = (0, 4, 1, 0, 0) is optimal
for the primal linear program yielding the optimal objective value of 11.
Consider a possible solution (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) to the dual problem constructed as
follows. The values of the dual variables y1 and y2 are set to the negative of the objective
function row coefficients of w1 and w2 in the final system of equations that the simplex
method obtains. The values of the dual slack variables z1 , z2 and z3 are set to the
negative of the objective function row coefficients of x1 , x2 and x3 . Therefore, the solution
(y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) is given by
2 5 11
y1∗ = , y2∗ = , z1∗ = , z2∗ = 0, z3∗ = 0.
3 3 3
Note that this solution satisfies
Thus, the solution (y1∗ , y2∗ ) is feasible to the dual problem and it provides the objective value
of 9 y1∗ + 3 y2∗ = 9 × 32 + 3 × 53 = 11 for the dual problem.
• First, the solution (y1∗ , y2∗ ) is feasible to the dual problem, satisfying all of the constraints
in the dual problem.
• Second, the solution (y1∗ , y2∗ ) provides an objective value of 11 for the dual problem,
which is the objective value provided by the solution (x∗1 , x∗2 , x∗3 ) for the primal problem.
• Third, the solution (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) is feasible to the version of the dual problem with
slack variables.
Using the first two properties, we were able to conclude that the solution (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ )
constructed by using the objective function row coefficients in the final iteration of the
simplex method is optimal to the dual problem. In this section, we understand why the
three properties above hold. The simplex method started with the system of equations
5 x1 + 3 x2 − x3 = ζ
3 x 1 + 2 x2 + x3 + w 1 = 9
4 x1 + x2 − x3 + w2 = 3.
− 11
3
x1 − 2
3
w1 − 5
3
w2 = ζ − 11
5 1 2
− 3 x1 + x3 + 3
w1 − 3
w2 = 1
7 1 1
− 3 x1 + x2 + 3
w1 + 3
w2 = 4.
The last system of equations is obtained by carrying out row operations starting from the
initial system of equations. Therefore, we have to be able to obtain the objective function
row in the last system of equations by multiplying the equations in the initial system of
equations with some constants and adding them up.
In the objective function row in the last system of equations, w1 appears with a coefficient
of −2/3. Thus, if we are to obtain the objective function row in the last system of equations
by multiplying the equations in the initial system of equations with some constants and
adding them up, then we must multiply the first constraint row in the initial system of
equations by −2/3 because w1 appears nowhere else in the initial system of equations
and there is no other way of having a coefficient of −2/3 for w1 in the final system of
Thus, rearranging the terms in the equalities above, if we let y1∗ and y2∗ be the negative of
the objective function row coefficients of primal slack variables w1 and w2 in the final system
The first three equalities above show that (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) satisfies the constraints in the
version of the dual problem with slack variables. In addition, we note that the objective
function row coefficients are all non-positive in the final iteration of the simplex method. Since
the values of the decision variables (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) are set to the negative of the objective
function row coefficients, they are all non-negative. Thus, the solution (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) is
feasible to the version of the dual problem with slack variables, which establishes the third
property that we set out to prove at the beginning of this section. On the other hand, the
last equality above shows that the solution (y1∗ , y2∗ ) provides the objective value of 11 for
the dual problem, which is the objective value provided by the solution (x∗1 , x∗2 , x∗3 ) for the
primal problem, showing the second property at the beginning of this section. Finally, since
z1∗ ≥ 0, z2∗ ≥ 0 and z3∗ ≥ 0, the first three equalities above yield
Thus, the solution (y1∗ , y2∗ ) is feasible to the dual problem, which establishes the first property
at the beginning of this section.
As long as the simplex method terminates with an optimal solution, the objective function
row coefficients in the final iteration will be all non-positive. In this case, we can always
use the trick described in this chapter to construct an optimal solution to the dual problem
by using the negative objective function row coefficients in the final iteration of the simplex
method. The objective value provided by the solution that we obtain for the dual problem is
always the same as the objective value provided by the solution that we have for the primal
problem. We note that these observations will not hold when the simplex method does not
terminate with an optimal solution, which is the case when there is no feasible solution to
the problem or the problem is unbounded.
The moral of this story is the following. Consider a linear program with n decision
variables and m constraints. Assume that the simplex method terminates with an optimal
solution (x∗1 , . . . , x∗n , w1∗ , . . . , wm
∗
) providing the optimal objective value of ζ ∗ for the primal
problem. We construct a solution (y1∗ , . . . , ym ∗
, z1∗ , . . . , zn∗ ) to the dual problem by letting yi∗
be the negative of the objective function row coefficient of wi∗ in the final iteration of the
simplex method and zj∗ be the negative of the objective function row coefficient of x∗j in the
max 5 x1 + 3 x2 − x3 min 9 y1 + 3 y2
st 3 x1 + 2 x2 + x3 + w1 = 9 st 3 y1 + 4 y2 − z1 = 5
4 x1 + x2 − x3 + w 2 = 3 2 y1 + y2 − z2 = 3
x1 , x2 , x3 , w1 , w2 ≥ 0, y1 − y2 − z3 = −1
y1 , y2 , z1 , z2 , z3 ≥ 0.
Assume that we have a solution (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) to the primal problem and a solution
(ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) to the dual problem satisfying the following three properties.
• The solution (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) is feasible for the primal problem and the solution
(ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) is feasible for the dual problem.
It turns out satisfying the three properties above ensures that the solution (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 )
is optimal for the primal problem and the solution (ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) is optimal for the
dual problem. To see this result, it is enough to show that the objective value provided
by the solution (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) for the primal problem is equal to the objective value
The first equality above uses the fact that (ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) is feasible to the dual problem
because of the first property above, which is to say that 3 ŷ1 + 4 ŷ2 − ẑ1 = 5, 2 ŷ1 + ŷ2 − ẑ2 = 3
and ŷ1 − ŷ2 − ẑ3 = −1. The second equality follows by arranging the terms. The third
equality uses the fact that x̂1 ẑ1 = x̂2 ẑ2 = x̂3 ẑ3 = ŷ1 ŵ1 = ŷ2 ŵ2 = 0 by the second and third
properties above. The fourth equality can be obtained by rearranging the terms. The last
equality uses the fact that (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) is feasible to the primal problem by the first
property above, which is to say that 3 x̂1 +2 x̂2 +x̂3 + ŵ1 = 9 and 4 x̂1 +x̂2 −x̂3 + ŵ2 = 3. So, the
chain of equalities shows that the solutions (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) and (ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) provide
the same objective value for their respective problems. By the discussion in the previous
paragraph, it must be the case that the solution (x̂1 , x̂2 , x̂3 , ŵ1 , ŵ2 ) is optimal for the primal
problem and the solution (ŷ1 , ŷ2 , ẑ1 , ẑ2 , ẑ3 ) is optimal for the dual problem.
The moral of this story is the following. Consider a linear program with n decision
variables and m constraints. Assume that we have a feasible solution (x̂1 , . . . , x̂n , ŵ1 , . . . , ŵm )
to the primal problem and a feasible solution (ŷ1 , . . . , ŷm , ẑ1 , . . . , ẑn ) to the dual problem. If
these solutions satisfy
x̂j × ẑj = 0 ∀ j = 1, . . . , n
ŷi × ŵi = 0 ∀ i = 1, . . . , m,
then the solution (x̂1 , . . . , x̂n , ŵ1 , . . . , ŵm ) must be optimal for the primal problem and the
solution (ŷ1 , . . . , ŷm , ẑ1 , . . . , ẑn ) must be optimal for the dual problem. This result is known as
complementary slackness. Note that the first equality above can be interpreted as whenever
x̂j takes a strictly positive value, ẑj is 0, and whenever ẑj takes a strictly positive value, x̂j
is 0. A similar interpretation holds for the second equality above.
x∗1 z1∗ = 0, x∗2 z2∗ = 0, x∗3 z3∗ = 0, y1∗ w1∗ = 0, y2∗ w2∗ = 0.
Since x∗1 = w1∗ = w2∗ = 0, we immediately have x∗1 z1∗ = 0, y1∗ w1∗ = 0 and y2∗ w2∗ = 0,
irrespective of the values of z1∗ , y1∗ and y2∗ . Since x∗2 = 1 and x∗3 = 4, to satisfy x∗2 z2∗ = 0 and
x∗3 z3∗ = 0, we must have z2∗ = 0 and z3∗ = 0. Also, since we want (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) to be a
feasible solution to the dual problem, we must have
Since we must have z2∗ = 0 and z3∗ = 0, the system of equations above is equivalent to
The system of equations above has three unknowns and three equations. Solving this
system of equations, we obtain y1∗ = 2/3, y2∗ = 5/3 and z1∗ = 11/3. Thus, the
solution (x∗1 , x∗2 , x∗3 , w1∗ , w2∗ ) = (0, 1, 4, 0, 0) is feasible for the primal problem, the solution
(y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) = ( 23 , 35 , 11
3
, 0, 0) is feasible for the dual problem and these solutions satisfy
x1 z1 = x2 z2 = x3 z3 = y1 w1 = y2∗ w2∗ = 0. In this case, by complementary slackness, it
∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗
follows that the solution (x∗1 , x∗2 , x∗3 , w1∗ , w2∗ ) = (0, 1, 4, 0, 0) is optimal for the primal problem
and the solution (y1∗ , y2∗ , z1∗ , z2∗ , z3∗ ) = ( 23 , 53 , 11 3
, 0, 0) is optimal for the dual problem.
Although we will not prove explicitly, it is possible to show that the converse of the
statement in complementary slackness also holds. In particular, assume that we have a
feasible solution (x∗1 , . . . , x∗n , w1∗ , . . . , wm
∗
) to the primal problem and a feasible solution
∗ ∗ ∗ ∗
(y1 , . . . , ym , z1 , . . . , zn ) to the dual problem. If these solutions are optimal for their respective
problems, then it must be the case that x̂j × ẑj = 0 for all j = 1, . . . , n and ŷi × ŵi = 0 for
all i = 1, . . . , m.
We use z 0 to denote the optimal objective value of the problem above, which corresponds to
the optimal revenue we can obtain from yearly contracts in the current situation.
Assume that we can purchase additional disk space to enlarge our cloud computing
business. We have a supplier that offers us to sell additional disk space at a cost of $5 per
GB for each year of use. Should we be willing to consider this offer? To answer this question,
we assume that we have 60000 + GB of disk space rather than 60000, where is a small
amount. When we have 60000 + GB of disk space, we can compute optimal revenue from
94
yearly contracts by solving the linear program
We use z to denote the optimal objective value of the problem above, which corresponds to
the optimal revenue we can obtain from yearly contracts when we have 60000 + GB of disk
space. If we have z − z 0 ≥ 5 , then the increase in our yearly revenue with GB of extra
disk space exceeds the cost of the GB of extra disk space we get from our supplier. Thus,
we should be willing to consider the offer from our supplier, at least for a few GB of disk
space. On the other hand, if we have z − z 0 < 5 , then the increase in our yearly revenue
with GB of extra disk space is not worth the cost of the extra disk space. So, we should
not consider the offer from our supplier. Note that z − z 0 corresponds to the change in the
optimal objective value of the linear program when we increase the right side of the second
constraint by a small amount .
The approach described above is a reasonable approach to assess the offer from our
supplier, but it requires solving two linear programs, one to compute z 0 and one to compute
z . Perhaps solving two linear programs is not a big deal, but assume that an airline solves a
linear program to assess the optimal revenue that it can obtain when it operates a certain set
of flight legs with certain capacities on them. The airline wants to understand the revenue
improvement from increasing the capacity on each one of its flight legs. If there are L flight
legs in the network that the airline operates, then the airline may need to solve 1 + L linear
programs, where the first linear program corresponds to the current situation and each one
of the remaining L linear programs corresponds the case where we increase the capacity on
each of the L flight legs by a small amount. If the airline network is large, then L can be
large and solving these linear programs can consume a lot of time.
A natural question is whether we can get away with solving a single linear program to
assess how much the optimal objective value of a linear program changes when we increase
the right side of a constraint by a small amount. To answer this question, consider the linear
program for the cloud computing example and its dual given by
2400 x1 + 3200 x2 = z
100 x1 + 40 x2 + w1 = 10000
200 x1 + 400 x2 + + w2 = 60000
x2 + + w3 = 140.
We increase x1 up to min{4400/100, 4000/200} = 20. So, the entering variable is x1 and the
leaving variable is w2 . The necessary row operations yield the system of equations
− 12 w2 + 1600 w3 = z − 496000
1
w1 − w + 160 w3
2 2
= 2400
1
x1 + w −
200 2
2 w3 = 20
x2 + w3 = 140.
We increase w3 up to min{2400/160, 140} = 15. In this case, the entering variable is w3 and
the leaving variable is w1 . Appropriate row operations give the system of equations
− 10 w1 − 7 w2 = z − 520000
1 1
w
160 1
− 320
w2 + w3 = 15
1 1
x1 + 80 w1 − 800
w2 = 50
1 1
x2 − 160 w1 + 320
w2 = 125.
All coefficients in the objective function row are non-positive. Thus, we obtained the
optimal solution, which is given by (x∗1 , x∗2 ) = (50, 125). The optimal objective value is
520000. Furthermore, we know that if we define the solution (y1∗ , y2∗ , y3∗ ) such that y1∗ , y2∗ and
y3∗ are respectively the negative of the objective function row coefficients of w1 , w2 and w3 in
the final system of equations, then the solution (y1∗ , y2∗ , y3∗ ) is optimal to the dual. Therefore,
the solution (y1∗ , y2∗ , y3∗ ) with
is optimal to the dual problem. We want to understand why y2∗ = 7 gives how much the
optimal objective value of the primal problem changes when we increase the right side of the
second constraint by a small amount .
Let us reflect on the row operations in the execution of the simplex method above. We
started with the system of equations
2400 x1 + 3200 x2 = z
100 x1 + 40 x2 + w1 = 10000
200 x1 + 400 x2 + w2 = 60000
x2 + w3 = 140.
− 10 w1 − 7 w2 = z − 520000
1 1
w
160 1
− 320
w2 + w3 = 15
1 1
x1 + 80 w1 − 800
w2 = 50
1 1
x2 − 160 w1 + 320
w2 = 125.
Consider replacing all of the appearances of w2 in the two systems of equations above with
w2 − . Therefore, if we start with the system of equations
and apply the same sequence of row operations on this system, then we would obtain the
system of equations
Moving all of the terms that involve an to the right side, it follows that if we start with
the system of equations
2400 x1 + 3200 x2 = z
100 x1 + 40 x2 + w1 = 10000
200 x1 + 400 x2 + w2 = 60000 +
x2 + w3 = 140
and apply the same sequence of row operations that we did in the simplex method, then we
would obtain the system of equations
Note that w2 and have the same coefficient in each equation above.
Now, consider solving the linear program after we increase the right side of the second
constraint by a small amount . The simplex method starts with the system of equations
2400 x1 + 3200 x2 = z
100 x1 + 40 x2 + w1 = 10000
200 x1 + 400 x2 + + w2 = 60000 +
x2 + + w3 = 140.
Starting from the system of equations above, let us apply the same sequence of row operations
in the earlier execution of the simplex method. These row operations may or may not exactly
be the ones followed by the simplex method when we solve the problem after we increase
the right side of the second constraint by . Nevertheless, applying these row operations is
harmless in the sense that we know that a system of equations remains unchanged when we
apply a sequence of row operations on it. If we apply these row operations, then by the just
preceding argument, we would obtain the system of equations
− 10 w1 − 7 w2 = z − 520000 − 7
1 1 1
w
160 1
− 320
w2 + w3 = 15 − 320
1 1 1
x1 + 80 w1 − 800
w2 = 50 − 800
1 1 1
x2 − 160 w1 + 320
w2 = 125 + 320 .
The objective function row coefficients are all non-positive in the system of equations above,
which means that we reached the optimal solution. Thus, if we increase the right side of the
second constraint by , then the optimal solution is given by (x∗1 , x∗2 ) = (50 − 800
1 1
, 125 + 320 )
∗ 1
and the optimal objective value is 520000+7 . As long as is small, we have x1 = 50− 800 ≥
0 and x∗2 = 125 + 320
1
≥ 0. Also, for small , observe that
Dist.
1 2 3 4 5
Base
1 1 1 0 1 0
2 1 0 1 0 1
3 0 1 0 0 1
To formulate the problem as an integer program, we use two sets of decision variables. The
first set of decision variables captures whether we station an ambulance at each base. Thus,
for all i = 1, 2, 3, we define the decision variable
(
1 if we station an ambulance at base i
xi =
0 otherwise.
102
The second set of decision variables captures whether a district is under coverage. Therefore,
for all j = 1, 2, 3, 4, 5, we define the decision variable
(
1 if we cover district j
yj =
0 otherwise.
We have a logical relationship between the two types of decision variables. For example,
district 5 can be covered only from bases 2 and 3, which implies that if we do not have an
ambulance at bases 2 and 3, then district 5 is not covered. We can represent this relationship
by the constraint y5 ≤ x2 + x3 . Thus, if x2 = 0 and x3 = 0 so that we do not have an
ambulance at bases 2 and 3, then it must be the case that y5 = 0, which indicates that we
cannot cover district 5. Using similar logical relationships for the coverage of other districts,
we can figure out where to station the ambulances to maximize the total population under
coverage by solving the integer program
In the objective function, we add up the populations of the districts that we cover. The
first five constraints above ensure that if we do not have ambulances at any of the stations
that cover a district, then we do not cover the district. For example, consider the constraint
y5 ≤ x2 + x3 . District 5 can be covered only from bases 2 and 3. If x2 = 0 and x3 = 0 so that
there are no ambulances at bases 2 and 3, then the right side of the constraint y5 ≤ x2 + x3
is 0, which implies that we must have y5 = 0. Thus, we do not cover district 5. On the
other hand, if x2 = 1 or x5 = 1, then the right side of the constraint y5 ≤ x2 + x3 is
at least 1, which implies that we can have y5 = 1 or y5 = 0. Since we are maximizing the
objective function, the optimal solution would set y5 = 1, which implies that we cover district
5. The last constraint in the problem above ensures that since we have 2 ambulances, we can
station an ambulance at no more than 2 bases. We impose the requirement xi ∈ {0, 1} on
the decision variable xi for all i = 1, 2, 3. This requirement is equivalent to 0 ≤ xi ≤ 1 and
xi is an integer. The same argument holds for the decision variable yj . Thus, the problem
If we satisfy goal j, then we make a reward of Rj . Thus, the data for the problem is
{aij : i = 1, . . . , m, j = 1, . . . , n} and {Rj : j = 1, . . . , n}. We want to figure out which
actions to take to maximize the reward from the satisfied goals while making sure that we
do not take more than k actions. To draw parallels with our previous example, action i
corresponds to stationing an ambulance at base i and goal j corresponds to covering district
j. In the data, aij indicates whether an ambulance at base i covers district j or not, whereas
Rj corresponds to the population of district j. To formulate the problem as an integer
program, we define the decision variables
(
1 if we take action i
xi =
0 otherwise,
(
1 if we satisfy goal j
yj =
0 otherwise.
Note that aij takes value 1 if action i satisfies goal j, otherwise aij takes value 0. Thus, the
sum m
P
i=1 aij xi on the right side of the first constraint corresponds to the number of actions
that we take satisfying goal j. If we do not take any actions that satisfy goal j so that
Pm
i=1 aij xi = 0, then we must have yj = 0, indicating that we cannot satisfy goal j. The
second constraint ensures that we take at most k actions.
Facility 1 2 3 4
Fixed Charge 500 1200 800 500
Per Unit Cost 4 2 3 5
If xj = 0, which means that we do not use facility j for production, then we must have
yj = 0 as well, indicating that the amount produced at facility j must be 0. On the other
hand, if xj = 1, meaning that we use facility j for production, then yj is upper bounded by
the capacity at the production facility, which is 500. Thus, we can represent the relationship
between xj and yj by using the constraint yj ≤ 500 xj . In this case, we can figure out how
many units to produce at each facility by solving the integer program
st yj ≤ Uj xj ∀ j = 1, . . . , n
X n
yj ≥ D
j=1
xj ∈ {0, 1}, yj ≥ 0 ∀ j = 1, . . . , n.
If there is no production capacity at facility j, then we can replace the constraint yj ≤ Uj xj
with yj ≤ M xj for some large number M . In the problem above, we know that we will never
produce more than D units at a production facility. So, using M = D suffices.
Supplier 1 2 3 4
Price 5 6 3 7
Distance 450 700 800 200
Low Thresh. 10 15 5 10
High Thresh. 50 40 30 45
Consider the amount that we purchase from supplier 1. If x1 = 1, then the purchase
quantity from supplier 1 is smaller than the low threshold, which implies that we must have
y1 ≤ 10. On the other hand, if x1 = 0, then the purchase quantity from supplier 1 is larger
the high threshold, which implies that we must have y1 ≥ 50. To capture this relationship
between the decision variables x1 and y1 , we use the two constraints
y1 ≤ 10 + M (1 − x1 ) and y1 ≥ 50 − M x1 ,
where M is a large number. In this case, if x1 = 1, then the two constraints above take the
form y1 ≤ 10 and y1 ≥ 50 − M . Since M is a large number, 50 − M is a small number. Thus,
y1 ≥ 50 − M is always satisfied. Thus, if x1 = 1, then we must have y1 ≤ 10, as desired. On
the other hand, if x1 = 0, then the two constraints above take the form y1 ≤ 10 + M and
y1 ≥ 50. Since M is a large number, y1 ≤ 10 + M is always satisfied. Thus, if x1 = 0,
then we must have y1 ≥ 50, as desired. We can follow a similar reasoning to ensure that
the purchase quantities from the other suppliers are either smaller than the low threshold or
larger than the high threshold.
Another constraint we need to impose on our decisions is that the average distance that
all of our purchases travels is no larger than 400 miles. The average distance traveled by all of
Putting the discussion so far together, we can figure out how may units to purchase from
each supplier by solving the integer program
The objective function above accounts for the total cost of the purchases. The first eight
constraints ensure that the purchase quantity from each supplier should be either smaller
than the low threshold or larger than the high threshold. The last two constraints ensure
that the average distance traveled by our orders is no larger than 400 miles and the total
quantity we purchase is at least the desired amount of 100.
The integer program above involves either-or constraints. In particular, out of two
constraints, we need to satisfy either one constraint or the other but not necessarily
both. To describe a more general form of either-or constraints, consider a linear program
with n non-negative decision variables {yj : j = 1, . . . , n}. In the objective function of
the constraint that at least k out of the m constraints above are satisfied. To formulate this
problem as an integer program, in addition to the decision variables {yj : j = 1, . . . , n}, we
define the decision variables {xi : i = 1, . . . , m} such that
(
1 if the constraint nj=1 aij yj ≤ bi is satisfied
P
xi =
0 otherwise.
Pn
In our integer program, letting M be a larger number, we replace the constraint j=1 aij yj ≤
bi with the constraint
n
X
aij yj ≤ bi + M (1 − xi ).
j=1
In this case, we can maximize the objective function nj=1 cj yj subject to the constraint
P
that at least k out of the m constraints are satisfied by solving the integer program
n
X
max cj y j
j=1
Xn
st aij yj ≤ bi + M (1 − xi ) ∀ i = 1, . . . , m
j=1
Xm
xi ≥ k
i=1
xi ∈ {0, 1}, yj ≥ 0 ∀ i = 1, . . . , m, j = 1, . . . , n.
the constraint takes the form nj=1 aij yj ≤ bi . Thus, if xi = 1, then we ensure that the
P
the constraint takes the form nj=1 aij yj ≤ bi + M , which is a constraint that is always
P
satisfied since bi + M is a large number. Thus, if xi = 0, we do not care whether the decision
variables {yj = 1, . . . , n} satisfy the constraint nj=1 aij yj ≤ bi . The second constraint above
P
imposes the condition that the decision variables {yj = 1, . . . , n} must satisfy at least k of
the constraints nj=1 aij yj ≤ bi for all i = 1, . . . , m.
P
Plant#A# Plant#B#
Cost#
210#
1#
pe#
Slo 180# #
170# p e#1
Slo
140#
4#
pe#
Slo
e#5#
Slop
70#
2#
e# 40#
p 1#
Sl o pe#
Slo
To understand the decision variables that we need, we focus on plant A and take a closer
look at the graph in the figure above that gives the cost of generation as a function of the
power generated. There are three segments in the horizontal axis of the graph and these
three segments are labeled as 1, 2 and 3. For each one of the segments i = 1, 2, 3, we define
the decision variable
(
1 if the power generated at plant A utilizes segment i
xiA =
0 otherwise.
For example, if we generate 45 units of power at plant A, then x1A = 1, x2A = 1 and x3A = 0.
Also, for each one of the segments i = 1, 2, 3, we define the decision variable
For each unit of power generated in segment 1, we incur an additional cost of $2. For each
unit of power generated in segment 2, we incur an additional cost of $4. Finally, for each unit
of power generated in segment 3, we incur an additional cost of $1. Therefore, we can write
the total cost of power generated at plant A as
On the other hand, if x1A = 1, then we use segment 1 when generating power at plant A. In
this case, noting that the width of segment 1 is 35, we must have y1A ≤ 35. If x1A = 0,
then we do not use segment 1 when generating power at plant A. In this case, we must have
y1A = 0. To capture this relationship, we use the constraint y1A ≤ 35 x1A . Note that since
x1A ∈ {0, 1}, this constraint implies that we always have y1A ≤ 35. Furthermore, if x2A = 1,
then we use segment 2 when generating power at plant A, which means that we must have
use segment 1 in its entirety. Therefore, if x2A = 1, then we must have y1A ≥ 35. To capture
this relationship, we use the constraint y1A ≥ 35 x2A . Thus, the decision variable y1A is
connected to the decision variables x1A and x2A through the constraints
By using the same argument for segment 2, the decision variable y2A is connected to the
decision variables x2A and x3A through the constraints
Lastly, if x3A = 1, then we use segment 3 when generating power at plant A so that y3A ≤
40. If x3A = 0, then we do not use segment 3 when generating power at plant A so that we
must have y3A = 0. To capture this relationship, we use the constraint
y3A ≤ 40 x3A .
We can use the same approach to capture the cost of power generation at plant B. For
each one of the segments i = 1, 2, 3, we define the decision variable
(
1 if the power generated at plant B utilizes segment i
xiB =
0 otherwise.
Also, for each one of the segments i = 1, 2, 3, we define the decision variable
In this case, the total amount of power generated at plant B is given by y1B + y2B + y3B ,
whereas the total cost of power generated at plant B is given by 1 y1B + 5 y2B + 1 y3B . By
using the same approach that we used for plant A, the decision variables y1B , y2B and y3B
are connected to the decision variables x1B , x2B and x3B through the constraints
y1B ≤ 40 x1B , y1B ≥ 40 x2B , y2B ≤ 20 x2B , y2B ≥ 20 x3B , y3B ≤ 40 x3B .
Collecting all of our discussion so far together, if we want to figure out how much power
to generate at each plant to generate a total of 100 units of power with minimum generation
cost, then we can solve the integer program
The integer program above provides an approach for dealing with single-dimensional
piecewise-linear objective functions in our optimization problems. Any single-dimensional
nonlinear function can be approximated arbitrarily well with a piecewise-linear
function. Therefore, by using the approach described in this section, we can use rather
accurate approximations of single-dimensional nonlinear functions as objective functions in
our optimization problems.
max 5 x1 + 4 x2 + 4 x3 + 3 x4
st 2 x1 + 4 x2 + 3 x3 + 2 x4 ≤ 20
6 x1 + 5 x2 + 4 x3 + 5 x4 ≤ 25
x1 + x2 + x3 + x4 ≥ 5
x2 + 2 x 3 ≤ 7
x1 , x2 , x3 , x4 ≥ 0
x1 , x2 , x3 are integers.
Note that the decision variables x1 , x2 and x3 in the problem above are restricted to be
integers but the decision variable x4 can take fractional values. In the branch-and-bound
method, we start by solving the integer program above without paying attention to any of
the integrality requirements. In particular, we start by solving the problem
max 5 x1 + 4 x2 + 4 x3 + 3 x4
st 2 x1 + 4 x2 + 3 x3 + 2 x4 ≤ 20
6 x1 + 5 x2 + 4 x3 + 5 x4 ≤ 25
x1 + x2 + x3 + x4 ≥ 5
x2 + 2 x 3 ≤ 7
x1 , x2 , x3 , x4 ≥ 0.
The problem above is referred to as the linear programming relaxation of the integer program
we want to solve. Since there are no integrality requirements on the decision variables, we
can solve the problem above by using the simplex method. The optimal objective value of
the problem above is 23.167 with the optimal solution
x1 = 1.833, x2 = 0, x3 = 3.5, x4 = 0.
113
The solution above satisfies the first four constraints in the integer program because these
constraints are already included in the linear program that we just solved. However, the
solution above is not a feasible solution to the integer program that we want to solve because
the decision variables x1 and x3 take fractional values in the solution, whereas our integer
program imposes integrality constraints on these decision variables. We focus on one of
these decision variables, say x1 . We have x1 = 1.833 in the solution above. Note that in the
optimal solution to the integer program, we must have either x1 ≤ 1 or x1 ≥ 2. Thus, based
on the optimal solution of the linear program that we just solved, we consider two cases. The
first case focuses on x1 ≤ 1 and the second case focuses on x1 ≥ 2. These two cases yield two
linear programs to consider, where the first linear program imposes the additional constraint
x1 ≤ 1 and the second linear program imposes the additional constraint x1 ≥ 2. Thus, these
two linear programs are given by
max 5 x1 + 4 x2 + 4 x3 + 3 x4 max 5 x1 + 4 x2 + 4 x3 + 3 x4
st 2 x1 + 4 x2 + 3 x3 + 2 x4 ≤ 20 st 2 x1 + 4 x2 + 3 x3 + 2 x4 ≤ 20
6 x1 + 5 x2 + 4 x3 + 5 x4 ≤ 25 6 x1 + 5 x2 + 4 x3 + 5 x4 ≤ 25
x1 + x2 + x3 + x4 ≥ 5 x1 + x2 + x3 + x4 ≥ 5
x2 + 2 x 3 ≤ 7 x2 + 2 x 3 ≤ 7
x1 ≤ 1 x1 ≥ 2
x1 , x2 , x3 , x4 ≥ 0 x1 , x2 , x3 , x4 ≥ 0.
An important observation is that the optimal solution to either of the two linear programs
above will necessarily be different from the optimal solution to the linear program that we
just solved because noting the constraints x1 ≤ 1 and x1 ≥ 2 in the two linear programs
above, having x1 = 1.833 in a solution would be infeasible to either of the two linear
programs. Solving the linear program on the left above, the optimal objective value is 22.333
with the optimal solution
x1 = 1, x2 = 1.667, x3 = 2.667, x4 = 0.
We summarize our progress so far in the figure below. We started with the linear
programming relaxation to the original integer program that we want to solve. This linear
programming relaxation corresponds to node 0 in the figure. The optimal solution to
the linear program at node 0 is (x1 , x2 , x3 , x4 ) = (1.833, 0, 3.5, 0) with the objective value
23.167. Observe how we display this solution and the objective value at node 0 in the figure
below. Examining this solution, since the integer decision variable x1 takes the fractional
value 1.833 in the solution, we branch into two cases, x1 ≤ 1 and x1 ≥ 2. Branching into
these two cases gives us the linear programs at nodes 1 and 2 in the figure. The linear
program at node 1 includes all of the constraints in the linear program at node 0, along with
the constraint x1 ≤ 1. The linear program at node 2 includes all of the constraints in the
linear program at node 0, along with the constraint x1 ≥ 2. Solving the linear program at
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333%
(1,%1.667,%2.667,%0)%
The solution (x1 , x2 , x3 , x4 ) = (1, 1.667, 2.667, 0) provided by the linear program at node
1 is not feasible to the integer program we want to solve because the decision variables x2
and x3 take fractional values in this solution, but our integer program imposes integrality
constraints on these decision variables. We choose one of these decision variables, say x2 . We
have x2 = 1.667 in the solution at node 1, but in the optimal solution to the integer program,
we must have either x2 ≤ 1 and x2 ≥ 2. Thus, at node 1, we branch into two cases, x2 ≤ 1
and x2 ≥ 2. Branching into these two cases at node 1 gives us the linear programs at nodes 3
and 4 shown in the figure below. The linear program at node 3 includes all of the constraints
in the linear program at node 1, plus the constraint x2 ≤ 1. The linear program at node 4
includes all of the constraints in the linear program at node 1, plus the constraint x2 ≥ 2. In
other words, the linear program at node 3 includes all of the constraints in the linear program
at node 0, along with the constraints x1 ≤ 1 and x2 ≤ 1. The linear program at node 4
includes all of the constraint in the linear program at node 0, along with the constraints
x1 ≤ 1 and x2 ≥ 2.
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333%
(1,%1.667,%2.667,%0)%
x2%≤%1% x2%≥%2%
Node%3% Node%4%
If node i lies immediately below node j, then we say that node i is a child of node j. If
node j lies immediately above node i, then we say that node j is the parent of node i. For
example, node 3 and node 4 in the figure above are the children of node 1 and node 1 is the
x1 = 1, x2 = 1, x3 = 3, x4 = 0.4.
The decision variables x1 , x2 and x3 take integer values in this solution. So, this solution is
feasible to the integer program we want to solve. Thus, we obtained a feasible solution to
the integer program providing an objective value of 22.2. There is no need to explore any
child nodes of node 3 further, since by the argument in the previous paragraph, the linear
programs at the children of node 3 will give us objective values that are no better than 22.2
and we already have a solution to the integer program that provides an objective value of
22.2. Therefore, we can stop exploring the tree further below node 3. The best feasible
solution we found so far for the integer program provides an objective value of 22.2.
The linear programs at nodes 2 and 4 are yet unsolved. Following the depth-first strategy,
we solve the linear program at node 4. Solving the linear program at node 4, we obtain the
optimal objective value of 22.167 and the optimal solution is
x1 = 0.833, x2 = 2, x3 = 2.5, x4 = 0.
The solution above is not feasible to the integer program we want to solve, because x1 and
x3 take fractional values in this solution, whereas the integer program we want to solve
requires these decision variables to be integer. However, the key observation is that the
optimal objective value of the linear program at node 4 is 22.167. We know that if we
explore the tree further below node 4, then the linear programs at the children of node 4
will give us objective values that are no better than 22.167. On the other hand, we already
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333%
(1,%1.667,%2.667,%0)%
x2%≤%1% x2%≥%2%
Node%3% Node%4%
Obj.%=%22.2% Obj.%=%22.167%
(1,%1,%3,%0.4)% (0.833,%2,%2.5,%0)%
Stop% Stop%
The moral of the discussion in this section is that we can stop the search at the current
node for two reasons. First, if the linear program at the current node provides a feasible
solution to the integer program we want to solve, satisfying all integrality requirements, then
we can stop the search at the current node. Second, as our search proceeds, we keep the
best feasible solution to the integer program we have found so far. If the optimal objective
value of the linear program at the current node is worse than the objective value provided
by the best feasible solution we have found so far, then we can stop the search at the current
node. Recall that the best feasible solution we have found so far for the integer program
provides an objective value of 22.2. In the figure above, only the linear program at node 2
is unsolved. We explore node 2 and its children in the next section.
x1 = 2, x2 = 0, x3 = 3.25, x4 = 0.
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333% Obj.%=%23%
(1,%1.667,%2.667,%0)% (2,%0,%3.25,%0)%
Following the depth-first strategy, we need to solve the linear program either at node 5
or at node 6. Breaking the tie arbitrarily, we proceed to solving the linear program at node
5. The optimal solution to the linear program at node 5 is
x1 = 2.167, x2 = 0, x3 = 3, x4 = 0
with the corresponding optimal objective value 22.833. This solution does not satisfy the
integrality requirements in the integer program we want to solve. Furthermore, the optimal
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333% Obj.%=%23%
(1,%1.667,%2.667,%0)% (2,%0,%3.25,%0)%
Node%7% Node%8%
%
Now, the linear programs at nodes 6, 7 and 8 are yet unsolved. Following the depth-first
strategy, we need to solve the linear program either at node 7 or node 8. Breaking the tie
arbitrarily, we choose to solve the linear program at node 8. Note that the linear program
at node 8 includes all of the constraints in the linear program at node 5, along with the
constraint x1 ≥ 3. Solving the linear program at node 8, we find out that this linear
program is infeasible. Since the linear programs at the children of node 8 will include all of
the constraints in the linear program at node 8, the linear programs at the children of node
8 will also be infeasible. Thus, we can stop searching the tree further below node 8.
At the end of the previous section, we discussed two reasons for stopping the search at
a particular node. First, if the linear program at the current node provides a solution that
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333% Obj.%=%23%
(1,%1.667,%2.667,%0)% (2,%0,%3.25,%0)%
Node%7% Node%8%
Infeasible%
Stop%
x1 = 2, x2 = 0.2, x3 = 3, x4 = 0
with the corresponding optimal objective value 22.8. The solution above does not satisfy the
integrality requirements in the integer program we want to solve. Also, the optimal objective
value of the linear program at node 7 is not worse than the objective value provided by the
best feasible solution to the integer program we have found so far. So, we have no reason
to stop the search at node 7. The decision variable x2 needs to take an integer value in the
integer program we want to solve, but we have x2 = 0.2 in the solution to the linear program
at node 7. Based on the solution of the linear program at node 7, we branch into the cases
x2 ≤ 0 and x2 ≥ 1. Branching into these cases yields the linear programs at nodes 9 and
10 shown in the figure below. The linear program at node 9 includes all of the constraints
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333% Obj.%=%23%
(1,%1.667,%2.667,%0)% (2,%0,%3.25,%0)%
Node%7% Node%8%
Obj.%=%22.8% Infeasible%
(2,%0.2,%3,%0)%
Stop%
x2%≤%0% x2%≥%1%
Node%9% Node%10%
Now, the linear programs at nodes 6, 9 and 10 are unsolved. By the depth-first strategy,
we solve the linear program at node 9 or node 10. Breaking the tie arbitrarily, we solve the
linear program at node 9. The optimal solution to the linear program at node 9 is
x1 = 2, x2 = 0, x3 = 3, x4 = 0.2
with the corresponding optimal objective value 22.6. This solution satisfies all of the
integrality requirements in the integer program we want to solve. So, we do not need to
explore the children of node 9. The solution provided by the linear program at node 9 is a
feasible solution to the integer program we want to solve. Before node 9, the best feasible
solution we had for the integer program provided an objective value of 22.2. However, the
solution that we obtained at node 9 is feasible to the integer program we want to solve and
it provides an objective value of 22.6. Thus, we update the best feasible solution we have
found so far as the solution obtained at node 9.
At this point, the linear programs at nodes 6 and 10 are unsolved. Following the
depth-first strategy, we solve the linear program at node 10. The optimal objective value of
this linear program is 22 and the optimal solution is
x1 = 2, x2 = 1, x3 = 2, x4 = 0.
Node%0%
Obj.%=%23.167%
(1.833,%0,%3.5,%0)%
x1%≤%1% x1%≥%2%
Node%1% Node2%
Obj.%=%22.333% Obj.%=%23%
(1,%1.667,%2.667,%0)% (2,%0,%3.25,%0)%
Node%7% Node%8%
Obj.%=%22.8% Infeasible%
(2,%0.2,%3,%0)%
Stop%
x2%≤%0% x2%≥%1%
Node%9% Node%10%
Obj.%=%22.6% Obj.%=%22%
(2,%0,%3,%0.2)% (2,%1,%2,%0)%
Stop% Stop%
There are no unsolved linear programs in the figure above. So, our search is complete! The
best feasible solution to the integer program is the solution that we obtained at node
9. Therefore, we can conclude that the solution (x1 , x2 , x3 , x4 ) = (2, 0, 3, 0.2) is optimal
to the integer program we want to solve.
• The solution to the linear program at the current node provides a feasible solution
• The optimal objective value of the linear program at the current node is worse than
the objective value provided by the best feasible solution to the integer program we
have found so far.
If none of the three reasons above hold and we cannot stop the search at the current node,
then we branch into two cases, yielding two more linear programs to solve. The second reason
above is critical to the success of the branch-and-bound method. In particular, if we have a
good feasible solution to the integer program on hand, then the optimal objective value of
the linear program at the current node is more likely to be worse than the objective value
provided by the feasible solution we have on hand. Thus, we can immediately terminate
the search at the current node. The good feasible solution to the integer program we have
on hand could either be obtained during the course of the search in the branch-and-bound
method or be obtained by using a separate heuristic solution algorithm.
Throughout this chapter, we used the depth-first strategy when selecting the next linear
program to solve. The advantage of the depth-first strategy is that it allows us to obtain
a feasible solution to the integer program quickly. In particular, the nodes towards the
beginning of the tree do not have many constraints added in them. Thus, they are less likely
to provide feasible solutions satisfying the integrality requirements in the integer program
we want to solve. On the other hand, the nodes towards the bottom of the tree have many
constraints added in them and they are likely to provide solutions that satisfy the integrality
requirements. As discussed in the previous paragraph, having a good feasible solution on
hand is critical to the success of the branch-and-bound method. Another approach for
selecting the next linear program to solve is to focus on the node that includes the linear
program with the largest optimal objective value and solve the linear program corresponding
to one of its children.
After solving the linear program at a particular node, there may be several variables that
violate the integrality requirements of the integer program we are interested in solving. In
this case, we can use any one of these decision variables to branch on. For example, if the
decision variables x1 and x2 are restricted to be integers, but we have x1 = 2.5 and x2 = 4.7
in the optimal solution to the linear program at the current node, then we two options for the
decision variable to branch on. First, we can branch on the decision variable x1 and use the
two cases x1 ≤ 2 and x1 ≥ 3 to construct the child nodes of the current node. Second, we can
branch on the decision variable x2 and use the two cases x2 ≤ 4 and x2 ≥ 5 to construct the
child nodes of the current node. The choice of a good variable to branch on is hard to figure
out a priori, but choosing a good variable to branch on may have dramatic impact on the
size of the search tree. A general rule of thumb is that if there is some hierarchical ordering
Note that if xj = 0, meaning that we do not have a facility at location j, then we cannot
serve demand point i from a facility at location j, meaning that we must have yij = 0. To
capture this relationship between the decision variables xj and yij , we use the constraint
yij ≤ xj . Thus, to choose the locations for facilities and to decide which facilities to use to
serve each demand point, we can solve the integer program
The objective function accounts for the total cost of opening the facilities and serving the
125
P
demand points. Noting the definition of the decision variable yij above, j∈F yij in the
first constraint corresponds to the number of facilities that serve demand point i. Thus,
the first constraint ensures that each demand point i is served by one facility. The second
constraint ensures that if we do not have a facility at location j, then we cannot use a facility
at location j to serve demand point i. The problem above is known as the uncapacitated
facility location problem. In particular, our formulation assumes that as long as we have
a facility at a certain location, we can serve as many demand points as we like from that
location. So, our formulation of the facility location problem assumes that there is infinite
capacity at the facilities. That is, the facilities are uncapacitated.
There is a capacitated version of the facility location problem. The setup for the
capacitated facility location is the same as before. The only difference is that demand point i
has a demand of di units. The total demand served by any facility cannot exceed U . Similar
to our formulation of the uncapacitated facility location problem, we continue assuming
that each demand point is served by one facility. We want to figure out the locations for
facilities and the facilities used to serve each demand point, while making sure that the total
demand served by a facility does not exceed the capacity at the facility. This problem can
be formulated as the integer program
The objective function and the first constraint are identical in the uncapacitated and
capacitated facility location problems. If we have xj = 0, then the second constraint above
P
reads i∈D di yij ≤ 0. To satisfy this constraint, we must set yij = 0 for all i ∈ D. Thus,
if we have xj = 0, meaning that we do not have a facility at location j, then we must
have yij = 0 for all i ∈ D, meaning that no demand point can be served from a facility at
P
location j. If we have xj = 1, then the second constraint above reads i∈D di yij ≤ U . Note
P
that i∈D di yij is the total demand at the demand points served by the facility at location
j. Thus, if we have xj = 1, meaning that we have a facility at location j, then we must have
P
i∈D di yij ≤ U , meaning that the total demand at the demand points served by the facility
at location j must be no larger than the capacity of the facility.
In our formulation of the capacitated facility location problem above, we could add the
constraints yij ≤ xj for all i ∈ D, j ∈ F . These constraints would be redundant because
To decide which loads to carry during the course of T days, we can solve the problem
xAC2&
xCB1&
+&sC& (C,1)& (C,2)& (C,3)& (C,4)&
xCC3&
Noting the discussion in the previous paragraph, the decision variable xijt corresponds to
the flow on an arc that goes from node (i, t) to (j, t+1). The decision variable zit corresponds
to the flow on an arc that goes from node (i, t) to (i, t + 1). Thus, the total flow out of node
P
(i, t) is j∈N xijt + zit . On the other hand, the decision variable xji,t−1 corresponds to the
flow on an arc that goes from node (j, t − 1) to (i, t). Similarly, the decision variable zi,t−1
corresponds to the flow on an arc that goes from node (i, t − 1) to (i, t). Thus, the total flow
P
into node (i, t) is j∈N xji,t−1 +zi,t−1 . Therefore, the second set of constraints in the dynamic
driver assignment problem captures the flow balance constraints for the node (i, t) for all
i ∈ N and t = 2, . . . , T . The node (i, 1) does not have any incoming arcs, but the node (i, 1)
has a supply of si units, which is the number of drivers available at node i at the beginning of
day 1. Thus, the first set of constraints in the dynamic driver assignment problem captures
the flow balance constraints for the node (i, 1) for all i ∈ N . We have an upper bound of dijt
on the flow over the arc corresponding to the decision variable xijt . Our formulation of the
dynamic driver assignment problem does not include a flow balance constraint for the sink
node in the figure above, but we know that in a min-cost network flow problem, the flow
balance constraint of one node is always redundant. Thus, our formulation of the dynamic
driver assignment problem omits the flow balance constraint for the sink node. Lastly, the
dynamic driver assignment problem maximizes its objective function rather than minimizing
as in a min-cost network flow problem, but we can always minimize the negative of the
objective function in the dynamic driver assignment problem. Thus, the dynamic driver
assignment problem corresponds to a min-cost network flow problem over the network shown
above with upper bounds on the flows over some of the arcs.
Recall that if all of the demand and supply data in a min-cost network flow problem
are integer-valued, then there exists an integer-valued optimal solution even when we do not
impose integrality requirements on the decision variables. It turns out this result continues to
hold when we have upper bounds on the flows over some of the arcs and these upper bounds
are also integer-valued. Therefore, it follows that if the numbers of drivers at different
locations at the beginning of day 1 are integers and the numbers of loads that need to be
In an acceptable tour, we must depart each city i exactly once. In other words, we must use
exactly one of the arcs that go out of each city i. We can represent this requirement by using
P
the constraint j∈N xij = 1. Similarly, we must enter each city i exactly once. So, we must
use exactly one of the arcs that go into each city i. This requirement can be represented
P
by using the constraint j∈N xji = 1. In this case, we can formulate the traveling salesman
problem as the integer program
3"
3"
1"
1"
4"
4" 6"
6" 2"
2"
7"
7" 5"
5"
We have one subtour elimination constraint for each subset of the cities. Thus, if there are
n cities, then there are 2n subtour elimination constraints, which can easily get large. With
this many constraints, our formulation of the traveling salesman problem appears to be
useless! The trick to using our formulation is to add the subtour elimination constraints as
needed. To illustrate the idea, consider the 10 cities on the left side of the figure below. On
the right side, we show the distance from city i to city j for all i, j ∈ N .
We begin by solving the formulation of the traveling salesman problem without any
subtour elimination constraints. In particular, we minimize the objective function in the
P
traveling salesman problem subject to the constraints
P j∈N xij = 1 for all i ∈ N and
j∈N xji = 1 for all i ∈ N only. The figure below shows the optimal solution that we
obtain when we solve the formulation of the traveling salesman problem without any subtour
elimination constraints. In particular, we have x13 = x31 = x24 = x48 = x85 = x52 = x67 =
x76 = x9,10 = x10,9 = 1 in the optimal solution and the other decision variables are zero.
In the solution in the figure above, we have a subtour that includes the set of cities
S = {1, 3}. That is, the solution above does not include an arc that connects a city in
S = {1, 3} directly to a city in N \ S = {2, 4, 5, 6, 7, 8, 9, 10}. Thus, we add the subtour
elimination constraint corresponding to the set S = {1, 3} into our formulation. Note that
this subtour elimination constraint is given by
In the constraint above, the first index of the decision variables is a city in set S and the
x21 + x23 + x26 + x27 + x29 + x2,10 + x41 + x43 + x46 + x47 + x49 + x4,10
+ x51 + x53 + x56 + x57 + x59 + x5,10 + x81 + x83 + x86 + x87 + x89 + x8,10 ≥ 1.
There is another subtour in the solution above that includes the set of cities S =
{6, 7}. We add the subtour elimination constraint corresponding to this set S into our
formulation. Lastly, the solution above has one more subtour that includes the set of cities
S = {9, 10}. We add the subtour elimination constraint corresponding to this set of
cities as well. The subtour elimination constraints corresponding to the sets S = {6, 7}
and S = {9, 10} can be written by using an argument similar to the one used in the
two subtour elimination constraints above. Therefore, we added 4 subtour elimination
constraints. Solving our formulation of the traveling salesman problem with these 4 subtour
elimination constraints, we obtain the solution in the figure below.
The solution in the figure above includes three subtours. Noting the cities involved in each
one of these subtours, we further add the 3 subtour elimination constraints corresponding to
the sets S = {1, 3, 4, 6, 7}, S = {2, 5} and S = {8, 9, 10} into our formulation of the traveling
salesman problem. Considering the 4 subtour elimination constraints that we added earlier,
we now have a total of 7 subtour elimination constraints. Solving the formulation of the
traveling salesman problem with these 7 subtour elimination constraints, we obtain the
solution given in the figure below.
6"
3" 7"
4"
9"
8" 10"
2" 5"
The solution above does not include any subtours. By adding 7 subtour elimination
constraints into the formulation of the traveling salesman problem, we obtained a solution
that does not have any subtours. Since this solution does not include any subtours, it must
be the optimal solution when we solve the traveling salesman problem with all subtour
elimination constraints. Therefore, the solution shown above is the optimal solution for the
traveling salesman problem. The total distance of this tour is 27. Note that the traveling
salesman problem we dealt with involves 10 cities. Thus, if we constructed all of the subtour
elimination constraints at the beginning, then we would have to construct 210 = 1024 subtour
elimination constraints. By generating the subtour elimination constraints as needed, we
were able to obtain the optimal solution to the traveling salesman problem by generating
only 7 subtour elimination constraints. For a problem with 10 cities, constructing all of the
1024 subtour elimination constraints may not be difficult. However, if we have a problem with
100 cities, then there are 2100 ≈ 1030 subtour elimination constraints and it is impossible to
construct all of these subtour elimination constraints. Although it is not possible to construct
all of the subtour elimination constraints, traveling salesman problems with hundreds of cities
are routinely solved today. Lastly, we emphasize that the idea of adding the constraints to
an optimization problem as needed is an effective approach to tackle problems with a large
number of constraints. In this section, we used this approach to solve the traveling salesman
problem, but we can use the same approach when dealing with other optimization problems
with large numbers of constraints.
f (j0 , j1 , . . . , jn ) = rj1 + . . . + rjn − cj0 ,j1 − cj1 ,j2 − . . . − cjn−1 ,jn − cjn ,j0 .
In the profit expression above, we do not include a reward for city 0 because we know that this
city must be visited in any tour anyway. Also, since we must go back to city 0 after visiting
the last city jn , we include the cost cjn ,j0 in the profit expression above. Generally speaking,
there are two classes of heuristics, construction heuristics and improvement heuristics. In
the next two sections, we discuss these two classes of heuristics within the context of the
prize-collecting traveling salesman problem.
136
16.2 Construction Heuristics
In a construction heuristic, we start with an empty solution. What we mean by an empty
solution depends on the specific problem on hand. For the prize-collecting traveling salesman
problem, an empty solution could correspond to the tour where we only visit city 0 to collect
a profit of 0. In a construction heuristic, we start with an empty solution and progressively
construct better and better solutions. A common idea to design a construction heuristic
is to be greedy and include an additional component into the solution that provides the
largest immediate increase in the objective value. In the prize-collecting traveling salesman
problem, this idea could result in inserting a city into the current tour such that the inserted
city provides the largest immediate increase in the profit of the current tour.
To give the details of a construction heuristic for the prize-collecting traveling salesman
problem, assume that the current tour on hand is (j0 , j1 , . . . , jn ). We consider each city k
that is not in the current tour. We try inserting city k into the current tour at each possible
position and check the increase in the profit. We choose the city that provides the largest
increase in the profit of the current tour and insert this city into the tour at the position
that provides the largest increase in the profit. In particular, assume that we currently have
the solution (j0 , j1 , . . . , jn ) with n cities in it. We consider a city k ∈ N \ {j0 , j1 , . . . , jn } that
is not in the current tour. If we add city k into the the current tour after the `-th city, then
the increase in the profit is given by
We note that the increase in the profit given above can be a negative quantity. We choose
the city k ∗ and the position `∗ that maximizes the increase in the profit. That is, the city
k ∗ and the position `∗ is given by
If inserting city k ∗ at the position `∗ into the current tour yields a positive increase in the
profit of the current tour, then we insert city k ∗ at the position `∗ . In this case, we have the
tour (j0 , j1 , . . . , j`∗ , k ∗ , j`∗ +1 , . . . , jn ) with n + 1 cities in it. Starting from the new tour with
n + 1 cities, we try to find another city to insert into the current tour until we cannot find
a city providing a positive increase in the profit of the current tour.
The chart on the left side of the figure below shows 15 cities over a 10 × 10 geographical
region. The distance associated with arc (i, j) is the Euclidean distance between cities i and
j. The reward associated with visiting each city is indicated in brackets next to label of the
city. For example, if we visit city 4, then we collect a reward of 2.1. We apply the greedy
heuristic described above on the prize-collecting traveling salesman problem that takes place
over these cities. The output of the greedy heuristic is shown on the right side of the figure
below. The total profit from the tour is 23.06. The tour in the figure below may look
reasonable, but we can improve this tour with simple inspection. For example, if we connect
Construction heuristics are intuitive and they are not computationally intensive, but they
often end up with solutions that are clearly suboptimal. In the figure above, since the portion
of the tour that visits the cities 0, 3, 4, 12, 13 and 14 has a crossing and the distances of the
arcs are given by the Euclidean distances between the cities, it was relatively simple to spot
that we could improve this tour. In the next section, we discuss improvement heuristics that
are substantially more powerful than construction heuristics.
for all choices of k, ` = 1, 2, . . . , n with k < `. For example, if the set of cities is N =
{0, 1, 2, 3, 4, 5} and the current solution we have on hand visits the cities (0, 2, 3, 5), then the
solutions in the neighborhood of this solution are given by
The first tour above is obtained by reversing the portion (2, 3) in the tour (0, 2, 3, 5), the
second tour above is obtained by reversing the portion (2, 3, 5) in the tour (0, 2, 3, 5) and the
third tour above is obtained by reversing the portion (3, 5) of the tour (0, 2, 3, 5).
In general, the definition of a neighborhood requires some insight into the problem on
hand. For example, the definition of a neighborhood given above can be useful to remove
crossings in a tour. To see how this, assume that the current solution we have on hand
corresponds to the tour given on the left side of the figure below. The sequence of the cities
visited in this tour is (0, 1, 2, 3, 4, 5, 6, 7). This tour has a crossing. If we focus on the portion
(3, 4, 5, 6) of the tour and reverse the order of the cities visited in this portion, then we obtain
the tour (0, 1, 2, 6, 5, 4, 3, 7). We show this tour on the right side of the figure below. Note
that the tour on the right side of the figure does not have the crossing on the left side. If the
distances on the arcs are given by the Euclidean distances between the cities, then the length
144
we release. During the month, the random rainfall is realized. At the end of the month, we
observe the water level, compute if and how much we are short of the desired water level
and incur the cost associated with each unit of water we are short. On the right side of the
figure, we give a tree that gives a more detailed description of the sequence of events and the
decisions in the problem. The nodes of the tree correspond to the states of the world. The
branches of the tree correspond to the realizations of the random quantities. Node A in the
tree corresponds to the state of the world here and now at the beginning of June. The three
branches leaving node A correspond to the three different realizations of the rainfall. Node B
corresponds to state of the world at the end of June after having observed that the realization
of the rainfall is low, at which point we need to check if and how much we are short of the
desired water level and incur the cost for each unit of water we are short. The interpretations
of nodes C and D are similar, but these nodes correspond to the cases where the rainfall
was observed to be medium and high.
Next, we think about the decisions that we need to make at each node. As a result of
this process, we will associate decision variables with each node in the tree. At node A, we
decide how much water to release from the reservoir. Therefore, associated with node A, we
define the following decision variable.
At node B, we measure the level of water in the reservoir and we incur a cost for each unit
we are short of the desired water level. Associated with node B, we define the following
decision variables.
zB = Given that we are at node B, the amount we are short of the desired water level.
Note that yB captures the water level at the end of June given that the rainfall was low during
the month and zB captures the amount we are short at the end of June given that the rain
fall was low. We define the decision variables yC , zC , yD and zD with similar interpretations,
In the expression above, we multiply the cost incurred at each node by the probability of
reaching that node to compute the total expected cost incurred at the end of June.
We now construct the constraints in the problem. The decision variable yB corresponds
to the water level at the end of June given that the rainfall during the month turned out
to be low. The water level at the end of the month depends on how much water we had
at the beginning of the month, how much water we released and the rainfall during the
month. Noting that we have 150 units in the reservoir at the beginning of June, we release
xA units of water from the reservoir and low rainfall corresponds to a rainfall of 125 units,
we can relate yB to the decision variable xA as
yB = 150 − xA + 125.
For each unit we are short of the desired water level of 100, we incur a cost of $5. The
decision variable zB corresponds to the amount we are short given that the rainfall during
the month turned out to be low. Thus, if yB is less than 100, then zB = 100 − yB , whereas
if yB is greater than 100, then zB = 0. To capture this relationship between the decision
variables zB and yB , we use the constraints
zB ≥ 100 − yB and zB ≥ 0.
Since zB appears in the objective function with a negative coefficient and we maximize the
objective function, we want to make the decision variable zB as small as possible. If yB is
less than 100, then 100 − yB ≥ 0. Therefore, due to the two constraints above, if yB is less
than 100, then the smallest value that zB can take is 100 − yB . In other words, if yB is
less than 100, then the decision variable zB takes the value 100 − yB , as desired. On the
other hand, if yB is greater than 100, then we have 100 − yB ≤ 0. In this case, due to the
two constraints above, if yB is greater than 100, then the smallest value that zB can take
is 0. In other words. if yB is greater than 100, then the decision variable zB takes the value
0, as desired. By using the same argument, we have the constraints
The optimal objective value of the problem above is 637.5 with the optimal values of the
decision variables given by
We show the optimal solution in the tree below. According to the solution in the tree, we
release 250 units of water at the beginning of June. If the rainfall turns out to be low, then
the water level at the end of the month is 25 and we are 75 units short of the desired water
level. If the rainfall turns out to be medium or high, then the water level at the end of the
month is respectively 100 or 200, in which case, we are not short. Note that to maximize the
expected profit, we release 250 units of water at the beginning of the month, which implies
that we are willing to be short of the desired water level when the rainfall during the month
turns out to be low. The revenue that we obtain from the released water justifies the cost
incurred at the end of the month if the rainfall turns out to be low.
Node%A%
xA%=%250#
#
#
Let us think about the decision variables in the problem. At each node in the tree except
for the leaf nodes that are at the very bottom, we need to decide how much water to release
from the reservoir. Thus, we define the following decision variables.
xi = Given that we are at node i, the amount of water released from the reservoir, for
i = A, B, C, D, E, F, G.
For example, the decision variable xC represents how much water we release at the beginning
of July given that we are at node C. In other words, xC represents how much water we release
at the beginning of July given that the rainfall during June was high. Similarly, xE represents
how much water we release at the beginning of August given that the rainfall during June
and July was respectively low and high. Since the planning horizon ends at the end of
August, we do not worry about how much water to release at the end of August. Thus, we
do not worry about defining decision variables that capture the amount of water released at
the nodes H, I, J, K, L, M, N and O. On the other hand, at each node in the tree except for
the root node at the very top, we need to measure the level of water and how much we are
short of the desired level. So, we define the following decision variables.
The water level in the reservoir at node A is known to be 150. Therefore, we do not need
decision variables that measure the water level and how much we are short of the desired
water level at node A.
Next, we construct the objective function in the problem. Each node in the tree
contributes to the expected profit. As an example, we consider node G. At node G, the
amount of water we release is given by the decision variable xG . Thus, we make a revenue
of 3 xG . At this node, the amount we are short of the desired water level is given by the
decision variable zG . Thus, we incur a cost of 5 zG at node G. So, the profit at node G is
given by 3 xG − 5 zG . We reach node G when the rainfall in June and July are respectively
high and high. Thus, the probability of reaching node G is 0.6 × 0.6 = 0.36. In this case,
the contribution of node G to the expected profit is given by 0.36 (3 xG − 5 zG ). Considering
all the nodes in the tree, the objective function is given by
3 xA + 0.4 (3 xB − 5 zB ) + 0.6 (3 xC − 5 zC )
+ 0.16 (3 xD − 5 zD ) + 0.24 (3 xE − 5 zE ) + 0.24 (3 xF − 5 zF ) + 0.36 (3 xG − 5 zG )
− 0.064 × 5 zH − 0.096 × 5 zI − 0.096 × 5 zJ − 0.144 × 5 zK − 0.096 × 5zL
− 0.144 × 5 zM − 0.144 × 5 zN − 0.216 × 5 zO .
We proceed to constructing the constraints in the problem. For each node in the tree,
we need to construct a constraint that computes the water level at the current node as a
function of the water level at the parent node of the current node, the amount of water
released at the parent node and the rainfall over the branch that connects the current node
to its parent node. For example, for nodes B, G and H, we have the constraints
Furthermore, for each node in the tree, we need to compute how much we are short of the
desired water level. For example, for nodes B, G and H, we compute how much we are short
of the desired water level by using the constraints
The idea behind the constraints above is identical to the one we used when we formulated
the two-stage problem in the previous section. We construct the two types of constraints
for all nodes in the tree except for node A. Since the water level at node A is known to
be 150 units, we do not need to compute the water level and how much we are short at
node A. Putting the discussion in this section together, we can maximize the total expected
The optimal objective value of the problem above is 2005. There are quite a few decision
variables in the problem. Thus, we go over the optimal values of only a few of the decision
variables. For example, we have xE = 700 and yE = 575 in the optimal solution. According
to this solution, given that the rainfall during June and July was respectively low and high,
it is optimal to release xE = 700 units of water at the beginning of August. Given that the
rainfall during June and July was respectively low and high, the optimal water level at the
beginning of August is yE = 575 units.
1
1 1
1 4
12! Dem. at Dem. at
1
2 Scenario Prob. Ret. Cen. 1 Ret. Cen. 2
10!
1 0.6 70 20
1 4
3 2 2 0.4 10 80
1
produc,on. warehouses! retail.
plant! centers.
On the left side of the figure below, we show the time line of the events. At the beginning,
we decide how much product to ship to each warehouse. Then, we observe the realization of
the demands. After observing the realization of the demands, we decide how much product
to ship from the warehouses to the retail centers to cover the demands. On the right side of
the figure, we give a tree that shows a more detailed description of the sequence of events
and the decisions in the problem. Node A in the tree corresponds to the state of the world
here and now. At this node, we decide how much product to ship to the warehouses. The
two branches leaving node A correspond to the two demand scenarios given in the table
above. Node B corresponds to the state of the world where the demands turned out to be
the one in scenario 1. At this node, we need to decide how much product to ship from the
warehouses to the retailer centers. Similarly, node C corresponds to the state of the world
where the demands turned out to be the one in scenario 2. At this node, we also need to
decide how much product to ship from the warehouses to the retailer centers. To capture
the decisions in the problem, we define the following decision variables.
xiA = Given that we are at node A, amount of product shipped to warehouse i, for i = 1, 2, 3.
yijB = Given that we are at node B, amount of product shipped from warehouse i to retail
center j, for i = 1, 2, 3, j = 1, 2.
yijC = Given that we are at node C, amount of product shipped from warehouse i to retail
center j, for i = 1, 2, 3, j = 1, 2.
We indicate these decision variables in the tree shown above. Note that since node B
corresponds to the case where the demands turned out to be the one in scenario 1, the
decision variables {yijB : i = 1, 2, 3, j = 1, 2} capture the products shipped from the
warehouses to the retail centers under scenario 1. Since it costs $1 to ship one unit of
product from the production plant to each one of the warehouses, the cost incurred at node
A is 3i=1 xiA . For notational brevity, we use cij to denote the cost of shipping a unit of
P
product from warehouse i to retail center j. So, the costs incurred at nodes B and C are
P3 P2 P3 P2
i=1 j=1 cij yijB and i=1 j=1 cij yijC . Since the probabilities of reaching nodes B and
C are 0.6 and 0.4, the total expected cost can be written as
3
X 3 X
X 2 3 X
X 2
xiA + 0.6 cij yijB + 0.4 cij yijC .
i=1 i=1 j=1 i=1 j=1
Next, we construct the constraints in the problem. At node A, the total amount of
product that we ship out of the production plant cannot exceed the product availability at
the production plant. Therefore, we have the constraint
3
X
xiA ≤ 100.
i=1
At node B, the total amount of product that we ship out of each warehouse i cannot exceed
the amount of product shipped to the warehouse. Noting that the amount of product shipped
to warehouse i is given by xiA , for all i = 1, 2, 3, we have the constraint
2
X
yijB ≤ xiA .
j=1
In this case, to figure out how to ship the products from the production plant to the
warehouses and from the warehouses to the retail centers to minimize the total expected
cost, we can solve the linear program
3
X 3 X
X 2 3 X
X 2
max xiA + 0.6 cij yijB + 0.4 cij yijC
i=1 i=1 j=1 i=1 j=1
3
X
st xiA ≤ 100
i=1
2
X
yijB ≤ xiA ∀ i = 1, 2, 3
j=1
3
X
yi1B ≥ 70
i=1
X3
yi2B ≥ 20
i=1
X2
yijC ≤ xiA ∀ i = 1, 2, 3
j=1
3
X
yi1C ≥ 10
i=1
X3
yi2C ≥ 80
i=1
The optimal objective value of the problem above is 340. We focus on the values of some
of the decision variables in the optimal solution. In particular, the optimal solution has
x1A = 20, x2A = 50 and x3A = 30. There are two interesting observations. First, observe
that the costs of shipping products from warehouse 1 to retail center 2 and from warehouse
3 to retail center 1 are rather high. Thus, if we ship a large amount of product to warehouse
1 and retail center 2 ends up having a large demand, then we incur a high cost to cover the
demand at retail center 2. Similarly, if we ship a large amount of product to warehouse 3
and retail center 1 ends up having a large demand, then we incur a high cost to cover the