Thesis A Komen Final June 6

Department of Information and Computing Sciences
Benders’ Decomposition vs. Column &

Constraint Generation, a Closer Look
Master Thesis
First Supervisor:
Author: Dr. J. A. Hoogeveen
Abel Komen
Utrecht University Second Supervisor:
Dr. Ir. J. M. van den Akker
June 6, 2017
Abstract
In this thesis an attempt is made to find out how the type of uncertainty (dis-
crete and finite or polyhedral) influences performance of Benders’ decomposition
[4] and Column & Constraint Generation [24] when solving the demand robust
location-transportation problem. A generalization of Benders’ decomposition is
presented to make it applicable to a large group of demand robust optimiza-
tion problems. Also, Column & Constraint Generation is adapted to be used
on discrete and finite uncertainty sets. In [24] it was shown that Column &
Constraint Generation is able to solve the problem a lot better than a standard
implementation of Benders’ decomposition. The performance comparison for
discrete and finite uncertainty sets made in this thesis is new. On top of that,
a number of techniques for making Benders’ faster are applied. Special atten-
tion is paid to the role of the MIP-solver that is used as a black box for both
algorithms.
Contents
1 Introduction 3
2 Demand Robust Optimization 7

2.1 Problem Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.1.1 Demand Robust Min-Cut . . . . . . . . . . . . . . . . . . 10
2.1.2 Demand Robust Location-Transportation . . . . . . . . . 11
3 Benders’ Decomposition 14
3.1 Benders’ Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Benders’ Decomposition for Demand Robust Optimization . . . . 17
3.2.1 Discrete Uncertainty Sets . . . . . . . . . . . . . . . . . . 17
3.2.2 Polyhedral Uncertainty Sets . . . . . . . . . . . . . . . . . 20
3.3 Demand Robust Location-Transportation . . . . . . . . . . . . . 21
3.4 Algorithmic Enhancements . . . . . . . . . . . . . . . . . . . . . 23
3.4.1 2 Phase Method . . . . . . . . . . . . . . . . . . . . . . . 23
3.4.2 Using Incumbent Solutions . . . . . . . . . . . . . . . . . 24
4 Column & Constraint Generation 25

4.1 Discrete Uncertainty Sets . . . . . . . . . . . . . . . . . . . . . . 25
4.2 Polyhedral Uncertainty Set . . . . . . . . . . . . . . . . . . . . . 26
4.3 Demand Robust Location-Transportation . . . . . . . . . . . . . 27
4.3.1 Discrete Uncertainty Set . . . . . . . . . . . . . . . . . . . 27
4.3.2 Polyhedral Uncertainty Set . . . . . . . . . . . . . . . . . 28
5 Computational Results 29
5.1 Problem Generation . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1
6 Interpretation of Results 35
6.1 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
6.1.1 Problem Selection . . . . . . . . . . . . . . . . . . . . . . 36
6.1.2 Code and Algorithmic Choices . . . . . . . . . . . . . . . 37
6.1.3 Performance Variability . . . . . . . . . . . . . . . . . . . 37
6.2 Possible Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7 Conclusion 40
2
Chapter 1
Introduction
Real world optimization problems often include uncertainty in some form. The
easiest way to deal with this uncertainty is to use a deterministic model whose
solutions are close enough to the real outcome. Unfortunately this is not always
possible or it could be that capturing the uncertainty in an optimization problem
yields superior results. A number of ways to do this will be covered here.
There is ample literature regarding real world optimization problems that
deal with uncertainty. In the field of robustness a couple of examples are prob-
lems in railway related optimization problems ([8], [16]), airline scheduling prob-
lems problems ([19]) and problems regarding electricity grid stability ([1], [14],
[15], [23], [25]).
To illustrate the possibilities, consider the cutting-stock problem. The cutting-
stock problem concerns the divisions of some standard-sized pieces of supplied
material into pieces of a specific size. The objective is to minimize the waste
that is left over after the demand is fulfilled. This problem could for example be
encountered in a company that is supplied with pieces of sheet metal of a fixed
size that have to be cut into numerous pieces of a specific size. The company has
m different sizes, and requires qj pieces of size with index j ∈ {1, . . . , m}. Each
standard-sized piece can be cut using a so-called pattern. The company has n
patterns and for each pattern i ∈ {1, . . . , n} parameter aij indicates the number
of pieces of size j it produces. Also, each pattern i has waste ci associated with
it. The variable xi indicates how often pattern i is utilized. Of course, xi is
integer. The deterministic
Pn Pnversion of cutting stock is given by
minx∈{0,1,...} { i=1 ci xi | i=1 aij xi ≥ qj ∀j ∈ {1, . . . , m}}.
In the cutting-stock problem, uncertainty can arise in a number of ways.
It could be that demand qj for the cuts is unknown at the time of decision
making or that the number of cuts aij produced by a pattern is uncertain (due
to low quality of supplied material for example). Also, the amount of waste
produced by pattern i can be uncertain. For ease of exposition, assume that
the uncertainty can be adequately captured by some K scenarios {1, . . . , K}.
This means that scenario k ∈ {1, , . . . , K} is a set of parameters (ck , q k , ak )
The most strict way to deal with this uncertainty is given by robust opti-
3
mization. In robust optimization the chosen solution has to be feasible for all
possible scenarios and the objective is to minimize the costs of the most expen-
sive scenario. So the robust optimization
Pn Pversion of cutting stock is given by
n
minx∈{0,1,...} {maxk∈{1,...,K} { i=1 cki xi | i=1 akij xi ≥ qjk ∀j ∈ {1, . . . , m}}}.
Robust optimization is a one stage approach to handling uncertainty, be-
cause all decisions have to be made at once before the outcome of the uncertain
processes is known and it is not possible to adjust the solution afterwards. On
the other hand, there are also two stage approaches. These allow for a decision
maker to observe the outcome of his plans and adjust where necessary. For
cutting-stock this means that it is possible to make a plan and check whether
all demands are met after the plan is executed and the results can be checked.
If not, some more pieces of material can be cut.
A well known two stage approach is stochastic programming. The goal of two
stage stochastic programming is to minimize the cost of the first stage decisions
plus the average cost over all scenarios of the second stage decisions. Let the
first stage decisions be called yi . The second stage decisions for scenario k are
called xki . The cost for the first stage decisions are denoted by di and to avoid
a trivial situation where all decisions can be postponed to the second stage it
is necessary that di < cki . The two stage stochastic programming version of
cutting-stock is given by
n K n
X 1 XX k k
min d i yi + ci xi
y,xk
i=1
K i=1
k=1
n
X
s.t. akij (yi + xki ) ≥ qjk ∀j ∈ {1, . . . , m}, k ∈ {1, . . . , K}
i=1
y, xk ≥ 0 and integer
A middle ground between stochastic programming and robust optimization

is presented by recoverable robustness ([3], [16]). Recoverable robustness is like
two stage stochastic programming in that decisions can be made before and
after it is known what the scenario is. However, the value to be optimized is
not the first stage costs plus the expected second stage costs, but the first stage
costs plus the cost of the worst case scenario in the second stage, which makes it
look more like robust optimization. Also, the algorithm to determine the second
stage decisions should be easy and the allowed actions should be limited. In this
case the set of patterns could be limited to the patterns used in the first stage
and the recovery problem becomes a cutting stock problem with fixed patterns.
A recoverable robust version of cutting-stock is given by
4
n
X n
X
min di yi + max min cki xki
y k xk
i=1 i=1
n
X
s.t. akij (yi + xki ) ≥ qjk ∀j ∈ {1, . . . , m}, k ∈ {1, . . . , K}
i=1
This thesis focuses on the narrower concept of demand robustness as pre-

sented in [11]. In demand robustness the uncertainty is only in the right-hand-
side (so the demand) of some subset of constraints that need to be satisfied. Two
examples of these kind of problems are given in Chapter 2: the demand robust
minimum-cut ([11]) problem and the robust facility location problem ([24]). A
demand robust version of cutting stock is given by
n
X n
X
min di yi + max min ci xki
y k xk
i=1 i=1
n
X
s.t. aij (yi + xki ) ≥ qj ∀j ∈ {1, . . . , m}, k ∈ {1, . . . , K}
i=1
A variety of optimization techniques can be applied to demand robust op-

timization problems. The optimization algorithm of choice is highly dependent
on the way the problem is modelled. In this thesis I will focus on two families
of optimization techniques:
1. Benders’ decomposition [4] (see Chapter 3)
2. Column & constraint generation [24] (see Chapter 4)
Both techniques rely on decomposition of problems into a ‘hard’ and an

‘easy’ part and are therefore a natural choice to solve two stage demand robust
problems. Benders’ decomposition has been applied to optimization problems
under uncertainty in for example [6] and [13]. Column & constraint generation
has been applied in [1], [14], [15], [23] and [25].
In [24] the performance of the column & constraint generation algorithm is
compared to that of Benders’ decomposition on a robust location-transportation
problem. The column & constraint generation algorithm appears to perform
much better. However, part of this advantage could be caused by a sub-
optimal implementation of Benders decomposition. Since its publication nu-
merous methods of making Benders’ faster have been published, such as a bet-
ter cut selection ([17]) and faster starting procedures ([18]). Also, column &
constraint generation may be at an advantage in [24] due to the nature of the
5
uncertainty in that paper. A polyhedral scenario set is used which could put
Benders’ at a disadvantage because the problem no longer can be solved by a
classical Benders’ but has to be solved by a variant of the algorithm.
Therefore, in this thesis Benders’ decomposition will be compared to column
and constraint generation for demand robust location-transportation with a
discrete scenario set and with a polyhedral uncertainty set. This will hopefully
shed some light on the strengths and weaknesses of both algorithms and could
help with future algorithm selection.
Also, the problem with discrete uncertainty sets is solved by a standard
Gurobi MIP-solver to compare the performance of the specialized algorithms to
an off-the-shelf standard solver. The results of these experiments can be found
in Chapter 5.
In Chapter 6 we will take a closer look at how to interpret the results of the
computational experiments.
6
Chapter 2
Demand Robust
Optimization
This chapter provides a detailed look at demand robust optimization problems

and how to model them using (mixed integer) linear programs. According to
[3] real-world optimization problems often have the following properties:
• The data are not exactly known.
• Small data perturbations can heavily affect feasibility/optimality proper-
ties of solutions.
• Efficient solution methods are necessary because data and decision vectors
are large.
In [3] Ben-Tal et al. describe two ways of dealing with the listed properties
using uncertain linear programs: Robust Optimization (RO) and Adjustable RO.
The uncertainty of parameters is captured in an uncertainty set that contains all
possible realizations of the uncertain parameters. The definition of an uncertain
linear program for some uncertainty set Z is

min{cx|Ax ≥ u} (2.1)
x≥0
[A,u,c]∈Z
In robust optimization feasibility and objective value of a solution x are
affected by the uncertain parameters in uncertainty set Z. The aim is to find
an optimal solution x∗ that is feasible for all possible realizations in Z and
minimizes the maximal cx among all corresponding realizations of c. This leads
to the Robust Counterpart (RC, [3]) of 2.1:

min max {cx|Ax ≥ u, [A, u, c] ∈ Z} (2.2)
x≥0 [A,u,c]
The RC formulation requires that a solution is feasible for all realizations

of Z. In practice, this might not be strictly necessary. It could be that the
7
value of some decision variables can be chosen after the uncertainty is lifted.
From now one, variables that have to be chosen before the uncertainty is lifted
will be denoted by y and variables that can be chosen after the uncertainty
is lifted will be marked x, unless explicitly stated otherwise. The variables y
are called non-adjustable or first stage and the variables x are called adjustable
or second stage. It is possible that there are constraints that only affect the y
variables. The set Y will be the set of allowed values of y and is defined by
Y = {y|Dy ≥ e} ∩ N (throughout this report, the symbol N will be used to
denote the natural numbers including 0). The Adjustable RC (ARC, [3]) can
now be formulated as
min{dy|Ax + By ≥ u; x ≥ 0}[A,B,u,d]∈Z (2.3)

y∈Y
The matrix A is called the recourse matrix. The problem is called a fixed
recourse problem when A is not uncertain. The vector u is called the demand
vector.
The concept of demand robustness (DR) is introduced in [11]. Whitout call-
ing it demand robustness, [24] also gives a description of demand robustness.
The description provided here is modeled after the description in [24] but contin-
uing the notation used in the beginning of [3]. A DR problem aims to optimize
the total cost of the non-adjustable variables y plus the worst case costs of the
adjustable variables x while only the demand vector u is uncertain. To fit this
into the framework of 2.3 a variable η is added to the non-adjustable variables
accompanied by constraints of the form η ≥ cx. In this case, the costs of the
second stage decisions (captured by c) can also vary. The uncertainty set Z
is renamed U to emphasize this property and furthermore [11] only deals with
uncertainty sets that consist of scenarios, meaning that U is a discrete and fi-
nite set. So U = {u1 , . . . , uk }. The discrete and finite nature of U allows for
the introduction of a separate vector of variables xk to be introduced for every
scenario uk .
A general DR problem in the style of 2.3 can be formulated as follows:
min {dy + η|Axk + By ≥ uk , η ≥ ck xk , xk ≥ 0}uk ∈U (2.4)

y∈Y,η
A different kind of uncertainty set for demand robust problems was intro-
duced in [5] where the uncertain values are not in a discrete and finite set but
in a special type of polyhedron. Every element uj of u ∈ U has a base value
uj and some extra demand of at most ūj can be added to that. So every uj
takes a value in uj + gj ūj with gj ∈ [0, 1]. On top of that there is a limit Γ
to the amount
P of total deviation from the base values which is enforced by the
constraint j gj ≤ Γ. The uncertainty set can now be defined as
U = {u|u + g ū, 1g ≤ Γ} (2.5)

Vector 1 denotes the row vector of ones (1 · · · 1) and is element-wise mul-
tiplication of two vectors of the same length.
8
A DR problem with an uncertainty set as in 2.5 can be summarized as

min dy + max min{cx|Ax ≥ u − By} (2.6)
y∈Y u∈U x≥0
There is a large difference in solving 2.4 and 2.6. The discrete and finite
uncertainty set of 2.4 allows the problem to be formulated as one linear program.
This is not the case for 2.6. The way to solve the problem for some fixed ȳ can
be found in [13] and is repeated here. How to use this to solve the complete
problem is explained in the chapters on Benders’ decomposition (Chapter 3)
and Column & Constraint Generation (Chapter 4).
For fixed ȳ, problem 2.6 reduces to ([13])
max min cx (2.7)

u∈U x
s.t. Ax ≥ u − B ȳ (2.8)
x≥0 (2.9)
Taking the dual of the inner minimization of 2.7 - 2.9 and combining the
maximization problems yields
max π(u − B ȳ) (2.10)

u,π
s.t. πA ≤ c (2.11)
u ∈ U, π≥0 (2.12)
Using the uncertainty set defined by 2.5 leads to
max πu + π(g ū) − πB ȳ (2.13)

g,π
s.t. πA ≤ c (2.14)
1g ≤ Γ (2.15)
g ∈ {0, 1}, π≥0 (2.16)
Note that g ∈ {0, 1} and not g ∈ [0, 1] because an optimal solution will
always be in an extreme point of the uncertainty set when Γ is integer ([13],
[24]).
Problem 2.13 - 2.16 contains the quadratic terms π(gū). Due to these terms
the problem cannot be solved by a standard MIP-solver. These quadratic terms
can be eliminated by introducing a new variable vector ω such that ωj = πj
when gj = 1 and 0 otherwise. Because ū ≥ 0 this can be achieved by the
constraints ω ≤ π and ω ≤ M g for some sufficiently large real number M . This
gives the following problem
9
max πu + ωū − πB ȳ (2.17)
g,π
s.t. πA ≤ c (2.18)
ω≤π (2.19)
ω ≤ Mg (2.20)
1g ≤ Γ (2.21)
gj ∈ {0, 1}∀j, π ≥ 0, ω≥0 (2.22)
The solution to Problem 2.17 - 2.22 can be used for Column & Constraint
Generation or Benders’ decomposition.
2.1 Problem Examples

2.1.1 Demand Robust Min-Cut
A simple example of a demand robust problem is the robust minimum-cut prob-
lem ([11]). The regular min-cut problem consists of a directed graph G = (V, A)
with a root vertex r and a terminal vertex t. A set of arcs A0 ⊆ A is to be
removed such that r and t are no longer connected in G0 = (V, A 0
P− A ). Every
edge a ∈ A has costs ca and the goal is to minimize the costs a∈A0 ca . The
linear programming formulation of this problem has a variable xa for all a ∈ A
that is 1 if a ∈ A0 and 0 otherwise. For all v ∈ V there is an additional variable
pv that is 1 if v is connected to the root r and 0 otherwise. This is enforced by
a constraint that makes sure pi − pj = 0 when vertices i and j are connected, so
x(i,j) − (pi − pj ) ≥ 0 for all (i, j) ∈ E. Disconnecting the root from the terminal
translates to constraint pr − pt ≥ 1. A linear program is given by
X
min c(i,j) x(i,j) (2.23)
(i,j)∈A
s.t. x(i,j) − (pi − pj ) ≥ 0 ∀(i, j) ∈ A (2.24)

pr − pt ≥ 1 (2.25)
pi ≥ 0 ∀i ∈ V (2.26)
x(i,j) ≥ 0 ∀(i, j) ∈ A (2.27)
Problem 2.23 - 2.27 can be expanded by the realization that implicit con-
straints pr − pi ≥ 0 for all i 6= t are left out. This makes sense for the determin-
istic version of the problem. Mentioning them explicitly makes it easier to see
how this problem can be cast into the demand robust framework of 2.4. The
demand robust version of min-cut is such that the terminal vertex t is uncer-
tain and it is cheaper to select edges before the terminal is known. Due to the
nature of this problem (a vertex is the terminal or not) there is only a demand
robust version of this problem with a discrete and finite uncertainty set. For
10
every edge there is a variable y(i,j) that indicates if (i, j) is selected before the
terminal is known. For every edge in every scenario k there is a variable xk(i,j)
that indicates if edge (i, j) is selected after the terminal is revealed in scenario
k. Selecting (i, j) before the terminal is revealed costs c(i,j) and selecting (i, j)
after the terminal is known costs σ k c(i,j) in scenario k for some σ k ≥ 1. The
uncertain demand vector consists of the right-hand sides of constraints pr − pki
with pr −pki ≥ 0 if i 6= t in scenario k and 1 otherwise. If tk denotes the terminal
in scenario k the demand robust equivalent of 2.23 - 2.27 is
X
min c(i,j) y(i,j) + η (2.28)
(i,j)∈E
s.t. y(i,j) + xk(i,j) − (pki − pkj ) ≥ 0 ∀(i, j) ∈ E, ∀k (2.29)

pr −≥1 pktk ∀k (2.30)
X
η− σ k c(i,j) xk(i,j) ≥ 0 ∀k (2.31)
(i,j)∈E
pki ≥ 0 ∀i ∈ V, ∀k (2.32)
xk(i,j) ≥0 ∀(i, j) ∈ E, ∀k (2.33)
2.1.2 Demand Robust Location-Transportation

The formulation of the robust location-transportation problem in this section
is due to [24]. The idea is that some commodity has to be supplied to a set of
customers. Before delivery it will be stored at a number of potential warehouses.
Each warehouse has a fixed building cost and variable costs for the chosen ca-
pacity. For each warehouse there is a maximal possible capacity. Supplying a
customer from a warehouse also carries with it a variable cost. Each customer
has an uncertain demand.
To formulate this problem as a mixed integer program each possible warehouse
has binary variable yi to indicate whether it is operational and continuous vari-
able zi to indicate its chosen capacity. Continuous variable xij indicates the
amount that is supplied to customer j from warehouse i. The parameters are
fi and zi for the fixed and variable cost of warehouse i respectively and cij for
the variable cost of supplying customer j from warehouse i. Each customer
has demand uj and the largest possible capacity for warehouse i is Ki . The
deterministic version of this problem is given by
11
X X
min (fi yi + ai zi ) + cij xij (2.34)
(y,z,x)
i ij
s.t. zi ≤ Ki yi (2.35)
X
xij ≤ zi (2.36)
j
X
xij ≥ uj (2.37)
i
yi ∈ {0, 1}, zi ≥ 0, xij ≥ 0 (2.38)
In the demand robust version of this problem the variables yi and zi are the
non-adjustable or first stage variables and the variables xij are the adjustable
or second stage variables. The uncertain demand vector u is drawn from a
polyhedral uncertainty set U defined as in 2.5. Let
P umax be the largest possible
total demand for all scenarios. The constraint i zi ≥ umax is added to make
sure that the second stage problem is always feasible. The demand robust
location-transportation problem is given by
X X
min (fi yi + ai zi ) + max min cij xij (2.39)
(y,z,x) u∈U
i ij
s.t. zi ≤ Ki yi ∀i (2.40)
X
zi ≥ umax (2.41)
i
X
xij ≤ zi ∀i (2.42)
j
X
xij ≥ uj ∀j (2.43)
i
yi ∈ {0, 1}, zi ≥ 0, xij ≥ 0 (2.44)
For a discrete and finite uncertainty set U = {u1 , . . . , uK } of size K the

problem is
12
X
min (fi yi + ai zi ) + η (2.45)
(y,z,η)
i
s.t. zi ≤ Ki yi ∀i (2.46)
X
zi ≥ umax (2.47)
i
X
xkij ≤ zi ∀i, k (2.48)
j
X
xkij ≥ ukj ∀j, k (2.49)
i
X
η− cij xkij ≥ 0 ∀k (2.50)
ij
yi ∈ {0, 1}, zi ≥ 0, xkij ≥ 0 (2.51)
13
Chapter 3
Benders’ Decomposition
Benders’ Decomposition as described here is a method to solve mixed integer

programming problems by splitting the original problem into a master problem
that contains the ‘hard’ (integer) variables and a subproblem that contains the
‘easy’ (continuous) variables. The master and subproblem are iteratively solved
until a solution is found.
Benders’ decomposition was first published in [4] and has since been suc-
cessfully applied to a wide range of integer linear programs and mixed integer
programs. It is particularly well suited for network design problems [9], but has
also been applied to, for example, airline scheduling [19]. The following expla-
nation of Benders’ decomposition is based on [9]. The general problem that is
suited for being solved by Benders’ decomposition is as follows:
min cx + dy (3.1)
s.t. Ax + By ≥ b (3.2)
Dy ≥ e (3.3)
x ≥ 0, y∈N (3.4)
For ease of exposition the set Y is defined as Y = {y|Dy ≥ e, y ∈ N}.

Problem 3.1 - 3.4 can also be written as
min{dy + min{cx|Ax ≥ b − By}} (3.5)

y∈Y x≥0
The minimization problem minx≥0 {cx|Ax ≥ b − By} is a linear program so

it can be substituted by its dual maxπ≥0 {π(b − By)|πA ≤ c} which leads to
min{dy + max{π(b − By)|πA ≤ c}} (3.6)

y∈Y π≥0
Problems 3.5 and 3.6 are equivalent. The inner maximization problem of
3.6 is a linear program so it can be either bounded, unbounded or infeasible.
If it is infeasible the inner minimization problem of 3.5 is either unbounded
or infeasible. This means that the original problem 3.1 - 3.4 is unbounded or
14
infeasible in that case. Therefore, the inner maximization of 3.6 is assumed to be
feasible. Note that feasibility does not depend on y, so the inner maximization
problem is infeasible for all y or feasible for all y. If it is unbounded for some
ȳ ∈ Y then the inner minimization problem of 3.5 is infeasible and that means
there exists no solution for the original problem with y = ȳ. Also, an unbounded
dual problem means there exists some extreme ray ρ with ρ(b − By) > 0. This
situation can be avoided by adding a constraint
ρ(b − By) ≤ 0 (3.7)

Now suppose the inner maximization problem of 3.6 is bounded. Then the
optimal solution is an extreme point of the feasible region. Suppose there are
P extreme points called π p for p ∈ {1, . . . , P }. The maximization problem
maxπ≥0 {π(b − By)|πA ≤ c} can also be written as
max{π p (b − By)|p ∈ {1, . . . , P }} (3.8)

q
Suppose there are Q extreme rays ρ with q ∈ {1, . . . , Q}. Combining con-
straints of the form 3.7 with expression 3.8 and an auxiliary variable means the
original problem can also be expressed as
min dy + ζ (3.9)
p
s.t ζ ≥ π (b − By) ∀p ∈ {1, . . . , P } (3.10)
q
ρ (b − By) ≤ 0 ∀q ∈ {1, . . . , Q} (3.11)
y∈Y (3.12)
3.1 Benders’ Algorithm

In itself the reformulation of problem 3.1 - 3.4 into 3.9 - 3.12 is not very useful.
Due to the possibly large number of extreme points and/or extreme rays it is
still not easy to solve the problem. However, the reformulation does give rise to
a potentially useful algorithm.
The algorithm starts by removing all constraints 3.10 and 3.11 from problem
3.9 - 3.12. This problem will be called the master problem. At each iteration
a constraint of type 3.11 (feasibility constraint) or 3.10 (optimality constraint)
will be added. So after t iterations some P 0 ≤ P optimality constraints and
Q0 ≤ Q feasibility constraint have been added with P 0 + Q0 = t. At that point,
the master problem is
min dy + ζ (3.13)
s.t ζ ≥ π p (b − By) ∀p ∈ {1, . . . , P 0 } (3.14)
q 0
ρ (b − By) ≤ 0 ∀q ∈ {1, . . . , Q } (3.15)
y∈Y (3.16)
15
The master problem after t iterations is the same as problem 3.9 - 3.12 with
a number of constraints removed, and hence the objective value of an optimal
solution for the master problem is a lower bound (LB) for problem 3.9 - 3.12.
Therefore, the master problem after t iterations provides a LB for the original
problem 3.1 - 3.4.
The feasibility and optimality constraints are generated by solving the inner
maximization of 3.6. The y variables in the objective function get the values
of the y variables of the most recent optimal solution for the master problem.
This is called the subproblem. Optimization of the subproblem will yield an
extreme ray or extreme point that can be used to either build a feasibility or
an optimality constraint, respectively. When the subproblem has been solved
to optimality (so it is not unbounded), the optimal objective function plus dy
are an upper bound (UB) for the original problem. If this UB is smaller than
the incumbent UB, then a new upper bound has been found.
So the algorithm starts by solving the master problem and storing the LB.
Then the solution to the master problem is used to build a new subproblem.
The subproblem is solved and either a feasibility or optimality constraint is
added to the master. If a new UB is found and it is better than the old one the
UB is updated. Now the master (including the new constraint) is solved, which
leads to a new LB and a new subproblem to be solved. The algorithm stops
when LB=UB, because at that point the solution to the original problem has
been found. Pseudocode of the algorithm can be found in Algorithm 3.1.
Algorithm 1 Benders Decomposition

upper bound U B = ∞
lower bound LB = −∞
while U B − LB > 0 do
ȳ, ξ¯ = argmin{dy + ξ|y ∈ Y, ξ ≥ 0}
LB = dȳ + ξ¯
solve maxu≥0 {u(b − B ȳ)|uA ≤ c}
if subproblem is unbounded then
get direction r and add constraint
r(b − By) ≤ 0 to master problem
else
get optimal solution u and objective
value u(b − B ȳ), set
U B = min{U B, dȳ + u(b − B ȳ)}
and add constraint ξ ≥ u(b − By) to master
end if
solve master, set new y and LB
end while
16
3.2 Benders’ Decomposition for Demand Robust
Optimization
Benders’ is also well suited to be applied to robust optimization problems. For
discrete uncertainty sets the general Benders’ decomposition as described in
Section 3.1 can be used in a straightforward way. For polyhedral uncertainty
sets Benders’ has to be adapted to the fact that the subproblem is no longer a
linear program. In the case described in this report the subproblem is a bilinear
program which calls for a different implementation of Benders’.
3.2.1 Discrete Uncertainty Sets

Using Benders’ decomposition to solve a demand robust optimization problem
with a discrete and finite uncertainty set is quite straightforward. This section is
based on the disaggregation of Benders’ cuts presented in [22]. In [22] Benders’
decomposition is applied to a facility location problem with capacity expansion
and multiple products. The resulting subproblem is then split into two by
seperating the transportation and facility expansion problems and it is further
split into seperate problems for products and facilities. In [20] the subproblem
is split for every time period T . A seperate subproblem is created for every
scenario in [21]. In [21] the average costs of the second stage are optimized, as
opposed to the worst case costs. In this section a seperate subproblem is created
for every scenario.
Disaggregation can be applied to subproblems that can be split into inde-
pendent parts ([22]). Matrix notation is used in this section to show that the
scenario subproblems are independent except for the cost constraint. Let the
standard problem be defined as
min dy + η (3.17)
k k
s.t. Ax + By ≥ b ∀k ∈ {1, . . . , K} (3.18)
k k
η−c x ≥0 ∀k ∈ {1, . . . , K} (3.19)
k
y ∈ Y, x ≥ 0 ∀k ∈ {1, . . . , K} (3.20)
where y are the first stage variables and xk are the second stage variables
for scenario k. The set Y is defined as Y = {y|y ∈ N, Dy ≥ e}. An expanded
representation of 3.17 - 3.20 is given by
17
min dy + η (3.21)
 
0 A 0 ··· 0    1
0 0 A ··· 0  B b
 .. .. .. ..  ..   .. 
  
.. 
η
. . . . .  1  . 
    . 
  x  B 
  K
0 0 0 ··· A   .  +   y ≥ b 
 
s.t. 
1 −c1  .  0 (3.22)
 0 ··· 0  .  
0
 
1
 0 −c2 ··· 0  x
 K .
 .. 
 . 
 .. 
. .. .. .. ..
 ..

. . . .  0 0
1 0 0 ··· −cK
η ≥ 0, x1 , . . . , xK ≥ 0, y ∈ Y (3.23)
For some fixed y = ȳ this can be reduced to
min η (3.24)
 
0 A 0 ··· 0  1  
0 0 A ··· 0  b B
 .. .. .. ..  ..   .. 
  
..  η

. . . . .  1  K
 .  .
  x  b  B 
   
0 0 0 ··· A   .  ≥   −   ȳ
s.t.  (3.25)
1
 −c1 0 ··· 0  .   0  0
 .    
1
 0 −c2 ··· 0  xK
  .  .
 ..   .. 
. .. .. .. ..
 ..

. . . .  0 0
1 0 0 ··· −cK
η ≥ 0, x1 , . . . , xK ≥ 0 (3.26)
The dual of problem 3.24 - 3.26 is the Benders’ subproblem. The dual
variables will be
π = π u,1 · · · π u,K π η,1 · · · π η,K

(3.27)
In 3.27 π u,k refers to a set of variables that correspond to the scenario con-
straints Axk ≥ bk − B ȳ for every scenario k, and the π η,k variables are the
dual variables of the cost constraint η − ck xk ≥ 0 for every k. The Benders’
subproblem for some master solution ȳ can now be expressed as
18
b1
    
B
 ..   ..  
 .   .  
 K    
b  B  
max  0  −  0  ȳ 
π (3.28)
    
    
 .   .  
 ..   ..  
0 0
 
0 A 0 ··· 0
0 0 A ··· 0 
 .. .. .. ..
 
.. 
.
 . . . . 

0 0 0 ··· A 
s.t. π ≤ 1 0 ··· 0 (3.29)
1
 −c1 0 ··· 0 

1
 0 −c2 ··· 0 

. .. .. .. ..
 ..

. . . . 
1 0 0 ··· −cK
π≥0 (3.30)
The objective 3.28 can be reduced because for all dual variables of cost
constraints π η,k the objective parameter is 0 so 3.28 can also be expressed as
 1    
b B
u,1 u,K  ..   ..  

π ··· π  .  −  .  ȳ  (3.31)
bK B
The problem can further be reduced in the following way. Suppose k ∗ is
the most expensive scenario for some optimal solution to 3.24 - 3.26 for fixed
∗ ∗
ȳ. In other words, η ≥ ck xk > ck xk for all k 6= k ∗ (if multiple scenarios are
tied for most expensive then the first one that is encountered is chosen). This
means that the cost constraints η − ck xk are not binding and therefore the dual
variables π η,k are 0 for k 6= k ∗ due to complementary slackness. The constraint
that linked the dual problem together is now no longer necessary so the problem
can be split into K independent problems.
For all k 6= k ∗ the problem is
max π u,k (bk − B ȳ) (3.32)

u,k
s.t. π A≤0 (3.33)
u,k
π ≥0 (3.34)
For the most expensive scenario k ∗ the problem is
19
∗ ∗
max π u,k (bk − B ȳ) (3.35)
u,k∗ η,k∗ k∗
s.t. π A−π c ≤0 (3.36)
u,k∗ η,k∗
π ≥ 0, 0≤π ≤1 (3.37)
∗
Assuming ck ≥ 0 for all k choosing π η,k = 1 gives the largest feasible region,
so problem 3.35 - 3.37 reduces to
∗ ∗
max π u,k (bk − B ȳ) (3.38)
u,k∗ k∗
s.t. π A≤c (3.39)
u,k∗
π ≥0 (3.40)
Solving problem 3.28 - 3.30 is equivalent to solving the K − 1 problems

3.32 - 3.34 and problem 3.38 - 3.40. Solving 3.32 - 3.34 is trivial because if
it is unbounded an extreme ray has to be found but when the subproblem is
bounded, the optimal objective is 0 with trivial solution π u,k = 0. Due to the
way the Demand Robust Location-Transportation problem is defined in Section
2.1.2 the subproblem is always bounded and therefore its dual is too. This
report focuses on that problem so from now on the dual will always be assumed
to be bounded. Solving 3.38 - 3.40 can be done by recognizing that its dual is
∗ ∗
min ck xk (3.41)
k∗ k∗
s.t. Ax ≥b − B ȳ (3.42)
k∗
x ≥0 (3.43)
This observation can be used to device a solution strategy for 3.28 - 3.30.
It starts by solving minxk ≥0 {ck xk |Axk ≥ bk − B ȳ, xk ≥ 0} for every scenario k.
All objective values for the optimal solution are then compared, and the most
expensive scenario is called k ∗ . For all other scenarios k 6= k ∗ the following
∗ ∗
values are assigned: π η,k = π u,k = 0. For k ∗ : π η,k = 1 and π u,k equals the
vector of shadow prices of the constraints 3.42.
This yields the following Benders’ reformulation
min dy + ζ (3.44)
u,k k

s.t. ζ ≥ π b − By ∀p ∈ {1, . . . , P } (3.45)
y∈Y (3.46)
3.2.2 Polyhedral Uncertainty Sets

As said in Chapter 2, the most expensive scenario for a scenario set as in 2.5 will
always be an extreme point of that scenario set. So if all extreme points of the
20
scenario set are explicitly generated the solution method of the previous section
could be applied because the set of extreme points of the scenario set is discrete
and finite. Unfortunately this is infeasible in practice. Therefore the solution
to problem 2.17 - 2.22 has to be used. An optimal solution π ∗ , g ∗ is sufficient to
generate an optimality constraint π ∗ (u∗ − By) for the Benders’ master problem.
3.3 Demand Robust Location-Transportation

In this section Benders’ decomposition is applied to the Demand Robust Location-
Transportation problem.

The original problem in the case of discrete uncertainty sets is given by 2.45 -
2.51. The following exposition is a slight adaptation of the work in [24]. The
only difference is the discrete instead of polyhedral uncertainty set. The starting
master problem is
X
min (fi yi + ai zi ) + ζ (3.47)
(y,z,ζ)
i
s.t. zi ≤ Ki yi ∀i (3.48)
X
zi ≥ umax (3.49)
i
yi ∈ {0, 1}, zi ≥ 0 (3.50)
In the starting master problem variable ζ is unconstrained, which leads to
an unbounded problem. This can be prevented by choosing ζ = 0 or removing
ζ from the master and introducing it the first time an optimality constraint is
added. Be careful to ignore the LB found by optimizing the master problem
with ζ = 0 because it could be too high.
The K subproblems (one for every scenario k) based on an (optimal) solution
(y ∗ , z ∗ ) of the master problem are
X
min cij xkij (3.51)
ij
X
s.t − xkij ≥ −zi∗ ∀i (3.52)
j
X
xkij ≥ ukj ∀j (3.53)
i
k
xij ≥0 (3.54)
Optimizing the subproblem for every scenario k yields a solution π sup,k∗ , π dem,k∗
for the most expensive scenario where π sup,k∗ are the dual values of supply con-
straints 3.52 and π dem,k∗ are the dual values of demand constraints 3.53. This
21
gives Benders’ extreme point constraint ζ − j πjdem,k∗ uk∗
P P sup,k∗
j + i πi zi ≥ 0
that can be added to the master. So after T iterations the master problem is
given by
X
min (fi yi + ai zi ) + ζ (3.55)
(y,z,η)
i
s.t. zi ≤ Ki yi ∀i (3.56)
X
zi ≥ umax (3.57)
i
(dem,k∗),t k∗ (sup,k∗),t
X X
ζ− πj uj + πi zi ≥ 0 ∀t ∈ {1, . . . , T } (3.58)
j i
yi ∈ {0, 1}, zi ≥ 0 (3.59)

The master problem for Demand Robust Location-Transportation with a poly-
hedral uncertainty set is the same as the one used for discrete uncertainty set.
The subproblem is based on 2.17 - 2.22 and is ([13])
X X X
max uj πjdem + ūj ωj − zi∗ πisup (3.60)
j j i
s.t. πjdem − πisup ≤ cij ∀i, j (3.61)

X
gj ≤ Γ (3.62)
j
ωj ≤ πjdem ∀j (3.63)
ω j ≤ M gj ∀j (3.64)
πjdem , πisup , ωj ≥ 0, gj ∈ {0, 1} (3.65)
M is some really big number. According to [13] it can be set to the value
πjdem,∗ where πjdem,∗ is the value of πjdem in the optimal solution of the subroblem
for Γ equals the number of customers, so all demands are at their maximum
value. This means that gj = 1 for all j and ωj = πjdem . So these values can be
found by solving
X X
max ((uj + ūj )πjdem ) − zi∗ πisup (3.66)
j i
s.t. πjdem − πisup ≤ cij ∀i, j (3.67)

πjdem , πisup ≥0 (3.68)
22
3.4 Algorithmic Enhancements
Benders’ decomposition can work well for some problems but solving numerous
(increasingly large) MIPs can hurt performance. A number of ways of overcom-
ing this problem exist. Below are two ways of improving Benders’ decomposi-
tion that can be used separately or combined. The 2-phase method described
in 3.4.1 and the method of using incumbent solutions (3.4.2) are ways to utilize
the observation that optimal subproblem solutions based on non-optimal master
solutions also provide valid constraints. This can be used to reduce the number
of times the master MIP has to be solved.
3.4.1 2 Phase Method

Repeatedly solving the master problem (a MIP) to optimality can slow down
the convergence of Benders decomposition. A way to possibly mitigate this
performance issue is described in [18]. The method is based on the fact that the
feasible region of the dual of the subproblem is independent of the solution to the
master problem. This means that optimizing the (bounded dual) subproblem
will always yield an extreme point of the feasible region and will lead to a valid
constraint for the master problem. So a non-optimal or infeasible solution for
the master problem still leads to a valid constraint. This follows from the fact
that the dual of the subproblem looks like this:
min π(b − By) (3.69)

s.t. πA ≤ c (3.70)
π≥0 (3.71)
The feasible region {π : πA ≤ c, π ≥ 0} of the dual of the subproblem 3.69 -

3.71 is independent of master problem solution y; y only influences what extreme
point of the feasible region is returned after optimization. Therefore it is not
strictly necessary to optimize the dual of the subproblem with an optimal master
solution to generate a valid constraint for the master. It is not even necessary
that the vector y is feasible for the master problem. Simply put, solving the
subproblem to optimality yields a valid constraint for Benders decomposition
regardless of the feasibility or optimality of y for the master problem.
A substitute for an optimal master solution could be a solution to the linear
programming relaxation of the master problem. The LP relaxation of the mas-
ter problem is obtained by simply ignoring the integer (or binary) constraints.
A solution to the relaxed master problem will most likely not be a feasible so-
lution for the master problem but it can be used in the subproblem to generate
constraints. The major drawback of this method is that the solution of the dual
of the subproblem can no longer be used as a valid upper bound in Benders’
algorithm. Therefore, this way of generating constraints has to be implemented
in a 2-phase method. In the first phase the LP relaxation of the master problem
is repeatedly solved and the solutions are used to generate constraints while in
23
the second phase the integer constraints are reinstated and the master problem
is solved.
The time spent in the first phase can be chosen in a number of ways:
1. Continue until the LP relaxed algorithm has converged.

2. Use the relaxed master problem for a fixed number of iterations.
3. Stop when the ζ-variable of the master and the objective value of the
subproblem are close enough.
Method 1 leads to the longest possible first phase. Once the first phase
has converged no new constraints can be found using this method. Method 2
is really straightforward. Before starting a number of iterations for the first
phase is chosen and after that the algorithm reapplies the integer constraints to
the master variables that were removed in the first phase. Method 3 compares
the value of the ζ-variable in the master problem and the objective value of
the optimized subproblem. Once these are deemed close enough the algorithm
continues to the second phase.
It is hard to say in advance which method of stopping the first phase gives
the biggest performance boost (if any). The best implementation of this 2-phase
algorithm can only be determined experimentally, also because it is likely to be
problem dependent.
3.4.2 Using Incumbent Solutions

As explained in section 3.4.1, it is not strictly necessary to use optimal master
solutions in the subproblem to generate valid constraints for the master problem.
Suboptimal but feasible solutions of the master problem can also be used to
generate constraints ([10]). When the master is solved with a modern MIP-
solver (using Branch-and-Bound) callback functions can be used to get a (at that
time) feasible solution for the master problem every time a new best incumbent
solution is found. This new incumbent solution is used in the subproblem and
the resulting constraint is added as a lazy constraint. This can be continued
until the MIP-solver converges and has found the optimal solution.
24
Chapter 4
Column & Constraint

Generation
A rather novel way of solving robust optimization problems called Column &
Constraint Generation(C&CG) was first presented in [24]. C&CG is a decom-
position strategy that uses a master and subproblem framework similar to Ben-
ders’ decomposition. The main difference between Benders’ decomposition and
C&CG is that Benders’ is a general solving procedure that can be applied to a
wide range of mixed integer programming problems while C&CG is specifically
suited to solving robust optimization problems.
Despite being quite new there are already a number of applications for
C&CG. It has been mostly used in robust optimization problems concerning
power grids ([1], [14], [15], [23], [25]) but it also seems to work well for facility
location problems ([2], [24]).
The name Column & Constraint Generation is based on the fact that the
algorithm iteratively adds variables (columns) and constraints to the problem
until the optimal solution is found. The variables that are added correspond to
second stage decision variables and the constraints are from scenarios that are
added to the problem. The scenarios are selected on basis of being the worst
case at some point in the optimization.
4.1 Discrete Uncertainty Sets

This explanation of Column & Constraint Generation is derived from [24]. The
difference is the use of discrete uncertainty sets instead of polyhedral ones. Sup-
pose there is a robust optimization problem with K scenarios and the problem
can be modelled as follows:
25
min dy + η (4.1)
y,η
s.t. Axk + By ≥ bk ∀k ∈ {1, . . . , K} (4.2)

k k
η≥c x ∀k ∈ {1, . . . , K} (4.3)
Dy ≥ e (4.4)
In the problem formulation above the vector y consists of the first stage
variables and the vector xk is made up of the second stage variables for scenario
k. ck is the cost vector for the second stage variables in scenario k.
The main idea of C&CG is iteratively adding scenarios and the corresponding
variables until an optimal solution is found. The optimal solution is found when
the lower bound (LB) and upper bound (UB) maintained by the algorithm are
equal, so when LB = U B.
The LB is based on the idea that the objective value of the optimal solution
for a restricted set of scenarios is never worse than the objective value of the
optimal solution for the complete set of scenarios. Let U be the complete set
of scenarios for problem (4.1)-(4.4) and let U 0 ⊆ U be some restricted scenario
set, then the objective value of the optimal solution for the problem with a
restricted scenario set is a lower bound for the original problem if the subproblem
is bounded and has a feasible solution. Suppose that y ∗ , x1,∗ , . . . , xK,∗ is an
optimal solution for the problem with scenario set U , then y ∗ can be used as a
partial solution for the problem with scenario set U 0 that has an objective value
that is never larger than the objective value for this first stage solution with
scenario set U . Therefore, the optimal solution of the restricted problem leads
to a lower bound for the complete problem.
UB is found by solving the restricted problem and using the resulting op-
timal first stage vector y 0∗ to solve the second stage problem for all scenarios.
This leads to a solution (y 0∗ , x0∗ , . . . , x0K ) that is feasible for the problem with
complete scenario set U and the objective value of this solution is therefore an
upper bound on the objective value of the optimal solution with scenario set U .
The above leads to the C&CG algorithm where the master problem is the
original problem with a restricted set of scenarios and the subproblem is opti-
mizing the second stage variables xk for all scenarios separately given the first
stage decision y found by optimizing the master problem. The scenario k 0 that
0 0
has the highest second stage costs ck xk is then added to the master problem
and the master problem is solved again. This continues until UB=LB.
4.2 Polyhedral Uncertainty Set

The approach described in the previous section works for discrete and finite
uncertainty sets but not for polyhedral uncertainty sets as the one defined in
2.5 in Chapter 2. This is because the number of possible scenarios (all extreme
points of U ) is potentially way too big to result in a solution in a decent amount
26
of time. Therefore, scenarios will be generated by solving problem 2.17 - 2.22.
Solving this problem also results in finding the most expensive scenario.
4.3 Demand Robust Location-Transportation

In this section the Column & Constraint Generation algorithm is applied to the
Demand Robust Location-Transportation problem. This part is also based on
[24] with the difference being the discrete uncertainty sets instead of polyhedral
ones.
4.3.1 Discrete Uncertainty Set

The starting master problem for C&CG (with a discrete uncertainty set) is the
same as the starting master problem for Benders’ decomposition (see 3.47 -
3.50). At every iteration t ∈ {1, . . . , T } one scenario is added to this master
problem, so after t0 iterations the master problem is the same as the Demand
Robust Location-Transportation for a discrete uncertainty set with scenario set
{1, . . . , t0 }. So it is
X
min (fi yi + ai zi ) + η (4.5)
(y,z,η)
i
s.t. zi ≤ Ki yi ∀i (4.6)
X
zi ≥ umax (4.7)
i
X
xtij ≤ zi ∀i, t (4.8)
j
X
xtij ≥ utj ∀j, t (4.9)
i
X
η− cij xtij ≥ 0 ∀t (4.10)
ij
yi ∈ {0, 1}, zi ≥ 0, xtij ≥ 0 (4.11)
Similar to Benders’ decomposition, K independent subproblems are solved.

After solving them the most expensive scenario is added to the master problem.
For every scenario k the subproblem is given by
27
X
min cij xkij (4.12)
ij
X
s.t − xkij ≥ −zi∗ ∀i (4.13)
j
X
xkij ≥ ukj ∀j (4.14)
i
xkij ≥0 (4.15)
After solving the subproblem, demands uk∗

j are sufficient to add a new sce-
nario to the master problem.
4.3.2 Polyhedral Uncertainty Set

For polyhedral uncertainty sets the starting master is again the same as 3.47 -
3.50. At every iteration a scenario is generated by solving the same subproblem
as is done for Benders’ decomposition with polyhedral uncertainty sets (see
3.60 - 3.65). After solving the subproblem a scenario can be generated using
the variables gk∗ from the optimal solution and that scenario can be added to
the master problem. The master problem for polyhedral uncertainty sets after
T iterations is the same as the one used for discrete uncertainty sets.
28
Chapter 5
Computational Results
To compare Benders’ decomposition and C&CG a robust location-transportation

will be solved. The problem is the same as in [24]. Several implementations of
Benders’ decomposition with a number of enhancements will be tested to find
the optimal implementation for this problem for discrete and polyhedral un-
certainty sets. Benders’ performance is then compared to the performance of
C&CG.
The demand robust location-transportation problem was given in Chapter
2 Section 2.1.2 as
X X
min (fi yi + ai zi ) + max min cij xij (5.1)
(y,z,x) u∈U
i ij
s.t. zi ≤ Ki yi (5.2)
X
xij ≤ zi (5.3)
j
X
xij ≥ uj (5.4)
i
yi ∈ {0, 1}, zi ≥ 0, xij ≥ 0 (5.5)
5.1 Problem Generation

The way to generate instances of the robust location-transportation problem
described in [24] is derived from [13]. The first thing to determine is the number
of facilities nf and the number of demands nd . At one point [13] states that it
is more realistic to have the number of facilities to be larger than the number
of demands. However, later on in [13] tests are performed on instances with
nf = nd . In [24] tests are also performed on instances with nf = nd . This
example will be followed and test are performed on instances with nf = nd = 30.
To generate the polyhedral uncertainty set U base demands uj are drawn
from [10, 500] with maximal deviation ūj ∈ [01.uj , 0.5uj ] Fixed costs fi are
29
drawn from [100, 1000], variable facility costs per unit ai from [10, 100] and
maximal allowable capacity Ki from [200, 700]. Transportation costs cij are in
the interval [1, 1000]. P P
To ensure feasibility the inequality i Ki ≥ maxu∈U { j uj } has to be re-
spected. Neither [24] nor [13] make clear how this is taken care of during gen-
eration so instances that violate the feasibility constraint will just be ignored.
So an instance is generated, the feasibility is checked and if it is feasible it is
entered into the set of test instances.
The scenarios for the version of the problem with discrete uncertainty sets
were generated based on the corresponding polyhedral uncertainty sets. To
generate a scenario Γ customers are chosen to have their maximum demand in
that scenario. All other customers have base demand. All discrete problems
have 100 scenario’s.
All algorithms are implemented in C# and executed on a computer with an
Intel Core2Duo 2.20 GHz processor.
5.2 Results
To evaluate the performance of the various optimization algorithms 10 instances
of size 30x30 were generated. Each of these problems was extended with 100 ran-
dom scenario’s for every value of Γ. The values of Γ are 3, 6, 9, 12, 15, 18, 21, 24
and 27.

The results of the algorithmic experiments for problems with a discrete un-
certainty set are presented in this subsection. The performance of the three
variations of Benders’ decomposition are presented first. After that the per-
formance of the standard Gurobi MIP-Solver is discussed and finally they are
compared with Column & Constraint Generation.
Benders’ Decomposition
Three variations of Benders decomposition are compared. A classic implementa-
tion (denoted by BenClass) that alternatingly solves the master and subproblem
to optimality until the algorithm converges, an implementation where the sub-
problem is solved every time the solver of the master problem encounters a
new incumbent solution (BenCB) and an implementation that implements the
2 Phase method (Ben2Phase). The 2 Phase method was implemented in such
a way that the relaxed master problem was solved for 5 iterations before the
standard Benders’ algorithm took over. As you can see in Table 5.2.1, the classic
implementation outperforms BenCB and is comparable to Ben2Phase. This can
be explained by the fact that both enhancement methods are aimed at reduc-
ing the burden caused by repeatedly solving the master problem to optimality.
Apparently, the master problem is not such a huge bottleneck for this problem
that these methods provide a significant boost in performance.
30
Γ 3 6 9 12 15 18 21 24 27
BenClass time(s) 33.5 69.3 63.5 58.5 57.6 51.7 48.0 45.0 17.6
iterations 67.4 58.7 55.1 53 52.5 48.8 47.7 45.7 45.4
Ben2Phase time(s) 36.4 69.5 24.0 48.2 31.5 19.1 42.1 39.1 38.7
iterations 64.2 58 52 46 43.2 44.3 42 40 40.2
BenCB time(s) 133.9 105.4 106.5 123.4 68.2 89.1 102.9 87.7 84.0
iterations 653.3 517.3 506.7 464.4 427.6 433.7 383.3 339.7 370.5
Table 5.1: Performance of variations of Benders’ decomposition, discrete uncer-

tainty sets, 100 scenarios
Gurobi MIP-solver
The Gurobi MIP-solver provides a wide range of settings that can be used to
tune its performance. All were left in their default setting, except for one: the
optimality gap. The optimality gap is used to determine when the algorithm
can stop optimizing. When the relative gap between the lower bound and the
upper bound is smaller than the optimality gap the algorithm terminates. It
would be more elegant if the algorithm terminates when the upper and lower
bound are equal but in practice this is not feasible, mainly due to rounding
errors. It could be that a rounding error causes the upper bound to be slightly
larger than the lower bound even though the optimal solution is found and this
would cause a failure to terminate. So the optimality gap is a necessary evil.
The standard optimality gap in Gurobi is 10−4 . There is no fundamental
reason for the gap to have this value. A value of 10−4 still allows for termi-
nation before an optimal solution is certain to be found, as can be seen in the
results. Therefore it is justified to see how this parameter affects algorithmic
performance. A comparison is made to solution times obtained when setting
the optimality gap to 10−2 .
The results are quite surprising. Solution times are a lot better for the larger
optimality gap. However, it could be that solutions obtained by the larger gap
could be a lot worse because 10−2 is a lot bigger than 10−4 . This seems to
be the case when looking at the results in Table 5.2.1. The solver terminates
with a solution that is guaranteed to be optimal for a considerable number of
instances when the allowed gap is 10−4 and the average final optimality gap is
roughly 100 times worse when the allowed gap is 10−2 as would be expected.
Γ 3 6 9 12 15 18 21 24 27
MIP (gap= 10−4 ) time(s) 37.8 68.1 70.0 71.8 58.4 49.2 48.5 44.2 52.8
exact 5/10 3/10 4/10 6/10 5/10 7/10 3/10 2/10 2/10
avg gap (∗10−5 ) 2.29 4.40 3.68 1.77 3.00 1.09 4.14 4.89 4.83
MIP (gap= 10−2 ) time(s) 5.0 9.1 9.2 9.3 9.2 9.0 9.3 9.4 7.3
exact 0/10 0/10 0/10 0/10 0/10 0/10 0/10 0/10 0/10
avg gap (∗10−5 ) 379 362 310 298 266 259 247 220 214
Table 5.2: Performance of Gurobi MIP-solver, discrete uncertainty sets, 100

scenarios
31
The actual difference between the solutions that are found by both methods
are a lot smaller. The average relative difference (see Table 5.2.1) is less than 1 in
1000. Based on the difference between the allowed optimality gaps this relative
difference could have been almost 1 percent. The relatively small difference can
be explained by the way an optimality gap is calculated. It is simply (UB -
LB)/UB so a larger optimality gap does not necessarily mean that the current
solution is bad. A difference in optimality gap can also be caused by a lower
bound that is less tight. For this problem that seems to be the case. This shows
that performance can be dramatically increased while the obtained solution is,
on average, less than 1 in a 1000 worse.
Γ 3 6 9 12 15 18 21 24 27
avg diff. (∗10−5 ) 44.5 69.6 37.1 39.5 37.9 31.1 22.5 2.0 1.9
Table 5.3: Relative difference between optimal solutions obtained by the Gurobi
MIP-solver with optimality gap 10−2 and 10−4 . The number for each value of
Γ is the average relative difference over ten instances. The realtive difference is
−2 −4 −4
calculated by (U B 10 − U B 10 )/U B 10
Table 5.2.1 makes it clear that a large part of the difference in optimality gap
can be explained by a worse lower bound. The performance of the Gurobi MIP-
solver could potentially be sped up by instructing it to focus more on improving
the lower bound. Fortunately it provides this option. There is a parameter
called MIPFocus that can make the solver spend more resources on improving
the lower bound. According to Gurobi’s documentation: ”If you believe the
solver is having no trouble finding good quality solutions, and wish to focus
more attention on proving optimality, select MIPFocus=2.” The result can be
found in Table 5.2.1. For this comparison the allowed optimality gap was left
at its default value of 10−4 . The results show that changing the focus of the
solver has a positive influence on its performance. The average time it takes to
solve a problem decreases.
Γ 3 6 9 12 15 18 21 24 27
MIP (default) time(s) 37.8 68.1 70.0 71.8 58.4 49.2 48.5 44.2 52.8
exact 5/10 3/10 4/10 6/10 5/10 7/10 3/10 2/10 2/10
avg gap (∗10−5 ) 2.29 4.40 3.68 1.77 3.00 1.09 4.14 4.89 4.83
MIP (MIPFocus = 2) time(s) 40.0 13.2 20.9 18.3 19.1 20.9 14.4 20.2 16.0
exact 5/10 4/10 4/10 5/10 1/10 4/10 3/10 1/10 3/10
avg gap (∗10−5 ) 3.44 3.47 2.44 4.03 4.18 4.23 2.82 4.80 1.88
Table 5.4: Result of difference focus settings for Gurobi MIP-solver, discrete
uncertainty sets, 100 scenarios
The experiments with a different optimality gap and a different focus show
that the performance of a solver can be influenced changes in its settings. These
experiments were not aimed at finding the optimal settings for the Gurobi MIP-
32
solver but are an indication that performance of this solver can be tuned and
that this tuning can have a large effect on its performance.
Both Benders’ decomposition and Column & Constraint Generation use a
MIP-solver and a LP-solver as a subroutine. When comparing the performance
of these algorithms it is important to keep in mind that solver performance can
be greatly affected by changes in settings.
Column & Constraint Generation

Only one version of C&CG was implemented, but it has the best performance
of all algorithms. The only algorithm that is a bit close is the MIP-solver with a
large optimality gap. C&CG outperforms Benders’ but it should be noted that
the performance of Benders’ decomposition is on par with the performance of
the MIP-solver with the standard optimality gap.
Γ 3 6 9 12 15 18 21 24 27
BenClass time(s) 33.5 69.3 63.5 58.5 57.6 51.7 48.0 45.0 17.6
iterations 67.4 58.7 55.1 53 52.5 48.8 47.7 45.7 45.4
C&CG time(s) 3.2 3.0 5.2 4.1 3.4 3.7 4.6 3.3 1.6
iterations 4.4 3.2 4.2 3.8 3.5 3.6 4.1 3.3 3.2
MIP Gap= 10−2 time(s) 5.0 9.1 9.2 9.3 9.2 9.0 9.3 9.4 7.3
Table 5.5: Performance of Benders’ decomposition, Column & Constraint Gen-

eration, Gurobi MIP-solver, discrete uncertainty sets, 100 scenarios

The experiments with the polyhedral uncertainty sets were conducted to verify
the results of [24]. The results can be found in Table 5.2.2. In [24] the CPlex
MIP-solver was used to solve the subproblem for C&CG and Benders’ decom-
position. Apparently this solver is much better at solving this specific problem
than the Gurobi solver. Therefore, the results presented here are a lot slower
than those presented in [24]. However, the number of iterations needed by ei-
ther C&CG or Benders’ decomposition to solve the problem is comparable to
the number of iterations neede in [24]. This means that the number of iterations
needed for C&CG is a lot smaller than the number needed for Benders’ decom-
position. The polyhedral nature of the uncertainty sets causes the subproblem
to be a MIP instead of a LP, as is the case for discrete uncertainty sets. This
turns the subproblem into the bottleneck and given that the subproblem for
both algorithms is the same (only the information extracted from it is different)
the algorithm that needs the smallest number of iterations is preferable.
The solution times marked with a ∗ indicate that at least one of the tested
instances failed to converge before the time limit of 3600 seconds was reached.
In general nearly all time needed by both algorithms was spent solving the sub-
problem. Therefore the low number of necessary iterations gives C&CG a large
33
advantage over Benders’ decomposition. It should be noted that adding a sce-
nario to the master problem when performing Column & Constraint Generation
involves adding 900 continuous variables and 60 constraints to the master. Dur-
ing Benders’ decomposition no variables are added to the master problem. The
advantage that C&CG has is created by the smaller number of iterations while
its master problem does not become so slow that it becomes a disadvantage.
The fact that the master problems can be expanded with so many variables and
constraints and not have its performance crippled is an interesting observation.
Γ 3 6 9 12 15 18 21 24 27
BenClass time(s) 249.44 5536.17 2620.42 3718.39∗ 3698.02∗ 3706.67∗ 1083.17∗ 1867.93∗ 20.45
iterations 65 59.5 34.5 34.5 22.5 18.5 33.5 5770 27
C&CG time(s) 10.42 101.43 509.25 611.17 477.70 230.56 97.75 14.40 3.17
iterations 4.7 4.8 6.5 6.5 5.9 5.6 5.4 4 3.4
Table 5.6: Performance of Benders Decomposition and C&CG, polyhedral un-

certainty sets. Only 2 problems per value of Gamma were solved by Benders’
decomposition due to large solving times.
34
Chapter 6
Interpretation of Results
The computational experiments of Chapter 5 are performed in a way many

experiments in optimization are carried out (see for example [24] and [13]). A
number of instances is generated or obtained in some other ways and some
algorithms are used to solve those problems. Then the performance of the
various algorithms is compared.
The question to answer at this point is what the results presented in Chapter
5 (or those obtained by similar experiments) actually mean. By comparing the
increase in performance of various versions of the CPLEX MIP-solvers, Bixby
([7]) shows that there has been a tremendous improvement in the CPLEX solver
over the years. So the constant emergence of new ideas and the usually small
scale testing of new algorithms and new versions of existing algorithms has led
to great results.
However, it is still hard to generalize the results of experiments to larger
groups of problems. On a fundamental level, the question remains what the
outcome of the experiments in this thesis means for performance of the tested
algorithms on other problem instances and problems. Preferably, one would like
to make the claim that the results presented in this report are representative
for all instances of the Demand Robust Location-Transportation problem or
something similar. In other scientific fields such as medicine tests performed
on a sample of the population are generalized to the complete population using
statistics. The quality of these results for the population is affected by the
representativeness of the sample for the population. Such an elevation of specific
experiments to general rule cannot be made using the kind of experiments that
are common in the field of combinatorial optimization and also performed in this
report. Why this is impossible and possible ways to mitigate the consequences
of this observation are explained in this chapter.
It would be nice to be able to make a claim along the lines of: ‘Column &
Constraint Generation is superior to Benders decomposition for the optimization
of the Demand Robust Location-Transportation problem.’ Based on the results
of the previous chapter that looks like a reasonable statement, but that is not
really what was proven. What has been shown is that Column & Constraint
35
Generation performs better than Benders decomposition for the instances tested
using the code and computer system of the author. So to claim a general result,
you would have to prove the following:
1. In general, solver performance is the same for a random instance of De-

mand Robust Location-Transportation as it was in the instances tested.
2. The code and underlying algorithmic choices are optimal.
3. All computer systems behave the same as the one used.
In the next three sections, the challenges accompanying these three points
will be explained. After that, some possible ways to mitigate these effects will
be discussed.
6.1 Challenges
6.1.1 Problem Selection
It is unclear how the performance of solver on a number of problem instances
should be elevated to a general result for some problem. The generation of
problems for this report was done using the description of [24], which is based
on [13]. The parameters used to generate problem instances seem to be picked at
random, or at least they are not thoroughly justified. Such a justification could
for example be that these problems are a very good representation of a general
problem or that they are close to a problem that is relevant in the real world. For
example, when solver performance for the recourse problem is analysed section
4.1 of [13] the number of customers is larger than the number of warehouses ‘to
be closer to reality’. However, when the full problem is solved the number of
customers equals the number of warehouses. This way of generating problems
is not wrong in any way but it will not lead to a general result, simply because
it is unclear how representative the sample is for what population.
A question that arises when thinking about the problem selection is the
required number of tested instances. No attempt is made to link the sample
size to the reliability of the result. So something that could be given more
attention in papers like [24] and [13] is what set of problems was tested and how
do the algorithmic performance figures translate to a larger group of problems.
Another possible issue that arises from the lack of a transparent way selecting
test problems is possibility of cherry picking results. If authors are free to choose
the problem instances on which they test their algorithms there is the possibility
of trying an algorithm on a set of instances and only publishing the results for the
subset of instances on which it performed well. It is hard to say how often this is
done, but the major problem is that it is impossible to check because everyone
can select their problem instances without having to justify their choices.
36
6.1.2 Code and Algorithmic Choices
An algorithm like Benders’ decomposition is not so much one algorithm as it is a
family of algorithms. That can be seen in this report for example by the fact that
Benders’ can be implemented with the 2 Phase-method or with a technique that
uses incumbent solutions during the solving of the master problem to generate
constraints. The Gurobi MIP-solver that was used for the problem with discrete
uncertainty set has a wide range of settings that can be changed, one of which
was the optimality gap that affected performance quite a bit. Even more, both
Benders’ decomposition and C&CG employ MIP- and LP-solvers to solve the
master problem and subproblem respectively. These solvers can be configured
in a lot of ways that can affect performance.
Besides algorithmic choices that influence performance there is also the way
in which an algorithm is implemented that affects how well it does. On top of
that, the programming language itself in which the algorithm was written can
have an effect on its speed.
Two specific implementations of an algorithm that are given the same name
by their respective authors could have differences in performance that are caused
by different algorithmic choices and differences in implementation that are not
immediately clear to the reader of an academic paper.
A fundamental problem with the way results are usually presented is that it is
unclear how much an algorithm was tuned to perform optimally on the problem
instances it is tested on. This could lead to a problem that is equivalent to the
statistical notion of overfitting. This implies that it is possible for an algorithm
that was tuned to perform optimally on the test instances might not perform
very well on instances that it was not initially tested on.
6.1.3 Performance Variability

A more surprising source of possible performance variability is shown in [12]. In
that paper, it is explained that an MIP-solver that is run on the same problem on
two different computing systems can show a very large difference in performance.
Even more surprising, this variability can also occur on the same system. The
explanation for this phenomenon is that MIP-solvers rely on scores to make
choices (e.g. branching decisions) among candidates while looking for a solution.
These scores can show some small rounding errors that vary between computing
systems and in the case of actual ties some arbitrary choices is made. This
arbitrary choice could be based on something as simple as the order of an array.
For example, a branching decision among seemingly equivalent variables can be
influenced by which one was added to the model first.
The above is not a hypothetical or rare phenomenon. In [12], it is shown
permuting the order of constraints and variables can strongly affect algorithmic
performance.
37
6.2 Possible Solution
The best way to deal with these issues is transparency. Transparency comes
in many different shades and it is not the objective of this paper to present a
ready-made solution, but it could be beneficial to future scientific endeavors to
think about how results are made public.
To be able to judge the results published in an academic paper it is important
that they can be reproduced. This is by definition impossible if the problem
instances are not available in some form. If the problem instances are generated
in some random and reproducible way the experiment can be repeated, but there
is still no way to repeat the experiments of the original publication. This could
be dealt with by demanding that the set of problem instances is published in
an easily accessible way along with the article. Some kind of public repository
is an option for this. An initiative like miplib.zib.de is a start, but as of now
it only contains 361 problem instances spread out over a large number different
problems.
However, due to the proprietary nature of some data not all problem in-
stances can be made public. Depositing such a problem at some kind of neutral
third party that is able to run an algorithm for an interested researcher could
be solution in these cases.
Some kind of database that shows what kind of algorithms have been applied
to some problem would speed up the way in which research is done. This
would provide an overview over the large amount of papers that are produced.
An example from the medical science shows what this could look like. The
International Committee of Medical Journal Editors requires that all medical
trials are registered with clinicaltrials.gov as a prerequisite for publication
of trial results.
Such a database does not completely prevent the possibility of cherry picking
but it will at least result in a nice overview of what has been tried and also allows
for negative results to be made public without being published in a journal
article.
The greatest opportunity lies in the sharing of code. If code is required to be
made public along with an article presenting results, doubts about code qual-
ity and algorithmic choices can be easily addressed. The availability of code
together with the publication of problem instances leads to simple reproduc-
tion and verification of results. Testing how well the results translate to other
instances is made easier.
A comprehensive open source optimization library would be even better.
Open source projects can be very successful, see for example the Python pro-
gramming language, the R project for statistical computing or the Linux operat-
ing system. If such a thing would exist for optimization it would make applying
different algorithms to problems a lot easier. No longer would scientists have to
write their own implementations of algorithms to be able to test them. Also,
comparison of performance would be made a bit easier if an algorithm is avail-
able in an open source library so there are no doubts about the way it was
implemented.
38
Of course, computer code can also be deemed proprietary in some cases.
Here, a neutral third party could also serve the purpose of making experiments
reproducible without making everything public.
Overfitting of an algorithm on a set of problem instances can be dealt with
in two ways. The simplest one is applying the algorithm to problem instances
that it was not initially tested on. This could be done after a result is published,
but it can also be incorporated in the development of the algorithm by splitting
the available problem instances into a training set and a test set. Similarly to
the way in which this scheme is applied in machine learning, the algorithm can
be tuned on the training set and its performance judged on the test set.
Addressing the issue of performance variability is the toughest nut to crack.
This issue will always exist in some form unless all computational experiments
are run within the same (virtual) computing system. However, such a stan-
dardized system would be one among many and it is hard to see how it could
be designed to allow maximal crossover between the results of experiments and
applications in the real world. That does not mean performance variability
is something that can just be ignored when publishing results. More research
could shed light on what algorithms are particularly affected by it and how the
effects could be alleviated.
39
Chapter 7
Conclusion
This report has attempted to compare the performance of Benders’ decomposi-

tion, Column & Constraint Generation and the standard Gurobi MIP-solver for
the Demand Robust Location-Transportation problem with discrete uncertainty
sets and the performance of Benders’ decomposition and Column & Constraint
Generation on said problem with polyhedral uncertainty sets. The general pic-
ture is that Column & Constraint Generation performs very well for both classes
of uncertainty sets. Especially the low number of required scenarios for an op-
timal solution is noteworthy. Also, the small number of required iterations
protects the algorithm form being bogged down by a hard subproblem. If that
property holds for a wider range of problems it is a nice addition to the toolkit
of demand robust optimization.
The remarkable difference in the performance of the MIP-solver for changes
in its settings is something that raises questions about the very specific nature
of results obtained by experiments like these. This is important to note because
both Column & Constraint Generation and Benders’ decomposition lean heavily
on general MIP and LP solvers such as Gurobi or Cplex. This is a trait that
is shared among a large number of decomposition approaches to solving mixed
integer programs.
The performance of these algorithms is affected by the choice of solver and its
settings. Therefore it is strange that usually these choices are merely mentioned
and not justified when they are used in an algorithm. In practice they can
greatly affect how fast a problem is solved and they should not be overlooked
when evaluating the results of an experiment.
The reproducibility and general nature of results obtained by computational
experiments could be improved by a larger emphasis on openness regarding
problem instances and algorithmic implementation within the scientific commu-
nity.
40
Bibliography
[1] Y. An and B. Zeng. “Exploring the modeling capacity of two-stage ro-

bust optimization: Variants of robust unit commitment model”. In: IEEE
Transactions on Power Systems 30.1 (2015), pp. 109–122.
[2] Y. An, B. Zeng, Y. Zhang, and L. Zhao. “Reliable p-median facility loca-
tion problem: two-stage robust models and algorithms”. In: Transportation
Research Part B: Methodological 64 (2014), pp. 54–72.
[3] A. Ben-Tal, A. Goryashko, E. Guslitzer, and A. Nemirovski. “Adjustable
robust solutions of uncertain linear programs”. In: Mathematical Program-
ming 99.2 (2004), pp. 351–376.
[4] J. F. Benders. “Partitioning procedures for solving mixed-variables pro-
gramming problems”. In: Numerische mathematik 4.1 (1962), pp. 238–
252.
[5] D. Bertsimas and M. Sim. “Robust discrete optimization and network
flows”. In: Mathematical programming 98.1-3 (2003), pp. 49–71.
[6] D. Bertsimas, E. Litvinov, X. A. Sun, J. Zhao, and T. Zheng. “Adaptive
robust optimization for the security constrained unit commitment prob-
lem”. In: IEEE Transactions on Power Systems 28.1 (2013), pp. 52–63.
[7] R. E. Bixby. “A brief history of linear and mixed-integer programming
computation”. In: Documenta Mathematica (2012), pp. 107–121.
[8] S. Cicerone, G. DAngelo, G. Di Stefano, D. Frigioni, A. Navarra, M.
Schachtebeck, and A. Schöbel. “Recoverable robustness in shunting and
timetabling”. In: Robust and online large-scale optimization. Springer,
2009, pp. 28–60.
[9] A. M. Costa. “A survey on benders decomposition applied to fixed-charge
network design problems”. In: Computers & operations research 32.6 (2005),
pp. 1429–1450.
[10] G. Cote and M. A. Laughton. “Large-scale mixed integer programming:
Benders-type heuristics”. In: European Journal of Operational Research
16.3 (1984), pp. 327–333.
41
[11] K. Dhamdhere, V. Goyal, R. Ravi, and M. Singh. “How to pay, come what
may: Approximation algorithms for demand-robust covering problems”.
In: 46th Annual IEEE Symposium on Foundations of Computer Science
(FOCS’05). IEEE. 2005, pp. 367–376.
[12] M. Fischetti, A. Lodi, M. Monaci, D. Salvagnin, and A. Tramontani. “Im-
proving branch-and-cut performance by random sampling”. In: Mathe-
matical Programming Computation 8.1 (2016), pp. 113–132.
[13] V. Gabrel, M. Lacroix, C. Murat, and N. Remli. “Robust location trans-
portation problems under uncertain demands”. In: Discrete Applied Math-
ematics 164 (2014), pp. 100–111.
[14] R. A. Jabr, I. Džafić, and B. C. Pal. “Robust optimization of storage
investment on transmission networks”. In: IEEE Transactions on Power
Systems 30.1 (2015), pp. 531–539.
[15] C. Lee, C. Liu, S. Mehrotra, and M. Shahidehpour. “Modeling transmis-
sion line constraints in two-stage robust unit commitment problem”. In:
IEEE Transactions on Power Systems 29.3 (2014), pp. 1221–1231.
[16] C. Liebchen, M. Lübbecke, R. Möhring, and S. Stiller. “The concept of
recoverable robustness, linear programming recovery, and railway appli-
cations”. In: Robust and online large-scale optimization. Springer, 2009,
pp. 1–27.
[17] T. L. Magnanti and R. T. Wong. “Accelerating Benders decomposition:
Algorithmic enhancement and model selection criteria”. In: Operations
research 29.3 (1981), pp. 464–484.
[18] D. McDaniel and M. Devine. “A modified Benders’ partitioning algorithm
for mixed integer programming”. In: Management Science 24.3 (1977),
pp. 312–319.
[19] A. Mercier, J. F. Cordeau, and F. Soumis. “A computational study of Ben-
ders decomposition for the integrated aircraft routing and crew scheduling
problem”. In: Computers & Operations Research 32.6 (2005), pp. 1451–
1476.
[20] R. H. Pearce and M. Forbes. “Disaggregated Benders Decomposition for
solving a Network Maintenance Scheduling Problem”. In: arXiv preprint
arXiv:1603.02378 (2016).
[21] T. Santoso, S. Ahmed, M. Goetschalckx, and A. Shapiro. “A stochas-
tic programming approach for supply chain network design under un-
certainty”. In: European Journal of Operational Research 167.1 (2005),
pp. 96–115.
[22] L. Tang, W. Jiang, and G. K. Saharidis. “An improved Benders decom-
position algorithm for the logistics facility location problem with capacity
expansions”. In: Annals of operations research 210.1 (2013), pp. 165–190.
[23] W. Wei, F. Liu, S. Mei, and Yunhe Hou. “Robust energy and reserve
dispatch under variable renewable generation”. In: IEEE Transactions on
Smart Grid 6.1 (2015), pp. 369–380.
42
[24] B. Zeng and L. Zhao. “Solving two-stage robust optimization problems us-
ing a column-and-constraint generation method”. In: Operations Research
Letters 41.5 (2013), pp. 457–461.
[25] M. Zugno and A. J. Conejo. “A robust optimization approach to energy
and reserve dispatch in electricity markets”. In: European Journal of Op-
erational Research 247.2 (2015), pp. 659–671.
43

Thesis A Komen Final June 6

Uploaded by

Copyright:

Available Formats

Thesis A Komen Final June 6

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Thesis A Komen Final June 6

Uploaded by

Copyright:

Available Formats

Department of Information and Computing Sciences

Benders’ Decomposition vs. Column &

2 Demand Robust Optimization 7

4 Column & Constraint Generation 25

A middle ground between stochastic programming and robust optimization

This thesis focuses on the narrower concept of demand robustness as pre-

A variety of optimization techniques can be applied to demand robust op-

1. Benders’ decomposition [4] (see Chapter 3)

2. Column & constraint generation [24] (see Chapter 4)

Both techniques rely on decomposition of problems into a ‘hard’ and an

This chapter provides a detailed look at demand robust optimization problems

The RC formulation requires that a solution is feasible for all realizations

min{dy|Ax + By ≥ u; x ≥ 0}[A,B,u,d]∈Z (2.3)

min {dy + η|Axk + By ≥ uk , η ≥ ck xk , xk ≥ 0}uk ∈U (2.4)

U = {u|u + g ū, 1g ≤ Γ} (2.5)

max min cx (2.7)

max π(u − B ȳ) (2.10)

Using the uncertainty set defined by 2.5 leads to

max πu + π(g ū) − πB ȳ (2.13)

2.1 Problem Examples

s.t. x(i,j) − (pi − pj ) ≥ 0 ∀(i, j) ∈ A (2.24)

s.t. y(i,j) + xk(i,j) − (pki − pkj ) ≥ 0 ∀(i, j) ∈ E, ∀k (2.29)

2.1.2 Demand Robust Location-Transportation

For a discrete and finite uncertainty set U = {u1 , . . . , uK } of size K the

yi ∈ {0, 1}, zi ≥ 0, xkij ≥ 0 (2.51)

Benders’ Decomposition as described here is a method to solve mixed integer

For ease of exposition the set Y is defined as Y = {y|Dy ≥ e, y ∈ N}.

min{dy + min{cx|Ax ≥ b − By}} (3.5)

The minimization problem minx≥0 {cx|Ax ≥ b − By} is a linear program so

min{dy + max{π(b − By)|πA ≤ c}} (3.6)

ρ(b − By) ≤ 0 (3.7)

max{π p (b − By)|p ∈ {1, . . . , P }} (3.8)

3.1 Benders’ Algorithm

Algorithm 1 Benders Decomposition

3.2.1 Discrete Uncertainty Sets

For some fixed y = ȳ this can be reduced to

π = π u,1 · · · π u,K π η,1 · · · π η,K

max π u,k (bk − B ȳ) (3.32)

For the most expensive scenario k ∗ the problem is

Solving problem 3.28 - 3.30 is equivalent to solving the K − 1 problems

3.2.2 Polyhedral Uncertainty Sets

3.3 Demand Robust Location-Transportation

3.3.1 Discrete Uncertainty Sets

yi ∈ {0, 1}, zi ≥ 0 (3.59)

3.3.2 Polyhedral Uncertainty Sets

s.t. πjdem − πisup ≤ cij ∀i, j (3.61)

s.t. πjdem − πisup ≤ cij ∀i, j (3.67)

3.4.1 2 Phase Method

min π(b − By) (3.69)

The feasible region {π : πA ≤ c, π ≥ 0} of the dual of the subproblem 3.69 -

1. Continue until the LP relaxed algorithm has converged.

3.4.2 Using Incumbent Solutions

Column & Constraint

4.1 Discrete Uncertainty Sets

s.t. Axk + By ≥ bk ∀k ∈ {1, . . . , K} (4.2)

4.2 Polyhedral Uncertainty Set

4.3 Demand Robust Location-Transportation

4.3.1 Discrete Uncertainty Set

yi ∈ {0, 1}, zi ≥ 0, xtij ≥ 0 (4.11)

Similar to Benders’ decomposition, K independent subproblems are solved.