162c PDF

System optimum vs.
user equilibrium
in static and dynamic traffic routing
by
Jon Marius Venstad
Thesis
for the degree of
Master of Mathematics
(Master i Matematikk)
Faculty of Mathematics and Natural Sciences

University of Oslo
Juni 2007
Det matematisk- naturvitenskapelige fakultet

Universitetet i Oslo
Contents
1 Introduction 4
1.1 Transportation planning . . . . . . . . . . . . . . . . . . . . . . 4
1.2 This text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Background theory 7
2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.1 Walks and paths . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.2 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.1.3 Distance, weighted graphs . . . . . . . . . . . . . . . . . 9
2.1.4 Capacity, flows . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Convex sets, cones . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Halfspaces and polyhedron . . . . . . . . . . . . . . . . 14
2.2.3 Linear Programming . . . . . . . . . . . . . . . . . . . . 16
2.2.4 The dual problem . . . . . . . . . . . . . . . . . . . . . . 17
2.2.5 Convex optimization . . . . . . . . . . . . . . . . . . . . 18
2.2.6 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3 The traffic routing problem 20

3.1 Traffic model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.2 Static system optimum . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Static user equilibrium . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Dynamic system optimum . . . . . . . . . . . . . . . . . . . . . 25
3.5 Dynamic user equilibrium . . . . . . . . . . . . . . . . . . . . . 28
4 Existing work and solution algorithms 29

4.1 Distance and shortest paths . . . . . . . . . . . . . . . . . . . . 29
4.2 Weighted shortest path . . . . . . . . . . . . . . . . . . . . . . . 31
4.3 The Simplex algorithm . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.1 Basic idea . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.3.2 Basic and non-basic variables . . . . . . . . . . . . . . . 35
4.3.3 Correctness and complexity . . . . . . . . . . . . . . . . 38
4.4 Matrix notation, bases . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Network flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.5.1 Multi-commodity flows . . . . . . . . . . . . . . . . . . . 41
4.5.2 Maximum flows . . . . . . . . . . . . . . . . . . . . . . . 42
2
5 Analysis 44
5.1 Special cases and simplifications . . . . . . . . . . . . . . . . . 44
5.1.1 Number of commodities . . . . . . . . . . . . . . . . . . 44
5.1.2 Special latency functions . . . . . . . . . . . . . . . . . . 45
5.1.3 Simplified networks . . . . . . . . . . . . . . . . . . . . . 47
5.1.4 Alternative optimality criteria . . . . . . . . . . . . . . 49
5.2 System optimal vs. user equilibrium . . . . . . . . . . . . . . . 50
5.3 The dynamic case, simplified latency model . . . . . . . . . . 61
5.3.1 System optimal planning . . . . . . . . . . . . . . . . . . 61
5.3.2 Existence and uniqueness of the system optimal so-
lution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.3.3 The time discrete graph . . . . . . . . . . . . . . . . . . 63
5.3.4 System optimal planning by use of the augmented
graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.3.5 Variable preferred arrival time . . . . . . . . . . . . . . 65
5.3.6 Properties of the augmented graph . . . . . . . . . . . 66
5.3.7 Examples of the augmented graph . . . . . . . . . . . . 66
5.3.8 Multi-commodity planning . . . . . . . . . . . . . . . . 71
5.4 Chain decomposable flows . . . . . . . . . . . . . . . . . . . . . 72
5.4.1 System optimal solution with chain flows . . . . . . . 73
6 Summary 87
7 Appendix 89
3
1 Introduction
Optimization is a large discipline of mathematics, and loosely said con-
cerns finding the best solution to some given problem. Numerous branches
exist within the field of optimization, such as linear optimization, con-
vex optimization, and integer optimization. Typically an optimization
problem has a function to minimize or maximize over a given domain,
and the problem might be easy or hard to solve, depending on both the
objective function and the feasible domain. Some problems might be
solved in a time polynomially proportional to a measure of the size of
the problem, while other problems have not yet been, or can never be,
solved faster than exponentially proportional to the size of the problem.
When solving a problem by hand any big problem can become almost
impossible to solve because of its sheer size, but with the aid of fast
computers mathematicians today can solve bigger and bigger problems.
In particular more and more real world problems can be solved to opti-
mality with the emerging possibilities.
1.1 Transportation planning

One of the branches of real world problems that certainly benefits from
optimization theory is transportation planning. In countless scenarios
one or more kinds of commodities are to be transported between differ-
ent locations, and it is often desirable to find the best way of doing this.
What the defines the best way can be different from problem to problem,
but some examples are:
• Routing traffic through a city, with as little congestion as possible.
• Routing internet traffic, with as little delay as possible.
• Routing containers between harbors, while transporting empty ones

as short distances as possible.
Now in the case of routing traffic through a city, another possible

criterion to consider could be minimizing the total travel time of all the
commuters, and yet another could be minimizing the difference between
desired and actual departure and arrival times for all commuters. And
of course a combination of any of these criteria could be used. This is
then the quantity we wish to minimize in our problem, and a function to
compute this quantity is required for any mathematical optimization to
be done. The mathematical model of our problem is what allows us to do
4
this calculation; it is a mathematical representation of our problem, and
as such has a way of representing the traffic throughout the network
in a precise and quantitative way. We can then use the mathematical
model to check which different traffic routings are the best with regard
to our choice of minimization goal. It is very important that the chosen
mathematical model has properties that resemble those of the original
problem. Often finding a good mathematical model is not very hard,
but finding one that is not too complex for efficient optimization to take
place might be harder. Typically this transportation kind of problem
is regarded as a network problem where the network is represented as
a graph, and each edge in the graph has a cost associated to it that
depends on the amount of traffic flowing along it.
Another interesting viewpoint in traffic problems is that of each com-
muter, assuming the users of the network behave according to the ego-
istic goal of minimizing their own travel time in the network. This is by
many considered the situation that will occur in a real world traffic net-
work, and the "solution" we get from this approach can differ from the
solution to the similar optimization problem of e.g. least travel time. In-
terestingly the user approach yields a solution that is often much worse
in terms of total travel time. Bridging the gap between these two solu-
tions to the traffic flow problem might, at least for the environmentalists,
be of great interest.
1.2 This text

In this thesis we will have a look at the problem of optimally routing
traffic through a network that does not have enough capacity for all
the traffic to follow the same, fastest route. I will also try to compare
the optimal solution to the user solution that is assumed to occur if
no measures are taken to direct the traffic. In order to do this I will
need a mathematical model for both problems, which might be slightly
different from one another. The models used to represent the networks
will in both cases be directed graphs with cost functions along each of
the arcs. In addition I will need optimization theory to find the optimal
solution to the given problems. In the simplest case we can use linear
optimization, but might need other areas for the general case.
The outline of the thesis will be as follows: I will give the basic termi-
nology of the text in section 2, and then go on to describe our problem
and different varieties of it in section 3. Section 4 will be used to exam-
ine theoretical results that may be applied to our problems. In section 5
5
I will conduct my own analysis of the problems, using the theory from
the previous section, and section 6 will contain a short discussion of
what I have achieved in the thesis.
6
2 Background theory
This section will contain an overview of terminology and concepts used
in the rest of the text. This will be from the fields of graph theory and
optimization (in particular linear optimization).
2.1 Graphs
A graph is a structure used to describe how different entities are related
to each other, through the means of representing each entity and each
relation by nodes and edges, respectively. More precisely an undirected
graph (or just graph) G consists of a set of nodes V and a set E of pairs
of these nodes. Each edge e = (v1 , v2 ) or v1 v2 represents a connection
between these two nodes. The nodes are said to be the endpoints of e,
and v1 and v2 are said to be adjacent. The edge is also incident to each
of the nodes, and vice versa. Two edges are adjacent if they are incident
to a common node.
If we demand that the set of pairs of nodes be a set of ordered pairs,
we obtain a directed graph, or digraph for short. It is then also common
to name the nodes vertices and the edges arcs instead, and the set of
arcs is called A instead of E. In this case an arc a = v1 v2 represents a
connection from v1 to v2 , but not the other way. These nodes are called
the source and target of a respectively. Note that removing the direction
of the arcs in a directed graph simply yields an undirected graph often
referred to as the underlying (undirected) graph. When using the term
graph without qualification we might mean undirected or directed graph,
based on the context.
Throughout this section I will give definitions for undirected graphs,
with supplements for the directed case where needed .
The degree of a node is the number of edges incident to it. For a
vertex we distinguish the indegree, which is the number of arcs entering
the vertex, and the outdegree, which is the number of arcs leaving the
vertex.
2.1.1 Walks and paths
A walk in a graph is an alternating sequence of nodes and edges

(v0 , v0 v1 , v1 , . . . , vn−1 vn , vn ) starting and ending with a node, such that
for each edge in the walk the preceding and succeeding nodes are the
endpoints of that edge. For a digraph they must be the source and target
of the arc, respectively. A walk is closed if it begins and ends with the
7
Figure 1: An example graph
same node, and open if not.

A walk in which each node is present only once (except possibly the
first and last nodes, which may be the same) is called simple, and are
walks that often occur naturally as solutions to various problems. Think
for instance of the problem of finding a shortest path through a graph.
Intuitively we may believe this to always be a simple walk, and this will
indeed be proven true later in the text. (At least for graphs where such
a path exists.)
A walk that is both open and simple is called a path, whereas a closed
and simple walk is called a cycle. A cycle with only one edge is called a
loop. A (directed) graph containing no (directed) cycles is called acyclic.
A (directed) graph is (strongly) connected if for any node there exists a
path to any other node, and each subgraph H = (U, F ), U ⊂ V , F = U ∩ E
such that U is connected is called a component of G. A directed graph
where for any pair of vertices there exists a path from one of the nodes
to the other is weakly connected.
All graphs in this text are henceforth assumed to be connected, unless
otherwise stated.
2.1.2 Trees
Contained in the set of all possible graphs are several interesting subsets
or classes of graphs. Among the most important of these are trees. A
tree is a connected graph with no cycles. This, however, implies a few
8
other properties that the trees must have.
Theorem 2.1 For a graph G = (V , E) the following are equivalent:
a) G is connected and acyclic.
b) G is connected, and |V | = |E| + 1.
c) Between any pair of nodes in G there exists exactly one unique path.
Figure 2: A tree with an s − t-path highlighted
2.1.3 Distance, weighted graphs
The notion of distance comes to mind when thinking of a traffic network,

be it distance in terms of travel time or in terms of spatial distance.
There is also a corresponding definition of distance in graphs.
Definition 2.1 The length of a walk is equal to the number of edges in it.
This leads us to the following definition of distance.
Definition 2.2 The distance between two nodes is equal to the length of a
shortest walk between them. If no such walk exists the distance is defined
to be ∞.
9
This may be used to define a metric in any undirected graph.
i d(v1 , v2 ) = 0 ⇐⇒ v1 = v2
Follows from the definition.
ii d(v1 , v2 ) = d(v2 , v1 )
Any path from v1 to v2 is also a path from v2 to v1 with the same
length, when reversed.
iii d(v1 , v2 ) + d(v2 , v3 ) ≥ d(v1 , v3 )

Any path from v1 to v2 with length l1 can be combined with a path
from v2 to v3 with length l2 to form a path from v1 to v3 with
length l1 + l2 .
In a directed graph this measure of distance only yields a metric on the

underlying undirected graph, because the symmetry requirement fails.
We will still define the distance between vertices v1 and v2 in a directed
graph as the length of a shortest path from v1 to v2 .
Expanding the definition of a graph to also include a function l : E →
R we get a weighted graph, where l(e) is the length (or weight) of edge e.
These weights are often assumed to be non-negative, i.e. l : E → R + , and
this will also be the case in this text. Now we can give another definition
of length and distance in a graph:
Definition 2.3 The (weighted) length of a walk is equal to the sum of the
lengths of the edges in it.
If we assume the length of each edge to be 1 we see that we recover the

first definition of length. We will hereby mean the weighted length/distance
whenever we say length/distance.
2.1.4 Capacity, flows
In a road network, the internet, or in any other real life network in which
commodities are transported there is some kind of limit to how much
stuff can be moved around during unit of time. In a graph this capacity
constraint is easily added as another function c : E → R + where c(e)
denotes the maximum amount of commodity that can be moved along
the edge e in one time unit. We will, however, also be interested in the
direction of flow along each edge, and because of this we hereby switch
our attention over to the directed graphs for the rest of the text.
When working with flow in graphs we also have a function f : A → R +
10
Figure 3: The distance from s to t is 6.
where f (a) denotes the flow currently assigned to arc a. The cost of
assigning a flow f (a) along the arc a is
f (a)l(a) (1)
and the total cost of a flow f is then defined:
Definition 2.4 The cost of a flow f is

X
f (a)d(a)
a∈A
Thinking of a flow situation in which the picture is not altered over

time we realize that for each vertex the inflow and the outflow must be
equal. Now some nodes may have an innate supply of commodity, such
that the flow out into the graph from such a node node is greater than
the flow into it. This is a node with a positive supply, and it is called a
source. In the opposite case the node is a sink, with a negative supply.
We define the supply function b : V → R such that for each source s we
have b(s) > 0, for each sink t we have b(t) < 0 and for all other nodes
v we have b(v) = 0. We can then characterize a balanced flow.
Definition 2.5 The flow f is balanced if for each vertex v ∈ V we have
X X
f (a) = b(v) + f (a) (2)
a∈δout (v) a∈δin (v)
where δout (v) denotes the leaving arcs of v and δin (v) denotes the en-
tering arcs.
11
This is also called the flow conservation property, and will be assumed
to hold for any flow f unless otherwise stated.
Definition 2.6 A flow that satisfies both the flow conservation property
and also the capacity constraint
0 ≤ f (a) ≤ c(a) ∀a ∈ A (3)
is called a feasible flow.
For a feasible flow we also talk about the value of the flow, which is
Definition 2.7 The value of a s − t-flow f is
X X
f (a) = f (a)
a∈δout (s) a∈δin (t)
Figure 4: A graph with costs and capacities on each arc, and a feasible
flow in the same graph.
2.2 Optimization
The most general form of an optimization problem may be written
max{f (x) : x ∈ D}
or
min{f (x) : x ∈ D}
where D is some domain called the feasible region. Multiplying f (x)
with −1 we see that the two problems are really the same. About such
a general problem there is not much to be said, and thus optimization
problems are divided into several categories according to the form of
both f and D:
12
• If both f and D are convex we have convex optimization
• If f is also linear and D is a polyhedron we have linear optimization

or linear programming
• If D is a finite set we have combinatorial optimization
• If D is also the integer points of a polyhedron we have integer

programming
We will in this section look at minimizing a linear function over a poly-

hedron, which is then linear programming. Our aim is to see that all
polyhedron are a combination of polytopes and finitely generated cones,
and that a linear function over such a domain obtains its minimum value
in a vertex of the domain, unless the minimum is unbounded.
2.2.1 Convex sets, cones
An import kind of set is the convex set.
Definition 2.8 A set C ⊂ R n is convex if for any pair of points c, d ∈ C

and for any 0 ≤ λ ≤ 1 we have
λc + (1 − λ)d ∈ C
i.e. any convex combination of the two points is again in C.
Examples of convex sets are R n the n−dimensional real space, I n the

n−dimensional solid square box and Dn the n−dimensional ball.
Similar to the convex set we have the convex cone.
Definition 2.9 A set C ⊂ R n is a convex cone if for any x, y ∈ C we also
have
λx + µy ∈ C, λ, µ ≥ 0
i.e. any conical combination of the two points is again in C.
Note that the convex cone is also convex.

We define the intersection and sum of two sets X, Y .
Definition 2.10 The intersection of two sets X, Y is given by

X ∩ Y = z : z ∈ X, z ∈ Y
Definition 2.11 The sum of two sets X, Y is given by

X + Y = x + y : x ∈ X, y ∈ Y
13
We can easily verify that convex sets and convex cones are closed under
both intersection and sum.
A useful construction is the convex hull of a set X.
Definition 2.12 The convex hull conv.hull(X) of a set X is the intersection

of all sets containing X. Subsequently it is the minimal convex set (with
regard to inclusion) containing X. If X is finite conv.hull(X) is a polytope.
Although this definition is rather abstract, it can be shown that the defi-
nition is equivalent to a more useful characterization.
Theorem 2.2 For a set X ⊂ R n we have

X
conv.hull(X) = x : x = λ1 x1 + · · · + λm xm , xi ∈ X, λi = 1, λi ≥ 0
i
If X is finite there exists a subset X ′ ⊂ X with |X ′ | = n + 1 such that each

x can be expressed uniquely as a convex combination of points in X ′ , and
X ′ are then the vertices of the polytope of X.
Again we have the similar definition of the cone of a set X.
Definition 2.13 The cone cone(X) of a set X is the smallest convex cone
containing X. If X is finite cone(X) is finitely generated.
Again it can be shown that the definition is equivalent to a more useful

characterization.
Theorem 2.3 For a set X ⊂ R n we have

cone(X) = x : x = λ1 x1 + · · · + λm xm , xi ∈ X, λi ≥ 0
If X is finite there exists a subset X ′ ⊂ X with |X ′ | = n such that each x

can be expressed uniquely as a convex combination of points in X ′ .
2.2.2 Halfspaces and polyhedron
The polytopes and cones are closely related to another kind of convex
set, the polyhedron, which is the intersection of a finite number of half-
spaces.
Definition 2.14 A halfspace H ⊂ R n is a subset of R n such that there

exist a vector r ∈ R n , r ≠ 0 and a real number δ ∈ R such that
H = {x : r T x ≤ δ}
14
Since scalar products commute with vector addition in R n , we see imme-
diately that all halfspaces are also closed convex sets.
Similar to the halfspace we have the hyperplane.
Definition 2.15 A hyperplane P ⊂ R n is a subspace of R n such that there

exist a vector r ∈ R n and a real number c ∈ R such that P = {x : r T x =
c}.
A hyperplane is also called an affine subspace, and in the case when

c = 0 a linear subspace.
Definition 2.16 A polyhedron P ⊂ R n is an intersection of finitely many

halfspaces. I.e.
P = {x : Ax ≤ b}
where A is a m × n matrix that determines the m halfspaces that P is an
intersection of.
Since a halfspace is a closed, convex set, any intersection of halfspaces

is also a closed, convex set. So all polyhedron are then closed and con-
vex. And intuitively they look very much like polytopes and cones. The
relation between the different kinds of sets are given by the following
theorem.
Theorem 2.4 Any polyhedron P is a sum of a polytope Q and a finitely

generated cone C, i.e.
P =Q+C
, where Q and P have the same vertex set. If P is bounded it equals the
polytope Q (or C is empty).
For a proof refer to [1] This means that everything that is true for poly-
topes (or sums of polytopes and finitely generated cones) is also true
for bounded (or general) polyhedron! Due to the explicit definition of
the polytopes and finitely generated cones it is often easier to prove
attributes of these, than it is for the polyhedron with their implicit defi-
nition.
The converse to this theorem is also true, but for us this theorem is of
most interest, as we will see in a moment.
15
2.2.3 Linear Programming
Misleading as the name may be, Linear Programming (LP) has little to do
with actual programming, but concerns rather the problem of finding the
maximum of minimum value of a linear function over a convex domain.
To this end several algorithms have been developed over the years, and
among those that stand out are the Simplex algorithm and the interior
point methods. Formally we have the objective function (i.e. the function
to maximize or minimize, and we will hereby assume minimization) f :
P → R where P is the domain polyhedron P = {x : Ax ≤ b}. Our problem
is thus to find
min{c T x : Ax ≤ b}
where A is the constraint matrix of the LP problem.
Intuitively the LP problem is very easy to solve, as the sets for which
the objective function have a constant value are hyperplanes. Thus solv-
ing an LP problem is really just the same as moving this hyperplane
along its normal vector, decreasing the value of the objective function,
until it reaches the boundary of the convex domain. This intuition also
tells us that the minimum value of the the objective function is attained
in at least one of the vertices of the domain, if at all. The objective
function might decrease along a direction in which the polyhedron is
unbounded. In this case we say that the LP problem is unbounded. The
other extreme case is when the polyhedron is empty, i.e. no solution
exists at all. This is an infeasible LP problem.
Of course finding the solution to an LP problem is not as easy as
intuition leads us to believe, but this is where the power of theorem
(2.4) can be used:
Theorem 2.5 For a feasible and bounded LP problem the optimal value
is always attained in a vertex of the domain polyhedron.
Proof. We start with the case of a bounded LP problem. For a non-

empty polytope Q and a linear function f : Q → R the minimum of f is
attained in a vertex of Q. Let Q have vertices v1 , . . . , vn . Then any point
x ∈ Q can be written
X
x = λ1 v1 + · · · + λn vn , λI ∈ [0, 1], λi = 1
Now since f is linear, we get
f (x) = f (λ1 v1 + · · · + λn vn )
= λ1 f (v1 ) + · · · + λn f (vn ) ≥ min{f (v1 ), . . . , f (vn )}
16
Since any bounded polyhedron is also a polytope, we then have for
bounded LP problems that the minimum is attained in a vertex. Now
we might have an LP problem where the domain is unbounded, but
the minimum value for the objective function might still exist and be
finite. Proving that the optimal value here is attained in a vertex as
well requires some extra details. Assume now that the polyhedron P
is unbounded and has at least one vertex, and that f has a bounded
minimum value that is attained in P . Now we know that P = Q + C
where C is nonempty. Assume that there exists a vector z ∈ C such that
f (z) = f (0) + d, d > 0. But then f (nz) = f (0) + nd, and since nz ∈ C f
is unbounded, which contradicts the assumption that the problem was
bounded. Thus f (z) ≤ f (0) for all z ∈ C, and we may examine only
the points in P of the form x + y, x ∈ Q, y ∈ C where y = 0, that is
we may consider only the points of the polytope Q. Now since we know
that the minimum value is attained in the polytope Q, we also know that
it is attained in a vertex of Q, which is again a vertex of P by (2.4).
2.2.4 The dual problem
An important theorem in linear programming concerns a problem re-

lated to an original LP problem, called the dual problem. For an LP prob-
lem
min{c T x : Ax ≤ b, x ≥ 0}
the dual problem is
max{bT y : AT y ≥ c, y ≥ 0}
Now the famous duality theorem states that if both the original and the
dual problems are feasible, then their optimal solutions are the same:
Theorem 2.6 (LP duality theorem) For a linear optimization problem and
its dual we have min{c T x : Ax ≤ b, x ≥ 0} = max{bT y : AT y ≥ c, y ≥
0} if both problems are feasible.
For a proof refer to [5].

In the case that one of the problems is infeasible we can make use of
Farkas’ lemma to show that the other problem must be unbounded. Not
so much a lemma as a theorem, it states the following:
Theorem 2.7 (Farkas’ lemma) The system Ax = b has a nonnegative

solution if and only if there is no vector y satisfying y T A ≥ 0, y T b < 0.
For a proof refer to [1]
17
2.2.5 Convex optimization
Not to be confused with a convex set is a convex function
Definition 2.17 A function f : D → R is convex if for any x1 , x2 ∈ D and

0 < λ < 1 we have
f (λx1 + (1 − λ)x2 ) ≤ λf (x1 ) + (1 − λ)f (x2 )
The field of convex optimization concerns minimizing or maximizing

convex functions over convex domains. Thus linear optimization is a
subfield of convex optimization.
Several algorithms exist for solving convex optimization problems, and
among the most significant is the Frank-Wolfe algorithm. Briefly this al-
gorithm works as follows:
Start with an initial guess at the solution, x0 . Then approximate the
objective function with a linear approximation around xi , and solve the
resulting LP-problem, obtaining the solution xi′ . Now use a convex com-
bination of xi , xi′ as xi+1 , and do a new approximation around xi+1 . Ter-
minate when some criterion is met, e.g. improvements are less than
some ǫ for each iteration.
The Frank-Wolfe algorithm does not give an exact solution, but rather an
approximation to the optimal solution. Unfortunately the improvements
made in each iteration of the algorithm decrease rapidly, and obtaining
an accurate approximation might require a large amount of iterations,
each consisting of constructing and solving a linear optimization prob-
lem. Nevertheless the algorithm is often used for convex optimization
problems.
2.2.6 Complexity
The complexity of an algorithm refers to roughly how many operations

need to be performed for the algorithm to terminate. The complexity of
a problem equals the smallest complexity of a solution algorithm.
Complexity is used as a measure of how long time a problem will take
to solve, based on some measure of problem size. For instance we can
consider the problem of finding max(S) for some finite set S of integers.
A fastest algorithm that solves this problem is
max = S(1)
for (i = 2; i <= |S|; i++)
if (S(i) > max) max = S(i)
return max
18
If |S| = n this algorithm performs roughly n loops, of which roughly
log2 (n) contain four operations, and the rest contain three. In total this
is 3n + log2 (n) operations. However keeping exact track of the number
of operations is not of very much interest for the average algorithm
theoretician. When n grows large it is clear that 3n dominates log2 (n),
so our algorithm has a running time which is roughly 3n. Again constant
terms are not really significant. When comparing 3n to n2 it is clear that
for large enough values of n the extra term n dominates the term 3,
and so it is customary to also strip all constants. Thus we end up with a
running time roughly proportional to n for our algorithm, or we say that
it has complexity O(n) - at the order of n. And the problem of finding
the largest of n integers then also has complexity O(n).
Note that if we made the assumption that the integers in S were sorted,
a fastest algorithm would simply be
return S(|S|)
which clearly has complexity O(1).

We usually consider the algorithms that have complexity O(nm ) for
some fixed m, the polynomial algorithms, to be good. And of course
the lower the exponent m, the better. If the complexity of an algorithm
is O(en ) we say that the time is exponential, which is bad. Many prob-
lems have exponential complexity, and these are ones we’d rather not
solve precisely. In these cases approximation algorithms with polyno-
mial complexity may often be used to find an approximate solution of
the problem instead.
19
3 The traffic routing problem
In fact there is not the traffic routing problem, but this involves rather
a large amount of related problems. In the introduction the following
examples were given:
• Routing traffic through a city, with as little congestion as possible.
• Routing internet traffic, with as little delay as possible.
• Routing containers between harbors, while transporting empty ones

as short distances as possible.
There was also a mention of the situation in which traffic is not directed
in any way. and that this could lead to another traffic routing. These
listed situations are examples of system optimal routing where we try
to minimize some measure of badness. On the other hand this second
situation would be a user equilibrium where the behavior of the users
decide the solution. Both scenarios can be studied using static network
flow formulations.
In addition one might add the dimension of time, leading to some slightly
harder problems. An example of this could be to route a given amount
of traffic through a network over time, and in such a way that the total
travel time of all the traffic was minimized. This would typically involve
avoiding congestion, and would be calculating the dynamic system op-
timal routing. On the other hand one could also study how this traffic
would route itself if no interference was done, and thus obtain the dy-
namic user equilibrium. Much effort is also being put into researching
the relation between the system and user equilibria, as finding a way
to use e.g. proper taxing to obtain a user equilibrium that equals the
system optimal solution would be rather splendid in a lot of real world
scenarios.
3.1 Traffic model

In order to bring the real world problem of examining traffic in road
networks over to the mathematical workbench, we need a model that
represents the original problem. This subsection will draw some out-
lines for such models.
We always represent the network in question as a directed graph where
traffic flows along the arcs of the graph. Now the two most important
factors that characterize a stretch of road in the real world is, at least
20
for me, the speed at which traffic flows and the amount of traffic that
there is room for along the road. Typically these two are inversely pro-
portional, something we see if we assume that each car desires to have a
certain amount of time to the car in front. Then this distance increases
proportionally with the speed of the car, and the amount of cars there
are room for per length then decrease. Let’s say that a sensible amount
of time to have between two cars is τ. In a sense this determines an
absolute capacity on this road, where the inflow rate of traffic cannot
be greater than τ1 cars per time unit for each lane of the road. But in
fact there is another mechanism that causes slowdown before this limit
is reached: When the concentration of cars is rather high, and one car
breaks, the one behind it will also have to break immediately to retain
the τ time distance to the car in front. But the second car does probably
not react instantly to the car in front, and thus has to break more than
the first car in order to stay far enough behind. Now the third car, be-
hind the second one, will have to break even more, and so on.
This all leads us to think of the travel time along a stretch of road as
a non-decreasing function of traffic concentration. And of course the
actual length of the road also factors into the travel time as one would
expect.
So we represent the network as a digraph where each arc has a latency
function la (xa ), la : R + → R + dependent on the flow assigned to the
arc. Note that this is the function we use as the length of each arc when
considering the network only as a graph. This latency function is usually
assumed to be convex and nondecreasing, and so we will assume here.
In addition each arc may have a capacity constraint c(a), c : A → R + ,
but when the cost function is increasing, this might also play the role of
a capacity in limiting the amount of flow assigned to the arc.
In the dynamic case we cannot use the simple network flow model
anymore, but must expand of change the traffic model to describe the
added time dimension. In this case our latency functions are often much
more complex, consisting of differential equations or the like, to accom-
modate for queues and variable latency situations. However this will not
be the main focus of this text, and when examining the dynamic prob-
lems I will assume the latency functions to be of a rather simple kind
that allows for only a small expansion of the network flow model.
Each traffic agent must have an origin and a destination, but since we
are not treating the commuters individually we use sources and sinks of
continuous flow. Since it is significant where the flows run from and to,
we have to treat travelers originating from a vertex s and destined to a
vertex t differently from other travelers, that is we need to distinguish
21
s − t-flows from all other flows for each pair (s, t). To this end we in-
troduce one commodity for each origin-destination pair, and index these
with i ∈ I. Thus we have |I| flow functions fi (a), fi : A → R + , and the
total flow along arc a is X
xa = fi (a)
i∈I
Everyone knows how boring it is to be stuck in traffic jams. In fact

most people would agree that spending time at home of at work is
preferable to spending it in a car or on a bus altogether. So we assume
that all traffic agents are interested in minimizing their time spent in
traffic; their travel time. And of course the travel time experienced by
one traffic agent is equal to the sum of the travel time along each road
of the agent’s route, from origin to destination, or in other words the
length of the path (or route) the traveller takes in the network.
In the dynamic cases another assumption we make is that each of our
commuters have got a certain time they wish to arrive at their destina-
tion (and possibly also a desired departure time). And when too many
have the same desired arrival time the roads get crowded at some time
intervals, causing congestion and delays. In this case each commodity
has it’s own departure cost function g(t), g : R → R + and arrival cost
function h(t), h : R → R + . Both of these are assumed to be convex, and
to avoid that spending time in traffic is preferable to spending time at
home or at work we also require that h′ (t) ≥ −1, ∀t and g ′ (t) ≤ 1, ∀t.
To avoid congested traffic situations we can consider routing traffic
along alternative paths to relieve the most heavily used roads. Another
possibility is to hurry or delay departures such that not everyone enters
the network at the same time. But how do we route traffic? The general
assumption is that each traffic agent does exactly what’s best for them-
selves, i.e. totally selfish behavior. And as we shall see this may cause
much more congestion than what is indeed necessary! Finding a way
to make the selfish user equilibrium and the altruistic system optimal
solution coincide will be the ultimate goal of this analysis.
3.2 Static system optimum

The first problem we will look at is finding the system optimal routing
in a static setting. The motivation for this problem is that we wish to
route a static flow - perhaps the peak traffic causing the usual jams in
cities during the morning and afternoon hours - through a given traf-
fic network. The data we are given is the network itself, represented
22
as a graph, where each arc has a travel time function (or latency func-
tion) dependent on the flow along it. The arcs may also have capacity
constraints. In addition we have a set origin and destination pairs, each
with a flow of a certain magnitude that needs to flow between them.
Each of these pairs correspond to one commodity.
What we wish to do is minimize the total travel time of all commuters.
The total travel time of all traffic is given by:
X
xa la (xa ) (4)
a∈A
where xa is the total flow along arc a and l is the latency function giving
the travel time along the arc as a function of traffic flow. Now this flow
is a composition of flows between several origin-destination pairs. Let
these pairs be indexed by the set I, and let fi (a) denote the amount of
flow of commodity i along arc a, such that
X
xa = fi (a)
i∈I
We also require flow conservation of each of the i flows. If bi (v) is the

supply of commodity i at vertex v we can express these constraints as
X X
fi (a) = bi (v) + fi (a) ∀v ∈ V , i ∈ I (5)
In addition we have non-negativity constraints on the flows:
fi (a) ≥ 0 ∀a ∈ A, i ∈ I (6)
and capacity constraints along each arc:

X
fi (a) ≤ c(a) ∀a ∈ A (7)
i∈I
This is quite a problem to solve, but if we study it more closely we

see that all our constraints are linear equalities or inequalities. In other
words our feasible domain is a polyhedron. Since the latency function
is convex we see that our problem fulfills the criteria for being a convex
optimization problem! Thus we know one way of solving it, although a
rather general and possibly not very efficient way. In an attempt to gain
some more insight into this problem, I will look at some special cases of
it later, in section 5.
23
3.3 Static user equilibrium
The second problem I will consider is finding the user equilibrium rout-
ing in a static setting. The motivation for this problem is to calculate
what flow we will actually get in a given network if we let the users de-
termine the flow of traffic. Another aim is to find some characterizations
of these user equilibria, which may then be used in making a system op-
timal solution become a user equilibrium by taxation. The data we are
given is exactly the same as in the system optimal problem above.
How to solve this problem is not intuitively easy, but we can start with
the famous principle of Wardrop:
Postulate 3.1 (Wardrop’s first principle) The journey times in all utilized
routes are equal, and equal to or less than those which would be experi-
enced by a single vehicle on any unused route.
An equivalent formulation, viewing each traffic agent as a player, is that
the user equilibrium is a Nash equilibirum:
Postulate 3.2 In a user equilibrium no traffic agents can improve their
travel times by unilaterally changing routes.
This definition can not be used directly to calculate the user equilib-
rium, as the number of players is too great. We don’t even treat them
individually in our flow model. Luckily we can formulate an optimization
problem that gives us the user equilibrium! Consider the function
X Z xa
la (x)dx (8)
a∈A 0
where X
xa = fi (a)
i∈I
We want to show that this function actually is minimal exactly when the
postulates above hold. Let vertices s, t be the source and sink of fi for
some i ∈ I, let P , Q be two s − t-paths, and assume the cost of P is less
than the cost of Q. Now let Ap be the arcs in p that are not also in q,
and similarly for Aq . Then
X X
la (xa ) < la (xa )
a∈Ap a∈Aq
and shifting a flow of magnitude δx from q to p changes the value of

(8) by
X Z xa +δx X Z xa
la (x)dx − la (x)dx (9)
a∈Ap xa a∈Aq xa −δx
24
which is negative for sufficiently small δx since la is non-decreasing.
Thus all paths between each origin-destination pair have equal cost when
(8) is minimal. And we can formulate the problem of finding the user
equilibrium as minimizing (8) subject to the same constraints (5 - 7) as
the system optimal problem.
Again the assumption that each latency function is convex (or just
non-decreasing, in fact) makes this a convex optimization problem, solv-
able by known algorithms. I will also look at special cases of this prob-
lem later, as well as compare the user equilibrium flow and the system
optimal flow in the same network.
3.4 Dynamic system optimum

The third and by far the hardest problem is that of finding a system
optimal solution to a flow that changes over time. The motivation here
is that we have a certain amount of traffic we want to route through a
given network, as opposed to a flow of a certain magnitude. Now all the
traffic can not be moved at the same time due to capacity constraints,
so we must spread it out over a period of time. This period of time is
assume to be t ∈ [0, T ] For the commuters this means the extra cost of
departing and arriving at less than optimal times. We expect the system
optimal solution to respect this, by finding a routing of the traffic that
minimizes both time spent in traffic and deviations from the preferred
departure and arrival times.
For the traffic planner it means we are no longer dealing with flows and
capacities along the arcs as real numbers, but as real valued functions
fi (a, t) of arcs and time. Thus we have a vastly larger domain to opti-
mize over.
In addition we must be able to calculate the travel time along each arc at
any given time τ, and we need a latency model for doing this. Now this
calculation is dependent not only on the inflow at t = τ, but also the
inflow to the arc at all times t < τ We assume that inflow at t > τ does
not influence the travel time at t = τ, and this is called causality of our
latency model. This ensures that we can in fact calculate the travel time
at τ = t if we know the inflow and travel times up to this point of time.
It should also be impossible to arrive at an earlier time by choosing the
same route, but departing later. This encompasses the FIFO, or queue,
principle; the First In are the First Out. Lastly we require that the total
outflow from each arc must equal the total inflow to that arc, which is the
conservation of traffic. We end up with inflow functions fiin (a, t), xain (t),
25
outflow functions fiout (a, t), xaout (t) and latency functions la (xain , t) for
each arc a at time t, where again
X
xain (t) = fiin (a, t)
i
and similarly for xaout , faout . The relation between xaout and xain needed
to satisfy the conservation of traffic is given by
1
xaout (t + la (xain , t)) = xain (t) δ
(10)
1+ l (xain , t)
δt a
and similarly for fiin (a, t), fiout (a, r + la (xain , t)). This is obtained by
differentiating the integral of inflow up to time t and outflow up to time
t + la (xain , t), which must be equal.
The FIFO principle directly translates as
la (xain , t) + ∆t ≤ la (xain , t + ∆t)
which implies
δ
la (xain , t) ≥ −1
δt
And in addition we assume that whenever xain (t) > 0 we have
δ
la (xain , t) > −1
δt
These are all properties that need to be satisfied by our latency model,
and the models that are usable in this sense range from very simply to
very complex. The model I choose later in the text is quite simple.
We are ready to state the dynamic system optimal traffic routing
problem:
The total travel time of all traffic agents is given by
X ZT
xain (t)la (xain , t)dt (11)
a∈A 0
Assuming each commodity has a common departure cost function gi (t)

and arrival cost function hi (t), denoting the source and sink of com-
modity i by si , ti , and letting
X X
bi (v, t) = fiin (a, t) − fiout (a, t)
26
denote the difference in outflow and inflow of commodity i at vertex v
at time t, the total departure and arrival time deviation cost is
XZT
bi (si , t)gi (t) − bi (ti , t)hi (t)dt (12)
i 0
Thus the minimization in the dynamic system optimality problem is

Z XZT
X T in in

min xa (t)la (xa , t)dt + bi (si , t)gi (t) − bi (ti , t)hi (t)dt
a∈A 0 0
i
(13)
such that the below constraints all hold.
The flow balance constraints, when allowing excess traffic to remain
temporarily at each vertex, are for the non-source or -sink vertices of
commodity i
Zt
bi (v, τ)dτ ≤ 0 (14)
0
or written out
Zt X Zt X
fiout (a, τ)dτ ≤ fiin (a, τ)dτ (15)
0 a∈δ 0 a∈δ (v)
out (v) in
with the inequality replaced by an equality at time t = T . If we do

not allow excess traffic to remain at internal vertices, we replace the
inequality with an equality at all times. For the source vertex si the
inequality is relaxed by adding bi on the right hand side
Zt
bi (si , τ)dτ ≤ bi
0
and for the sink ti the equality at time t = T is

ZT
bi (ti , τ)dτ = −bi
0
Note that the supplies at sources and sinks are not given explicitly as
functions of time, but are consequences of flow balance and total supply
at the terminal time t = T .
In addition the capacity constraints
xain (t) ≤ c(a, t) (16)
and the non-negativity constraints
fi (a, t) ≥ 0 (17)
27
apply as usual.
In this case we are very far from having solved the problem, even
though we have formulated it precisely. The unknowns are no longer
points in R, but functions from R + into R + . This is a problem since the
optimization methods we have mentioned will no longer be applicable.
In addition the latency functions la (xain , t) are functions of xain , which
are themselves functions of t, and the functions la may not at all be
simple; maybe even inexpressible.
3.5 Dynamic user equilibrium

Finding the dynamic user equilibrium will not be treated as a separate
problem here, but I will look at some characterizations of dynamic user
equilibrium flows in the analysis concerning the system optimal case.
The reason for this is twofold: The problem is rather hard, and I am
not as interested in finding the user equilibrium as I am in finding the
system optimal solution. What I am interested in is rather conditions
that ensures a flow is a user equilibrium.
28
4 Existing work and solution algorithms
As expected both optimization and graph theory in general, and traffic
planning specifically, has received lots of attention through the years,
and the amount of articles on the latter is vast. This section contains
the theory and algorithms I have found useful for solving to the traffic
assignment problems, and most of it is general theory found everywhere
in the literature.
4.1 Distance and shortest paths

In section 2 we defined the length of a walk as the sum of the length
of each edge in the walk, counting multiplicity. We also defined the
distance from one node s to another t as the length of a shortest walk
from s to t, but we never said how to find this distance. This is clearly
something we might be interested in. Let’s try to solve this problem in a
graph where all edges have length 1:
One way of doing this could be to try all possible walks from s to t, and
find the minimum of the lengths of these. But there is one problem: if
the graph contains any (directed) cycle reachable from s we will end up
trying to go through this cycle one time, two times, three times and so
on to each time form a different walk. This produces infinitely many
different walks, and we will never terminate our search for the shortest
walk!
Let us therefore try to focus our attention on just the simple walks, or
paths, from s to t. Again we might try all different paths from s to t and
use the length of one of the shortest ones as the distance. Since there are
only finitely many nodes in our graph any path will be of finite length,
and we thus have only a finite amount of possible paths to examine.
This number might nevertheless be horrendously huge! Imagine a graph
with n nodes, where there is an edge between any pair of nodes. This
is called the complete graph of order n. Then the number of different
s − t-paths is
n−1
!
X n−2
(i − 1)!
i−1
i=1
Considering that a graph with 100 nodes is not at all large, this certainly
is a problem.
A different approach is needed. We will pursue a simple but nice idea
that actually inspires several more advanced algorithms later on:
Knowing all nodes reachable from s in k steps, find all nodes reachable
29
in k + 1 steps. When node t is encountered in the l-th step, we have that
the length of a shortest walk from s to t is exactly l. Let’s describe an
algorithm, called a breath first search (BFS), in more detail:
Let Vi , i = 0 . . . |V | be the set of nodes reachable from s in minimum i
steps. Let U be the set of unvisited nodes, let d(s, t be the distance from
s to t and let π : V → V be a mapping we will use to determine the actual
shortest path from s to any other node.
Initialization V0 ← {s}, Vi ← ∅ ∀i > 0

U ← V \ V0
d(s, s) ← 0, d(s, u) ← ∞ ∀u ∈ U
k←1
Loop while t ∉ Vk and k < |V |:
for v ∈ Vk :
for u ∈ δout (v) ∩ U:
d(s, u) = k
π (u) ← v
Vk+1 ← Vk+1 ∪ {u}
U ← U \ {u}
k←k+1
When the algorithm terminates we will have calculated d(s, t), and we
can also find the reverse of the walk used to reach t by repeatedly ap-
plying π , beginning with π (t).
Theorem 4.1 The breadth first search algorithm finds a shortest path
from s to t, if such a path exists.
Proof. Assume that t ∈ Vk for some k, and also that the nodes in Vi
are precisely the nodes reachable from s in minimum i steps. Then
d(s, t) = k after the termination of the algorithm is in fact the distance
from s to t. Since each application of π on a node in Vi yields a node
in Vi−1 we will reach a node in V0 after k steps, beginning with t. This
node must be s, and reversing the direction we find a path from s to t of
length k.
We must show that the nodes in Vi are precisely what we claim they are,
and will do so by induction on i:
For i = 0 the claim is obviously true. Now assume the claim holds for
i = 0 . . . l. Then for u, v as in the loop section of the algorithm a path
from s to u of length l + 1 is easily obtained by combining a path from
s to v with the edge (v, u). Assume that there exists a path from s to u
with length j < l + 1. Then u ∈ Vj , and has thus already been removed
from U, contradicting the assumption that u ∈ U.
30
Now assume that t ∈ U after the algorithm terminates. Thus d(s, u) =
∞. We must show u ∈ U ⇐⇒ no walk from s to u exists.
⇒:In a graph with n nodes (of order n) there are no paths of length ≥ n.
This is because in a path each node is visited only once, and at each step
a new node is visited, making the maximum possible length of a path
n − 1. So if t is not reachable from s in n steps, we can conclude that no
path of any length exists from s to u.
⇐: If there exists a walk from s to t, there must also exist a path from s
to t, obtained by eliminating all cycles from the walk. So if no walk from
s to t exists, neither does a path of any length, so t is unreachable from
s and remains in U through the whole algorithm.
Directly from the algorithm we can also see that the algorithm is
quite fast:
Corollary 4.1 The above algorithm finds a shortest path from s to t in at

most |E| steps, if such a path exists.
Proof. Each edge is processed exactly one time.

Thus we have not only obtained an algorithm for finding a shortest
path (and the distance) between a pair of nodes (or vertices!), but we
have also obtained a fast algorithm. And the idea we have used here, of
examining the ’closest’ nodes first, will be the basis of more advanced
algorithms later, among which the Dijkstra-Prim algorithm is probably
the most famous.
4.2 Weighted shortest path

In the section above we assumed all edges had length 1. We will now
look at the case where the length of each edge may be any positive real
number. Finding a shortest path between a pair of nodes is not quite as
easy as when all edges have unit length, but thanks to Dijkstra and Prim
there exists a not too difficult algorithm nonetheless. We will define and
prove the algorithm for directed graphs:
We want to find the shortest path from s to t for all t ∈ V . We assume
that G has no directed cycles of negative length.
Let U be the set of unvisited nodes, let f : V → R + be the s − v-distances
we wish to calculate and let π : V → V be used for keeping the reverses
of the shortest paths from s to all visited nodes v.
31
Initialization U ←V
f (s) ← 0, f (t) ← ∞ ∀ t ∈ U \ {s}
Loop while U ≠ ∅:
find u ∈ U s.t. f (u) = min{f (u) : u ∈ U}
for a = uv ∈ δout (u) s.t. f (v) > f (u) + l(a):
f (v) ← f (u) + l(a)
π (v) ← u
U ← U \ {u}
Theorem 4.2 The function f gives the distance (of a shortest path) from
s to t for all t ∈ V . If no such path exists the distance is ∞.
Proof. Let d(s, t) be the distance from s to t. We will show that for
each u chosen in the loop, we have f (u) = d(s, u). Thus f (u) =
d(s, u) ∀ u ∈ V \ U by induction. Clearly this holds initially, when
V = U. Note that f (u) ≥ d(s, u) ∀ u ∈ V always holds, since f (u)
is the length of some path from s to u. Now assume f (u) > d(s, u).
Then a shortest s − u-path (s, a1 , v1 . . . , vn−1 , an , u) must pass through
U. Let i be the smallest index for which vi ∈ U. Now if we can show
f (vi ) ≤ d(s, vi ) ≤ d(s, u) < f (u), we have a contradiction to f (u)
being minimal. So we need to show f (vi ) ≤ d(s, vi ): If i = 0 then
f (vi ) = f (s) = 0 = d(s, s) = d(s, vi ). If i > 0 then we must have
f (vi ) ≤ f (vi−1 ) + l(vi−1 vi ) = d(s, vi−1 ) + l(vi−1 vi ) = d(s, vi ).
It can also be show that the running time of the algorithm, with the set
U implemented as a heap, is rather good:
Theorem 4.3 The Dijkstra-Prim shortest path algorithm (with heaps) has
a running time of O(|A|log2 (|V |)).
A typical use for the Dijkstra-Prim algorithm is to find the shortest path
through a graph for someone who wishes to travel from one place to an-
other in the graph, but since this is such an abstract notion the algorithm
obviously has many uses. For instance it can be used for determining
maximum flows through graphs when applied repeatedly to a series of
residual graphs, as we shall see later in this section. Another smart use
of it is to construct a graph in such a way that finding a shortest path
through it solves another, maybe more confusing, problem.
In the case the graph has negative cost arcs we can no longer use Di-
jkstra’s algorithm to find shortest paths. But in this case we we can still
use the Bellman-Ford algorithm for the same purpose. This algorithm
looks perhaps more like the breadth first search. Assume we want to
find the shortest s − v-paths in the graph, for some s. Let f and π be as
32
Figure 5: Snapshot of the Dijkstra-Prim algorithm. The vertices are la-
beled with the order in which the are chosen, and arcs used for the
shortest paths are dashed. u is the next vertex to be picked. w is not
examined at all yet.
33
above.
Initialization f (s) ← 0, f (v) ← ∞ ∀v ∈ V \ {s}

Loop for i = 1 . . . |V |:
for a = uv ∈ A:
if f (u) + l(a) < f (v):
f (v) ← f (u) + l(a)
pi(v) ← u
Theorem 4.4 The Bellman-Ford algorithm computes a shortest s − t path

in O(|A||E|) time, if such a path exists.

Note that in the case when G contains a negative cost directed cycle
this can be detected by examining π . If applying π several time results
in returning to some vertex, then there is a negative cost directed cycle,
and the distances computed by the algorithm may be wrong.
4.3 The Simplex algorithm

Having proved that an optimal value to an LP problem, if it exists, is
always attained in a vertex, an algorithm that examines the vertices of
the feasible polyhedron sounds like a good idea for solving the given LP
problem. And in fact such an algorithm exists, thanks to Danzig. This
algorithm is the famous Simplex algorithm, about which several books
have been written. I will give a short description of the algorithm, and
why it works, some discussion on complexity and average number of
iterations, and then suggest some alternative methods for solving LP
problems.
4.3.1 Basic idea
The basic idea idea of the Simplex algorithm is that since the optimal
solution of the LP problem is attained in a vertex, we can look at only the
vertices of our feasible region, if any. Now finding a vertex of the feasible
region is not necessarily easy, but if we have found one vertex, finding
an adjacent one is no problem. Remember that a point is a vertex of the
polyhedron if and only if it satisfies to equality a number of the linear
independent inequalities equal to the dimension n of the space. Then
moving from one vertex to an adjacent one is done by exchanging one of
34
those equalities with another of the inequalities currently not satisfied
to equality, by moving along the n − 1 equalities until the boundary of
another halfspace is encountered. Now choosing to always travel in a
direction in which the objective function is nondecreasing (taking care
not to go back to the same vertex twice) will eventually lead to a vertex
with the optimal value of the objective function! Doing all this sounds
like a lot of book-holding, but all this is beautifully kept track of by the
Simplex algorithm, as we shall see.
4.3.2 Basic and non-basic variables
Let’s consider the problem min{c T x : Ax ≤ b, x ≥ 0}, where A is an

m × n matrix, i.e. x ∈ R n , and there are m + n inequalities (m from
the matrix, and n from requiring non-negativity of x). Now obviously
expressing a point x can be done with n basis vectors. However the
simplex algorithm introduces m extra ones, so that there is one xi , i =
1, . . . , n for each dimension and one xi , i = n + 1, . . . , n + m for each
inequality. Now xi = 0 means that inequality i is satisfied to equality,
thus making it easy to see which ones are. For example x1 = 0 means
that the first coordinate of x is 0, and xm+n = 0 means that the last
inequality from the matrix A is satisfied to equality. Note that since the
extra variables xn+1 , . . . , xn+m are a measure of how far from equality
equation m is, these variables are also called slack variables. Now an
example will certainly help clear things up. Consider the following LP:
min{c T x : Ax ≤ b, x ≥ 0}
     
5 2 3 1 5
     
c =  −4  A =  4 1 2  b =  11 
3 3 4 2 8
Here we will use x1 , x2 , x3 as the basis vectors of R 3 , and the vectors
x4 , x5 , x6 for each of the three inequalities in A, by transforming the
problem into the following, equivalent problem:
 
xn+1
 
min{c T x : x ′ = b − Ax, x, x ′ ≥ 0}where x ′ =  . . . 
xn+m
Another equivalent problem which we will use when dealing with the
matrix notation for LP problems is the following:
 
x1 " #
′T ′ ′ ′ ′ ′   ′ c
min{c x : A x = b, x ≥ 0}where x =  . . .  , c = , A′ = [A I]
0
xn+m
35
Written out the first formulation of the problem looks like:
minimize: f = 5x1 − 4x2 + 3x3
subject to: x4 = 5 − 2x1 − 3x2 − x3
x5 = 11 − 4x1 − x2 − 2x3
x6 = 8 − 3x1 − 4x2 − 2x3
x≥0
Now the book-holding of the Simplex algorithm is done by assuming that
all the variables appearing on the first line, i.e. x1 , x2 , x3 in this case,
are all 0. This means we are in fact looking at the point (0, 0, 0) ∈ R 3 ,
since x1 , x2 , x3 correspond to the basis vectors of R 3 . Now this makes
it very easy to check the value of the objective function: It is 0. These
variables are called the non-basic variables. Now checking the value of
the variables x4 , x5 , x6 , we see that they are 5, 11, 8 respectively, and
so they are all greater than or equal to 0, and we see that our point is
feasible. Now this might not always be the case in the starting set-up
like here, but there are ways to deal with that. The variables appearing
on the left hand side of the equations, in this case x4 , x5 , x6 , are called
the basic variables.
Having seen that we are in fact at a feasible point of our LP we can
begin looking for a new vertex that improves the value of the objective
function. Looking at the expression f = 5x1 − 4x2 + 3x3 we see that
increasing x1 , x3 would lead to an increase in f , whereas increasing x2
would in fact lead to a decrease in f ! Now let’s try to do exactly this.
How we will do this is to exchange x2 with one of the basic variables,
making x2 basic and setting the other variable to 0. This is called a
pivot. To do this we see that we already have each of the basic variables
expressed as linear functions of the non-basic variables, and it is then
easy to find an expression also for a non-basic variable in terms of the
other non-basic variables and one basic one. Then after choosing a non-
basic and a basic variable we can simply substitute all occurrences of
the non-basic variable with its expression in terms of the chosen basic
and the other non-basic variables. Now the point is to choose the right
variable to exchange x2 with. We see that increasing x2 will lead to a
decrease in all of the basic variables, and taking into consideration that
each of them must still be non-negative after the pivot, we see that we
can check which one of the basic variables first becomes 0 as we are
increasing x2 . In this example we see that x4 will be 0 when x2 is 35 ,
which is the smallest value of x2 that will make any of the basic variables
equal to 0, so the variable we must pivot on is thus x4 . Looking at the
expression x4 = 5 − 2x1 − 3x2 − x3 we see that we can express x2 as
36
x2 = 31 (5 − 2x1 − x4 − x3 ). We then substitute each of the occurrences
of x2 by this expression and obtain the following:
20 23 4 13
minimize: f = −3 x
3 1
+ x
3 4
+ 3 3
x
5 2 1 1
subject to: x4 = 3
− x
3 1
− x
3 4
− x
3 3
28 10 1 5
x5 = 3
− x
3 1
+ x
3 4
− x
3 3
4 1 4 2
x6 = 3
− x
3 1
+ x
3 4
− x
3 3
x≥0
Now all the non-basic variables appear with positive signs in front, mean-
ing that increasing any of them above 0 will make the objective function
greater. In other words we have already obtained an optimal solution,
which is f = − 203
, and the point in which this value is attained is x1 = 0,
5
being a non-basic variable, x2 = 3 , being a basic variable, and x3 = 0,
again being a non-basic variable. Or (0, 53 , 0) in short.
Now in general we can expect to have several non-basic variables with
negative sign in the objective function, and choosing which one should
enter the basis can be done in several ways. One common method is sim-
ply to choose the variable with the greatest negative coefficient, and if
there are ties, just choose one of them. This is known as the greatest co-
efficient rule. Now to determine the variable leaving the basis when there
are ties, a common method is the lexicographical pivot rule in which each
of the slack variables are increased by a arbitrarily small value at the
start of the algorithm. We define
0 < ǫn+m << ǫn+m−1 << · · · << ǫn+1 << all other data
and for each slack variable xi we add ǫi to the right hand side of the
equation determining xi . Using this perturbation our starting dictionary
in the problem above would look like this:
minimize: f = 5x1 − 4x2 + 3x3

subject to: x4 = 5 + ǫ1 − 2x1 − 3x2 − x3
x5 = 11 + ǫ2 − 4x1 − x2 − 2x3
x6 = 8 + ǫ3 − 3x1 − 4x2 − 2x3
x≥0
Now the idea here is that this perturbation of the original problem is
so small that it does not change the solution, and can thus be removed
again when an optimal dictionary is found, but that it makes the choices
of leaving variables during the algorithm unambiguous, thus preventing
the algorithm from going in circles.
37
4.3.3 Correctness and complexity
Without going into details on this, it can be shown that the Simplex
algorithm (with proper pivot rules) terminates for a given LP problem,
and that it finds an optimal solution to the given problem if one exists.
Otherwise it determines if the problem is either unbounded or infeasible.
The complexity analysis will not be done properly here, but in short it
is believed that there is no variant of the Simplex algorithm that has
better worst-case time than exponential. However the average running
time of the algorithm is rather good. Using n and m as a measure of
the size of an LP problem we see that in general we must expect nm
updates for each pivot, as we can expect all the equations to be affected
by the pivot, and these contain nm variables in total. Now the number
of pivots is the hard part to analyze thoroughly, but we can imagine
a worst case scenario where all the vertices of the feasible domain are
visited once. Since the number of vertices can be exponentially large in
n, this could potentially be bad for the algorithm. In practice however an
expected number of pivots is no more than O(m), which is rather good.
Although polynomial time algorithms for solving LP problems exists,
they are often outperformed by the Simplex algorithm in practice.
4.4 Matrix notation, bases

As mentioned above it is also possible to give a formulation of both an
LP problem and of the Simplex algorithm that looks like
min{c T x : Ax = b, x ≥ 0} (18)
where the matrix A contains the identity matrix as a submatrix as in the

example. This way of working with the problem will be used in the work
with the Tree-Simplex algorithm.
The Simplex algorithm works by choosing which m of the m + n vari-
ables are basic variables. Now let us split A into two parts: B containing
the columns that correspond to the basic variables and N containing the
columns corresponding to the non-basic variables. After possibly rear-
ranging the columns of A and the rows of x, c, b we can then rewrite:
A = [B N] (19)
" #
xB
x= (20)
xN
38
Our constraints are then
" #
xB
Ax = [B N] = BxB + Nxn = b (21)
xN
We also partition the cost vector and the objective function:

" #T " #
cB xB
cT x = T
= cBT xB + cN xn (22)
cN xN
Now the fact that the basic variables can be written as functions of the
non-basic variables corresponds to the matrix B being invertible in (21),
and we get:
xB = B −1 b − B −1 Nxn (23)
Now the algorithm consist of keeping track of which variables are basic
and which are non-basic, updating the values of xB and the objective
function, and pivoting on chosen variables. The computationally heavy
part is solving the set of equations involving B, as B changes each time
a pivot is performed. Note that mathematically this is exactly the same
as the above approach to the Simplex algorithm.
4.5 Network flow

As promised we return to the problem of network flow, and will here
look at how to solve such a problem. Letting A be the incidence matrix
of the given digraph, the network flow problem is
min{lT x : 0 ≤ x ≤ c, Ax = −b}
To solve this we will first look at the problem with the simplification that
we are ignoring the capacities, i.e.
min{lT x : 0 ≤ x, Ax = −b}
What makes the network flow problem interesting is the special form of
the matrix B which is used for the basic variables here. It can be shown
that the matrix A has rank m − 1. We delete one row from A to obtain
a new matrix A′ and the corresponding entry from b to get b′ . We call
the node corresponding to the deleted row the root node. The following
theorem contains the main idea for the network Simplex algorithm:
Theorem 4.5 A square submatrix of A′ is a basis if and only if its columns

correspond to arcs forming a spanning tree in the network.
39
For a proof refer to [5]
Now solving the set of equations BxB = −b′ actually is very simple.
It corresponds to fulfilling the flow balance equations at each node of
the graph, assuming all non-tree arcs have 0 flow. The following is an
efficient method of calculating the flow along the spanning (basis) tree
arcs:
Pick a leaf node. The flow along all arcs entering and leaving this node
are known, except one. The supply of the node is also known. Calculate
the flow along the last arc, and remove the node and this arc from the
spanning tree, producing a smaller tree. Repeat the process until the
tree is empty.
Of course the matrix B must have properties that allow us to solve the
set of equations in the same way, and we can verify this by examining it
closer. In fact we never have to do neither multiplications nor divisions,
which speeds things up a bit in a computer.
Now doing a pivot in the tree simplex algorithm corresponds to choosing
a non-basic arc to enter the basis, as usual. Adding this arc to the tree
results in exactly one (undirected) cycle in the tree, and we then update
the flows along only the arcs of this cycle as we increase the flow along
the chosen non-basic arc. Which arc to leave the basis is determined by
which arc has the least potential for change, as usual.
To add the capacity constraints along the arcs to our problem we use
the trick of introducing some extra nodes and arcs to our graph. Assume
we have vertices vi , vj with the arc aij having a capacity cij , cost lij . To
enforce the capacity constraint on the flow along aij we can introduce
an extra node vk and replace aij by aik and ajk . Here we let aik have
cost lik = dij and ajk has cost ljk = 0. In addition we increase the
supply of vj by cij and give the new node vk a supply of −cij . Now we
are again in the situation of a network without capacities, but one that
corresponds to the original one with capacities. To recover the solution
of the original network flow problem, simply take the f (aik ) to be f (aij )
of the original problem, ignoring the f (ajk ) arc. Since the cost cjk = 0
the two problems will also have the same cost. The operation on the
matrix A is less complicated:
Add a new column for the new arc ajk (and just keep the aij as aik ).
Add a new row expressing
f (aik ) + f (ajk ) = cij
Subtract this row from the row
· · · + f (aik ) + · · · = −bj
40
to obtain
· · · − f (ajk ) + · · · = −bj − cij
4.5.1 Multi-commodity flows
The situation we have looked at so far with network flows has had only
one kind of flow commodity. In practice we often encounter problems
where there is not one kind of commodity, but several. These problems
are called multi-commodity flow problems, and are a rather straightfor-
ward generalization of the single-commodity flow problem, as we shall
see now.
Instead of having a single commodity with sources and sinks, we now
have several commodities, each with its own supply in each vertex of the
network. The difference now is that it is not irrelevant which commodity
ends up where, but the flow balance property must hold individually for
each commodity. Let the n commodities be defined by the index set I,
such that the functions bi (v) : V → R defines the magnitude of com-
modity i at vertex v, and fi (a) : A → R + denotes the flow of commodity
i along arc a. Then we require that
X X
fi (a) = bi (v) + fi (a) ∀i ∈ I, ∀v ∈ V
And to satisfy the capacity constraints, we must also require that

X
fi (a) ≤ c(a) ∀a ∈ A
i∈I
as well as assuming non-negativity
fi (a) ≥ 0 ∀i ∈ I, a ∈ A
If we wish to minimize the total cost of our flow, which is given by

 
X X
l(a) fi (a)
a∈A i∈I
we se that this problem is in fact an LP-problem, and we can thus solve

it by the tools we have for these. The Tree-Simplex algorithm no longer
works, but the general one does, and can thus be used for solving these
problems rather efficiently, even for quite large networks with many
commodities.
41
4.5.2 Maximum flows
Another approach to network flow is not finding the minimum cost flow
satisfying some supplies/demands, but finding the maximum flow be-
tween a pair of vertices s, t. That is the flow with the greatest value. Let
G = (V , A) be a graph, with a length function l : A → R and a capacity
function c : A → R + . We assume G has no negative cost directed cycles.
Let f : A → R + be a flow in G. We then construct the residual graph Gf
of f as follows:
For each arc a = uv with f (a) > 0 add a residual arc a−1 = vu with
length l(a−1 ) = −l(a) and capacity c(a−1 ) = f (a). Then reduce the ca-
pacity of a to c(a) − f (a), and if the new capacity equals 0 remove a
from Gf .
We are ready to formulate the flow augmenting algorithm of Ford
and Fulkerson (or the Successive Shortest Path algorithm since we are
choosing shortest paths in the algorithm). The algorithm is based on
calculating flows g in the residual graph Gf , and then augment the ex-
isting flow f with the new flow, resulting in a flow f + g with a greater
value. This process is repeated until the residual graph no longer con-
tains any s − t-paths, at which point we have a maximum value flow.
We choose to augment the existing flow along the shortest path in the
residual graph.
Initialization f =0
Loop while true:
P ← shortest s − t-path in Gf
if l(P ) = ∞ STOP
µ ← mina∈P {c(a)}

g ← g(a) = µ : a ∈ P ∩ A, −µ : a ∈ P ∩ A−1 , 0 : a ∉ P
f ←f +g
It can be shown [1] that the algorithm terminates when the capacities
c(a) are rational, and that the resulting flow is maximal. But since we
chose to augment the existing flow along a shortest path in each itera-
tion of the algorithm, we can show even more: The algorithm computes
a maximum flow with a minimal cost, and in fact each flow f during the
course of the algorithm is minimum cost among all flows with the same
value! This deserves a theorem:
Theorem 4.6 The Ford-Fulkerson algorithm with shortest augmenting paths

(or the Successive Shortest Path algorithm) computes for each step a flow
which is minimum cost among all flows with the same value, and termi-
42
nates with a maximum value flow if the capacities c(a) are rational. If c
is integer and bounded by M the algorithm has running time O(M|E||A|).
Figure 6: The same graph as in figure (2.1.4), and the residual graph
corresponding to the flow in that example. Negative, or residual, arcs
are dashed.
43
5 Analysis
In this section I will examine the problems posed in section 3. I will look
at simplifications and special cases in an attempt to gain some insight
into the problems, and to see which problems are solvable and how good
solution methods we have for each of them.
This is done first for the static problems, and then I will go on and study
the connection between the system optimum and the user equilibrium
in the static setting, and actually present an algorithm that makes the
system optimum a user equilibrium!
In the last part of this section I take a closer look at the dynamic system
optimum problem. First by the rather intuitive approach of discretizing
the time dimension of the dynamic network. And then by the less intu-
itive, but computationally faster and more compact, approach of chain
decomposition of flow; flows that exist during certain time intervals.
5.1 Special cases and simplifications

When faced with a large problem in mathematics one often looks at spe-
cial cases of the problem at hand to see if some useful results might
be found for the special case, and then perhaps generalized back to the
original problem. In this subsection I look at special cases of static net-
works in the system optimum and user equilibrium problems.
5.1.1 Number of commodities
In all original problems we had several commodities making up the total

flow along each arc. These commodities represented different origin-
destination pairs, and were indexed with the set I. It is clear that reduc-
ing the number of commodities greatly reduces the size of our problem,
as the number of unknown variables is proportional to |I|. As we shall
see having just a single commodity, together with some other simplifi-
cations, could allow for some more specialized algorithms to be used.
The first thing I will do here is to reduce the number of needed com-
modities to represent our problem. As stated we had one commodity for
each origin-destination pair. But we can do with less! Treating all flow
from the same source s but to different sinks ti as one flow is possible.
To do this we have to add a universal sink node Ts to our graph. For
each sink ti of the OD-pairs that have s as a source, we then add an arc
ti Ts with capacity equal to b(ti ) and cost 0. This ensures that the correct
amount of flow goes from s to each sink ti without altering the flow in
44
any other way. Using this we can expect to reduce the number of com-
modities by a great amount. In a graph where all vertices are sources
of flow to all other vertices, this trick would reduce the number of com-
modities to the square root of the original amount. And in a graph with
only one source the number of commodities would be reduced to just
one!
Alternatively we could treat all flow going to the same sink t as one flow
Figure 7: A network with several sinks for the same commodity is shown
to the left. To the right is shown a network with the same routing prob-
lems, but with only one sink and thus only one commodity.
by adding a universal source in the same way as above. And in a specific

problem we might chose which of the two approaches to use based on
the number of sinks vs. sources. Let us summarize this:
Observation 5.1 To reduce the number of commodities it is possible to

treat all flow with a common source (or sink) as one flow.
5.1.2 Special latency functions
Remember that the total travel time of all commuters was given as
X
xa la (xa )
a∈A
where X
xa = fi (a)
i
45
This was also the objective function of the static system optimality prob-
lem, whereas the user equilibrium objective function was
X Z xa
la (x)dx
a∈A 0
One of the simplifications we can do is restricting the latency functions

on each arc of our graph to certain kinds. This can lead to problems that
are solvable by more specialized and faster algorithms than what are
needed for the general convex optimization case. The simplest choice
for a latency function is the constant function
la (x) = la , la ∈ R +
Inserting this expression in the static system optimality objective we

obtain the following objective function
X
xa la (24)
a∈A
which is linear. As we already stated the feasible domain of this problem

is a polyhedron, due to the linear inequalities (5 - 7), and we recognize
the current simplification to be an LP problem since the objective func-
tion is now also linear. We can then use the Simplex method to solve
this efficiently, and this is a huge improvement compared to the rather
slow approximation methods used for general convex optimization.
Note that assuming constant latency functions as exactly the same
effect on the user equilibrium object (8), and the objective function for
the two problems coincide in this case.
If we also assume that we have just a single commodity our problem
becomes exactly the minimum cost network flow problem described in
section 2, with capacities. We can then use the even faster Network
Simplex algorithm to solve our problem, or alternatively the successive
shortest path algorithm.
It is important that we do not drop the capacity constraints on the
arcs in this case, as doing so would simplify the whole problem to find-
ing the shortest path from the source to the sink, and then routing all
flow along this path. Although very simple to solve, this problem would
probably not reflect the real world problem very well. This also applies
to the multi commodity case with constant latency functions. This prob-
lem, without capacities, would decompose to finding a shortest path for
each OD-pair like in the single commodity case.
46
Observation 5.2 In the case when our latency functions are constant the
objective function of both the system optimum and the user equilibrium
problems become linear, allowing the use of the Simplex algorithm. If
we also have only a single commodity we can use the Network Simplex
algorithm or the successive shortest path algorithm to find the solutions.
A note on discontinuous latency functions might be needed. We de-

fined the user equilibrium as a situation in which all utilized paths had
equal cost, or latency. With latency functions of the kind la (x) = la this
might clearly be impossible to satisfy. If the network has only two paral-
lel arcs a, b with la < lb , and c(a), c(b) finite, then we cannot really find
a user equilibrium flow. What we could do in this situation is to think of
the latency functions as increasing very rapidly at exactly the capacity
of the edge, so adding just an infinitesimal flow over the capacity causes
an infinitely expensive flow. This justifies the existence of the user equi-
librium also in this graph.
Note also that the argument that minimizing the user equilibrium objec-
tive (8) corresponds to satisfying Wardrop’s first principle is still valid,
as it only required the latency functions to be non-decreasing.
5.1.3 Simplified networks
Another type of special cases we can look at is when the network graph
itself has special properties. I will also look for simplifications that can
be done without altering the solution, like the commodity number re-
duction above.
It is clear that any vertex v with no supply b(v) = 0 and with a single
entering arc a and a single leaving arc b can be removed, joining the two
arcs a, b to a new arc c. The latency is then summed together
lc (x) = la (x) + lb (x)
and the capacity is the minimum of the two
c(c) = min{c(a), c(b}
We could also try joining two parallel arcs a, b to form one arc c with
the same source and target as a and b. It is then clear that the capacity
of the new arc would be the sum of the capacities of the two original,
c(c) = c(a) + c(b). Unfortunately finding the latency function of the
new arc is a bit harder, and is actually influenced by whether we want to
find the user equilibrium or the system optimum.
47
If we want the user equilibrium we should make the assumption that
flow xc is spread between the two arcs a and b in such a way that
la (xa ) = lb (xb )
For a given cost L this means
xa = l−1 −1
a (L), xb = lb (L)
Since
xc = xa + xb
this implies
l−1 −1 −1
c (L) = la (L) + lb (L)
which finally gives the expression

−1
−1
lc (x) = l−1
a (x) + lb (x)
The problem with this expression for lc is that in many cases l−1 a or
−1
lb might not be defined. Take for instance la (x) = la , a constant la-
tency function. Then l−1 a is not defined. But even though we found
a way of working around this problem, which I’m sure we could, find-
ing inverses can be a problem in itself. And on top of that many func-
−1
tions which could result from adding l−1 a and lb don’t even have ex-
pressible inverses! Assume for instance la (x) = x, lb (x) = x 2 , then

1 −1
lc (x) = x + x 2 which to the best of my knowledge is not express-
ible.
If we want the system optimum we should assume the flow xc spread
between a and b in a way that minimizes the total cost of using those
two arcs. As shown below this is equivalent to
′ ′
(xa la (xa )) = (xb lb (xb ))
Pursuing the same idea as above we get from this that

′ −1 ′ −1 ′ −1
(xlc (x)) = (xla (x)) + (xlb (x))
With the identification

1
f ′−1 =
f′
this is then
1 1 1
′ = ′ + ′
(xlc (x)) (xla (x)) (xlb (x))
48
Since xlc (x) = 0 when x = 0 we then finally get
Z
1 x 1
lc (x) = 1 1 dy
x 0 ′ + ′
(yla (y)) (ylb (y))
or without the identification above
Z
1 x h ′ i−1 h ′ i−1 −1
lc (x) = yla (y) + ylb (y) dy
x 0
Unfortunately this is even more impractical than in the user equilibrium
case.
5.1.4 Alternative optimality criteria
The way we defined the two problems of user equilibrium and system
optimality in the static network were very different. The user equilib-
rium problem was initially stated as finding a flow such that for origin-
destination pair s, t all paths in use from s to t were of equal cost. This
in turn led to a minimization problem with the objective function
X Z xa
la (y)dy
a∈A 0
which then turned out to be a convex optimization problem.

For the system optimality problem the initial formulation was that of
minimizing the total travel time of all traffic, given by
X
xa la (xa ) (25)
a∈A
This problem also has a formulation similar to that of the user equilib-
rium, expressed locally on each s − t-path for each origin-destination
pair.
Let P be an s − t-path and let xP be the flow along this path. Then I claim
the following:
Theorem 5.1 (System optimality condition) Minimizing the system op-

timality objective function (25) is equivalent to requiring that
 
d X
LP = xa la (xa )
dxP a∈P
is equal for all used paths P and equal or less than for any unused path.
49
Proof. →: Let P , Q be two s − t-paths with LP < LQ , and let AP be the
arcs in P not in Q and similarly for AQ . Then shifting a flow of value δx
from Q to P changes the value of (25) by
X X
((xa + δx)la (xa + δx) − xa la (xa ))− (xa la (xa ) − (xa − δx)la (xa − δx ))
a∈AP a∈AQ
But this expression is just the same as

X Z xa +δx d X Z xa d
xla (x) dx − xla (x) dx
a∈A xa dx a∈A xa −δx dx
P Q
which is negative for small enough δx, since we have assume that LP <
LQ .
←: We have assume that each latency function la (x) is convex and non-
decreasing in the interval 0 ≤ x ≤ c(a). Then
′
xla (x) = la (x) + xl′a (x)
is non-decreasing in the same interval, for each arc a. Thus each LP is

also non-decreasing as a function of xP . Assume LP is equal for each
used path P and less than or equal for any unused path, for some flow
x. Say LP = Lx . Then any other flow y for which the same holds,
with Ly < Lx , mustPhave yP < xP for all paths P . But since the total
value of the flow is P yP this means that the value of y is less than the
value of x, so y cannot be feasible. Likewise we cannot have a flow y
for which the incremental equality condition hold, but for which Ly >
Lx . So let y be another flow with the same value as x satisfying the
incremental equality condition, and for which Ly = Lx . It is clear that
we can transform x to y by a series of flow shifts of values ∆x from
one path P used more by x to another path Q used more by y, such
that LP = LQ . And then this does not change the value of the objective
function. So x and y have exactly the same cost, and then so do all
flows that satisfy the incremental equality condition. Since minimizing
the objective function produces such a flow, we finally get that all flows
satisfying the incremental equality condition also minimize the objective
function!
5.2 System optimal vs. user equilibrium

As an example of the relation between the system optimal and the user
equilibrium solutions to the traffic assigment problem in the static case,
consider the following example.
50
We have a network as shown in figure 8, where la (x) = 1, a constant
latency function, and lb (x) = x, a linear one. This means that the first
arc has the same travel time regardless of how much flow is assigned to
it, while the other arc has a travel time directly proportional to the flow.
Now assume we are to route a flow of value 1 from s to t. The objective
Figure 8: A small network with two arcs. Both arcs have unlimited ca-
pacity, and latencies are shown in the figure.
function for the system optimal solution is
τs = xa + xb2
Differentiating this and using the relation xa = 1 − xb we get
τs′ = −1 + 2xb
And we see that τs has a minimum at xb = 21 , which gives τs = 43 While

the total travel time is given by the same function here, the objective
function for the user equilibrium is
1 2
τu = xa + x
2 b
Differentiated this is
τu′ = −1 + xb
And we see that τu has a minimum at xb = 1, which gives τs = 1. So the
user equilibrium has in this case 34 of the total travel time of the system
optimal solution. None of the users accept that any other can travel
faster than themselves, and thus everyone end up on the same road b,
51
worsening the travel time for all the other commuters, and ultimately
themselves.
How bad great can the difference between the system optimal and
user equilibrium solutions be? It has been shown [3] that for linear la-
tency functions the relation is never greater than 34 . But with arbitrary
latency functions, even convex ones, we can get arbitrarily large differ-
ence. Replacing lb (x) = x with lb (x) = x n we can do the same analysis
again.
τs = xa + xbn+1
τs′ = −1 + (n + 1)xbn
n1
1
xb0 =
n+1

1
And the optimal solution is 1 − xb0 + xbn+1
0
= 1 − xb0 1− n+1
which gets
arbitrarily small with a sufficiently large choice of n.
For the user equilibrium we get
1
τu = xa + x n+1
n+1 b
τu′ = −1 + xbn
xb0 = 1
And the total travel time is again 1. Comparing with the system optimal
we see that the quotient becomes arbitrarily large with sufficiently large
n!
Theorem 5.2 The quotient of the total travel time of the user equilibrium
and the system optimal solution may be arbitrarily large for arbitrary
latency functions.
The other extreme case to consider is when all latency functions are
constant. Then we see that the terms in the objective functions for the
system optimal and user equilibrium problems coincide:
Zx Zx
′
xl(x) = xl0 = l0 dx = l(x ′ )dx ′
0 0
And we therefore get that the system optimal and the user equilibrium
problems are exactly the same!
Theorem 5.3 The user equilibrium and the system optimal solutions co-
incide if all latency functions are constant functions.
52
Usually the latency functions are somewhere in between the two ex-
tremes, and it is meaningful to work with both cases.
How is it possible to force the user equilibrium and the system opti-
mum to coincide? In the first example above we could guess at adding a
toll of value 21 to the road with the linear latency function, thus replacing
1
lb (x) = x with lb (x) = x + 2 . Here the constant factor is not actually
time delay, but rather a tax you would have to pay to drive along that
road. This should work because the two arcs would then have the same
cost with the system optimal flow, and this should then also be a user
equilibrium flow. Repeating the analysis we get
1 2 1
τu = xa + x + xb
2 b 2
1 1
τu′ = −1 + xb + = − + xb
2 2
1
And we see that the minimum is now at xb = 2 , which is indeed the
system optimal solution! What about the general case?
Remember Wardrop’s characterization of a user equilibrium; that all
paths from s to t that are in use have the same cost, and cost equal to
or less than that of any unused path. In a system optimum this is not
necessarily the case. I want to introduce taxes to some of the arcs in a
given network such that the cost incurred by the travelers along arc a is
l(a) + T (a) where T : A → R + is the tax function. Then there is a system
optimal solution that minimizes the total travel cost fs , and there is
also a user equilibrium Fu corresponding to the new tax modified cost
functions. Let it be absolutely clear that these taxes do not directly affect
the latency along the arcs, but are only perceived by the travelers some
generalized cost which they want to minimize together with travel time.
Then I claim that if these taxes are chosen appropriately we can force
the system optimum fs and the user equilibrium fu to coincide! To solve
the problem of choosing an appropriate tax function I came up with the
following:
Theorem 5.4 Given an acyclic graph G, a length function l : A → R and
vertices s with in-degree 0 and t with out-degree0 it is possible to find a
function T : A → R + such that all s − t-paths have equal length when
considering the length function l + T , and such that the length of all these
equal the length of the longest s − t-path when considering only l. This
can be done in time O(|A|).
Proof. Since G is acyclic and s, t have in- and out-degrees 0 respectively
we can find a topological ordering v0 , v1 , . . . , vn of G where v0 = s and
53
vn = t, and where all arcs are of the form vi vj , i < j. This can be done
in time O(|A|).
Let Pvi denote all vi − t-paths and let Lvi = max{l(P ) : P ∈ Pvi . I will
prove by induction that for any vertex vi we can find T (a) such that
(l − T )(P ) = Lvi ∀P ∈ Pvi
for all outgoing arcs a from vi . For vi = t the claim is trivial.

Now assume that the hypothesis holds for all vj , j > i. Consider an
outgoing arc vi vj from vi . Then the claim holds for vj , and thus all
vi − t-paths passing through vj are of equal l− T -length Lvj + l(a)+ T (a)
with Lvj + l(a) ≤ Lvi . Now setting T (a) = Lvi − Lvj − l(a) for all outgoing
arcs a makes the hypothesis true for vi .
Each arc is examined exactly once, so the time usage follows.
Corollary 5.1 The tax function T above also makes all vi − vj -paths
equally long when considering the length function l + T .
Proof. This follows immediately from the theorem, as any concatenation

of a vi − vj -path and a vj − t-path results in a vi − t-path. Then since all
vi − t-paths and all vj − t-paths have equal l + T -length all vi − vj -paths
must also have equal l + T -length.
Let us return to the problem of making the system optimum fs of
latency function l and the user equilibrium fu of generalized cost func-
tion l + T coincide in the graph G. We could try to accomplish this by
considering the subgraph Gfs ⊂ G consisting of only those arcs used by
fs
function l (a) = la fs (a)
fs , with length
fs
. Since fs is system optimal
and la f (a) ≥ 0 we can assume that G is acylic. Then we can find the
tax function T such that all s − t-paths in Gfs are of l + T -length equal
to Ls . And adding this same tax function to the arcs in the original G
we see that all s − t-paths in use by fs have equal l + T -length under
fs ! However some path not used by fs might be shorter than Ls , and fs
then fails to be a user equilibrium for the whole graph G. Now let P Ls
denote all s − t-paths in G with l-length less than or equal to Ls . Now
including all arcs in these paths in Gfs might cause Gfs to no longer be
acyclic. Thus we just have to take extra care when calculating T for our
graph G and system optimal flow fs , to ensure T really causes fs to be a
user equilibrium under when considering l + T .
The following algorithm solves our problem of finding a tax function
T for a graph G with a system optimal s − t-flow fs considering latencies
l such that fs also becomes a user equilibrium when considering l + T .
Let Ls be the length of the longest path in use by fs .
54
Algorithm for finding optimal tolls first in Gfs and then in all of G.
Lv ← −1 ∀v ∈ V
Lt ← 0
call findMaxDistance(s)
call unusedTolls()
function findMaxDistance(v)
if Lv ≥ 0 return Lv
Lv ← max{call findMaxDistance(u)+l
a fs (a) : a = vu ∈ δout (v), fs (a) > 0}
T (a) ← Lv − Lu − la fs (a) , ∀a = vu ∈ δout (v), fs (a) > 0}
return Lv
function unusedTolls()
d(v) ← Ls − Lv ∀v
U ←V
while U ≠ ∅
v ← u ∈ U s.t. d(u) is minimal
if d(v) > Ls BREAK
U ← U \ {v}
for a = vw ∈ δout (v)

if Lw = −1 d(w) ← min{d(v) + la fs (a) ,d(w)}
else T (a) ← max{d(w) − d(v) − la fs (a) , 0}
Here the findMaxDistance function works within the acyclic subgraph

Gfs consisting of the arcs used by fs and finds tolls such that all s − t-
paths used by fs become equally long. This is the part of the algorithm
that corresponds to Theorem equalizer. The vertices v covered by fs
get Lv > −1. Then the unusedTolls function searches for shortest paths
(Dijkstra-Prim based) with length less than or equal to Ls in the whole
graph G, and assigns a correction toll to such paths each time it finds
one, such that all these paths get cost at least Ls . Note that when a cor-
rection toll is assigned to an arc already in use by fs , and thus with a tax
already assigned to it, the new assignment equals the old, so no changes
are done. So at the end all s − t-paths in use by fs have l + T -length Ls ,
55
and all other paths have l + T -length greater than or equal to Ls .
Note that we could actually have chosen any acyclic flow f in the
algorithms above, not just the system optimal fs , as the theorem (5.4)
only requires the graph to be acyclic. So we see that we can make any
flow a user equilibrium by proper taxation! But for me, of course, the
system optimum makes the most sense.
What about the case when there is no longer just one commodity? We
can also solve the more general multi commodity problem by calculating
a set of tolls for each commodity. Then we would require knowledge of
the origin and destination of each traveler in the network. Although this
is currently impractical for real world traffic uses, it is an interesting
theoretical result. And with the ever increasing importance of computer
networks in our daily life, it might be possible to implement successfully
in the future.
It could also be possible to find a way of calculating just one tax function
for several commodities, such that the system optimum becomes a user
equilibrium when including taxes. Alas I have not been able to solve this
problem. I end this subsection with an example calculation of a static
Figure 9: An example graph with one source/sink pair and with latencies
and capacities as given on the arcs.
56
system optimal solution and corresponding taxes. Consider the graph
shown in figure (5.2). We are to route a flow of value 2 from s to t. To find
the system optimal solution I will use the alternative optimality criterion
derived in the previous subsection. Let P1 be the path P1 = s − u − t,
P2 = s − u − v − t, P3 = s − v − t, P4 = s − t along the arc with cost 6 and
P5 = s − t along the arc with cost 7. Then x1 + x2 + x3 + x4 + x5 = 2. And
L1 = 2(x1 + x2 ) + 4
L2 = 2(x1 + 3x2 + x3 ) + 1
L3 = 2(x2 + x3 ) + 4
L4 = 6
L5 = 7
Let us assume that x5 = 0. We will see later that this is a correct as-
sumption. Then the equalities above constitute a linear set of equations.
Solving this we get
1
x1 = x2 = x3 = x4 =
2
The corresponding flow is shown in figure (5.2), with the latencies in
parenthesis.
Using the latencies, or costs, in this graph, we run the taxing algorithm
on the graph shown in figure (5.2), where the resulting taxes are shown
as +T on each arc. Adding these taxes to the original graph we finally
get the graph in figure (5.2), where the latencies and taxes add up to
form the new costs along each edge. We easily check that the flow x1 =
1
x2 = x3 = x4 = 2 makes all s − t-paths in use equally expensive when
considering both latencies and taxes. And we also know that this flow
reduces the total travel time, since it was the solution to the system
optimality problem we started with.
Out of curiosity we can also find the user equilibrium in the original
problem in the same way as we found the system optimum. This turns
out to be
1 4
x1 = x3 = , x2 =
3 3
34 136
The total cost of the user equilibrium flow is then 3
= 12
, whereas the
system optimal flow has cost 394
= 117
12
.
57
Figure 10: The system optimal flow of this graph, along with latencies
corresponding to this particular flow fs .
58
Figure 11: The graph Gfs after computing taxes. Total arc costs l + T are
latencies + taxes on each arc.
59
Figure 12: The original graph, with necessary taxes added along the arcs.
The user equilibrium when considering latencies + taxes is now the same
as the system optimum when considering only the latencies.
60
5.3 The dynamic case, simplified latency model
One interesting special case of network models was the one where there
are absolute capacities on the arcs, and where the travel time (or cost)
along each arc remains constant with regard to loading. For a vehicle
traffic situation this seems a bit unrealistic, but for e.g. information flow
in networks these assumptions might very well be acceptable. In fact,
the deterministic queue model [2] exhibits exactly this behavior, under
the assumption that there are no queues! In this case very much is
known about our graph, and computing shortest paths, minimum cost
flows, maximum flows etc. is all possible with well known polynomial
time algorithms. Perhaps we might derive some useful results from this
already well established area?
5.3.1 System optimal planning
A dynamic system optimal solution to a route and departure time plan-

ning problem is one which minimizes the total cost of all users, defined
in section 3. Assume we bottleneck with an absolute capacity and de-
sired through-flow greater than this capacity for a certain time interval.
This could be a graph with just two nodes s, t and one arc a = st. We
assume the deterministic queue model for this arc, that is
d xa (t)
la xa , t = −1 (26)
dt c(a)
if there is already a queue, or one is forming, or
la (xa , t) = la (27)
if there is no queue, and the inflow xa (t) not is great enough for one to
form.
Let us also assume the departure deviation cost function g(t) = 0 to be
zero, and the arrival deviation cost function to be h(t) = 12 |t|.
In this case a system optimal solution will consist of a constant inflow
equal to the capacity of the bottleneck, such that the bottleneck is max-
imally utilized, but also such that no queues arise. The arrival time in-
V V
terval will, for a dynamic flow of value V , be [− 2c(a) , 2c(a) ], which causes
V V
the departure time interval to be [− 2c(a) − la , 2c(a) − la ].
The dynamic user equilibrium is again such that the total cost expe-
rienced by each traveller s equal.So the dynamic system optimum above
certainly differs from the user equilibrium in that several of the travel
61
agents could have done better by choosing a departure time that would
bring their arrival time closer to their desired arrival time t = 0. But in
doing so they would have caused all later entrants to be delayed by the
time it would have taken themselves to pass the bottleneck, and thus
a queue would have arisen, increasing the total cost of all later travel
agents. Here the dynamic flow that causes all travel agents to experi-
V
ence the same cost has a queue that starts to form at time t = 2c(a) − la
and grows at a rate such that

l′a (xa , t) = −h′ la (xa , t) + t
until the time when la (xa , t) = 0, at which the queue starts to shrink
again at a rate such that

l′a (xa , t) = −h′ la (xa , t) + t
The dynamic inflow that satisfies this is
( V V
2c(a) : − 2c(a) − la ≤ t < − 4c(a) − la
xa (t) = 2 V V
3
c(a) : − 4c(a) − la ≤ t < 2c(a) − la
Comparing the two solutions we see that the inflow happens during ex-
actly the same time interval, but in the user equilibrium case the inflow
is great enough to cause a queue, so that none of the later travelers get
a lower total cost than the very first ones.
In the rest of the treatment on dynamic flows I will, however, assume
there are no queues. We see that adding a time dependent toll
V 1
ξ(t) = − |t + la |
2c(a) 2
at the entrance to the bottleneck would cause the dynamic system opti-
mal flow to be a user equilibrium! And I expect it to be easy to find some
tolls that can be used to make the system optimal flow a user equilib-
rium, also in the more general case. The crudest approach could just
be to add tolls at the sink of a flow, equal to some constant minus the
arrival deviation cost function for the flow arriving at that sink, and pos-
sibly also negate the departure deviation cost in the same way. Thus the
rest of this section will focus on constant latency function networks. No
queues!
5.3.2 Existence and uniqueness of the system optimal solution
The system optimal solution being the one that minimizes the total
cost incurred by all users of the network, its existence is not hard to
62
prove. Since the total cost function is a continuous function into R with
a bounded minimum this minimum is attained by some dynamic flow.
In general it might be hard to say whether the system optimal solution
is unique or not, but in the case of the constant latency simplification
we can see that it is not necessarily unique, but convex: Let f1 and f2
be two system optimal solutions with no queues, and with total costs
C1 = C2 = C. Both these are governed by (27), since there are no queues.
Consider a convex combination f3 = λf1 + (1 − λ)f2 , λ ∈ [0, 1]. Then
along each arc in f1 the flow is always less than or equal to the capacity
of that arc, and the same for f2 , so a convex combination of flows along
each arc will again never exceed the capacity of that arc. Thus there will
not be any congestion in f3 either. Now since the latency of each arc
is constant with regard to flow, if Ca,i is the total cost associated with
using arc a in solution fi , then Ca,3 = λCa,1 + (1 − λ)Ca,2 , and thus the
total cost of f3 is just C3 = λC + (1 − λ)C) = C, so S3 is also system
optimal.
5.3.3 The time discrete graph
Finding flows with various properties is a well established area in graph

theory. Optimal peak routing is one example of an application of this
to real life problems, where we solve the problem of routing a given
flow of traffic through a network of available roads. This is exactly the
static system optimality problem we have studied already. But again
this is just a snapshot of the traffic situation throughout the whole day,
or throughout the time of the day where congestion is a problem. If
we wish to route traffic not only through different routes, but also at
different times, the problem becomes a bit harder.
One way of transforming this problem into an already well known
and solvable problem is to discretize the time dimension of the problem
and make a graph where the vertices of the new graph are the vertices
of the original graph at different times. The arcs then go from a vertex
to vertices that are reachable from that vertex in the original graph, but
with a later time coordinate. Let the original graph G have vertices V and
arcs A, and let the time discretization be T = {tn }. We assume that the
latency functions la are all positive and integer, i.e. la : A → N. Then the
vertices of the augmented graph G = (V ×T , A×T ) are (v, tn ) and there
is an arc from (v, tn ) to (w, tm ) if there is an arc a from v to w and the
travel time l(a) is equal to tm − tn , and the cost of this arc is equal to
the original travel time plus any fixed cost of that arc (tolls etc.), while
the capacity is the same as that of the original arc a multiplied by the
63
time discretization unit ∆t. There are also arcs from (v, tn ) to (v, tn+1 )
with cost equal to tn+1 − tn , representing waiting one time unit at node
v.
In addition there are two nodes for each origin destination pair; one
representing the origin at all time steps, the universal origin su , and one
representing the destination at all time steps, the universal destination
tu . Let (s, tn ) be the original origin node in each time step, then there
are arcs from su to (s, tn ) with cost equal to the departure deviation cost
at time tn , and similarly for the destinations. The universal origins and
destination are sources and sinks respectively with supply equal to the
total amount of traffic that must pass from the origin to the destination
in the original graph. An example of this construction is shown in figures
(5.3.7) - the original graph - and (5.3.7) - the augmented graph.
My claim is that finding a system optimal travel plan is equivalent to
finding a minimum cost flow through the augmented graph, that satis-
fies the source and sink constraints at the origins and destinations. This
can be seen by letting the time discretization steps go towards 0. Then
the total cost of the multi-commodity flow in the augmented graph ap-
proaches the dynamic system optimality objective function, as the sums
over all time steps approach the integrals. And since we minimize the
cost of the multi-commodity flow in the augmented graph, we also min-
imize the dynamic system optimality objective function.
Proposition 5.1 A minimum cost (multi-commodity) flow in the augmented

graph is an approximate solution to the minimum cost (multi-commodity)
dynamic flow in the dynamic network.
We then see immediately that this model has good flexibility in several
aspects: varying capacity with time, using any kinds of departure and
arrival specific cost, time varying toll functions.
5.3.4 System optimal planning by use of the augmented graph
The system optimal solution is one which minimizes the total cost in-
curred by all travel agents. Since the cost of following a path from su via
(s, tn ) and (t, tm ) to tu in the augmented graph is the same as the cost
that a travel agent departing at time tn and arriving at time tm incurs,
then a minimum cost flow through the augmented graph must be the
same as a system optimal solution in the continuous case. This is with
the reservation that all time dependent cost functions are piecewise con-
stant in the time discrete graph, but may be continuous in the original
64
problem. But by choosing a fine enough discretization we can get arbi-
trarily close to the original continuous cost functions. This will, however,
lead to a graph with an arbitrarily huge number of nodes, something we
do not want, and thus choosing a fine enough but also not too fine dis-
cretization will be important.
Thus the problem of finding a system optimal flow in a network where
we assume constant travel time and absolute capacities on the arcs can
be approximated by finding a minimum cost flow in the augmented
graph. This results in a dynamic flow that is really a different static
flow for different time steps. But since it is a feasible flow for each of
these, the concatenation of each of these static flows result in a feasible
dynamic flow. So by finding a minimum cost static flow that satisfies
the constraints of the augmented graph, we have really found a dynamic
flow that satisfies the dynamic constraints (14 - 17).
It is also possible to solve the multi-commodity minimum cost problem
by use of linear programming, and thus we can even approximate sys-
tem optimal solutions to networks with several origin destination pairs!
5.3.5 Variable preferred arrival time
Something the first outline of our time discrete graph did not allow for
was the option of having different preferred arrival times (or departure
times). I will give a solution to this for the arrival time case. The depar-
ture one is treated similarly.
By introducing some extra nodes and altering the arcs that enter the uni-
versal sink we get a graph where different choices of paths the last two
arcs correspond to different arrival time preferences. Let t be the orig-
inal destination node and (t, tn ) its time discretization as usual. Then
instead of having arcs from each of the (t, tn ) to tu we introduce an ex-
tra set of nodes, δhk , between these, such that each of these extra nodes
correspond to a different arrival deviation cost function hk . From (t, tn )
to δhk there are arcs with unlimited capacity and cost equal to hk (tn ),
i.e. the cost of arriving at time tn with cost function hk . And from each
of the δhk to tu there are arcs with capacity equal to the amount of travel
agents having hk as their arrival time cost function.
We see immediately that this allows for different preferences in arrival
time, as this is just the same as translating the cost function along the
time axis. But it also allows for any different kinds of cost functions!
65
5.3.6 Properties of the augmented graph
As the augmented graph is a special construction, we expect it to have

some properties that might be of use when solving the minimum cost
flow problem in it.
Proposition 5.2 The augmented graph has no directed cycles.
Proof. All nodes of the graph are one of the following types:
• Universal origin/destination. These only have arcs exiting/entering,

and can thus not be part of any cycle.
• Departure/arrival cost nodes. These only have arcs entering from/exiting

to the universal origin/destination, and can therefore not be part
of any cycles either, as such a cycle then would have to include the
universal origin/destination.
• Time discretization nodes. Since all arcs representing time dis-

cretizations of original arcs have a positive travel time, it is impos-
sible to depart from any time discretization node and return to this
node within the same time step.
In fact the graph is not only acyclic, but in the case when the latency
and capacity functions are constant over time, which is the case we are
the most interested in, it contains a plethora of equal paths from say
(s, tn ) to (t, tm ), and then also from (s, tn+k ) to (t, tm+k ) with k ∈ Z.
We might expect that if one of these is used, then so will many of the
other equal paths be. Unfortunately this information is not utilized by
the Simplex algorithms. In the algorithm in the last part of this section I
do utilize this information, to find a much faster solution method than
this augmented graph method. This comes at the cost of the loss of
flexibility in time varying latencies and capacities.
5.3.7 Examples of the augmented graph
I will demonstrate how to create and use the augmented graph by pre-
senting a small example. The network we will use is rather simple, con-
sisting only of an origin vertex and a destination vertex, and two dif-
ferent arcs between these. One of the arcs has a shorter length than
the other, and also greater capacity, but as the desired flow through the
66
Figure 13: A sample network we discretize and do some calculations
with.
graph exceeds the time unit capacity of the shorter arc we expect that
both arcs will be used. The original graph is drawn in figure 5.3.7.
In the first example I will show the first approach to our time dis-
crete graph, that is allowing only one arrival cost function. The time dis-
cretization is done here by choosing ∆t = 1, and by using 6 time steps
tn , i = −1, . . . , 4. The arrival cost function h(t) used here is piecewise
linear, with h(t) = −0.5t, t < 3 and h(t) = 2t, t > 3, and the departure
cost is g(t) = 0. Thus the costs along the arcs from the universal origin
su to the time discretizations (s, tn ) of the origin node are all zero, and
the capacities are infinite, and the arcs from the time discretizations of
the destination node (t, tn ) to the universal destination tu have costs
h(tn ) and infinite capacities. And also, the arcs from (s, tn ) to (t, tn+1 )
have cost 1 and capacity 10, and the arcs from (s, tn ) to (t, tn+2 ) have
costs 2 and capacity 5. The resulting graph is shown in figure 5.3.7.
Solving the problem of a minimum cost flow of value 60 we end up
with fully utilizing all the arcs marked in blue and cyan (dashed) colors,
and we also see that we could have used the green arcs (dotted) at the
expense of the cyan ones, something which would not make any differ-
ence to the total cost. Interpreting this we see that the faster arc will
be used over a greater time interval than the slower one, but that both
of them will indeed be used. Adding together the cost of the different
components of the flow, we get a total cost of 122.5. Now the way we
have defined the cost along the arcs that correspond to arrival costs, we
have assumed everyone arrives at exactly the same time, i.e. we have
chosen l (t, tn )tu = h(tn ). This is of course slightly wrong. We could
67
Figure 14: An augmentation of the sample graph, with time discretiza-
tion unit 1. A minimum cost flow is shown in thick blue and cyan
(dashed) lines. The green (dotted) arcs have the same cost as the cyan,
and could have been used as well.
68
instead have used the average cost of the flow with average arrival time
tn , and this will henceforth be used. Now since h is piecewise linear
this actually gives the same cost everywhere except where h has a break
point. In this example h only has one break point, namely at t = 3,
and the cost l (t, 3)tu here becomes 85 instead. Recalculating the total
7
cost then gives 131 8 , which is indeed the cost of this flow when viewed
continuously as well.
Solving analytically we get that the faster arc will be in full use in the
time interval that causes arrivals in the interval (− 26 , 119 ) and the slower
30 30
34 104
arc in the time interval that causes arrivals in the interval ( 30 , 30 ). In-
5
tegrating the flow cost terms here we get a total cost of 123 6 . We see
that we here have a better solution that the one we got by using the
augmented graph, which gave a solution that was roughly 6.5% more ex-
pensive. But remember that in the solution of the minimum cost flow
problem we had several equally expensive choices for routing the most
expensive part of the flow. If we had chosen do divert some flow to each
of them, we should expect to get a better result than we did. This is
again because the cost along the arrival cost arcs are based on arrival at
the mean time of the time interval represented by that arc, and if we had
only used a bit of that interval, we could have chosen the cheaper part
of it, thus obtaining a lower cost than our graph model shows.
Pursuing this idea I tried moving the desired arrival time from t = 3
to t = 2.5, which then changes only the costs along the last arcs in the
graph. Solving the minimum cost flow problem here gives a flow with
value 60 and cost only 125, or less than 1% more than the optimal cost!
This solution is shown in figure 5.3.7. We see that here we have no al-
ternative choices of arcs that would give the same total cost. Now the
optimal solution would still use some of the unused arcs to a very small
extent, at the expense of the most expensive choices here, but I think we
are pretty close, and with a rather simple discretization. We see, how-
ever, that we could also have bad luck when choosing our discretization.
An upper bound on the error of the cost obtained could be nice.
Now for the case with different arrival cost functions. Assume three
different arrival cost functions h1 , h2 , h3 with h2 as the cost function h
used above, with desired arrival time t = 3, and with h1 (t) = h2 (t + 1)
and h3 (t) = h2 (t − 1). The amount of traffic that has each cost function
is 20, 30 and 10 respectively. The augmented graph is set up above,
except for the last arcs. These now go to the three different vertices
for the different arrival cost functions, and then there are arcs from
each of these to the universal destination vertex. The graph is shown in
69
Figure 15: The same graph augmented, but with the time discretization
translated 12 time unit. This results in a much cheaper minimum cost
flow, in thick blue.
70
figure 5.3.7. Again the solution to the minimum cost flow problem with
a flow of value 60 is shown with the arcs in blue being fully utilized. The
solution has a total cost of 105.
Figure 16: Again the same graph, but with three different arrival cost
functions. The thick blue lines indicate again the minimum cost flow.
5.3.8 Multi-commodity planning
Having dealt with the case of routing a flow from an origin to a desti-
nation in a network across different times, we see that there’s a fairly
obvious generalization we should look at, and which we have already
mentioned: Dynamic networks with several origin/destination pairs.
How to construct the augmented graph in this case is rather straightfor-
ward: Just add universal origins and destinations (or sinks and sources),
possibly with corresponding departure or arrival cost nodes, for each
of the sources and sinks in the original network. Thus we get several
universal sources and sinks that will be used as the sources and sinks of
our new multi-commodity flow problem in the time discrete graph.
71
To solve the multi-commodity flow problem we need only solve the LP
problem we get from formulating the graph’s flow constraints and costs,
like in the static multi commodity case.
Thus we see that practically any dynamic system optimality problem can
be solved this way, but possibly resulting in a large graph for which the
multi-commodity flow problem takes long time to solve. Both the num-
ber of unknowns and constraints is proportional to both the number
of commodities and vertices. In the brief time analysis of the Simplex
algorithm we used the dimensions n, m of the constraint matrix as a
measure of problem size. Here n, m ∈ O(|T ||I|). We get slightly better
results for the single commodity case, but the time discretization can
still be a problem if the network already is large.
5.4 Chain decomposable flows

As we saw in the previous section we could use a time augmented graph
to solve the dynamic system optimal problem, assuming all latency func-
tions were constant with regard to flow. Although problems are solvable
by this method, the size of the augmented graph can be a problem, as
the number of arcs and nodes are proportional to the number of time
discretization steps. Recent work on dynamic flows [4] has been a break-
through in this area, and Hoppe has come up with an algorithm to solve
several dynamic flow problems in true polynomial time, regardless of
the time discretization chosen! The problems solved in this fashion are
• The dynamic flow problem: Finding a dynamic flow from one source
to one sink satisfying a given supply within a certain time interval.
• Quickest dynamic flow problem: Finding a dynamic flow in as short

a time as possible. Uses the above algorithm with binary search for
smallest possible time interval.
• Lexicographical maximum dynamic flow problem: Maximizing the

flow between source/sink pairs in a prescribed order of importance
during a given time interval.
• The dynamic transshipment problem: Finding a dynamic flow sat-

isfying given supplies for several source/sink pairs within a certain
time interval. Modifies the network so the problem is solvable by
the above algorithm.
72
• Quickest dynamic transshipment problem: Finding a dynamic trans-
shipment flow in as short a time as possible. Uses the above algo-
rithm with binary search for smallest possible time interval.
The algorithms devised by Hoppe build on the rather simple successive

shortest path algorithm for finding a maximum s −t-flow in a graph with
capacities, but include considerations regarding the time discretization.
Instead of only assigning flow along the computed shortest paths of the
augmented graph, the flows are assigned during a maximal time interval;
starting from the source at time step 0 and ending so that the last part
of the flow reaches the sink in the terminal time step T . The work in [4]
assumes an integer time discretization of the original graph, and thus
integer latency functions, as this makes some theoretical results easier.
I try to work with arbitrary positive latency functions.
5.4.1 System optimal solution with chain flows
Let us now reconsider the dynamic system optimal planning in the con-
stant latency setting. To summarize we have a network consisting of a
graph G with several origin-destination vertex pairs (si , ti ), each with a
supply/demand. Each arc a has a constant latency function la (x) = la
and a capacity c(a), c : A → Q+ . Each commodity also has a departure
cost function gi (t) and and arrival cost function hi (t) applying when
flow leaves the source or enters the sink.
Looking back at the general form of the system optimal planning prob-
lem (11 - 17) we see that the constant latency functions simplify the ob-
jective function slightly and the constraints massively. This is because
we no longer have the complicated relation between inflow and outflow
to each arc. We denote the inflow to arc a at time t by xa (t), and the
outflow at time t + la is then equal to xa (t).
We still have the objective function split in two parts. This really is just
like in the augmented graph above, but from a slightly different view-
point, as we no longer make a time discretization. The total travel time
is now:
X ZT
xa (t)la dt (28)
a∈A 0
And the total departure and arrival time deviation cost is:
XZT
bi (si , t)gi (t) − bi (ti , t)hi (t)dt (29)
0
i
73
The constraints apply just as before, but with the much simpler relation
between inflows and outflows.
If we choose to optimize with regards to the travel time only our
problem really simplifies. Since the total travel time is also the integral
over all traffic of the travel time experienced by each travel agent, it is
clear that minimizing the travel time of each infinitesimal piece of flow
minimizes the total travel time. But this is just the same as calculating
a shortest path through the graph, and routing traffic only along this
path until all the supply is satisfied! Of course this might lead to a
really long time during which traffic flows in the graph, and commuters
might be almost arbitrarily late (or early) for work. Thus minimizing
the total travel time alone makes little sense in the dynamic setting. On
the other hand optimizing only with regard to the departure and arrival
costs we try to find a cheapest (and thus shortest) possible set of time
intervals in which flow leaves the sources and enters the sinks. This is
close to the quickest dynamic flow problem for one origin/destination
pair, or the quickest transshipment problem with several pairs [4]. And
as we remember these problems are now solvable in polynomial time
with chain decomposition! But since travel time is not regarded at all,
disproportionally slow routes may be utilized, leading to a solution that
is not system optimal in the full sense. It is again not good enough
to consider only one part of the objective function. However using the
ideas pursued by Hoppe we might hope to devise a faster way of solving
the full dynamic system optimality problem by eliminating the use of
the time-augmented graph!
We still assume that all latency functions are constant, and that we
have capacity constraints on the arcs of our graph. We also assume that
all traffic shares the same convex departure and arrival cost function
g(t), h(t). I’ll start with the case of a single commodity, or dynamic s −t-
flow. What I propose is that the following Dynamic System Optimality
algorithm solves the system optimality problem with these assumptions.
This algorithm is based on the Successive Shortest Path algorithm in
section 4.5.2.
Let us first define a chain (flow):
Definition 5.1 A chain Fi in network G with source s and sink t is a

quadruple Fi = (Pi , ci , tis , tie ). Here Pi is an s − t-path in the undirected
underlying graph of G. ci is a real number denoting the constant amount
of flow along Pi , such that Fi sends flow of value ci along arcs included
in Pi with their positive direction, and canceling flow of value −ci along
arcs included in Pi with their negative direction. And finally there are two
74
reals denoting the time interval in which flow runs along Pi ; the starting
time tis at source s and the ending time tie at sink t.
Let F = {Fi } be a set of chains, and let li = l(Pi ) be the length of Pi for
compactness. Then the total dynamic flow F : A × R → R induced by F
is the sum of all chains in F , and the total amount of flow
X
ftot = ci tie − tis − li
i
We also denote the static flow induced by the k first elements of F by fk .

Let fdemand be the total amount of flow needed from s to t. Let Cmax be
the maximum individual cost associated with the current flow, and let
old new
Cmax , Cmax be lower and upper bounds on Cmax . Let Gfi be the residual
graph associated with G and the static flow fi . Let also
Hi (t) = li + g(t − li ) + h(t)
Hi (τ) is then the total cost incurred by a traveler using path Pi at a time
such that she arrives at t at time τ. We assume that each Hi (t) has a
minimum for some t.
75
Dynamic System Optimum algorithm
old new
Cmax , Cmax ←0
F ←∅
G0′ ← G
i←0
while ftot < fdemand

i←i+1
′
Pi ← shortest path in Gi−1
if !∃Pi
new
Cmax ←∞
BREAK
else
ci ← capacity of Pi
new
Cmax min{Hi (t)}
s new
ti ← min{t : Hi (t + li ) = Cmax }
e new
ti ← max{t : Hi (t) = Cmax }
for j = 1, . . . , i − 1
tjs ← min{t : Hj (t + lj ) = Cmax new
}
e new
tj ← max{t : Hj (t) = Cmax }
if ftot ≥ fdemand
BREAK
else
F ← F ∪ {(Pi , ci , tis , tie )}
Gfi ← Gfi−1 updated with flow ci along Pi
old new
Cmax ← Cmax
old new
find Cmax ∈ [Cmax , Cmax ] s.t. ftot = fdemand
s e
when ti , ti is updated accordingly to Cmax
F induces a system optimal dynamic flow F
The idea behind this algorithm is simple enough: Find a shortest path
in the residual graph and determine the minimum cost for anyone using
it, and the time at which this minimum is attained. Use this path ini-
tially at only the minimum cost time. Then increase the time interval for
which this path is in use, until some other path (possible with negative
arcs, meaning a modification to already existing flow) becomes equally
expensive at its minimum cost time. Then increase the time interval for
76
which both these paths are in use, until a third path becomes equally
expensive. And so on.
The chains are chosen and updated such that no infinitesimal chain in
use is more expensive than any not in use, and such that no infinitesi-
new
mal chain in use has cost greater than Cmax . In addition all infinitesimal
old
chains with cost less than or equal to Cmax is in use at the start of each
loop iteration, so the Cmax that gives a feasible total flow is always in
old new
the interval [Cmax , Cmax .
The total cost of the dynamic flow is the same as the sum of the costs
of each of the chains. So a chain with flow along negative arcs takes into
account the modification done to already existing static flow, and these
chains are then chosen in such a way that the total cost is always the
lowest. Note also that the total cost of dynamic flow F induced by F can
be much more compactly given with the chain representation. The total
cost of chain Fi is
Z te
i
Hi (t)dt
tis +li
Z te −li Z te
i i
= g(t)dt + h(t)dt + (tie − tis − li )li
tis tis +li
Notice that which arcs the chains use are of no direct importance, as the
total cost is fully determined by outflow from the source and inflow to
the sink.
Before proving correctness of the algorithm, let’s look at an example
application. The graph we consider is shown in figure (17). Here the
departure and arrival deviation cost functions are
1
g(t) = − t
2
1
h(t) = t + |t|
2
The first chain found by the algorithm follows path P1 = (s − u − v − t)
which has total latency l1 = 3 and capacity c1 = 3. We see that H1 (t) =
9
2
+ |t| which is minimal at t = 0. Thus F1 = (s − u − v − t), 3, −3, 0 is
old
added to F at the end of the first loop iteration, with Cmax = 29 .
The residual graph G1′ resulting from f1 is shown in figure (18). The
shortest path in G1′ is P2 = (s − v − u − t) with total latency l2 = 5 and
15
capacity c2 = 1. This gives H2 (t) = 2 + |t| which is minimal at t = 0.
new 15
Thus Cmax is raised to 2 , which increases the time intervalof use of F1
to t1s = −6, t1e = 3. This gives a total flow of 3 3 − (−6) − 3 = 18 < 26,
77
Figure 17: An example constant latency graph, with latencies and capac-
ities shown on each edge, and the total supply shown in the terminal
vertices.
78

so F2 = (s − v − u − t), 1, −5, 0 is added to F at the end of the second
iteration, with Cmaxold
= 15
2
In the third iteration the only s − t-path in G2′ is the slow arc st. We
get P3 = (s − t) with latency l3 = 7 and capacity c3 = 2. This gives
H3 (t) = 21 2
+|t| which is minimal at t = 0 with minimum value 21 2
. Raising
new 21 s e
Cmax to 2 we increase the intervals of use of F1 , F2 to t1 = −9, t1 = 6 and
t2s = −8, t2e = 3 which gives a total flow of 3 6 − (−9) − 3) + 1 3 − (−8) −
5 = 42 > 26. Thus we break the loop and search for a Cmax ∈ [ 15 2
, 21
2
]
that gives the correct amount of total flow. Since we have piecewise
linear cost functions in this example we quickly find that Cmax = 17 2
gives the intervals t1s = −7, t1e = 4 and t2s = −6, t2e = 1 resulting in exactly
26 total flow.
Figure 18: The residual graph G1′ . We see that the shortest s − t-path is
now (svut).
The resulting dynamic flow is shown in 12 time snaps in figure (19).

Here F1 is shown in light green (full-drawn) and F2 is shown in red (dot-
ted). Note that F2 starts flowing from u before it reaches v, which is
before u in P2 . This is of course because it travels along uv in the nega-
tive direction, with negative travel time. This results in a flow of value 2
along uv when both chains use the arc, and 3 when only F1 uses it. The
arcs unique to F1 are not affected by F2 .
The rest of this section will be an attempt to prove the following
theorem:
79
Figure 19: The system optimal dynamic flow shown at 12 different time
steps. Green (full drawn) represents the chain F1 , while red (dotted)
represents F2 .
The numbers inside the vertices is the number of the slice they belong
to (Definition 5.3 below).
80
Theorem 5.5 The Dynamic System Optimum algorithm above find a dy-
namic s − t-flow that is feasible and minimum cost.
Proving that this theorem is correct will be some work. First let’s show
that it produces a dynamic flow that is feasible at all times. For this we
need the following lemma:
Lemma 5.1 Suppose f is a minimum cost static flow in G, and let static
flow g augment f along a shortest s − t-path in residual graph Gf . Then
(1) f + g is a minimum cost static flow in G and (2) for any vertex v the
distance from s to v in Gf is less than or equal to the distance in Gf +g , or
df (s, v) ≤ df +g (s, v), and df (v, t) ≤ df +g (v, t).

If we denote the static flows induced by ∪ki=1 Fi , the first k elements of F ,
by fk for each k it is clear that all these are feasible minimum cost static
flows or their respective values, as these are calculated exactly as in the
Successive Shortest Path algorithm [1].
If we can show that for each vertex v the time intervals τFim (v) in which
each Fim cover v is ordered with regard to inclusion we must therefore
have that all constraints are satisfied at all vertices (and arcs) at all times.
Lemma 5.2 For any vertex v and any chains Fi , Fj , j < i covering v at
time intervals τFi (v), τFj (v) we have
τFi (v) ⊂ τFj (v)
Proof. For the source and sink vertices this is easy to show, as for each
Fi we have
Hi (tis + li ) = Cmax = H(tie )
So for the starting times we have
g(tis ) + h(tis + li ) + li = g(ti−1

s s
) + h(ti−1 + li−1 ) + li−1
By assumption on h we have h′ (t) ≥ −1 which gives

s s s
g(ti−1 ) + h(t−1 is + li−1 ) + li−1 ≤ g(ti−1 ) + h(ti−1 + li ) + li
Now since tis is such that
g(tis ) + h(tis + li ) + li = Cmax
where the left hand side is non-increasing, and since
g(tis ) + h(tis + li ) ≤ g(ti−1

s s
) + h(ti−1 + li )
81
we can conclude that
tis ≥ ti−1
s
Similarly, with the assumption that g ′ (t) ≤ 1, we get that

g(tie − li ) + h(tie ) ≤ g(ti−1
e e
− li ) + h(ti−1 )
and again tie is such that
g(tie − li ) + h(tie ) = Cmax
where the left hand side is non-decreasing. So we also get
tie ≤ ti−1
e
So the inclusion holds at the terminal nodes.

′
Now assume Pi is a shortest path in Gi−1 , and consider any vertex
v covered by Fi at some time. Assume that Fi reaches v before some
Fj , j < i, and let Qi , Qj be the s − v-path components of Pi , Pj respec-
tively. Since tj,s ≤ ti,s this means that l(Qi ) < l(Qj ). Now Qi is a shortest
′
s − v-path in Gi−1 . If it wasn’t then a shorter s − v-path could be com-
bined with the v − t-component of Pi to make a shorter s − t-path than
′
Pi , which contradicts Pi being a shortest s − t-path in Gi−1 . Similarly Qj
′
is a shortest s − v-path in Gj−1 . Then we have
l(Qi ) = dfi (s, v) < dfj (s, v) = l(Qj )
This contradicts lemma (5.1), and thus Fi can not reach any vertex v be-
fore any Fj , j < i.
Similarly we prove that each chain Fi leaves each vertex v before each
Fj , j < i. So the interval in which each chain covers each vertex is or-
dered by inclusion, and thus all constraints are satisfied at all vertices
(and arcs) at all times, and the dynamic flow is at all times feasible.
What remains is to show that the dynamic flow F is minimum cost
of all dynamic flows with the same value, when considering both travel
time and arrival deviation costs. This is of course the tricky part.
Let the dynamic flow found by the algorithm be F and let Z be a
minimum cost dynamic flow with the same value. What I want to show
is that the cost of Z if equal to the cost of F . My strategy for proving
this is to first split the difference Z − F between the two dynamic flows
up into smaller, independent parts, and then show that for each of these
parts the total cost is non-negative. Then since the difference Z − F is
the sum of all the smaller parts, we also get that the cost of Z − F is
non-negative, which implies that F is also minimum cost.
I will need some lemmas for this second part of the proof. First a
definition.
82
Definition 5.2 A static s −t-flow f is extreme if it is minimum cost among
all static s − t-flows with the same value.
So all flows in the Successive Shortest Path algorithm, and also the static
flows in this algorithm, are extreme. There is a useful characterization
of an extreme flow:
Lemma 5.3 A flow f is extreme if and only if the residual graph Gf con-
tains no negative cost cycles (or circulations).
A proof can be found in [1].

This is only valid for static flows and networks, whereas my network
is dynamic. I want to prove the similar claim that the dynamic resid-
ual graph GF contains no negative length cycle. To do this I intro-
duce the concept of a slice. In the proof of feasibility above we saw
that for a vertex v the time intervals τFim (v) for which chains Fim cov-
ers v is ordered by inclusion. Then we can look at the time inter-
vals in which each vertex v sees the residual graph Gfj , that is the
time intervals in which Fim , im ≤ j cover v, but no Fim , im > j cover
v. (Note that the residual graphs Gfj seen by vertex v will be the
same for i = im , . . . im+1 − 1, but we still consider them separately.)
Then the (possibly empty) time intervals in which vertex v sees resid-
ual graph Gfj is Tj1 (v) = tjs + dfj−1 (s, v), tj+1
s
+ dfj (s, v) and Tj2 (v) =
e e

tj+1 −lj+1 +dfj (s, v), tj −lj +dfj−1 (s, v) for j < K, and for j = K the time

interval will be TK1 (v) = TK2 (v) = tjs + dfj−1 (s, v), tje − lj + dfj−1 (s, v) .
Also define t0s = −∞, t0e = ∞. Then we define the slice:
Definition 5.3 The j-th slice is the union
∪v∈V v × Tj1 (v)
of all vertices v at their respective time intervals Tj1 (v), the (2K − j)-th
slice is the union
∪v∈V v × Tj2 (v)
of all vertices v at their respective time intervals Tj2 (v).
And with this we can prove the important lemma:
Lemma 5.4 It is impossible to go from an i-th slice to a j-th slice with

j < i in the dynamic residual graph GF .
83
s s
Proof. Look
at (v, t) in the i-th slice. Then either (1) t ∈ ti +dfi−1 (s, v), ti+1 +
e e
dfi (s, v) or (2) t ∈ ti+1 − li+1 + dfi (s, v), ti − li+ dfi−1 (s, v) or (3) if
i = K then t ∈ tis + dfi−1 (s, v), tie − li + dfi−1 (s, v) .
(2) Look at vertex u. Then by traveling within the i-th slice the earliest
we can get to u is
e
t + dfi (v, u) ≥ ti+1 − li+1 + dfi (s, v) + dfi (v, u)
But we know that
dfi (s, v) + dfi (v, u) ≥ dfi (s, u)
so this leads to
e
t + dfi (v, u) ≥ ti+1 − li+1 + dfi (s, u)
which is also in the i-th slice.

(1,3) Look at vertex u. Then by traveling within the i-th slice the earliest
we can get to u is
t + dfi (v, u) ≥ tis + dfi−1 (s, v) + dfi (v, u)
Now assume
dfi−1 (s, v) + dfi (v, u) < dfi−1 (s, u)
Let Psv be a shortest s − v-path in Gfi−1 and Pvu a shortest v − u-path
in Gfi . Then clearly Pvu ∩ Pi−1 ≠ ∅, since Fi must have made a shorter
path from v to u possible. With Pvu = a1 a2 . . . am let w be such that i
is maximal when ai = yw ∈ Pvu ∩ Pi−1 for some vertex y, and let x be
such that i is minimal when ai = zx ∈ Pvu ∩ Pi−1 for some vertex z. Let
Pwx be the w − x-component of Pi and Pxw the x − w-component of Pvu .
Then we must have
l(Pxw ) + l(Pwx ) = 0 (30)
−1
. Clearly it cannot be greater, as Pxw can be no longer than Pwx , and if it
was less then a flow of value ǫ along Pwx and back along Pxw would be
a negative cost circulation, contradicting the fact that fi−1 is extreme.
We can rewrite the assumption above as
l(Psv ) + l(Pvu ) < l(Psu ) (31)
Now since Fi only affects Pvu along Pxw we can split Pvu in Pvx , Pxw , Pwu
where the first and last ones are paths in Gfi−1 . It is clear that
l(Psw ) + l(Pwu ) ≥ l(Psu ) (32)
84
But then (32) and (31) together with the splitting of Pvu become
l(Psw ) + l(Pwu ) > l(Psv ) + l(Pvx ) + l(Pxw ) + l(Pwu ) (33)
Together with (30) we finally get
l(Psw ) + l(Pwx ) > l(Psv ) + l(Pvx ) (34)
which contradicts Pi being a shortest s − t-path in Gfi−1 !

Thus
tis + dfi−1 (s, v) + dfi (v, u) ≥ tis + dfi−1 (s, u)
which is again in the i-th slice.
So it is impossible to get to an earlier slice by traveling within one slice
(or one static residual graph). Now it could be possible to get to some
j-th slice with j < i from a k-th slice, with k > i. But since there is a last
slice, the (2K)-th slice, we see by induction that this is also impossible.
From this it is clear that a negative length cycle in the dynamic residual
graph must stay within one slice. But a negative cost cycle within one
slice would imply a negative cost cycle in the static residual graph corre-
sponding to that slice, which contradicts the fact that all the static flows
are extreme. Thus we have:
Corollary 5.2 The dynamic residual graph GF contains no negative length

cycles.
Another result we get is
Corollary 5.3 All shortest (1) s −t-paths and (2) t−s-paths in the dynamic
residual graph stay within one slice.
Proof. (1) A path ending at t in the i-th slice can leave s in the i-th
slice, but not later. (2) Similarly a path starting from t in the j-th slice
must follow a flow from s to t, all of which follows paths of length
0 ≤ l ≤ dfj−1 (s, t). Thus this flow cannot have started from s before the
j-th slice.
Note that the second statement is equivalent to saying that the slowest
flows in use by F are not slow enough to fall through to later time slices.
I am finally ready to prove Theorem 5.5.
Proof. [Theorem 5.5] Let F be the dynamic flow found by the algorithm
and let Z be a minimum cost dynamic flow with the same value. Then the
difference between the two dynamic flows Z − F can be written as a sum
85
of non-canceling dynamic circulations, possibly with non-constant flow
values, and including waiting at vertices. By non-canceling I mean that if
a dynamic circulation y has positive flow along arc a at time t then no
other dynamic circulation x can have negative flow along a (or positive
flow along a−1 ) at time t that cancels the flow of y. This independence is
important because we then know that each of these dynamic circulations
have to be feasible when added together with F , which again means that
any positive flow in such a circulation y has to be feasible together with
flow in F , and any negative flow in y can only cancel flow in F .
Now remember that the total cost of F was determined fully by outflow
from s and inflow to t. Consider the cost of an infinitesimal piece of one
of the circulations y, with constant travel time ly and unit value. If this
circulation contains neither s nor t the cost is 0. If it contains only s,
and leaves s at time τ the cost is
g(τ) − g(τ + ly ) + ly
If it contains only t, and leaves t at time τ the cost is
−h(τ) + h(τ + ly ) + ly
In both these cases we see that for this cost to be negative we must have
ly < 0, because of the assumptions g ′ (τ) ≤ 1, h′ (τ) ≥ −1. But we have
already proven in Corollary 5.2 that this is impossible.
If it contains both s and t, leaves s at τs , travels to t along a path of
length l+ −
y , leaves t at τt and travels to s again along a path of length ly
the cost is
g(τs ) + h(τs + l+ + − −
y ) + ly − h(τt ) − g(τt + ly ) + ly
In this case we can consider the infinitesimal dynamic circulation as two

separate chains y + , y − , where y + comes in addition to F , and y − can-
cels some part of F . Let the costs of these two chains be Cy + , Cy − . Now
we already know that −Cy − ≤ Cmax . So for the total cost of y to be neg-
ative we need Cy + < Cmax . y + has minimum cost if it follows a shortest
path in the dynamic residual graph. Then look at a shortest path Pj in
the i-th slice, with cost function Hj (t) as in the algorithm. Then for
t ∈ R \ τFj (t) we have Hj (t) ≥ Cmax since Hj is convex with a minimum
in τFj (t) and because of the way tjs , tje are selected in the algorithm. So
Cy + ≥ Cmax , and we finally have that Cy − + Cy + ≥ 0, and again that
Cz−f ≥ 0. And this means that F is also minimum cost!
86
6 Summary
In this paper I have studied system optimal and user equilibrium flows
in static and dynamic networks, and how to find these in both general
and more specific situations. Of most interest are the alternative char-
acterization of static system optimal flows, the algorithm for turning a
static system optimal flow into a user equilibrium by taxes along the
arcs, and the chain flow algorithm for solving the dynamic system op-
timality problem with the assumption of constant latency functions on
the arcs. I will summarize these here.
• Alternative static system optimality criterion

The definition of the user equilibrium was defined locally on each
s − t-path. Each s − t-path in use has equal cost and cost less than
or equal to that of s − t-path. We saw that it was possibly to for-
mulate this also as a minimization problem that turned out to be a
convex optimization problem.
The system optimum was defined globally as a minimization prob-
lem, also a convex optimization problem. What I showed in section
5.1.4 was that in this case there was also an equivalent path-local
definition, similar to that of the user equilibrium. Each s − t-path
in use has incremental cost
 
d  X
LP = xa la (xa )
dxP a∈P
that is equal, and equal to or less than that of any unused s−t-path.
• Optimal taxing in a static network

I then went on in section 5.2 to find a way of turning a system opti-
mal flow into a user equilibrium by adding taxes along the arcs of
the network. This came from the idea that any static flow fs gives
rise to certain latencies along each arc in the graph G. Using the
fixed graph Gfs consisting of the arcs used by this flow and with
latencies caused by this flow, we can find constant taxes for each
arc in Gfs such that each s − t-path in Gfs has equal length. Then
adding these taxes to the original graph G, with non-constant la-
tencies, causes exactly the flow fs to be a user equilibrium in G, as
each s − t-path in G then has equal length exactly with fs .
I also pointed out that any flow fs can be turned into a user equilib-
rium by this process, but choosing the system optimal flow makes
the most sense in my case.
87
I have also programmed this algorithm, and tested it for some sim-
ple graphs. The code is found in the appendix.
• Dynamic system optimum algorithms

Although I showed in section 5.3 that it is possible to compute the
dynamic system optimal solution by making a time discretization
of the problem, this will often be impractical due to the sheer size
of the discretized optimization problem, even though the problem
is such that the Simplex algorithm, or even the Network Simplex
algorithm, can be used to solve it.
For the single commodity case I found an algorithm in section 5.4
that works by gradually increasing the number of paths used and
the time intervals of use of each of these. This is done such that
we always add flow at the lowest possible cost. The result of this
is a dynamic flow with the desired value, and with minimum total
cost, represented as a set of chains. Each chain is a path, possibly
with negative arcs, a value of constant flow along this path, and
a time interval during which the flow is present along this path.
Adding all the chains together gives the total dynamic flow. The
algorithm runs in a time polynomial to the number of vertices and
arcs, and proportional to the total supply, assuming the capacities
are integer. This is a huge improvement compared to the time dis-
cretization approach!
Although I only had time to develop the algorithm for the single
commodity case, I also think it would be possible to develop a sim-
ilar algorithm for the multi-commodity case, based on algorithm
presented here and the quickest transshipment problem in [4].
It should also be mentioned that although the time discretization
approach might be slow, it is however capable of solving more gen-
eral problems. It handles with ease time varying capacities, arbi-
trary departure/arrival deviation cost functions and multiple com-
modities. I also programmed code for turning a directed graph
with supplies and departure/arrival cost functions into a time dis-
cretized graph, removing unused vertices and arcs. I also wrote
some code for converting the graph to a format readable by a
graph drawing program, and more importantly for formulating the
LP problem of finding the dynamic system optimal flow through
the graph in the AMPL language. This code is also found in the
appendix.
88
7 Appendix
Below follow the .java files containing the code I have written for han-
dling these graphs on the computer.
Arc represents an arc used in the graph implementation.
package trafficGraphs;
public class Arc {

private double length, capacity, toll, flow;
private Vertex source, target;
private String name;
public Arc(String name, double length, double capacity, double toll,

Vertex source, Vertex target) {
// this.name = name;
this.name = source.getName() + target.getName() + (length + toll);
this.length = length;
this.capacity = capacity;
this.toll = toll;
this.source = source;
this.target = target;
source.getOutArcs().add(this);
target.getInArcs().add(this);
}
public Arc(String name, double length, double capacity, Vertex source,

Vertex target) {
this(name, length, capacity, 0, source, target);
}
public Arc(String name, double length, Vertex source, Vertex target) {

this(name, length, Double.POSITIVE_INFINITY, source, target);
}
public double getCapacity() {

return capacity;
}
public double getCost() {

return length + toll;
}
public double getLength() {

return length;
}
public String getName() {

return name;
}
public String getQuotedName() {

return "\"" + name + "\"";
}
public Vertex getSource() {

return source;
}
public Vertex getTarget() {

return target;
}
public double getToll() {

return toll;
}
public double getFlow() {

return flow;
}
public void setToll(double toll) {

this.toll = toll;
}
public void setFlow(double flow) {

this.flow = flow;
}
89
public String toGraphString() {
return source.getQuotedName() + " -> " + target.getQuotedName()
+ " [label = \""
+ (capacity == Double.POSITIVE_INFINITY ? "-" : capacity)
+ ", " + getCost() + "\"]";
}
public String toTextFileString() {

return "A " + name + " " + length + " " + capacity + " " + toll + " "
+ source.getName() + " " + target.getName();
}
}
AugementedGraph extends Graph, and represents the time discretiza-

tion of the original graph.
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.util.*;
public class AugmentedGraph extends Graph {

protected HashMap<String, Vertex[]> augVertices;
protected double step;
protected int max;
public AugmentedGraph(int max, double step) {

super();
this.max = max;
this.step = step;
augVertices = new HashMap<String, Vertex[]>();
}
public void toGraphFile(String filename) {

PrintWriter writer;
try {
writer = new PrintWriter(filename + ".dot");
} catch (FileNotFoundException e) {
System.err.println("The file was not found!");
e.printStackTrace();
return;
}
writer.println("digraph {\nrankdir = LR\nsplines = false\n");
for (Vertex[] vertexs : augVertices.values()) {
writer.println("{\nrank = same;\n");
for (Vertex vertex : vertexs)
if (vertex != null)
writer.println(vertex.getQuotedName());
writer.println("}\n");
}
for (Vertex vertex : vertices.values())
writer.println();
for (Arc arc : arcs.values())
writer.println(arc.toGraphString());
writer.println("}");
writer.close();
}
public void removeVertex(Vertex vertex) {

super.removeVertex(vertex);
String[] vinfo = vertex.getName().split(",");
augVertices.get(vinfo[0])[(int) (Double.parseDouble(vinfo[1]) / step)] = null;
}
public AugmentedGraph augment() {

throw new GraphAlreadyAugmentedException();
}
public class GraphAlreadyAugmentedException extends RuntimeException {

private static final long serialVersionUID = -1;
}
}
CostFunction is used to represent an arrival or departure cost function.

90
public abstract class CostFunction {
public abstract double getCost(double time, double step);
public static final CostFunction nullFunction = new CostFunction() {

public double getCost(double time, double step) {
return 0;
}
};
public static CostFunction makeCostFunction(final String format) {

if (format.equals("null"))
return nullFunction;
if (format.split(" ").length == 3)
return new CostFunction() {
private double a, b, t;
{
String[] s = format.split(" ");
a = Double.parseDouble(s[0]);
b = Double.parseDouble(s[1]);
t = Double.parseDouble(s[2]);
}

double lower = time - 0.5 * step;
double upper = time + 0.5 * step;
lower = (t - lower > 0 ? Math.min((t + 0.5 * step - lower)
/ step, 1) : 1)
* costAt(lower);
upper = (t - upper < 0 ? Math.min((-t + 0.5 * step + upper)
/ step, 1) : 1)
* costAt(upper);
return 0.5 * (lower + upper);
}
private double costAt(double time) {

return time < t ? -a * (time - t) : b * (time - t);
}
};
return makeCostFunction("0.5 2 5");
}
}
CostVertex extends Vertex, and is used for vertices with supplies, and
therefore cost functions.
public class CostVertex extends Vertex {

private CostFunction function;
private double magnitude;
public CostVertex(String name, CostFunction function) {

super(name);
this.function = function;
}
public CostVertex(String name, double magnitude, CostFunction function) {

super(name);
setMagnitude(magnitude);
this.function = function;
}
public CostFunction getCostFunction() {

return function;
}

return function == null ? 0 : function.getCost(time, step);
}

return getQuotedName()
+ (magnitude != 0 ? " [style = doublecircle, label = \""
+ getName() + " : " + magnitude + "\"]" : "") + ":";
}

return "C" + super.toTextFileString() + " " + magnitude + " ";
}
public double getMagnitude() {

return magnitude;
91
}
public void setMagnitude(double magnitude) {

this.magnitude = magnitude;
}
public boolean isRemovable() {

return super.isRemovable() && magnitude == 0;
}
}
Graph is the main class for representing a graph. It uses most of the
other classes here, and also contains the code that performs the dis-
cretization.
import java.util.*;
import java.io.File;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
public class Graph {

protected HashMap<String, Vertex> vertices;
protected HashMap<String, Arc> arcs;
public Graph() {
vertices = new HashMap<String, Vertex>();
arcs = new HashMap<String, Arc>();
}
public Graph(String filename) {

this();
fromTextFile(filename);
}
/**
* Reads a graph from a text file.
*
* Assumes all vertices first, format: ’V’ name ’CV’ name magnitude [cost
* function]
*
* Then arcs, format: ’A’ name length [capacity [toll]] source target
*
*
* @param filename
*/
protected void fromTextFile(String filename) {
Scanner scanner;
try {
scanner = new Scanner(new File(filename + ".txt"));
return;
}
String type;
int linen = 0;
while (scanner.hasNextLine()) {
try {
type = scanner.next();
if (type.equals("V")) {
addVertex(new Vertex(scanner.next()));
} else if (type.equals("CV")) {// TODO: Fixit cost function.
addVertex(new CostVertex(scanner.next(), scanner.nextDouble(),
CostFunction.makeCostFunction(scanner.nextLine().trim())));
} else if (type.equals("A")) {
String[] line = scanner.nextLine().trim().split(" ");
switch (line.length) {
case 4:
addArc(new Arc(line[0], Double.parseDouble(line[1]),
vertices.get(line[2]), vertices.get(line[3])));
break;
case 5:
Double.parseDouble(line[2]),
break;
case 6:
92
break;
default:
System.err.println("Wrong number of arguments for arc "
+ line[0]);
}
}
} catch (Exception e) {
System.err.println("Error at line " + linen);
}
linen++;
}
scanner.close();
}
public void toAMPLFile(String filename) {

PrintWriter writer;
try {
writer = new PrintWriter(filename + ".dat");
return;
}
writer.print("set Arcs := ");

writer.print(arc.getQuotedName() + " ");
writer.println(";\n");
writer.print("set Vertices := ");

writer.print(vertex.getQuotedName() + " ");
writer.println("param:\td\tc\t:=");
writer.println(arc.getQuotedName() + " " + arc.getCost() + " "
+ (arc.getCapacity() == Double.POSITIVE_INFINITY ?
"Infinity" : arc.getCapacity()));
writer.println("param:\tm\t:=");
writer.println(vertex.getQuotedName() + " "
+ vertex.getMagnitude());
writer.println("set Entering :=");

for (Vertex vertex : vertices.values()) {
// writer.printf("(%s, *)", vertex.getQuotedName());
for (Arc arc : vertex.getInArcs())
writer.print(" " + vertex.getQuotedName() + " "
+ arc.getQuotedName());
writer.println();
}
writer.println("set Leaving :=");
for (Vertex vertex : vertices.values()) {
// writer.printf("(%s, *)", vertex.getQuotedName());
for (Arc arc : vertex.getOutArcs())
writer.print(" " + vertex.getQuotedName() + " "
+ arc.getQuotedName());
writer.println();
}
/*
* for (Vertex vertex : vertices.values()) { writer.printf("set
* Entering%s := ", vertex.getName()); for (Arc arc :
* vertex.getInArcs()) writer.print(arc.getName() + "\t");
* writer.println(";\n");
*
* writer.printf("set Leaving%s := ", vertex.getName()); for (Arc arc :
* vertex.getOutArcs()) writer.print(arc.getName() + "\t");
* writer.println(";\n"); }
*/
writer.close();
try {
writer = new PrintWriter(filename + ".mod");
93
return;
}
writer.println("set Vertices;");
writer.println("set Arcs;");
writer.println("set Entering within {Vertices, Arcs};");

writer.println("set Leaving within {Vertices, Arcs};");
/*
* for (Vertex vertex : vertices.values()) { writer.printf("set
* Entering%s;%n", vertex.getName()); writer.printf("set Leaving%s;%n",
* vertex.getName()); }
*/
writer.println("\nparam d {Arcs} >= 0;");

writer.println("param c {Arcs} > 0;");
writer.println("param m {Vertices};");
writer.println("\nvar f {a in Arcs} >= 0, <= c[a];");

writer.println("\nminimize Cost: sum {a in Arcs} f[a] * d[a];");
writer.println("\nsubject to Flow {v in Vertices}: m[v] + "
+ "sum {(v, a) in Entering} f[a] = sum {(v, a) in Leaving} f[a];");
/*
* for (Vertex vertex : vertices.values()) { writer.printf("subject to
* Flow%s: sum {a in Entering%s} = m[%s] + sum {a in Leaving%s};\n",
* vertex.getQuotedName(), vertex.getQuotedName(),
* vertex.getQuotedName(), vertex.getQuotedName()); }
*/
writer.close();
}
public void toTextFile(String filename) {

PrintWriter writer;
try {
writer = new PrintWriter(filename + ".txt");
return;
}
for (Vertex v : vertices.values())
writer.println(v.toTextFileString());
for (Arc a : arcs.values())
writer.println(a.toTextFileString());
// TODO: To file.
writer.close();
}
public void toGraphFile(String filename) {

PrintWriter writer;
try {
writer = new PrintWriter(filename + ".dot");
return;
}
writer.println("digraph {\nrankdir = LR\nsplines = false\n");
writer.println();
writer.println(arc.toGraphString());
writer.println("}");
writer.close();
}
public void addVertex(Vertex vertex) {

vertices.put(vertex.getName(), vertex);
}
public void addArc(Arc arc) {

arcs.put(arc.getName(), arc);
}
public void removeArc(Arc arc) {

arc.getSource().getOutArcs().remove(arc);
94
arc.getTarget().getInArcs().remove(arc);
arcs.remove(arc.getName());
}
public void removeVertex(Vertex vertex) {

for (Arc arc : vertex.getOutArcs()) {
arc.getTarget().getInArcs().remove(arc);
}
for (Arc arc : vertex.getInArcs()) {
arc.getSource().getOutArcs().remove(arc);
}
vertices.remove(vertex.getName());
}
public void removeArc(String arcName) {

removeArc(arcs.get(arcName));
}
public void removeVertex(String vertexName) {

removeVertex(vertices.get(vertexName));
}
public AugmentedGraph augment(int max, double step) {

AugmentedGraph aug = new AugmentedGraph(max, step);
// Create new vertices and arcs.

for (Vertex v : vertices.values()) {
Vertex[] vs = new Vertex[max];
String vn = v.getName();
for (int i = 0; i < max; i++) {
vs[i] = new Vertex(vn + "," + (i * step), i * step);
aug.addVertex(vs[i]);
if (i > 0)
aug.addArc(new WaitArc(vs[i - 1].getName()
+ vs[i].getName(), step, vs[i - 1], vs[i]));
}
aug.augVertices.put(vn, vs);
// Vertex is source or sink.

if (v instanceof CostVertex) {
CostVertex cv = (CostVertex) v;
CostVertex uv = new CostVertex(vn, null);
aug.addVertex(uv);
uv.setMagnitude(cv.getMagnitude());
int it = cv.getMagnitude() > 0 ? 0 : 1;
Vertex[] sts = new Vertex[] { uv, null };
for (Vertex av : vs) {
sts[1] = av;
aug.addArc(new Arc(vn + av.getName(), 0,
Double.POSITIVE_INFINITY, cv.getCost(av.getTime(),
step), sts[it], sts[1 - it]));
}
}
}
// Add arcs from the old graph.

for (Vertex v : vertices.values()) {
Vertex[] vs = aug.augVertices.get(v.getName());
for (Arc a : v.getOutArcs()) {
Vertex u = a.getTarget();
int l = (int) (a.getLength() / step);
double cap = a.getCapacity() * step
* (l * step + 1 - a.getLength());
double carry = a.getCapacity() * step
* (a.getLength() - l * step);
Vertex[] us = aug.augVertices.get(u.getName());
for (int i = 0; i + l < max; i++)
// TODO: Kanskje fjern carry igjen?
if (Math.abs(cap) > 1e-6)
aug.addArc(new Arc(vs[i].getName()
+ us[i + l].getName(), a.getLength(), cap,
a.getToll(), vs[i], us[i + l]));
for (int i = 0; i + l + 1 < max; i++)
if (Math.abs(carry) > 1e-6)
aug.addArc(new Arc(vs[i].getName()
+ us[i + l + 1].getName(), a.getLength(),
carry, a.getToll(), vs[i], us[i + l + 1]));
}
}
// Remove superfluous vertices and arc.
95
LinkedList<Vertex> removables = new LinkedList<Vertex>();
for (Vertex v : aug.vertices.values())
if (v.isRemovable())
removables.add(v);
while (!removables.isEmpty()) {
Vertex v = removables.poll();
aug.removeVertex(v);
for (Arc arc : v.getInArcs())
if (arc.getSource().isRemovable())
removables.add(arc.getSource());
for (Arc arc : v.getOutArcs())
if (arc.getTarget().isRemovable())
removables.add(arc.getTarget());
}
return aug;
}
public static void test(String filename, int max, double step) {

Graph g = new Graph(filename);
g.toGraphFile(filename);
AugmentedGraph aug = g.augment(max, step);
aug.toGraphFile(filename + "aug");
aug.toAMPLFile(filename + "aug");
}
public static void main(String[] argh) {

test("grafenminja", 30, 0.5);
}
}
TollFinder has the code for finding tolls to make all paths from a given
vertex s to a given vertex t equally expensive. It assumes an acyclic
graph. This code is no longer correct, as I discovered an error in the
theory around this algorithm, but didn’t have time to rewrite the code.
import java.util.Collection;
import java.util.HashMap;
public class TollFinder {

private Collection<Arc> used;
private HashMap<Vertex, VertexWrapper> wraps;
private TollFinder(Collection<Arc> used) {

this.used = used;
wraps = new HashMap<Vertex, VertexWrapper>();
}
public static void findTolls(Vertex s, Vertex t, Collection<Arc> used) {

TollFinder tf = new TollFinder(used);
tf.makeSubgraphOfUsedArcs(s, t);
tf.calculateTolls(tf.wraps.get(s));
}
private void makeSubgraphOfUsedArcs(Vertex s, Vertex t) {

putInSubgraph(s);
wraps.get(t).distance = 0;
}
private void putInSubgraph(Vertex v) {

if (wraps.containsKey(v))
return;
VertexWrapper vw = new VertexWrapper(v);
wraps.put(v, vw);
for (Arc a : vw.v.getOutArcs())
if (used.contains(a))
putInSubgraph(a.getTarget());
}
private double calculateTolls(VertexWrapper vw) {

if (vw.distance >= 0)
return vw.distance;
if (vw.distance == -2)
throw new RuntimeException("The graph was not acyclic! " + vw.v
+ " encountered while active.");
vw.distance = -2;
double maxd = 0;
96
maxd = Math.max(maxd, calculateTolls(wraps.get(a.getTarget()))
+ a.getLength());
a.setToll(maxd - wraps.get(a.getTarget()).distance
- a.getLength());
return vw.distance = maxd;
}
private class VertexWrapper {

private Vertex v;
private double distance;
private VertexWrapper(Vertex v) {
this.v = v;
distance = -1;
}
}
public static void main(String[] args) {

Vertex s = new Vertex("s");
Vertex t = new Vertex("t");
Vertex u = new Vertex("u");
Vertex v = new Vertex("v");
Vertex w = new Vertex("w");
Arc su = new Arc("su", 1, s, u);
Arc uv = new Arc("uv", 3, u, v);
Arc vt1 = new Arc("vt1", 1, v, t);
Arc vt2 = new Arc("vt2", 2, v, t);
Arc uw = new Arc("uw", 1, u, w);
Arc wt = new Arc("wt", 1, w, t);
Collection<Arc> used = new java.util.HashSet<Arc>();
used.add(su);
used.add(uv);
used.add(vt1);
used.add(vt2);
used.add(uw);
used.add(wt);
findTolls(s, t, used);
for (Arc a : used)
System.out.format(
"%s: Total length: %.0f, of which %.0f is toll.\n",
a.getName(), a.getCost(), a.getToll());
}
}
Vertex represents vertices in the graph.

import java.util.HashSet;
import java.util.Collection;
public class Vertex {

private HashSet<Arc> inArcs, outArcs;
private double time;
private String name;
public Vertex(String name, double time) {

this.name = name;
this.time = time;
inArcs = new HashSet<Arc>();
outArcs = new HashSet<Arc>();
}
public Vertex(String name) {

this(name, Double.NaN);
}
public String getName() {

return name;
}
public double getTime() {

return time;
}
public double getMagnitude() {

return 0;
}
97
public Collection<Arc> getInArcs() {
return inArcs;
}
public Collection<Arc> getOutArcs() {

return outArcs;
}
public String getQuotedName() {

return "\"" + name + "\"";
}
public boolean isRemovable() {

return inArcs.size() * outArcs.size() == 0;
}

return "V " + name;
}
}
WaitArc extends Arc, and represents the arcs that are actually just wait-
ing at the same vertex some one time step.
public class WaitArc extends Arc {

public WaitArc(String name, double length, Vertex source, Vertex target) {
super(name, length, source, target);
}

return getSource().getQuotedName() + " -> "
+ getTarget().getQuotedName()
+ " [label = \"\", constraint = false]";
// " [label = \"-, " + getCost() + "\", constraint = false]";
}
}
98
References
[1] Alexander Schrijver A course in combinatorial optimization 2007
[2] B. G. Heydecker, J. D. Addison Analysis of dynamic traffic equilib-

rium with departure time choice TRANSPORTATION SCIENCE Vol.
39, No. 1, February 2005, pp. 39-57
[3] Jose R. Correa, Andreas S. Schulz, Nicolas E. Stier-Moses Selfish

routing in capacitated networks MATHEMATICS OF OPERATIONS
RESEARCH Vol. 29, No. 4, November 2004, pp. 961-976
[4] Bruce Edward Hoppe Efficient dynamic network flow algorithms Ph.
D. Cornell University 1995
[5] Robert J. Vanderbei LINEAR PROGRAMMING: Foundations and Ex-

tensions Kluwer Academic Publishers, 2. edition, 2001
99

162c PDF

Uploaded by

Copyright:

Available Formats

162c PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

162c PDF

Uploaded by

Copyright:

Available Formats

System optimum vs.

Jon Marius Venstad

Faculty of Mathematics and Natural Sciences

Det matematisk- naturvitenskapelige fakultet

3 The traffic routing problem 20

4 Existing work and solution algorithms 29

1.1 Transportation planning

• Routing traffic through a city, with as little congestion as possible.

• Routing internet traffic, with as little delay as possible.

• Routing containers between harbors, while transporting empty ones

Now in the case of routing traffic through a city, another possible

1.2 This text

2.1.1 Walks and paths

A walk in a graph is an alternating sequence of nodes and edges

same node, and open if not.

Theorem 2.1 For a graph G = (V , E) the following are equivalent:

a) G is connected and acyclic.

b) G is connected, and |V | = |E| + 1.

Figure 2: A tree with an s − t-path highlighted

2.1.3 Distance, weighted graphs

The notion of distance comes to mind when thinking of a traffic network,

This leads us to the following definition of distance.

iii d(v1 , v2 ) + d(v2 , v3 ) ≥ d(v1 , v3 )

In a directed graph this measure of distance only yields a metric on the

If we assume the length of each edge to be 1 we see that we recover the

2.1.4 Capacity, flows

and the total cost of a flow f is then defined:

Definition 2.4 The cost of a flow f is

Thinking of a flow situation in which the picture is not altered over

• If f is also linear and D is a polyhedron we have linear optimization

• If D is a finite set we have combinatorial optimization

• If D is also the integer points of a polyhedron we have integer

We will in this section look at minimizing a linear function over a poly-

2.2.1 Convex sets, cones

An import kind of set is the convex set.

Definition 2.8 A set C ⊂ R n is convex if for any pair of points c, d ∈ C

i.e. any convex combination of the two points is again in C.

Examples of convex sets are R n the n−dimensional real space, I n the

Note that the convex cone is also convex.

Definition 2.10 The intersection of two sets X, Y is given by

Definition 2.11 The sum of two sets X, Y is given by

Definition 2.12 The convex hull conv.hull(X) of a set X is the intersection

Theorem 2.2 For a set X ⊂ R n we have

If X is finite there exists a subset X ′ ⊂ X with |X ′ | = n + 1 such that each

Again we have the similar definition of the cone of a set X.

Again it can be shown that the definition is equivalent to a more useful

Theorem 2.3 For a set X ⊂ R n we have

If X is finite there exists a subset X ′ ⊂ X with |X ′ | = n such that each x

2.2.2 Halfspaces and polyhedron

Definition 2.14 A halfspace H ⊂ R n is a subset of R n such that there

Definition 2.15 A hyperplane P ⊂ R n is a subspace of R n such that there

A hyperplane is also called an affine subspace, and in the case when

Definition 2.16 A polyhedron P ⊂ R n is an intersection of finitely many

Since a halfspace is a closed, convex set, any intersection of halfspaces

Theorem 2.4 Any polyhedron P is a sum of a polytope Q and a finitely

Proof. We start with the case of a bounded LP problem. For a non-

Now since f is linear, we get

= λ1 f (v1 ) + · · · + λn f (vn ) ≥ min{f (v1 ), . . . , f (vn )}

2.2.4 The dual problem