Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lecture 5

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

CSC2411 - Linear Programming and

Combinatorial Optimization
Lecture 5: Smoothed Analysis, Randomized
Combinatorial Algorithms, and Linear
Programming Duality
Notes taken by Mea Wang
February 11, 2005

Summary: In this class, we discuss a few post-simplex-algorithm issues. We will first study the smoothed case analysis of Linear Programming problems. We then learn the Seidels algorithm, a randomized combinatorial algorithm that run in subexponential time, and its extensions.
Last, we will be introduced to the duality theorem of Linear Programs.

1 Overview
In the previous lecture, we learned the procedure of the simplex algorithm. The algorithm leads us to an optimum solution. The question now is how long it will take
the algorithm to reach this final solution. Moreover, we may wonder how well the
algorithm perform under different settings. In this lecture, we will first examine the
complexity of the simplex algorithm. Following this analysis, we will learn three randomized combinatorial algorithms which improve the running time, at least for some
sets of parameters. Last, we will start our discussion on Linear Program duality.

2 Smoothed Analysis
The worst case analysis estimates the maximum running time T of an algorithm A on
all possible inputs In of a given length n, whereas the average case analysis provides
the average running time of a program on a distribution of inputs of a given length.
The smoothed analysis was invented a few years ago motivated by the strictness of the
worst-case and average-case analyses. The smoothed analysis presents the maximum
running time of cases within certain range of the average cases. The range is defined

Lecture Notes for a course given by Avner Magen, Dept. of Computer Sciecne, University of Toronto.

by a scalar variable . We can intuitively think of as the distance away from the
average case we are interested in. It enabled us to study the behavior of an algorithm
under different settings. In mathematical symbols, these analyses are
worst case(A; n)

average case(A; n)
smoothed case(A; n, )

=
=

max T (x)

(1)

avgwIn T (x)
max avg T (w in neighborhood of x)

(2)
(3)

wIn

wIn

It is easy to see that Eqn. 1 Eqn. 3 Eqn. 2. For Linear Programming problems,
the algorithm is actually A(Ax b), and all possible inputs In are the all possible
combinations of A, b, and c. In general, the parameter varies depending on the problem being solved. In the smoothed case analysis, we let G be a matrix of independently
chosen Gaussian random variables of mean 0 and variance 1. Since we seek to analyze
the behavior of the algorithm within the neighborhood, we vary the constrain matrix
A by a G. Hence, the algorithm is A((A + G)x b).
Theorem 2.1. [ST01] Smoothed case (Simplex; n, ) = poly(m, n, 1/).
Theorem 2.1 essentially says that the simplex algorithm solves every Linear Program in time polynomial in m, n, and 1/ as long as we perturb the program in some
appropriate way scaled by . In fact, the worst case is a special case of the smoothed
case in which = 0. If is so large so that A becomes insignificant, we obtain the
average-case analysis. When is polynomially small, the analysis is a combination
of the worst-case and average-case analyses. In Linear Programming problems, the
worst case is exponential and the average case is polynomial. Theorem 2.1 tells us the
simplex is better than the just in average.

3 Randomized Combinatorial Algorithms


Lets consider a Linear Program Ax b with small number n of variables and m of
constrains. Assume H is the set of corresponding linear constraints. Fig. 1 shows a
polytope formed by a set of linear constraints for an instance of this problem. Recall
that if x is a vertex, then n of the linearly independent inequalities are satisfied as
equalities. In fact, the solution to the problem with only these n constraints is the same
as the original problem. We refer to these constraints as the critical constraints. In
Fig. 1, the critical constraints are represented by h1 and h2 . The cone bounded by the
hyperplanes h1 and h2 has the same solution as the original problem.
Claim 3.1. The cone formed by the critical constraints has the same optimum solution
as the original polytope.

3.1

Seidels Algorithm

Seidels algorithm [Sei91] is a simple randomized, incremental algorithm for Linear


Programs, that works well for small values of n.
2

max

critical constraints

h1

h2
optimum
all feasible solutions

Figure 1: Critical constrains and optimum solution.

Algorithm 3.2 (Seidels Algorithm (LP(H)).


Randomly pick a constraint h H
x := Seidels Algorithm (LP(H \ {h}))
if (x satisfies h)
return x
else
Project all constraints in H (input of current iteration)
onto h to form H 0
return Seidels Algorithm(new LP(H 0 ))
In Algorithm 3.2, if x satisfies h, it is the optimal solution of the original problem.
If x does not satisfy h, the solution must lie on the hyperplane defined by h. That is
h is satisfied as an equality in the optimum solution. In this case, we project all other
constraints onto h by casting these constraints onto their intersections with h. Last, we
solve the new Linear Program recursively with (n-1) variables on (m-1) constraints.
Lets go through the algorithm with a simple example shown in Fig. 2, in which
the objective function is directed upward and the polytope is defined by 6 hyperplanes.
Obviously, the optimal solution is at the intersection of hyperplanes 3 and 4. In fact,
these two hyperplanes correspond to the two critical constraints of this problem.

max
2

Figure 2: An example for Seidels algorithm.

Any one of the hyperplanes 1, 2, 5, or 6 can be removed without rendering x infeasible. If h = 4 in a particular iteration, the solution x should be in the shaded region
and would not satisfy h. We then know that hyperplane 4 is one of the critical constraints, and project all remaining constraints onto it. The polytope is now reduced to a
new Linear Program with one less dimension. The optimum solution remains the same
in this new Linear Program.
Assume there are k critical constraints out of m constraints. In the best case, we
want the algorithm to throw away the m k non-critical constraints before hitting a
critical constraint h. In this case, we have less number of constraints to project onto h.
The algorithm terminates faster. In the worse case, the algorithm consider all k critical
ones before the non-critical ones. For each critical constraints h, We need to project all
non-critical and subset of the critical constraints to it. It takes longer for the algotirhm
to terminate. Next, we will examine the complexity of this algorithm.

3.2

Analysis of Seidels Algorithm

Let A be an m n matrix, and m n. In every recursion, the number of constraints


is reduced by one. It takes the algoritm O(n) time to check satisfiability of x in each
recursive call. When x does not satisfy h, it takes the algorithm O(mn) to project
remaining constraints onto h. The complexity of this recursive algorithm is as follows:

T (n, m) = T (n, m1)+O(n)+

0
O(mn) + T (n 1, m 1)

if x satisfies h,
(4)
otherwise.

n criticals

......

m constraints

Figure 3: m constraints on n variables


Fig. 3 presents an intuitive way of this analysis. The rank of A can never exceed
n, that is at most n out of the m constraints are critical. In other words, every vertex
of the polytope is defined by at most n constraints. Hence, the case where x does not
n
satisfy h happens with probability m
. Eqn. 4 is then reduced to:
T (n, m) = T (n, m 1) + O(n) +

n
m (O(mn)

+ T (n 1, m 1))

Claim 3.3. The Seidels algorithm runs in exponential time O(n!m).


Proof. We prove this by double induction on n and m. We do induction on m within
the induction step for n.
Basis: Let n = 1. It takes the algorithm need to verify the only one constraint with
all m variables. Thus, the execution time is O(m) and O(n!m) = O(m).

Inductive step: Assume that for n 2, 1 i < n, T (i, m) = O(i!m). We want


to show that T (n, m) = O(n!m).
T (n, m) = T (n, m 1) + O(n) +

n
m (O(mn)

+ T (n 1, m 1))

We need to show that T (n, m 1) = O(n!(m 1)). We prove this by induction


on m.
Basis: Let m = 1. T (n, 0) = O(0) since the there are no variables left in this case.
Inductive step: Assume that for m 2, 1 j < m, T (n, j 1) = O(n!(j 1)).
We want to show that T (n, m 1) = O(n!(m 1)).
T (n, m 1)

=
=
=
=

T (n, m 2) + O(n) + O(n2 ) +


O(n!(m 2)) + O(n2 ) +

n
m2 T (n

n
m2 O((n

O(n!(m 2)) + O(n ) + O(n!)

1, m 2)

1)!(m 2))

O(n!(m 1))

Hence, T (n, m 1) = O(n!(m 1) for all m 1.


Now, we continue with the induction on n.
T (n, m)

=
=
=
=

O(n!(m 1)) + O(n) + O(n2 ) +


O(n! m1
m )

O(n!(m 1)) +
O(n!(m 1)) + O(n!)

n
m T ((n

1)!(m 1))

O(n!m)

Therefore, T (n, m) = O(n!m) for all n, m 1.


To summarize, the Seidels algorithm runs in exponential time O(n!m). This upper
bound is linear when the dimension n is constant.
Note that the set of critical constraints may or may not be unique. As shown in
Fig. 4, the point at the tip of the pyramid is the optimum solution. Let H 0 be the set
of constraints corresponding to the four hyperplanes intersecting at the optimum point.
Any three of H 0 are sufficient to define this optimum point. As soon as the Seidels
algorithm removes the first constraint in H 0 , the remaining three become the critical
constraints.
optimum
max

Figure 4: The set of critical constraints may not be unique.

3.3

Matousek Sharir Welzls Improvement

The Seidels algorithm throws away the solution x if it does not satisfy h and solves
the subproblem with all constraints being projected onto h. In 1992, Matousek, Sharir,
and Welzl [MSW92, SW92] identified this flaw and improved the algorithm using a
more sophisticated structure. They used information in the recursion regardless
the
q
satisfaction of h. This improvement leads to a smaller complexity, exp(2 n ln mn +

O( n + ln m)).

3.4

Kalais results

Let (n, m) be the maximal possible diameter of a polyhedron defined by m constraints on n variables. Then (n, 2n) n follows. We illustrate this with an example.
Example 3.4. The following constraints,
0 x1 1
..
.
0 xn 1
define an n-dimensional cube as shown in Fig. 5. The path between any two points
in this polyhedron involves at most n steps. Since the diameter of a polyhedron is
the shortest path between the two points furthest apart, the diameter of this cube is n.
Hence, the maximal possible diameter is at least n.

A
Figure 5: An n-dimensional cube.
Conjecture 3.5 (Hirsch [Dan63]). In general (n, m) = (n + m)
Obviously, the path defined by (n, m) might not be always moving in the direction of the objective function. In fact, it is the shortest path between two points. The
maximal directed diameter 0 (n, m) is the longest path along the direction of the objective function hc, xi between two points in a polyhedron defined by m constraints on
n variables. Thus, 0 (n, m) (n, m).
In the simplex algorithm, we start at one vertex of the polyhedron and move to
another vertex that has greater value of the objective function until the optimum is
found. If a simplex algorithm terminates in T iterations for every Linear Program with
m constraints on n variables, then T 0 (n, m) (n, m).
6

In 1992, Kalais algorithm [Kal92] emerged from the study of the diameter problem
for graphs of polyhedra. The algorithm is a randomized simplex
algorithm that gives

an upper bound on the number of iterations which is exp(O n log n). This shows, for
the first time, a similar bound on 0 (n, m).

4 Duality Theorem
Lets first consider two examples.
Example 4.1. Given the following Linear Program
min x2
s.t.
x1 + x 2 8

(5)

3x1 + 2x2 6

(6)

how do we lower bound the value of the optimum solution?


3Eqn. 5:
3(x1 + x2 ) 24

(7)

Eqn. 6+Eqn. 7:
3(x1 + x2 ) + (3x1 + 2x2 ) 30

(8)

Solve Eqn. 8:
x2 6
For any feasible solution x2 is at least 6.
Example 4.2. Given the following Linear Program
max 5x1 + 6x2 + 9x3 + 8x4
s.t.
x1 + 2x2 + 3x3 + x4 5

(9)

x1 + x2 + 2x3 + 3x4 3

(10)

xi 0, i = 1, 2, 3, 4
how do we upper bound the value of the optimum solution?
To make each term in Eqn. 9 and Eqn. 10 bigger than the terms in the objective
function, we apply the following operations.
8Eqn. 9:
5x1 + 6x2 + 9x3 + 8x4 8x1 + 16x2 + 24x3 + 8x4 40

(11)

5x1 + 6x2 + 9x3 + 8x4 6x1 + 6x2 + 12x3 + 18x4 18

(12)

6Eqn. 10:

Eqn. 12 provides a tighter upper bound than Eqn. 11. Consider a feasible solution
x1 = 1, x2 = 2, x3 = x4 = 0,
5x1 + 6x2 + 9x3 + 8x4 = 5 + 12 + 0 + 0 = 17
Another possible upper bound is 1Eqn. 9+4Eqn. 10:
5x1 + 6x2 + 9x3 + 8x4 5x1 + 6x2 + 11x3 + 13x4 17
Therefore we have an upper bound of 17.
This can be thought of as a search for the best upper bound for a maximization problem or the best lower bound for a minimization problem. We had to choose y1 , y2 0
as the coefficient in our proof. It was necessary to get a combination that dominated
the objective function. Consider the constraints in Example 4.2,

y1 (x1 + 2x2 + 3x3 + x4 ) + y2 (x1 + x2 + 2x3 + 3x4 )


(y1 + y2 )x1 + (2y1 + y2 )x2 + (3y1 + 2y2 )x3 + (y1 + 3y2 )x4

(13)

In order to have each term in Eqn. 13 bigger than the terms in the objective function,
we need to have
y1 + y 2
2y1 + y2
3y1 + 2y2
y1 + 3y2

5
6
9
8

Note that the objective function is less than Eqn. 13, which is less than 5y1 + 3y2 .
Hence, seeking for the tightest upper bound for Example 4.2 is the same as minimizing
value of 5y1 + 3y2 . This gives rise to a new Linear Programming problem as follows:
min 5y1 + 3y2
s.t.
y1 + y 2 5
2y1 + y2 6
3y1 + y2 9
y1 + 3y2 8
y1 , y 2 0
Note that the coefficients in the objective function are exactly the numbers on the
right side of the inequalities in the original Linear Program. The coefficients on the
left side of the inequalities are actually the transpose of matrix A, and the numbers
on the right side are the vector c in the original problem. Everything here satisfy the
constraints is an upper bound of the original problem. This is called the dual, and the
original problem is referred to as the primal.
Lets return to the diet problem given in the first lecture.

Example 4.3 (the diet problem). A farmer wants to spend as little as possible while
keeping the cows healthy. There are n different food types available, the j th food
containing cj R calories per kilogram, 1 j n, and aij R milligrams of
vitamin i per kilogram, 1 i m. The cow requires at least bi R milligrams of
vitamin i to stay healthy. Given that the goal is to minimize the money spent while
having enough of each vitamin, how should the cow be fed?
The problem in Linear Program form is as follow and is presented in a graphically
in Fig. 6.
minhc, xi
s.t.
Ax b
xi 0, 1 i n

food type
Vitamin

x b

mininum vitamin
required

C
cost per unit of food
Figure 6: The diet problem.
How do we bound the minimum cost? In other words, how do we convert this to
the dual?
Following our observations from Example 4.2, we introduce a vector y and want to
minimize hy, bi. The new constraints are yA c. Formally, we have
maxhy, bi
s.t.
yA c
yi 0, 1 i m
To see this results intuitively, we can think of the vector y as the price of vitamin
pills in the drug mart and the cost vector c as the price of each unit of food. The
pharmacist always wants to maximize his/her profit while being competitive against
each food type.
This naturally leads to the duality theorem, which we will be introduced to in the
next lecture. For every maximization problem, there is a corresponding minimization
problem, and vice versa. They provide upper/lower bound for each other. In fact, as
we will see, their optimum solutions are the same.

5 Tutorial: February 2
In this tutorial, we will discuss the dual of different forms of Linear Programs, and
learn the technique to transform an Linear Program between its primal and dual. We
will wrap this tutorial with an application of duality theorem.

5.1

Duality Theorem

Any Linear Program problem can be expressed as either the primal or the dual. Conventionally, we referred to the original problem as the primal, and the other one as
the dual. Given a Linear Program in general form, the primal and dual are defined as
follows:
Primal
minhc0 , xi
a0i x = bi
a0i x bi
xj 0
xj 0

iM

iM
jN

jN

Dual
maxh 0 , bi
i 0
i 0
0 Aj c j
0 Aj = c j

In general, if the primal is feasible, the dual is feasible as well. It is possible to have
both the primal and dual infeasible. If one of them is unbounded, the other one must
be infeasible.
Here, we want to focus on how to transform between the primal and dual of a Linear
Program. The objective functions of the primal and dual are related by the scalar 0 Ax.
More precisely, we have c0 x 0 Ax 0 b. The primal and dual of a Linear Program
can be visualized as in Fig. 7 . These two tables provide us a good reference on how to
map the problem from one form to another.

Primal (min)

Dual (max)

c
0

b
<> 0

<> 0

AT

_
N

_
M
Figure 7: Primal and dual of an LP.

10

5.2

Application: Shortest Path

Duality theorem has a wide range of applications. We will see its usefulness through
an example of the shortest path problem.
Example 5.1. Given a directed graph G = (V, E) (Fig. 8), a cost cj 0 is assigned
to each edge ej E. We want to find the least-cost path from node s to node t.

a
1
s

e1
e2
2

e3

3
2 e4

e5
1

Figure 8: An example of the shortest path problem.


The graph G can be represented
follows,

+1

aij = 1

+1
0
A=
1
0

by a matrix A, in which each entry is defined as


if edge ej leaves node vi ,
if edge ej arrives node vi ,
otherwise.

+1 0
0
0
0
0 1 1

0 +1 +1 0
1 1 0 +1

An edge ej has positive flow fj > 0 if it is selected to construct the path from s to
t. The objective is to minimize the total cost of the path. Thus, the Linear Program for
the problem is
minhc, f i
s.t.
Af = b
fi 0, i = 1, 2, 3, 4, 5
The vector b specifies the flow conservation on each node in the final path. In other
words, the entry for s is +1 indicating that a unit flow is leaving s, and the entry for
t is 1 indicating that a unit flow is arriving at t. The rest of the entries should be 0
since each node should forward all flows it received. A BFS of this problem must have
|E| |V | positive entries.
It is possible to have a non-integral optimum solution s0 . In fact, we can derive a
integral (a BFS) optimum solution from s0 . To do so, we let the set S = (s1 , , sn )
0
be all BFSs. Since all feasible
Psolution are in the convex hall of the BFSs, then s =
P
i i = 1. Let j be the index corresponding to the BFS
i i si for 0 i 1 and
with lowest cost. Then

11

c s0 =

X
i

i (c si ) (c sj )

X
i

i = c s j

which shows that sj is not only a BFS, but also the optimum integral solution.
The dual form of this problem is then
maxh 0 , bi
s.t.
0 A c
i0 0, i V
We can rewrite the Linear Program as
max s t
s.t.
i j cij , (i, j) E
i 0, i V
Now, we examine the feasibility of this dual. Let dij be the shortest path between
any two points in G. For each vertex v in V , let v = dvt . The constraints of the dual
become dit djt cij , which satisfies the triangle inequality as shown in Fig. 9. In
words, if there exists a shortest path from node i to t, its cost must be no more than the
cost of the path from i to t via node j. Hence, the constraints are feasible.

Figure 9: Triangle inequality.


At last, we want to show that the optimum solution of the dual is the shortest path
from s to t. Since dst is a feasible solution to the dual, dst
0 b, where
0 b is the
optimum solution. Obviously, the shortest path from s to t is the path s b t.
According to the dual, the following two inequality follows,
s b csb
b t cbt

(14)
(15)

Eqn. 14 + Eqn. 15 gives us s t csb + cbt = dst . Since 0 b = s t , we


have 0 b dst
0 b. Hence, dst is the optimum solution. Therefore, the optimum
solution of the dual is the shortest path from s to t.

References
[Dan63]

G. B. Dantzig. Linear programming and extensions. Princeton University


Press, Princeton, NJ, 1963.
12

[Kal92]

G. Kalai. A subexponential randomized simplex algorithm. in Proceedings


of 24th Annual ACM Symposium on Theory of Computing, pages 475482,
1992.

[MSW92] J. Matousek, M. Sharir, and E Welzl. A subexponential bound for linear


programming. in Proceedings of 8th Annual ACM Symposium on Computational Geometry, pages 18, 1992.
[Sei91]

R. Seidel. A simple and fast incremental randomized algorithm for computing trapezoidal decompositions and for triangulating polygons. Computational Geometry: Theory and Applications, 1:5164, 1991.

[ST01]

D. A. Spielman and S. Teng. Smoothed analysis: Why the simplex algorithm usually takes polynomial time. in Proceedings of the Thirty-Third
Annual ACM Symposium on Theory of Computing, pages 296305, 2001.

[SW92]

M. Sharir and E Welzl. A combinatorial bound for linear programming and


related problems. in Proceedings of 9th Symposium on Theoretical Aspects
of Computer Science, 577 of Lecture Notes in Computer Science:569579,
1992.

13

You might also like