1 Minimum Spanning Tree (MST) : Lecture Notes CS:5360 Randomized Algorithms
1 Minimum Spanning Tree (MST) : Lecture Notes CS:5360 Randomized Algorithms
Lectures 10, 11, and 12: Sep 20, Sep 25, and Sep 27, 2018
Scribe: Hankyu Jang
• Input: An edge-weighted, connected graph G = (V, E). Let w(e) denote weight of edge e
where w(e) > 0 for all e.
Example Some of the well known ”greedy” algorithms are (1) Kruskal’s algorithm, (2) Prom’s
algorithm, and (3) Boruvka’s algorithm. Take a look at Figure 1 to understand how Kruskal’s
algorithm work.
Figure 1: For each step of Kruskal’s algorithm, it adds an edge to the tree that has the minimum
weight and does not create a cycle. As depicted in the picture, the algorithm starts at the start
node and then adds an edge respectively.
Let m be the number of edges and n be the number of vertices. Then, using simple date
structures, these algorithms can be implemented in O((m + n) log n) time. (note that the state of
the art, log n can be replaced with β(m, n) which is an extremely slowly growing function.)
1
Question Does MST have an O(m + n) time algorithm? In other words, does it have a linear
time algorithm?
The answer to above question is that no one knows if there’s a deterministic linear time al-
gorithm. The problem is open if we focus on deterministic algorithm. There was a breakthrough
from the early 1990’s where Karger, Klein, and Tarjan showed that there is a Las Vegas algorithm
for MST running in the expectation of O(m + n) time. The key idea of this algorithm comes from
sampling.
Before we move on, let’s review Kruskal’s algorithm.
Algorithm 1: Kruskal’s algorithm
1 Sort the edges in increasing order of weight;
2 T ← (V, ∅) (containing all the vertices but no edges);
3 for each edge e = {u, v} considered in order do
4 if u and v are in distinct connected components of T then
5 add e to T ;
6 end
7 end
Figure 2: The black lines are the edges in T . If the next edge is the green dashed line, then
Kruskal’s algorithm will add the edge to T . If the next edge is the red dashed line, then Kruskal’s
algorithm will not add the edge to T .
Definition An edge {u, v ∈ E} is F-heavy if w(u, v) > wV (u, v). Otherwise, if w(u, v) ≤ wV (u, v)
then {u, v} is F-light. Figure 4 depicts the different cases.
Note that every edge in F is F-light and other edges in E are partitioned into F-light or F-heavy
edges.
2
Figure 3: In the graph on the left, u and v are in the same connected component. Hence, wF (u, v) =
8 is the heaviest weight of an edge of the path between u and v. In the graph on the right, u and
v are in the different components. Therefore, wF (u, v) = ∞.
Figure 4: In the graph on the left, the edge connecting u and v is F-light because w(u, v) =
6 < wF (u, v) = 8. In the graph in the middle, the edge connecting u and v is F-heavy because
w(u, v) = 20 > wF (u, v) = 8. In the graph on the right, the edge connecting u and v is also F-light
because no matter weight of the edge, w(u, v) < wF (u, v) = ∞.
3
Question How can we use randomization to find F quickly so that we can focus on computing
MST on F-light edges?
There are known deterministic linear time algorithms for step (3). Steps (2) and (4) are MST
computation step and step (3) is MST verification step. The higher you choose p, the more
computation is needed in step (2); but np is smaller so less the computation is needed in step (4).
We need a balance!
n
Example To balance work in steps (2) and (4), set mp = p where n is number of edges in step
q
n
(4). This lead to choosing p = m. Suppose m = Θ(n2 ) (very dense graph). Then,
r
2 n
mp = n
n2
1
= n2 √
n
1.5
= n
4
Algorithm 2: KKT sampling at ei (step i)
1 if coin toss = H then
2 add ei to G(p);
3 if ei connects to different components of F then
4 add ei to F ;
5 classify ei as F-light;
6 end
7 classify ei as F-heavy;
8 if ei connects two different components in F then
9 classify ei as F-light;
10 end
11 classify ei as F-heavy;
12 end
The table below illustrates Algorithm 2. At the end of Phase 0, ei is added to F since it’s the
first head. Likewise, at the end of Phase 1 ej is added to F . One thing to notice that in Phase
2, coin toss is H for an edge ej+1 but this edge was not added to F because the edge is F-heavy
which would create a cycle. Only edges that are F-light with respect to the F in the previous phase
can be added to F when the coin toss is Head. In general, at a typical phase, the first appearing
F-light edge can be added to F only if the coin toss is Head.
KKT Sampling
- Phase 0 Phase 1 Phase 2
Coin Toss T T ··· T H T T ··· T H H T ... T H
Edge selected e1 e2 · · · ei−1 ei ei+1 ei+2 · · · ej−1 ej ej+1 ej+2 · · · el−1 el
Edges in F 0 1 2
2. ei ∈ F
3. ei is F-light F-heavy
5
Example ep is F-heavy, coin toss is H. What happens? In this case, do not add the edge to F ,
but add it o G(p).
F-heavy edges in Phase k are not added to F independent of coin tosses. So we are essentially
just processing F-light edges until we get the first F-light edge for which we have heads as the coin
toss outcome.
Let Xk = number of F-light edges in Phase k. What is the distribution of Xk ? Xk ∼ Geom(p).
We know that E[Xk ] = p1
Let X = number of F-light edges. Note that if an edge is F-light when it is processed, it remains
P
F-light forever. Therefore, X = k≥0 Xk
Suppose we add s edges to F for some 0 ≤ s ≤ n − 1. The random variables X0 , X1 , · · ·, X0
follow Geom(p). Hence, E[X] = s−1 n−1
p ≤ p .
What about XS ? In other words, what about the remaining parts?
To deal with this issue, imagine that we add infinitely many dummy F-light edges to e1 , e2 ,
· · ·, em . Let the dummy edges be em+1 , em+2 , em+3 , · · ·, and let’s say these are all F-light edges.
Also, we toss coins until we have tossed n Heads for F-light edges.
Let Y = number of F-light edges we process in order to get n Heads. Then, E[X0 + X1 + · · · +
Xs−1 + Xs ] ≤ E[Y ]. Note that until Phase s-1, there are up to n − 1 heads corresponding to F-light
edges. In Phase s (Xs ) we have bunch of heads corresponding to F-light edges.
Definition A random variable Y has negative binomial distribution with parameters n (number
of successes we want) and p (probability of head) if it equals the number of coin tosses of a biased
coin needed to get n Heads.
So far, y = n, n + 1, · · ·
y−1 n
P (Y = y) = n−1 p (1 − p)y−n
If F is MST, output is all edges beside the edges in F. However, if F is a spanning forest, output
is all edges beside the edges in F, which is depicted in Figure 5.
There are several deterministic linear time algorithms for this problem. The simplest one is
due to Valerie King in the early 1990s. Another ingredient is Boruvka’s MST algorithm.
Algorithm 3: Boruvka’s MST algorithm
1 while there is more than one connected component do
2 For each vector v, pick a minimum weight edge ev incident on v;
3 Add all ev ’s to solution;
4 Contract connected components to supervertices;
5 end
6
Figure 5: The lines are the edges in a spanning forest F. If this is the case, red dashed lines are
the output of MST verification.
Figure 6: Red arrows on the second graph are ev s picked per node. These red arrows are all added
to solution. Then, the connected components are contracted to super-vertices.
There are two facts about Boruvka’s algorithm. (1) If we start a Boruvka iteration with n ver-
tices, we have ≤ n2 vertices after the iteration, and (2) Each Boruvka iteration can be implemented
in O(m + n) time.
7
Algorithm 4: KKT MST algorithm
1 We apply 3 iterations of Boruvka’s algorithm. Let G1 be the resulting graph. Let C be edges selected to be
in MST by these iterations of Boruvka’s algorithm;
2 If G1 contains a single vertex then return C and exit;
3 Set p = 21 and compute G2 = G1 (p);
4 By recursively calling the KKT MST algorithm, compute a minimum spanning forest F2 of G2 ;
5 Call MST verification algorithm to compute F2 -light edges of G1 . (MST verification algorithm will give the
heavy edges. Use the complement of this);
6 Recursively call the KKT MST algorithm to compute MST of (V (G1 ), F2 -light edges);
7 return C∪ (edges picked in step (6));
Figure 7: This figure depicts the KKT MST algorithm for a unconnected graph G.
Running time analysis Let T (m, n) denote the expected running time of the KKT MST algo-
rithm on an n-vertex graph with m edges. Here are the running time of the KKT MST algorithm
per step:
• Step (3): O(m + n) (visiting every edge and then sampling it)
• Step (4): T ( m n
2 , 8)
• Step (6): T ( n4 , n8 )
The running time of the KKT MST algorithm is the summation of the running time in the
above steps:
8
m n n n
T (m, n) = O(m + n) + T ( , ) + T( , )
2 8 4 8
We upper bound O(m + n) by c(m + n). Then,
m n n n
T (m, n) ≤ c(m + n) + T ( , ) + T( , )
2 8 4 8
m n n n m n n n
c(m + n) + T ( , ) + T ( , ) ≤ c(m + n) + 2c( , ) + 2c( , )
2 8 4 8 2 8 4 8
n n n
= c(m + n) + c(m + + + )
4 2 4
= c(m + n) + c(m + n)
= 2c(m + n)