4 Greedy Updated
4 Greedy Updated
4 Greedy Updated
Term 2, 2023
Table of Contents 2
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
The Greedy Method 3
Question
What is a greedy algorithm?
Answer
A greedy algorithm is one that solves a problem by doing it in
stages, only considering the choice that appears to be the best at
that stage of construction.
This obviously reduces the search space, but it works correctly only
in cases when the locally optimal choices lead to the globally
optimal outcome.
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Activity Selection 5
Problem
Instance: A list of n activities, with starting times si and finishing
times fi . No two activities can take place simultaneously.
si fi
Attempt 1
Always choose the shortest activity which does not conflict with
the previously chosen activities, then remove the conflicting
activities and repeat.
Attempt 2
Maybe we should always choose an activity which conflicts with
the fewest possible number of the remaining activities? It may
appear that in this way we minimally restrict our next choice . . .
As appealing this idea is, the above figure shows this again does
not work!
Activity Selection 8
Solution
Among those activities which do not conflict with the previously
chosen activities, always choose the activity with the earliest end
time (breaking ties arbitrarily).
Activity Selection 9
Thus, the algorithm runs in total time O(n log n), dominated
by sorting.
Activity Selection 15
A related problem
Instance: A list of n activities with starting times si and finishing
times fi = si + d; thus, all activities are of the same duration. No
two activities can take place simultaneously.
Solution
Since all activities are of the same duration, this is equivalent to
finding a selection with a largest number of non conflicting
activities, i.e., the previous problem.
Activity Selection 16
Question
What happens if the activities are not all of the same duration and
we have to select activities of maximal total duration?
Solution
The greedy strategy no longer works - we will need a more
sophisticated technique.
Petrol stations 17
Problem
Instance: You are traveling by car on a road from Loololong (in
the West) to Goolagong (in the East). You start with full tank of
petrol which you know that is sufficient to travel K kilometres.
You also know the distances di from Loologong to N petrol
stations on the road and you wish to reach Goolagong with a
minimal possible number of stops to refuel? How should you
choose at which petrol stations to stop?
Solution
Always travel to the furthest petrol station you can reach without
running out of petrol and always fill the tank to its capacity.
Petrol stations 18
Question
So, in general, how do we prove that a greedy algorithm produces
a correct solution?
Answer
There are two main methods of proof:
1 Exchange argument: consider an alternative solution, and
gradually transform it to the solution found by our proposed
algorithm without making it any worse (as we did in the
Activity Selection Problem).
2 Greedy stays ahead: prove that at every stage, no other
sequence of choices could do better than our proposed
algorithm (as we did in the Petrol Stations Problem).
Greedy Method Correctness Proofs 20
Problem
Instance: Along the long, straight road from Loololong (in the
West) to Goolagong (in the East), houses are scattered quite
sparsely, sometimes with long gaps between two consecutive
houses. Telstra must provide mobile phone service to people who
live alongside the road, and the range of Telstra’s cell tower is 5km.
L G
Exercise
Prove the correctness of this algorithm using an exchange
argument.
Cell Towers 25
His junior associate did exactly the same but starting from
Goolagong and moving westwards and claimed that his
method required fewer towers.
Problem
Instance: A start time T0 and a list of n jobs, with duration times
ti and deadlines di . Only one job can be performed at any time; all
jobs have to be completed. If a job i is completed at a finishing
time fi > di then we say that it has incurred lateness li = fi − di .
Task: Schedule all the jobs so that the lateness of the job with the
largest lateness is minimised.
Minimising Job Lateness 27
Solution
Ignore job durations and schedule jobs in the increasing order of
deadlines.
Proof of optimality
Consider any alternative schedule. We say that jobs i and j form
an inversion if job i is scheduled before job j but dj < di .
dj ti tj
T0 di
fi −1 fi fj −1 fj
li
lj
Minimising Job Lateness 28
Recall that bubble sort only swaps adjacent array entries, and
eventually sorts the array.
dj tj tk
T0 dk
fi fj fk
lj
lk
dj tk tj
T0 dk
fi fk fj
lj
lk
Problem
Instance: A list of n files of lengths li which have to be stored on
a tape. Each file is equally likely to be needed. To retrieve a file,
one must start from the beginning of the tape and scan it until the
file is found and read.
Task: Order the files on the tape so that the average (expected)
retrieval time is minimised.
Tape Storage 31
Problem
Instance: A Plist of n files of lengths li and probabilities to be
needed pi , ni=1 pi = 1, which have to be stored on a tape. To
retrieve a file, one must start from the beginning of the tape and
scan it until the file is found and read.
Task: Order the files on the tape so that the expected retrieval
time is minimised.
Tape Storage II 33
Problem
Instance: Let X be a set of n intervals on the real line, described
by two arrays XL [1..n] and XR [1..n], representing their left and
right endpoints. We say that a set P of points stabs X if every
interval in X contains at least one point in P.
Attempt 1
Is it a good idea to stab the largest possible number of intervals?
Hint
The interval which ends the earliest has to be stabbed somewhere.
Problem
Instance: A list of n powder substances, described by their
weights wi and values vi , and a maximal weight limit W of your
knapsack. You can take any fraction available of each substance
(not necessarily integer).
Solution
Take maximal available amount of the substance of highest value
per unit weight!
0-1 Knapsack 42
Problem
Instance: A list of n discrete items described by their weights wi
and values vi , and a maximal weight limit W of your knapsack.
Assume there are just three items with weights and values:
You are allowed to merge any two arrays into a single new
sorted array and proceed in this manner until only one array is
left.
Exercise
Design an algorithm which achieves this task and moves array
elements as few times as possible.
Give an informal justification why your algorithm is optimal.
However this might not be the most economical way: all the
symbols have codes of equal length but the symbols are not
equally frequent.
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Table of Contents 51
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Tsunami Warning 52
Problem
Instance: There are n radio towers for broadcasting tsunami
warnings. You are given the (x, y ) coordinates of each tower and
its radius of range. When a tower is activated, all towers within
the radius of range of the tower will also activate, and those can
cause other towers to activate and so on.
d
e f
Tsunami Warning 54
d
e f
Tsunami Warning 57
Attempt 1
Find the unactivated tower with the largest radius (breaking ties
arbitrarily), and place a sensor at this tower. Find and remove all
towers activated as a result. Repeat.
Attempt 2
Find the unactivated tower with the largest number of towers
within its range (breaking ties arbitrarily), and place a sensor at this
tower. Find and remove all towers activated as a result. Repeat.
Exercise
Give examples which show that neither of these algorithms solve
the problem correctly.
Tsunami Warning 58
a b a b
a→b a↔b
Tsunami Warning 59
Observation
Suppose that activating tower a causes tower b to also be
activated, and vice versa. Then we never want to place sensors at
both towers; indeed, placing a sensor at a is equivalent to placing a
sensor at b.
Example
a c
Observation
Let S be a subset of the towers such that that activating any tower
in S causes the activation of all towers in S.
Definition
Given a directed graph G = (V , E ) and a vertex v , the strongly
connected component of G containing v consists of all vertices
u ∈ V such that there is a path in G from v to u and a path from
u to v . We will denote it by Cv .
Claim
u is in Cv if and only if u is reachable from v and v is reachable
from u.
a b d e
c f
g h
Re = {d, e, f , g , h}.
Strongly Connected Components 65
a b d e
c f
g h
Re0 = {a, b, c, d, e, f }.
Strongly Connected Components 66
Combining
Re = {d, e, f , g , h}
with
Re0 = {a, b, c, d, e, f },
we have
Ce = Re ∩ Re0 = {d, e, f }.
Note that Cd and Cf are also the same set, namely {d, e, f }.
a b d e
c f
g h
Strongly Connected Components 69
Definition
Define the condensation graph ΣG = (CG , E ∗ ), where
a b d e
c f
g h
The Condensation Graph 72
{a, b, c} {d, e, f }
{g , h}
The Condensation Graph 73
Claim
The condensation graph ΣG is a directed acyclic graph.
Proof Outline
Suppose there is a cycle in ΣG . Then the vertices on this cycle are
not maximal strongly connected sets, as they can be merged into
an even larger strongly connected set.
Tsunami Warning 75
Solution
The correct greedy strategy is to only place a sensor in each
super-tower without incoming edges in the condensation graph.
Proof
These super-towers cannot be activated by another super-tower, so
they each require a sensor. This shows that there is no solution
using fewer sensors.
Tsunami Warning 76
Proof (continued)
We still have to prove that this solution activates all super-towers.
Definition
Let G = (V , E ) be a directed graph, and let n = |V |. A
topological sort of G is a linear ordering (enumeration) of its
vertices σ : V → {1, . . . , n} such that if there exists an edge
(v , w ) ∈ E then v precedes w in the ordering, i.e., σ(v ) < σ(w ).
Property
A directed acyclic graph permits a topological sort of its vertices.
Note that the topological sort is not necessarily unique, i.e., there
may be more than one valid topological ordering of the vertices.
Topological Sorting 78
Algorithm
Maintain:
a list L of vertices, initially empty,
an array D consisting of the in-degrees of the vertices, and
a set S of vertices with no incoming edges.
Topological Sorting 79
Algorithm (continued)
While set S is non-empty, select a vertex u in the set.
Remove it from S and append it to L.
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Single Source Shortest Paths 81
Problem
Instance: a directed graph G = (V , E ) with non-negative weight
w (e), and a designated source vertex s ∈ V .
Task: find the weight of the shortest path from s to v for every
v ∈ V.
Single Source Shortest Paths 82
Note
To find shortest paths from s in an undirected graph, simply
replace each undirected edge with two directed edges in opposite
directions.
Note
There isn’t necessarily a unique shortest path from s to each
vertex.
Dijkstra’s Algorithm 83
Algorithm Outline
Maintain a set S of vertices for which the shortest path weight has
been found, initially empty. S is represented by a boolean array.
w d(w)=7
1
7 3 v d(v)=5
5 1 2 z d(z)=¥
S={s}
u d(u)=3
s d(s)=0 3
w d(w)=3+3=6
1
7 3 v d(v)=3+1=4
S={s,u} 5 1 2 z d(z)=¥
u d(u)=3
s d(s)=0 3
Dijkstra’s Algorithm 85
w d(w)=3+3=6
1
7 3 v d(v)=3+1=4
S={s,u} 5 1 2 z d(z)=¥
u d(u)=3
s d(s)=0 3
w d(w)=3+1+1=5
1
7 3 v d(v)=4
S={s,u,v} 5 1 2 z d(z)=3+1+2=6
u d(u)=3
s d(s)=0 3
Dijkstra’s Algorithm 86
Claim
Suppose v is the next vertex to be added to S. Then dv is the
length of the shortest path from s to v .
Proof
Proof (continued)
Proof (continued)
new S
old S
v
s
y
Dijkstra’s Algorithm: Updates 91
Question
Earlier, we said that when we add a vertex v to S, we may have to
update some dz values. What updates could be required?
Answer
If there is an edge from v to z with weight w (v , z), the shortest
known path to z may be improved by taking the shortest path to v
followed by this edge. Therefore we check whether
dz > dv + w (v , z),
Claim
If dz changes as a result of adding v to S, the new shortest known
path to z must have penultimate vertex v , i.e. the last edge must
go from v to z.
Proof
p = s → · · · → v → · · · → u → z.
Dijkstra’s Algorithm: Updates 93
Proof (continued)
new S
old S
v
s
z
u
Dijkstra’s Algorithm: Data Structures 95
At each stage:
Question
What is the time complexity of this algorithm?
Answer
The first two of these suggest the use of a min-heap, but the
standard heap doesn’t allow us to update arbitrary elements.
Augmented Heaps 101
Accessing the top of the heap still takes O(1), and popping
the heap still takes O(log n).
Augmented Heaps 103
(7, 1)
(6, 2) (5, 6)
j 1 2 3 4 5 6 7
7 6 5 2 4 1 3 i
A[j]
1 2 6 3 8 7 9 di
i 1 2 3 4 5 6 7 i
P[i] 6 4 7 5 3 2 1 pi
Dijkstra’s Algorithm: Augmented Heap 104
Algorithm
Store the di values in an augmented heap of size n.
At each stage:
Question
What is the time complexity of our algorithm?
Answer
Note
In COMP2521/9024, you may have seen that the time complexity
of Dijkstra’s algorithm can be improved to O(m + n log n). This is
true, but it relies on an advanced data structure called the
Fibonacci heap, which has not been taught in this course or any
prior course.
Table of Contents 107
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Minimum Spanning Trees 108
Definition
A minimum spanning tree T of a connected graph G is a subgraph
of G (with the same set of vertices) which is a tree, and among all
such trees it minimises the total length of all edges in T .
Lemma
Let G be a connected graph with all lengths of edges E of G
distinct and S a non empty proper subset of the set of all vertices
V of G . Assume that e = (u, v ) is an edge such that u ∈ S and
v 6∈ S and is of minimal length among all the edges having this
property. Then e must belong to every minimum spanning tree T
of G .
Minimum Spanning Trees 109
Proof
Assume that there exists a minimum spanning tree T which does
not contain such an edge e = (u, v ).
p q
S S̄
u v
Minimum Spanning Trees 110
Proof (continued)
However, (u, v ) is shorter than any other edge with one end in
S and one end outside S, including (p, q).
Replacing the edge (p, q) with the edge (u, v ) produces a new
tree T 0 with smaller total edge weight.
Claim
Kruskal’s algorithm produces a minimal spanning tree, and if all
weights are distinct, then such a Minimum Spanning Tree is
unique.
Proof
We consider the case when all weights are distinct.
Consider an edge e = (u, v ) added in the course of Kruskal’s
algorithm, and let F be the forest in its state before adding e.
Kruskal’s Algorithm 114
Proof (continued)
The original graph does not contain any edges shorter than e
with one end in S and the other outside S. If such an edge
existed, it would have been considered before e and included
in F , but then both its endpoints would be in S, contradicting
the definition.
Proof (continued)
Any one Union operation might be Θ(n), but the total time
taken by the first k is O(k log k), i.e. each takes ‘on average’
O(log k).
Note
If i is not the representative of any set, then B[i] is zero and the
list L[i] is empty.
Note
The list array L allows us to iterate through the members of one of
the disjoint sets, which is used in the Union operation.
Union-Find 121
append the list L[j] to the list L[i] and replace L[j] with an
empty list.
Union-Find 122
Observation
The new value of B[i] is at least twice the old value of B[j].
Observation
Suppose m is an element of the smaller set J, so its label A[m]
changed from j to i.
Then the observation above tells us that B[A[m]] (formerly the old
B[j], now the new B[i]) at least doubled.
Problem
Instance: A complete graph G with weighted edges representing
distances between the two vertices.
Solution
Sort the edges in increasing order and start performing the usual
Kruskal’s algorithm for building a minimal spanning tree, but stop
when you obtain k connected components, rather than a single
spanning tree.
k-clustering of maximum spacing 130
Proof of optimality
y
Si
x z
w
Sj
k-clustering of maximum spacing 133
1. Introduction
2. Assorted problems
3. Applications to graphs
3.1 Directed graph structure
3.2 Single source shortest paths
3.3 Minimum spanning trees
4. Puzzle
Puzzle 137
Problem
Bob is visiting Elbonia and wishes to send his teddy bear to Alice,
who is staying at a different hotel. Both Bob and Alice have boxes
like the one illustrated above, as well as padlocks which can be
used to lock the boxes.
Puzzle 138
Problem (continued)
However, there is a problem. The Elbonian postal service
mandates that when a nonempty box is sent, it must be locked.
Also, they do not allow keys to be sent, so the key must remain
with the sender. Finally, you can send padlocks only if they are
locked. How can Bob safely send his teddy bear to Alice?
Puzzle 139
Hint
The way in which the boxes are locked (via a padlock) is
important. It is also crucial that both Bob and Alice have padlocks
and boxes. They can also communicate over the phone to agree on
the strategy.
There are two possible solutions; one can be called the “AND”
solution, the other can be called the “OR” solution. The “AND”
solution requires 4 mail one way services while the “OR” solution
requires only 2.
That’s All, Folks!!