AUU Lec4
AUU Lec4
AUU Lec4
Online load balancing is another classical setting. The setting itself is simple. We have a set of ‘ma-
chines’ which we shall denote with M and |M | = m. At every round, a job j arrives and we know about its
‘size’ pj - let J denote the set of jobs (since it’s an online setting, this set keeps growing but we keep notation
simple). The task is to immediately assign the job to one of the machines. We say that, if job j is assigned
P pj on machine i. Formally, we need to build an assignment σ : J → M . Then
to machine i, it puts a ‘load’
we define loadσ (i) = j:σ(j)=i pj . The load balancing (also known as makespan minimization) problem
seeks to determine an assignment σ which minimizes the maximum load across all machines. Formally, the
problem is
min max load(i)
σ i∈M
1 Identical Machines
This is the simplest possible setting where every job j ∈ J can potentially be assigned on every machine
i ∈ M and it will put a load of pj . We use a very simple greedy algorithm - when a job j arrives, assign it
to the machine which has the least current load.
Proof. We will first figure out two good lower bounds for the (offline) optimal solution.
1. opt ≥ pmax = maxj∈J pj : This is clearly true since the job with maximum size must be assigned to
some machine in any solution.
P
pj
2. opt ≥ j∈J m : This is true since there are m machines and in any solution, the maximum load has
to be at least the average load across all machines.
Now let us consider the machine which has the highest load in greedy - let us call it i⋆ . Let j ⋆ be the last job
assigned to i⋆ by the greedy algorithm. Now, by greedy nature of the algorithm, we can say that i⋆ was the
⋆
P loaded machine when j was assigned to it. Also, the load on⋆ the least loaded machine
least can be at most
⋆ after assigning
p
j∈J\{j ⋆ } j /m , that is, at most the average of all jobs except j . So, the total load on i
j ⋆ is
P P
j∈J\{j ⋆ } pj j∈J pj − pj ⋆
+ pj ⋆ ≤ + pj ⋆ ≤ opt + opt(1 − 1/m) = (2 − 1/m)opt
m m
,
where the second last inequality follows by applying the two lower bounds on opt.
1
2 Restricted Assignment
Next we consider a setting which is more general than the identical machines case. We no longer have that
every job can be assigned to every machine. Rather, each job j specifies a subset of machine Sj on which it
can be assigned. Everything else remains the same as before.
First we prove a rather strong negative result.
Theorem 2. Any online deterministic algorithm is ω(log m)-competitive for any m > 0
Proof. So we resort to the usual adaptive adversary model to beat any deterministic online algorithm. Let
m = 2k for some k (this assumption is not entirely without loss of generality, but the proof can easily be
adapted). Let us name the machines i1 , i2 , · · · im in any arbitrary order.
The adversary sends out m jobs in total, but they come in batches. In the first batch, it sends m/2 jobs
such that the pth job can be assigned only to machines ip and ip+1 . (Note that here, crucially, we are using
the fact that this is a restricted assignments setting. Such an adversary cannot be created for the identical
machines case). Now assume without loss of generality that the algorithm assigns the pth job to ip - this is
without loss of generality since we can simply rename the machines accordingly. The adversary now sends
the second batch of jobs - m/4 of them - the qth job can only go to machines iq , iq+2 , ∀q = 1, 2, m − 3.
Again, the algorithm has to assign these jobs to one of the allowed machines and assume w.l.o.g. that the
qth job goes to machine iq . This process can be repeated roughly k = log m times (see figure for a succinct
description of the procedure with 8 jobs and 8 machines). Hence, the load on at least one of the machines
would be k at the end of all batches.
On the other hand, it is easy to see that an offline optimal solution which knows the sequence of jobs
beforehand achieves a load of just 1 on each machine. Thus, the competitive ratio is exactly log m, proving
the theorem.
𝐴𝐿𝐺 OPT
The above result, in fact, even holds for all randomized algorithms, which is rather disappointing, but
intriguing !
Now we prove a matching upper bound result. We consider a natural extension of the greedy algorithm
for restricted assignments. When job j arrives, we assign it to the least loaded machine in the subset Sj .
2
Theorem 3. Greedy is (2 log m + 2)-competitive for restricted assignments.
We will prove the above theorem for a simpler setting where each job has size = 1. Already this will
capture most of the non-trivial ideas of the proof. The general case is left as one of the homeworks.
Proof. Let λ⋆ denote the offline optimal solution. Let us consider the machine with the highest load - call it
i1 and let the load on this machine be L > (2 log m + 2)λ⋆ for the sake of contradiction. Let us focus our
attention on the last 2λ⋆ jobs assigned to i1 - note that assuming 2 log m is sufficiently large, such a set will
exist. Let us define this set of jobs as J1 . Here is a simple but crucial observation. Since λ⋆ is the maximum
load in optimal solution, there has to be at least one more machine, say i2 where at least λ⋆ many jobs from
J1 must be assigned - let us call this subset J1′ . Consider the time when the first job from J1′ was assigned
by greedy to machine i1 - why did greedy do this ? Well because i2 must be having either the same or larger
load then i1 . But the load on i1 was already at least L − 2λ⋆ and hence i2 as well must be having this much
load. Hence, now we have proven the existence of two machines in the greedy algorithm, each of which has
load at least L − 2λ⋆ .
Let us try to repeat this process. Note that now we have two machines i1 and i2 with sufficiently large
load. Hence, we can consider the jobs which are responsible for increasing the load on both the machines
from L − 4λ⋆ to L − 2λ⋆ - there are now 4λ⋆ such jobs in total. Again, it can be argued that opt will need
at least 4 machines to schedule all these jobs since otherwise the maximum load would be larger than λ⋆ .
The moral of the story is - each time we repeat the above argument, we can prove that the number of
machines doubles. So the question is how many times r can we repeat the process ? We can do this until
L − 2rλ⋆ = 2λ⋆
, which implies
r = (L − 2λ⋆ )/2λ⋆
Now recall our assumption that L > (2 log m)λ⋆ . Plugging this in to the above inequality gives us
r > log m
But this means there are more than 2r = m machines which is a clear contradiction. Hence L must be at
most 2(log m + 1)λ⋆ which concludes the proof.