Van Emde Boas Trees
Van Emde Boas Trees
Van Emde Boas Trees
• Name: Tree(n)
• Operations:
insert(x) Add x to T .
delete(x) Remove x from T .
findsucc(x) Return the smallest element in T that is ≥ x.
1
Note that the translation from i to a, b (and conversely) is simple and
expedient, since a and b is the most significant half of bits in the
√ binary rep-
resentation of i, respectively the least significant half of bits ( n is rounded
to a power of 2). √
Tree(n) uses an additional version of Tree( n) called top. top contains
a if and only if bottom[a] is nonempty. It is now possible to implement
an operation of Tree(n)
√ in constant time plus the time for calling a single
operation in Tree( n) (implying O(log log n) time per operation).
This method may be viewed as binary search on the path from the root
to a leaf in a balanced binary tree. Initially, we spend constant time deciding
whether to call a top operation (upper half of path) or a bottom operation
(lower half of path). This is followed by the recursive call.
In this way, findsucc can be done in time O(log log n). It requires more
sophistication to obtain the same bound for delete and insert, but a partial
implementation is sketched in datastructure 1.
Note that max,min may be updated in time O(1) when using updated
values of max,min from top and bottom.
In the specific case when we insert the first element in a bottom tree or
delete the last element from a bottom tree, we need in addition to insert,
respectively remove, an element from the top tree. This makes a worst case
2
√
recurrence for time usage of T (n) = 2T ( n) + O(1) with the solution T (n) =
O(log n). To obtain a better result, we must modify the implementation to
allow insertion of the first element into an empty tree in constant time and
to allow deletion of the last element from a tree in constant time, i.e. we
should avoid recursive calls. It is possible to modify the implementation in
the suggested manner and still make it work. In every nonempty tree we keep
one element (say the minimum) out of the recursive structure. The√ details are
shown in datastructure 2, and it leads to the recurrence T (n) = T ( n)+O(1)
that has the solution T (n) = O(log log n).
Note that we might also have kept max out of the recursive structure by
a straightforward generalisation of data structure 2.
The implementation is described recursively, but for small n the set should
be represented directly in a bitstring or a red-black tree.
In practice we start with a welldefined tree structure that is either empty
or completely full. Either possibility can be handled in time O(1) using lazy
initialization.
We have restricted the description to the operations insert, delete and
findsucc, but in great detail. These operations are fundamental, and it is
easy to implement additional operations such as deletemin being equal to
delete(findsucc(1)) and decreasekey that may be implemented by com-
bining delete and insert.
3
This ordered dictionary has several applications (see the problem set).
We describe one of them here, namely Union-Find-Split on intervals.
Union-Split-Find on intervals.
The usual union-find datastructure is very efficient for representing a parti-
tioning of a set into classes with operations for uniting two classes and for
getting the name of the class containing some given element. If one wants
an extra operation for splitting a class into two classes, it is necessary to
define more precisely what such a split operation should do. A class with
k elements may be split in 2k − 2 different ways. For a survey of different
union-split-find problems and known solutions, see [1].
We present here a very efficient solution (time O(log log n) per operation)
for a variant of union-split-find on intervals, ie. we consider the following
datatype:
Union-Split-Find:
• Name: Interval(n)
• Operations:
find(x) returns the name of the interval containing x.
union(x): Unites the interval containing x with the immediately fol-
lowing interval.
split(x): The interval I containing x is split into two intervals I ∩[1; x]
and I ∩ [x + 1; n].
4
References
[1] Zvi Galil and Giuseppe F. Italiano. Data structures and algorithms for disjoint
set union problems. ACM Computing Surveys (CSUR) 23(3) (1991), 319–344.
[2] P. van Emde Boas. Preserving order in a forest in less than logarithmic time
and linear space. Inform. Process. Lett. 6(3) (1977), 80–82.
[3] P. van Emde Boas, R. Kaas, and E. Zijlstra. Design and implementation of
an efficient priority queue. Math. Systems Theory 10(2) (1976/77), 99–127.
Sixteenth Annual Symposium on Foundations of Computer Science (Berkeley,
Calif., 1975), selected papers.