Algorithm Lecture
Algorithm Lecture
Computational Geometry1
David M. Mount
Department of Computer Science
University of Maryland
Fall 2002
1 Copyright, David M. Mount, 2002, Dept. of Computer Science, University of Maryland, College Park, MD, 20742. These lecture notes were
prepared by David Mount for the course CMSC 754, Computational Geometry, at the University of Maryland. Permission to use, copy, modify, and
distribute these notes for educational purposes and without fee is hereby granted, provided that this copyright notice appear in all copies.
s t s t
The measure of the quality of an algorithm in computational geometry has traditionally been its asymptotic
worst-case running time. Thus, an algorithm running in O(n) time is better than one running in O(n log n) time
which is better than one running in O(n2 ) time. (This particular problem can be solved in O(n2 log n) time
by a fairly simple algorithm, in O(n log n) by a relatively complex algorithm, and it can be approximated quite
well by an algorithm whose running time is O(n log n).) In some cases average case running time is considered
instead. However, for many types of geometric inputs it is difficult to define input distributions that are both
easy to analyze and representative of typical inputs.
There are many fields of computer science that deal with solving problems of a geometric nature. These include
computer graphics, computer vision and image processing, robotics, computer-aided design and manufacturing,
computational fluid-dynamics, and geographic information systems, to name a few. One of the goals of com-
putational geometry is to provide the basic geometric tools needed from which application areas can then build
Convex Hulls: Convexity is a very important geometric property. A geometric set is convex if for every two
points in the set, the line segment joining them is also in the set. One of the first problems identified in
the field of computational geometry is that of computing the smallest convex shape, called the convex hull,
that encloses a set of points.
Intersections: One of the most basic geometric problems is that of determining when two sets of objects in-
tersect one another. Determining whether complex objects intersect often reduces to determining which
individual pairs of primitive entities (e.g., line segments) intersect. We will discuss efficient algorithms for
computing the intersections of a set of line segments.
Triangulation and Partitioning: Triangulation is a catchword for the more general problem of subdividing a
complex domain into a disjoint collection of “simple” objects. The simplest region into which one can
decompose a planar object is a triangle (a tetrahedron in 3-d and simplex in general). We will discuss
how to subdivide a polygon into triangles and later in the semester discuss more general subdivisions into
trapezoids.
Low-dimensional Linear Programming: Many optimization problems in computational geometry can be stated
in the form of a linear programming problem, namely, find the extreme points (e.g. highest or lowest) that
satisfies a collection of linear inequalities. Linear programming is an important problem in the combinator-
ial optimization, and people often need to solve such problems in hundred to perhaps thousand dimensional
spaces. However there are many interesting problems (e.g. find the smallest disc enclosing a set of points)
that can be posed as low dimensional linear programming problems. In low-dimensional spaces, very
simple efficient solutions exist.
Line arrangements and duality: Perhaps one of the most important mathematical structures in computational
geometry is that of an arrangement of lines (or generally the arrangement of curves and surfaces). Given
n lines in the plane, an arrangement is just the graph formed by considering the intersection points as
vertices and line segments joining them as edges. We will show that such a structure can be constructed in
O(n2 ) time. These reason that this structure is so important is that many problems involving points can be
transformed into problems involving lines by a method of duality. For example, suppose that you want to
determine whether any three points of a planar point set are collinear. This could be determines in O(n3 )
time by brute-force checking of each triple. However, if the points are dualized into lines, then (as we will
see later this semester) this reduces to the question of whether there is a vertex of degree greater than 4 in
the arrangement.
Voronoi Diagrams and Delaunay Triangulations: Given a set S of points in space, one of the most important
problems is the nearest neighbor problem. Given a point that is not in S which point of S is closest to it?
One of the techniques used for solving this problem is to subdivide space into regions, according to which
point is closest. This gives rise to a geometric partition of space called a Voronoi diagram. This geometric
Search: Geometric search problems are of the following general form. Given a data set (e.g. points, lines,
polygons) which will not change, preprocess this data set into a data structure so that some type of query
can be answered as efficiently as possible. For example, a nearest neighbor search query is: determine the
point of the data set that is closest to a given query point. A range query is: determine the set of points (or
count the number of points) from the data set that lie within a given region. The region may be a rectangle,
disc, or polygonal shape, like a triangle.
Fixed-Radius Near Neighbor Problem: As a warm-up exercise for the course, we begin by considering one of the
oldest results in computational geometry. This problem was considered back in the mid 70’s, and is a funda-
mental problem involving a set of points in dimension d. We will consider the problem in the plane, but the
generalization to higher dimensions will be straightforward.
We assume that we are given a set P of n points in the plane. As will be our usual practice, we assume that each
point p is represented by its (x, y) coordinates, denoted (px , py ). The Euclidean distance between two points p
and q, denoted |pq| is q
|pq| = (px − qx )2 + (py − qy )2 .
Given the set P and a distance r > 0, our goal is to report all pairs of distinct points p, q ∈ P such that |pq| ≤ r.
This is called the fixed-radius near neighbor (reporting) problem.
Reporting versus Counting: We note that this is a reporting problem, which means that our objective is to report all
such pairs. This is in contrast to the corresponding counting problem, in which the objective is to return a count
of the number of pairs satisfying the distance condition.
It is usually easier to solve reporting problems optimally than counting problems. This may seem counterin-
tuitive at first (after all, if you can report, then you can certainly count). The reason is that we know that any
algorithm that reports some number k of pairs must take at least Ω(k) time. Thus if k is large, a reporting
algorithm has the luxury of being able to run for a longer time and still claim to be optimal. In contrast, we
cannot apply such a lower bound to a counting algorithm.
Now, for i running from 1 to n, we consider the successive points xi+1 , xi+2 , xi+3 , and so on, until we first find
a point whose distance exceeds r. We report xi together with all succeeding points that are within distance r.
x1 x2 x3 x4 x5 x6
The running time of this algorithm involves the O(n log n) time needed to sort the points and the time required
for distance computations. Let ki denote the number of pairs generated when we visit pi . Observe that the
processing of pi involves ki +1 distance computations (one additional computation for the points whose distance
exceeds r). Thus, up to constant factors, the total running time is:
n
X n
X
T (n, k) = n log n + (ki + 1) = n log n + n + ki
i=1 i=1
= n log n + n + k = O(k + n log n).
This is close to the O(k + n) time we were hoping for. It seems that any approach based on sorting is doomed to
take at least Ω(n log n) time. So, if we are to improve upon this, we cannot sort. But is sorting really necessary?
Let us consider an approach based on bucketing.
1-dimensional Solution with Bucketing: Rather than sorting the points, suppose that we subdivide the line into in-
tervals of length r. In particular, we can take the line to be composed of an infinite collection of half-open
intervals:
. . . , [−3r, −2r), [−2r, −r), [−r, 0), [0, r), [r, 2r), [2r, 3r), . . .
In general, interval b is [br, (b + 1)r).
A point x lies in the interval b = bx/rc. Thus, in O(n) time we can associate the n points of P with a set
of n integer bucket indices bi . Although there are an infinite number of intervals, at most n will be occupied,
meaning that they contain a point of P .
There are a number of ways to organize the occupied buckets. They could be sorted, but then we are back to
O(n log n) time. Since bucket indices are integers, a better approach is to store the occupied bucket indices
To avoid duplicates, we need only report pairs (x, x0 ) where x0 > x. The key question is determining the time
complexity of this algorithm is how many distance computations are performed in step (2b). We compare each
point in bucket b with all the points in buckets b and b + 1. However, not all of these distance computations
will result in a pair of points whose distance is within r. Might it be that we waste a great deal of time in
performing computations for which we receive no benefit? The lemma below shows that we perform no more
than a constant factor times as many distance computations and pairs that are produced.
It will simplify things considerably if, rather than counting distinct pairs of points, we simply count all (ordered)
pairs of points that lie within distance r of each other. Thus each pair of points will be counted twice, (p, q) and
(q, p). Note that this includes reporting each point as a pair (p, p) as well, since each point is within distance r
of itself. This does not affect the asymptotic bounds, since the number of distinct pairs is smaller by a factor of
roughly 1/2.
Lemma: Let k denote the number of (not necessarily distinct) pairs of points of P that are within distance r of
each other. Let D denote the total number distance computations made in step (2b) of the above algorithm.
Then D ≤ 2k.
Proof: We will make use of the following inequality in the proof:
x2 + y 2
xy ≤ .
2
This follows by expanding the obvious inequality (x − y)2 ≥ 0.
Let B denote the (infinite) set of buckets. For any bucket b ∈ B, let b + 1 denote its successor bucket on
the line, and let nb denote the number of points of P in b. Define
X
S= n2b .
b∈B
First we bound the total number of distance computations D as a function of S. Each point in bucket b
computes the distance to every other point in bucket b and every point in bucket b + 1, and hence
X X X X
D = nb (nb + nb+1 ) = n2b + nb nb+1 = n2b + nb nb+1
b∈B b∈B b∈B b∈B
Next we bound the number of pairs reported from below as a function of S. Since each pair of points lying
in bucket b is within distance r of every other, there are n2b pairs in bucket b alone that are within distance
r of each other, and hence (considering just the pairs generated within each bucket) we have k ≥ S.
Therefore we have
D ≤ 2S ≤ 2k,
which completes the proof.
By combining this with the O(n) expected time needed to bucket the points, it follows that the total expected
running time is O(n + k).
Generalization to d dimensions: This bucketing algorithm is easy to extend to multiple dimensions. For example, in
dimension 2, we bucket points into a square grid in which each grid square is of side length r. The bucket index
of a point (x, y) is a pair (bx/rc , by/rc). We apply a hash function that accepts two arguments. To generalize
the algorithm, for each point we consider the points in its surrounding 3 × 3 subgrid of buckets. By generalizing
the above arguements, it can be shown that the algorithm’s expected running time is O(n + k). The details are
left as an exercise.
r
Figure 5: Fixed radius nearest neighbor on the plane.
This example problem serves to illustrate some of the typical elements of computational geometry. Geometry
itself did not play a significant role in the problem, other than the relatively easy task of computing distances. We
will see examples later this semester where geometry plays a much more important role. The major emphasis
was on accounting for the algorithm’s running time. Also note that, although we discussed the possibility of gen-
eralizing the algorithm to higher dimensions, we did not treat the dimension as an asymptotic quantity. In fact,
a more careful analysis reveals that this algorithm’s running time increases exponentially with the dimension.
(Can you see why?)
Geometry Basics: As we go through the semester, we will introduce much of the geometric facts and computational
primitives that we will be needing. For the most part, we will assume that any geometric primitive involving a
u+v
q
u p+v
p−q p v
v p
vector addition point subtraction point−vector addition
A number of operations can be derived from these. For example, we can define the subtraction of two vectors
~u − ~v as ~u + (−1) · ~v or scalar-vector division ~v /α as (1/α) · ~v provided α 6= 0. There is one special vector,
called the zero vector, ~0, which has no magnitude, such that ~v + ~0 = ~v .
Note that it is not possible to multiply a point times a scalar or to add two points together. However there is a
special operation that combines these two elements, called an affine combination. Given two points p0 and p1
and two scalars α0 and α1 , such that α0 + α1 = 1, we define the affine combination
aff(p0 , p1 ; α0 , α1 ) = α0 p0 + α1 p1 = p0 + α1 (p1 − p0 ).
Note that the middle term of the above equation is not legal given our list of operations. But this is how the
affine combination is typically expressed, namely as the weighted average of two points. The right-hand side
(which is easily seen to be algebraically equivalent) is legal. An important observation is that, if p0 6= p1 , then
the point aff(p0 , p1 ; α0 , α1 ) lies on the line joining p0 and p1 . As α1 varies from −∞ to +∞ it traces out all
the points on this line.
In the special case where 0 ≤ α0 , α1 ≤ 1, aff(p0 , p1 ; α0 , α1 ) is a point that subdivides the line segment p0 p1
into two subsegments of relative sizes α1 to α0 . The resulting operation is called a convex combination, and the
set of all convex combinations traces out the line segment p0 p1 .
It is easy to extend both types of combinations to more than two points, by adding the condition that the sum
α0 + α1 + α2 = 1.
The set of all affine combinations of three (noncolinear) points generates a plane. The set of all convex combi-
nations of three points generates all the points of the triangle defined by the points. These shapes are called the
affine span or affine closure, and convex closure of the points, respectively.
Euclidean Geometry: In affine geometry we have provided no way to talk about angles or distances. Euclidean
geometry is an extension of affine geometry which includes one additional operation, called the inner product,
which maps two real vectors (not points) into a nonnegative real. One important example of an inner product is
the dot product, defined as follows. Suppose that the d-dimensional vector ~u is represented by the (nonhomoge-
neous) coordinate vector (u1 , u2 , . . . , ud ). Then define
d−1
X
~u · ~v = u i vi ,
i=0
~v
v̂ = .
|~v |
Distance between points: Denoted either dist(p, q) or |pq| is the length of the vector between them, |p − q|.
Angle: between two nonzero vectors ~u and ~v (ranging from 0 to π) is
µ ¶
−1 ~u · ~v
ang(~u, ~v ) = cos . = cos−1 (û · v̂).
|~u||~v |
Orientation: In order to make discrete decisions, we would like a geometric operation that operates on points in
a manner that is analogous to the relational operations (<, =, >) with numbers. There does not seem to be
any natural intrinsic way to compare two points in d-dimensional space, but there is a natural relation between
ordered (d + 1)-tuples of points in d-space, which extends the notion of binary relations in 1-space, called
orientation.
Given an ordered triple of points hp, q, ri in the plane, we say that they have positive orientation if they define
a counterclockwise oriented triangle, negative orientation if they define a clockwise oriented triangle, and zero
orientation if they are collinear (which includes as well the case where two or more of the points are identical).
Note that orientation depends on the order in which the points are given.
r
q q
q p=r
r
q
p p r p
positive negative zero zero
Orientation is formally defined as the sign of the determinant of the points given in homogeneous coordinates,
that is, by prepending a 1 to each coordinate. For example, in the plane, we define
1 px py
Orient(p, q, r) = det 1 qx qy .
1 rx ry
Observe that in the 1-dimensional case, Orient(p, q) is just q − p. Hence it is positive if p < q, zero if p = q, and
negative if p > q. Thus orientation generalizes <, =, > in 1-dimensional space. Also note that the sign of the
orientation of an ordered triple is unchanged if the points are translated, rotated, or scaled (by a positive scale
factor). A reflection transformation, e.g., f (x, y) = (−x, y), reverses the sign of the orientation. In general,
applying any affine transformation to the point alters the sign of the orientation according to the sign of the
matrix used in the transformation.
In general, given an ordered 4-tuple points in 3-space, we can also define their orientation as being either positive
(forming a right-handed screw), negative (a left-handed screw), or zero (coplanar). This can be generalized to
any ordered (d + 1)-tuple points in d-space.
Areas and Angles: The orientation determinant, together with the Euclidean norm can be used to compute angles in
the plane. This determinant Orient(p, q, r) is equal to twice the signed area of the triangle 4pqr (positive if
CCW and negative otherwise). Thus the area of the triangle can be determined by dividing this quantity by 2.
In general in dimension d the area of the simplex spanned by d + 1 points can be determined by taking this
determinant and dividing by (d!). Once you know the area of a triangle, you can use this to compute the area of
a polygon, by expressing it as the sum of triangle areas. (Although there are other methods that may be faster or
easier to implement.)
Recall that the dot product returns the cosine of an angle. However, this is not helpful for distinguishing positive
from negative angles. The sine of the angle θ = 6 pqr (the signed angled from vector p − q to vector r − q) can
There are a number of reasons that the convex hull of a point set is an important geometric structure. One is
that it is one of the simplest shape approximations for a set of points. It can also be used for approximating
more complex shapes. For example, the convex hull of a polygon in the plane or polyhedron in 3-space is the
convex hull of its vertices. (Perhaps the most common shape approximation used in the minimum axis-parallel
bounding box, but this is trivial to compute.)
Also many algorithms compute the convex hull as an initial stage in their execution or to filter out irrelevant
points. For example, in order to find the smallest rectangle or triangle that encloses a set of points, it suffices to
first compute the convex hull of the points, and then find the smallest rectangle or triangle enclosing the hull.
Convexity: A set S is convex if given any points p, q ∈ S any convex combination of p and q is in S, or
equivalently, the line segment pq ⊆ S.
Convex hull: The convex hull of any set S is the intersection of all convex sets that contains S, or more intu-
itively, the smallest convex set that contains S. Following our book’s notation, we will denote this CH(S).
An equivalent definition of convex hull is the set of points that can be expressed as convex combinations of the
points in S. (A proof can be found in any book on convexity theory.) Recall that a convex combination of three
or more points is an affine combination of the points in which the coefficients sum to 1 and all the coefficients
are in the interval [0, 1].
Some Terminology: Although we will not discuss topology with any degree of formalism, we will need to use some
terminology from topology. These terms deserve formal definitions, but we are going to cheat and rely on
intuitive definitions, which will suffice for the sort simple, well behaved geometry objects that we will be
dealing with. Beware that these definitions are not fully general, and you are refered to a good text on topology
for formal definitions. For our purposes, define a neighborhood of a point p to be the set of points whose
distance to p is strictly less than some positive r, that is, it is the set of points lying within an open ball of radius
r centered about p. Given a set S, a point p is an interior point of S if for some radius r the neighborhood about
p of radius r is contained within S. A point is an exterior point if it an interior point for the complement of S.
Points that are neither interior or exterior are boundary points. A set is open if it contains none of its boundary
points and closed if its complement is open. If p is in S but is not an interior point, we will call it a boundary
point. We say that a geometric set is bounded if it can be enclosed in a ball of finite radius. A compact set is one
that is both closed and bounded.
Convex
Convex hull problem: The (planar) convex hull problem is, given a set of n points P in the plane, output a rep-
resentation of P ’s convex hull. The convex hull is a closed convex polygon, the simplest representation is a
counterclockwise enumeration of the vertices of the convex hull. (A clockwise is also possible. We usually
prefer counterclockwise enumerations, since they correspond to positive orientations, but obviously one repre-
sentation is easily converted into the other.) Ideally, the hull should consist only of extreme points, in the sense
that if three points lie on an edge of the boundary of the convex hull, then the middle point should not be output
as part of the hull.
There is a simple O(n3 ) convex hull algorithm, which operates by considering each ordered pair of points (p, q),
and the determining whether all the remaining points of the set lie within the half-plane lying to the right of the
directed line from p to q. (Observe that this can be tested using the orientation test.) The question is, can we do
better?
Graham’s scan: We will present an O(n log n) algorithm for convex hulls. It is a simple variation of a famous
algorithm for convex hulls, called Graham’s scan. This algorithm dates back to the early 70’s. The algorithm is
based on an approach called incremental construction, in which points are added one at a time, and the hull is
updated with each new insertion. If we were to add points in some arbitrary order, we would need some method
of testing whether points are inside the existing hull or not. To avoid the need for this test, we will add points
in increasing order of x-coordinate, thus guaranteeing that each newly added point is outside the current hull.
(Note that Graham’s original algorithm sorted points in a different way. It found the lowest point in the data set
and then sorted points cyclically around this point.)
Since we are working from left to right, it would be convenient if the convex hull vertices were also ordered
from left to right. The convex hull is a cyclically ordered sets. Cyclically ordered sets are somewhat messier to
work with than simple linearly ordered sets, so we will break the hull into two hulls, an upper hull and lower
hull. The break points common to both hulls will be the leftmost and rightmost vertices of the convex hull. After
building both, the two hulls can be concatenated into a single cyclic counterclockwise list.
Here is a brief presentation of the algorithm for computing the upper hull. We will store the hull vertices in a
stack U , where the top of the stack corresponds to the most recently added point. Let first(U ) and second(U )
denote the top and second element from the top of U , respectively. Observe that as we read the stack from
top to bottom, the points should make a (strict) left-hand turn, that is, they should have a positive orientation.
Thus, after adding the last point, if the previous two points fail to have a positive orientation, we pop them off
the stack. Since the orientations of remaining points on the stack are unaffected, there is no need to check any
points other than the most recent point and its top two neighbors on the stack.
Let us consider the upper hull, since the lower hull is symmetric. Let hp1 , p2 , . . . , pn i denote the set of points,
sorted by increase x-coordinates. As we walk around the upper hull from left to right, observe that each consec-
utive triple along the hull makes a right-hand turn. That is, if p, q, r are consecutive points along the upper hull,
then Orient(p, q, r) < 0. When a new point pi is added to the current hull, this may violate the right-hand turn
upper hull p p
i i
p
n
p pop
1
lower hull
processing p[i] after adding p[i]
invariant. So we check the last three points on the upper hull, including pi . They fail to form a right-hand turn,
then we delete the point prior to pi . This is repeated until the number of points on the upper hull (including pi )
is less than three, or the right-hand turn condition is reestablished. See the text for a complete description of the
code. We have ignored a number of special cases. We will consider these next time.
Analysis: Let us prove the main result about the running time of Graham’s scan.
Convex Hull by Divide-and-Conquer: As with sorting, there are many different approaches to solving the convex
hull problem for a planar point set P . Next we will consider another O(n log n) algorithm, which is based on
the divide-and-conquer design technique. It can be viewed as a generalization of the famous MergeSort sorting
algorithm (see Cormen, Leiserson, and Rivest). Here is an outline of the algorithm. It begins by sorting the
points by their x-coordinate, in O(n log n) time.
The asymptotic running time of the algorithm can be expressed by a recurrence. Given an input of size n,
consider the time needed to perform all the parts of the procedure, ignoring the recursive calls. This includes the
time to partition the point set, compute the two tangents, and return the final result. Clearly the first and third of
these steps can be performed in O(n) time, assuming a linked list representation of the hull vertices. Below we
upper tangent
b
B a B
A A
lower tangent
(a) (b)
will show that the tangents can be computed in O(n) time. Thus, ignoring constant factors, we can describe the
running time by the following recurrence.
½
1 if n ≤ 3
T (n) =
n + 2T (n/2) otherwise.
This is the same recurrence that arises in Mergesort. It is easy to show that it solves to T (n) ∈ O(n log n) (see
CLR). All that remains is showing how to compute the two tangents.
One thing that simplifies the process of computing the tangents is that the two point sets A and B are separated
from each other by a vertical line (assuming no duplicate x-coordinates). Let’s concentrate on the lower tangent,
since the upper tangent is symmetric. The algorithm operates by a simple “walking” procedure. We initialize a
to be the rightmost point of HA and b is the leftmost point of HB . (These can be found in linear time.) Lower
tangency is a condition that can be tested locally by an orientation test of the two vertices and neighboring
vertices on the hull. (This is a simple exercise.) We iterate the following two loops, which march a and b down,
until they reach the points lower tangency.
Proving the correctness of this procedure is a little tricky, but not too bad. Check O’Rourke’s book out for a
careful proof. The important thing is that each vertex on each hull can be visited at most once by the search, and
QuickHull: If the divide-and-conquer algorithm can be viewed as a sort of generalization of MergeSort, one might
ask whether there is corresponding generalization of other sorting algorithm for computing convex hulls. In
particular, the next algorithm that we will consider can be thought of as a generalization of the QuickSort
sorting procedure. The resulting algorithm is called QuickHull.
Like QuickSort, this algorithm runs in O(n log n) time for favorable inputs but can take as long as O(n2 ) time
for unfavorable inputs. However, unlike QuickSort, there is no obvious way to convert it into a randomized al-
gorithm with O(n log n) expected running time. Nonetheless, QuickHull tends to perform very well in practice.
The intuition is that in many applications most of the points lie in the interior of the hull. For example, if the
points are uniformly distributed in a unit square, then it can be shown that the expected number of points on the
convex hull is O(log n).
The idea behind QuickHull is to discard points that are not on the hull as quickly as possible. QuickHull begins
by computing the points with the maximum and minimum, x- and y-coordinates. Clearly these points must
be on the hull. Horizontal and vertical lines passing through these points are support lines for the hull, and so
define a bounding rectangle, within which the hull is contained. Furthermore, the convex quadrilateral defined
by these four points lies within the convex hull, so the points lying within this quadrilateral can be eliminated
from further consideration. All of this can be done in O(n) time.
discard these
To continue the algorithm, we classify the remaining points into the four corner triangles that remain. In general,
as this algorithm executes, we will have an inner convex polygon, and associated with each edge we have a set
of points that lie “outside” of that edge. (More formally, these points are witnesses to the fact that this edge is
not on the convex hull, because they lie outside the half-plane defined by this edge.) When this set of points is
empty, the edge is a final edge of the hull. Consider some edge ab. Assume that the points that lie “outside” of
this hull edge have been placed in a bucket that is associated with ab. Our job is to find a point c among these
points that lies on the hull, discard the points in the triangle abc, and split the remaining points into two subsets,
those that lie outside ac and those than lie outside of cb. We can classify each point by making two orientation
tests.
How should c be selected? There are a number of possible selection criteria that one might think of. The method
that is most often proposed is to let c be the point that maximizes the perpendicular distance from the line ab.
(For example, another possible choice might be the point that maximizes the angle cba or cab. It turns out that
discard these
a a
these can be are very poor choices because they tend to produce imbalanced partitions of the remaining points.)
We replace the edge ab with the two edges ac and cb, and classify the points as lying in one of three groups:
those that lie in the triangle abc, which are discarded, those that lie outside of ac, and those that lie outside of cb.
We put these points in buckets for these edges, and recurse. (We claim that it is not hard to classify each point
p, by computing the orientations of the triples acp and cbp.)
The running time of Quickhull, as with QuickSort, depends on how evenly the points are split at each stage. Let
T (n) denote the running time on the algorithm assuming that n points remain outside of some edge. In O(n)
time we can select a candidate splitting point c and classify the points in the bucket in O(n) time. Let n1 and n2
denote the number of remaining points, where n1 + n2 ≤ n. Then the running time is given by the recurrence:
½
1 if n = 1
T (n) =
T (n1 ) + T (n2 ) where n1 + n2 ≤ n.
In order to solve this recurrence, it would be necessary to determine the “reasonable” values for n1 and n2 .
If we assume that the points are “evenly” distributed, in the sense that max(n1 , n2 ) ≤ αn for some constant
α < 1, then by applying the same analysis as that used in QuickSort (see Cormen, Leiserson, Rivest) the running
time will solve to O(n log n), where the constant factor depends on α. On the other hand, if the splits are not
balanced, then the running time can easily increase to O(n2 ).
Does QuickHull outperform Graham’s scan? This depends to a great extent on the distribution of the point
set. There are variations of QuickHull that are designed for specific point distributions (e.g. points uniformly
distributed in a square) and their authors claim that they manage to eliminate almost all of the points in a matter
of only a few iterations.
Gift-Wrapping and Jarvis’s March: The next algorithm that we will consider is a variant on an O(n2 ) sorting al-
gorithm called SelectionSort. For sorting, this algorithm repeatedly finds the next element to add to the sorted
order from the remaining items. The corresponding convex hull algorithm is called Jarvis’s march. which
builds the hull in O(nh) time by a process called “gift-wrapping”. The algorithm operates by considering any
one point that is on the hull, say, the lowest point. We then find the “next” edge on the hull in counterclockwise
order. Assuming that pk and pk−1 were the last two points added to the hull, compute the point q that maximizes
the angle 6 pk−1 pk q. Thus, we can find the point q in O(n) time. After repeating this h times, we will return
back to the starting point and we are done. Thus, the overall running time is O(nh). Note that if h is o(log n)
(asymptotically smaller than log n) then this is a better method than Graham’s algorithm.
One technical detail is that when we to find an edge from which to start. One easy way to do this is to let p1 be
the point with the lowest y-coordinate, and let p0 be the point (−∞, 0), which is infinitely far to the right. The
point p0 is only used for computing the initial angles, after which it is discarded.
Output Sensitive Convex Hull Algorithms: It turns out that in the worst-case, convex hulls cannot be computed
faster than in Ω(n log n) time. One way to see this intuitively is to observe that the convex hull itself is sorted
along its boundary, and hence if every point lies on the hull, then computing the hull requires sorting of some
form. Yao proved the much harder result that determining which points are on the hull (without sorting them
along the boundary) still requires Ω(n log n) time. However both of these results rely on the fact that all (or at
least a constant fraction) of the points lie on the convex hull. This is often not true in practice.
p
3
p
2
p = (−INF,0) p
0 1
The QuickHull and Jarvis’s March algorithms that we saw last time suggest the question of how fast can convex
hulls be computed if we allow the running time to be described in terms of both the input size n and the output
size h. Many geometric algorithms have the property that the output size can be a widely varying function of
the input size, and worst-case output size may not be a good indicator of what happens typically. An algorithm
which attempts to be more efficient for small output sizes is called an output sensitive algorithm, and running
time is described as a asymptotic function of both input size and output size.
Chan’s Algorithm: Given than any convex hull algorithm must take at least O(n) time, and given that “log n” factor
arises from the fact that you need to sort the at most n points on the hull, if you were told that there are only h
points on the hull, then a reasonable target running time is O(n log h). (Below we will see that this is optimal.)
Kirkpatrick and Seidel discovered a relatively complicated O(n log h) time algorithm, based on a clever pruning
method in 1986. The problem was considered closed until around 10 years later when Timothy Chan came up
with a much simpler algorithm with the same running time. One of the interesting aspects of Chan’s algorithm
is that it involves combining two slower algorithms (Graham’s scan and Jarvis’s March) together to form an
algorithm that is faster than either one.
The problem with Graham’s scan is that it sorts all the points, and hence is doomed to having an Ω(n log n)
running time, irrespective of the size of the hull. On the other hand, Jarvis’s march can perform better if you
have few vertices on the hull, but it takes Ω(n) time for each hull vertex.
Chan’s idea was to partition the points into groups of equal size. There are m points in each group, and so
the number of groups is r = dn/me. For each group we compute its hull using Graham’s scan, which takes
O(m log m) time per group, for a total time of O(rm log m) = O(n log m). Next, we run Jarvis’s march
on the groups. Here we take advantage of the fact that you can compute the tangent between a point and a
convex m-gon in O(log m) time. (We will leave this as an exercise.) So, as before there are h steps of Jarvis’s
march, but because we are applying it to r convex hulls, each step takes only O(r log m) time, for a total of
O(hr log m) = ((hn/m) log m) time. Combining these two parts, we get a total of
µµ ¶ ¶
hn
O n+ log m
m
time. Observe that if we set m = h then the total running time will be O(n log h), as desired.
There is only one small problem here. We do not know what h is in advance, and therefore we do not know
what m should be when running the algorithm. We will see how to remedy this later. For now, let’s imagine that
someone tells us the value of m. The following algorithm works correctly as long as m ≥ h. If we are wrong,
it returns a special error status.
We assume that we store the convex hulls from step (2a) in an ordered array so that the step inside the for-loop
of step (4a) can be solved in O(log m) time using binary search. Otherwise, the analysis follows directly from
the comments made earlier.
q3
q4 q2
q1
pk
pk−1
t
Note that 22 has the effect of squaring the previous value of m. How long does this take? The t-th iteration
t t
takes O(n log 22 ) = O(n2t ) time. We know that it will stop as soon as 22 ≥ h, that is if t = dlg lg ne. (We
will use lg to denote logarithm base 2.) So the total running time (ignoring the constant factors) is
lgX
lg h lgX
lg h
n2t = n 2t ≤ n21+lg lg h = 2n lg h = O(n log h),
t=1 t=1
Geometric intersections: One of the most basic problems in computational geometry is that of computing intersec-
tions. Intersection computation in 2- and 3-space is basic to many different application areas.
• In solid modeling people often build up complex shapes by applying various boolean operations (intersec-
tion, union, and difference) to simple primitive shapes. The process in called constructive solid geometry
(CSG). In order to perform these operations, the most basic step is determining the points where the
boundaries of the two objects intersect.
• In robotics and motion planning it is important to know when two objects intersect for collision detection
and collision avoidance.
• In geographic information systems it is often useful to overlay two subdivisions (e.g. a road network and
county boundaries to determine where road maintenance responsibilities lie). Since these networks are
formed from collections of line segments, this generates a problem of determining intersections of line
segments.
• In computer graphics, ray shooting is an important method for rendering scenes. The computationally
most intensive part of ray shooting is determining the intersection of the ray with other objects.
Most complex intersection problems are broken down to successively simpler and simpler intersection problems.
Today, we will discuss the most basic algorithm, upon which most complex algorithms are based.
Element uniqueness: Given a list of n real numbers, are all of these numbers distinct? (That is, are there no
duplicates.)
Given a list of n numbers, (x1 , x2 , . . . , xn ), in O(n) time we can construct a set of n vertical line segments,
all having the same y-coordinates. Observe that if the numbers are distinct, then there are no intersections and
otherwise there is at least one intersection. Thus, if we could detect intersections in o(n log n) time (meaning
strictly faster than Θ(n log n) time) then we could solve element uniqueness in faster than o(n log n) time.
However, this would contradict the lower bound on element uniqueness.
We will present a (not quite optimal) O(n log n + I log n) time algorithm for the line segment intersection
problem. A natural question is whether this is optimal. Later in the semester we will discuss an optimal
radomized O(n log n + I) time algorithm for this problem.
Line segment intersection: Like many geometric primitives, computing the point at which two line segments inter-
sect can be reduced to solving a linear system of equations. Let ab and cd be two line segments, given by
their endpoints. First observe that it is possible to determine whether these line segments intersect, simply by
applying an appropriate combination of orientation tests. (We will leave this as an exercise.)
One way to determine the point at which the segments intersect is to use an old trick from computer graphics.
We represent the segments using a parametric representation. Recall that any point on the line segment ab can
be written as a convex combination involving a real parameter s:
An intersection occurs if and only if we can find s and t in the desired ranges such that p(s) = q(t). Thus we
get the two equations:
The coordinates of the points are all known, so it is just a simple exercise in linear algebra to solve for s and
t. The computation of s and t will involve a division. If the divisor is 0, this means that the line segments are
Sweep line: We will simulate the sweeping of a vertical line `, called the sweep line from left to right. (Our text
uses a horizontal line, but there is obviously no significant difference.) We will maintain the line segments
that intersect the sweep line in sorted order (say from top to bottom).
Events: Although we might think of the sweep line as moving continuously, we only need to update data
structures at points of some significant change in the sweep-line contents, called event points.
Different applications of plane sweep will have different notions of what event points are. For this appli-
cation, event points will correspond to the following:
Endpoint events: where the sweep line encounters an endpoint of a line segment, and
Intersection events: where the sweep line encounters an intersection point of two line segments.
Note that endpoint events can be presorted before the sweep runs. In contrast, intersection events will be
discovered as the sweep executes. For example, in the figure below, some of the intersection points lying
to the right of the sweep line have not yet been “discovered” as events. However, we will show that every
intersection point will be discovered as an event before the sweep line arrives here.
Event updates: When an event is encountered, we must update the data structures associated with the event.
It is a good idea to be careful in specifying exactly what invariants you intend to maintain. For example,
when we encounter an intersection point, we must interchange the order of the intersecting line segments
along the sweep line.
intersections detected
There are a great number of nasty special cases that complicate the algorithm and obscure the main points. We
will make a number of simplifying assumptions. They can be overcome through a more careful handling of
these cases.
(1) No line segment is vertical.
(2) If two segments intersect, then they intersect in a single point (that is, they are not collinear).
(3) No three lines intersect in a common point.
Lemma: Given two segments si and sj , which intersect in a single point p (and assuming no other line segment
passes through this point) there is a placement of the sweep line prior to this event, such that si and sj are
adjacent along the sweep line (and hence will be tested for intersection).
Proof: From our general position assumption it follows that no three lines intersect in a common point. There-
fore if we consider a placement of the sweep line that is infinitessimally to the left of the intersection
point, lines si and sj will be adjacent along this sweepline. Consider the event point q with the largest
x-coordinate that is strictly less than px . The order of lines along the sweep-line after processing q will be
identical the order of the lines along the sweep line just prior p, and hence si and sj will be adjacent at this
point.
si q
sj
l
Data structures: In order to perform the sweep we will need two data structures.
Event queue: This holds the set of future events, sorted according to increasing x-coordinate. Each event
contains the auxiliary information of what type of event this is (segment endpoint or intersection) and
which segment(s) are involved. The operations that this data structure should support are inserting an
event (if it is not already present in the queue) and extracting the minimum event.
It seems like a heap data structure would be ideal for this, since it supports insertions and extract-min in
O(log M ) time, where M is the number of entries in the queue. (See Cormen, Leiserson, and Rivest for
details). However, a heap cannot support the operation of checking for duplicate events.
There are two ways to handle this. One is to use a more sophisticated data structure, such as a balanced
binary tree or skip-list. This adds a small constant factor, but can check that there are no duplicates easily.
The second is use the heap, but when an extraction is performed, you may have to perform many extractions
to deal with multiple instances of the same event. (Our book recommends the prior solution.)
If events have the same x-coordinate, then we can handle this by sorting points lexicographically by (x, y).
(This results in events be processed from bottom to top along the sweep line, and has the same geometric
effect as imagining that the sweep line is rotated infinitesimally counterclockwise.)
Sweep line status: To store the sweep line status, we maintain a balanced binary tree or perhaps a skiplist whose
entries are pointers to the line segments, stored in increasing order of y-coordinate along the current sweep
line.
The complete plane-sweep algorithm is presented below. The various cases are illustrated in the following
figure.
Plane-Sweep Algorithm for Line Segment Intersection
(1) Insert all of the endpoints of the line segments of S into the event queue. The initial sweep status is empty.
(2) While the event queue is nonempty, extract the next event in the queue. There are three cases, depending on the
type of event:
Segment left endpoint: Insert this line segment into the sweep line status, based on the y-coordinate of this
endpoint and the y-coordinates of the other segments currently along the sweep line. Test for intersections
with the segment immediately above and below.
Segment right endpoint: Delete this line segment from the sweep line status. For the entries immediately
preceding and succeeding this entry, test them for intersections.
Intersection point: Swap the two line segments in order along the sweep line. For the new upper segment, test
it against its predecessor for an intersection. For the new lower segment, test it against its successor for an
intersection.
Analysis: The work done by the algorithm is dominated by the time spent updating the various data structures (since
otherwise we spend only constant time per sweep event). We need to count two things: the number of operations
applied to each data structure and the amount of time needed to process each operation.
For the sweep line status, there are at most n elements intersecting the sweep line at any time, and therefore the
time needed to perform any single operation is O(log n), from standard results on balanced binary trees.
Since we do not allow duplicate events to exist in the event queue, the total number of elements in the queue
at any time is at most 2n + I. Since we use a balanced binary tree to store the event queue, each operation
Thus, this is the total running time of the plane sweep algorithm.
Topological Information: In many applications of segment intersection problems, we are not interested in just a
listing of the segment intersections, but want to know how the segments are connected together. Typically, the
plane has been subdivided into regions, and we want to store these regions in a way that allows us to reason
about their properties efficiently.
This leads to the concept of a planar straight line graph (PSLG) or planar subdivision (or what might be called
a cell complex in topology). A PSLG is a graph embedded in the plane with straight-line edges so that no two
edges intersect, except possibly at their endpoints. (The condition that the edges be straight line segments may
be relaxed to allow curved segments, but we will assume line segments here.) Such a graph naturally subdivides
the plane into regions. The 0-dimensional vertices, 1-dimensional edges, and 2-dimensional faces. We consider
these three types of objects to be disjoint, implying each edge is topologically open (it does not include it
endpoints) and that each face is open (it does not include its boundary). There is always one unbounded face,
that stretches to infinity. Note that the underlying planar graph need not be a connected graph. In particular,
faces may contain holes (and these holes may contain other holes. A subdivision is called a convex subdivision
if all the faces are convex.
vertex
edge
face
convex subdivision
Planar subdivisions form the basic objects of many different structures that we will discuss later this semester
(triangulations and Voronoi diagrams in particular) so this is a good time to consider them in greater detail. The
first question is how should we represent such structures so that they are easy to manipulate and reason about.
For example, at a minimum we would like to be able to list the edges that bound each face of the subdivision in
cyclic order, and we would like to be able to list the edges that surround each vertex.
Planar graphs: There are a number of important facts about planar graphs that we should discuss. Generally speak-
ing, an (undirected) graph is just a finite set of vertices, and collection of unordered pairs of distinct vertices
called edges. A graph is planar if it can be drawn in the plane (the edges need not be straight lines) so that no
two distinct edges cross each other. An embedding of a planar graph is any such drawing. In fact, in specify-
ing an embedding it is sufficient just to specify the counterclockwise cyclic list of the edges that are incident
V − E + F = 2 − 2g.
Returning to planar graphs, if we allow the graph to be disconnected, and let C denote the number of connected
components, then we have the somewhat more general formula
V − E + F − C = 1.
In our example above we have V = 13, E = 12, F = 4 and C = 4, which clearly satisfies this formula. An
important fact about planar graphs follows from this.
Theorem: A planar graph with V vertices has at most 3(V − 2) edges and at most 2(V − 2) faces.
Proof: We assume (as is typical for graphs) that there are no multiple edges between the same pair of vertices
and no self-loop edges.
We begin by triangulating the graph. For each face that is bounded by more than three edges (or whose
boundary is not connected) we repeatedly insert new edges until every face in the graph is bounded by ex-
actly three edges. (Note that this is not a “straight line” planar graph, but it is a planar graph, nonetheless.)
An example is shown in the figure below in which the original graph edges are shown as solid lines.
Let E 0 ≥ E and F 0 ≥ F denote the number edges and faces in the modified graph. The resulting graph
has the property that it has one connected component, every face is bounded by exactly three edges, and
each edge has a different face on either side of it. (The last claim may involve a little thought.)
If we count the number of faces and multiply by 3, then every edge will be counted exactly twice, once
by the face on either side of the edge. Thus, 3F 0 = 2E 0 , that is E 0 = 3F 0 /2. Euler’s formula states that
V + E 0 − F 0 = 2, and hence
3F 0
V − + F0 = 2 ⇒ F ≤ F 0 = 2(V − 2),
2
and using the face that F 0 = 2E 0 /3 we have
2E 0
V − E0 + =2 ⇒ E ≤ E 0 = 3(V − 2).
3
This completes the proof.
v2
v0 v6
v3
v1 v4
v5
The famous Jordan curve theorem states that every simple closed plane curve divides the plane into two regions
(the interior and the exterior). (Although the theorem seems intuitively obvious, it is quite difficult to prove.)
We define a polygon to be the region of the plane bounded by a simple, closed polygonal curve. The term simple
polygon is also often used to emphasize the simplicity of the polygonal curve. We will assume that the vertices
are listed in counterclockwise order around the boundary of the polygon.
Art Gallery Problem: We say that two points x and y in a simple polygon can see each other (or x and y are visible)
if the open line segment xy lies entirely within the interior of P . (Note that such a line segment can start and
end on the boundary of the polygon, but it cannot pass through any vertices or edges.)
If we think of a polygon as the floor plan of an art gallery, consider the problem of where to place “guards”,
and how many guards to place, so that every point of the gallery can be seen by some guard. Victor Klee posed
the following question: Suppose we have an art gallery whose floor plan can be modeled as a polygon with
n vertices. As a function of n, what is the minimum number of guards that suffice to guard such a gallery?
Observe that are you are told about the polygon is the number of sides, not its actual structure. We want to know
the fewest number of guards that suffice to guard all polygons with n sides.
Before getting into a solution, let’s consider some basic facts. Could there be polygons for which no finite
number of guards suffice? It turns out that the answer is no, but the proof is not immediately obvious. You
might consider placing a guard at each of the vertices. Such a set of guards will suffice in the plane. But to
show how counterintuitive geometry can be, it is interesting to not that there are simple nonconvex polyhedra in
3-space, such that even if you place a guard at every vertex there would still be points in the polygon that are
not visible to any guard. (As a challenge, try to come up with one with the fewest number of vertices.)
An interesting question in combinatorial geometry is how does the number of guards needed to guard any simple
polygon with n sides grow as a function of n? If you play around with the problem for a while (trying polygons
with n = 3, 4, 5, 6 . . . sides, for example) you will eventually come to the conclusion that bn/3c is the right
value. The figure above shows a worst-case example, where bn/3c guards are required. A cute result from
combinatorial geometry is that this number always suffices. The proof is based on three concepts: polygon
triangulation, dual graphs, and graph coloring. The remarkably clever and simple proof was discovered by Fisk.
Theorem: (The Art-Gallery Theorem) Given a simple polygon with n vertices, there exists a guarding set with
at most bn/3c guards.
Before giving the proof, we explore some aspects of polygon triangulations. We begin by introducing a triangu-
lation of P . A triangulation of a simple polygon is a planar subdivision of (the interior of) P whose vertices are
the vertices of P and whose faces are all triangles. An important concept in polygon triangulation is the notion
of a diagonal, that is, a line segment between two vertices of P that are visible to one another. A triangulation
can be viewed as the union of the edges of P and a maximal set of noncrossing diagonals.
Lemma: Every simple polygon with n vertices has a triangulation consisting of n − 3 diagonals and n − 2
triangles.
(We leave the proof as an exercise.) The proof is based on the fact that given any n-vertex polygon, with n ≥ 4
it has a diagonal. (This may seem utterly trivial, but actually takes a little bit of work to prove. In fact it fails to
hold for polyhedra in 3-space.) The addition of the diagonal breaks the polygon into two polygons, of say m1
and m2 vertices, such that m1 + m2 = n + 2 (since both share the vertices of the diagonal). Thus by induction,
there are (m1 − 2) + (m2 − 2) = n + 2 − 4 = n − 2 triangles total. A similar argument holds for the case of
diagonals.
It is a well known fact from graph theory that any planar graph can be colored with 4 colors. (The famous
4-color theorem.) This means that we can assign a color to each of the vertices of the graph, from a collection
of 4 different colors, so that no two adjacent vertices have the same color. However we can do even better for
the graph we have just described.
Lemma: Let T be the triangulation graph of a triangulation of a simple polygon. Then T is 3-colorable.
Proof: For every planar graph G there is another planar graph G∗ called its dual. The dual G∗ is the graph
whose vertices are the faces of G, and two vertices of G∗ are connected by an edge if the two corresponding
faces of G share a common edge.
Since a triangulation is a planar graph, it has a dual, shown in the figure below. (We do not include the
external face in the dual.) Because each diagonal of the triangulation splits the polygon into two, it follows
that each edge of the dual graph is a cut edge, meaning that its deletion would disconnect the graph. As a
1 2 1 1
3 3
1
2 3 2
ear
result it is easy to see that the dual graph is a free tree (that is, a connected, acyclic graph), and its maximum
degree is 3. (This would not be true if the polygon had holes.)
The coloring will be performed inductively. If the polygon consists of a single triangle, then just assign any
3 colors to its vertices. An important fact about any free tree is that it has at least one leaf (in fact it has at
least two). Remove this leaf from the tree. This corresponds to removing a triangle that is connected to the
rest triangulation by a single edge. (Such a triangle is called an ear.) By induction 3-color the remaining
triangulation. When you add back the deleted triangle, two of its vertices have already been colored, and
the remaining vertex is adjacent to only these two vertices. Give it the remaining color. In this way the
entire triangulation will be 3-colored.
Proof: (of the Art-Gallery Theorem:) Consider any 3-coloring of the vertices of the polygon. At least one color
occurs at most bn/3c time. (Otherwise we immediately get there are more than n vertices, a contradiction.)
Place a guard at each vertex with this color. We use at most bn/3c guards. Observe that every triangle
has at least one vertex of each of the tree colors (since you cannot use the same color twice on a triangle).
Thus, every point in the interior of this triangle is guarded, implying that the interior of P is guarded. A
somewhat messy detail is whether you allow guards placed at a vertex to see along the wall. However,
it is not a difficult matter to push each guard infinitesimally out from his vertex, and so guard the entire
polygon.
The Polygon Triangulation Problem: Triangulating simple polygons is an operation used in many other applications
where complex shapes are to be decomposed into a set of disjoint simpler shapes. There are many applications
Henceforth, let us consider monotonicity with respect to the x-axis. We will call these polygons horizontally
monotone. It is easy to test whether a polygon is horizontally monotone. How?
(a) Find the leftmost and rightmost vertices (min and max x-coordinate) in O(n) time.
(b) These vertices split the polygon’s boundary into two chains, an upper chain and a lower chain. Walk from
left to right along each chain, verifying that the x-coordinates are nondecreasing. This takes O(n) time.
As a challenge, consider the problem of determining whether a polygon is monotone in any (unspecified) direc-
tion. This can be done in O(n) time, but is quite a bit harder.
Triangulation of Monotone Polygons: We can triangulate a monotone polygon by a simple variation of the plane-
sweep method. We begin with the assumption that the vertices of the polygon have been sorted in increasing
order of their x-coordinates. (For simplicity we assume no duplicate x-coordinates. Otherwise, break ties
between the upper and lower chains arbitrarily, and within a chain break ties so that the chain order is preserved.)
Observe that this does not require sorting. We can simply extract the upper and lower chain, and merge them
(as done in mergesort) in O(n) time.
The idea behind the triangulation algorithm is quite simple: Try to triangulate everything you can to the left of
the current vertex by adding diagonals, and then remove the triangulated region from further consideration.
7
7
2 2
1 3 4 1 3 4
5 5 8
6 13 6
12 12
11 11
7 7
2 2
10 10
1 3 4 1 3 4
5 9 5 9
8 8
6 6 13
In the example, there is obviously nothing to do until we have at least 3 vertices. With vertex 3, it is possible
to add the diagonal to vertex 2, and so we do this. In adding vertex 4, we can add the diagonal to vertex 2.
However, vertices 5 and 6 are not visible to any other nonadjacent vertices so no new diagonals can be added.
When we get to vertex 7, it can be connected to 4, 5, and 6. The process continues until reaching the final vertex.
The important thing that makes the algorithm efficient is the fact that when we arrive at a vertex the untrian-
gulated region that lies to the left of this vertex always has a very simple structure. This structure allows us
to determine in constant time whether it is possible to add another diagonal. And in general we can add each
additional diagonal in constant time. Since any triangulation consists of n − 3 diagonals, the process runs in
O(n) total time. This structure is described in the lemma below.
Lemma: (Main Invariant) For i ≥ 2, let vi be the vertex just processed by the triangulation algorithm. The
untriangulated region lying to the left of vi consists of two x-monotone chains, a lower chain and an upper
chain each containing at least one edge. If the chain from vi to u has two or more edges, then these edges
form a reflex chain (that is, a sequence of vertices with interior angles all at least 180 degrees). The other
chain consists of a single edge whose left endpoint is u and whose right endpoint lies to the right of vi .
We will prove the invariant by induction. As the basis case, consider the case of v2 . Here u = v1 , and one chain
consists of the single edge v2 v1 and the other chain consists of the other edge adjacent to v1 .
To prove the main invariant, we will give a case analysis of how to handle the next event, involving vi , assuming
that the invariant holds at vi−1 . and see that the invariant is satisfied after each event has been processed. There
are the following cases that the algorithm needs to deal with.
u
vi−1
Initial invariant
vi
u u vi u vi−1
vi−1 vi−1
Case 1 Case 2a Case 2b vi
How is this implemented? The vertices on the reflex chain can be stored in a stack. We keep a flag indicating
whether the stack is on the upper chain or lower chain, and assume that with each new vertex we know which
chain of the polygon it is on. Note that decisions about visibility can be based simply on orientation tests
involving vi and the top two entries on the stack. When we connect vi by a diagonal, we just pop the stack.
Analysis: We claim that this algorithm runs in O(n) time. As we mentioned earlier, the sorted list of vertices can be
constructed in O(n) time through merging. The reflex chain is stored on a stack. In O(1) time per diagonal, we
can perform an orientation test to determine whether to add the diagonal and (assuming a DCEL) the diagonal
can be added in constant time. Since the number of diagonals is n − 3, the total time is O(n).
Monotone Subdivision: In order to run the above triangulation algorithm, we first need to subdivide an arbitrary
simple polygon P into monotone polygons. This is also done by a plane-sweep approach. We will add a set of
nonintersecting diagonals that partition the polygon into monotone pieces.
Observe that the absence of x-monotonicity occurs only at vertices in which the interior angle is greater than
180 degrees and both edges lie either to the left of the vertex or both to the right. Following our book’s notation,
we call the first type a merge vertex (since as the sweep passes over this vertex the edges seem to be merging)
and the latter type a split vertex.
Let’s discuss the case of a split vertex first (both edges lie to the right of the vertex). When a split vertex v is
encountered in the sweep, there will be an edge ea of the polygon lying above and an edge eb lying below. We
helper(e1)
ea e1
u e2
v e3
helper(e3)
e4
eb
e5
sweep line e6
helper(e5)
helper(ea ) : Let eb be the edge of the polygon lying just below ea on the sweep line. The helper is the rightmost
vertically visible vertex below ea on the polygonal chain between ea and eb .
We join each split vertex to helper(ea ), where ea is the edge of P immediately above the split vertex. Note that
it is possible that the helper is the left endpoint of ea . Also note that helper(ea ) is defined with respect to the
current location of the sweep line. As the sweep line moves, its value changes. Also, it is only defined for those
edges intersected by the sweep line.
Events: The endpoints of the edges of the polygon. These are sorted by increasing order of x-coordinates.
Since no new events are generated, the events may be stored in a simple sorted list (i.e., no priority queue
is needed).
Sweep status: The sweep line status consists of the list of edges that intersect the sweep line, sorted from top
to bottom. Our book notes that we actually only need to store edges such that the polygon lies just below
this edge (since these are the only edges that we evaluate helper() from).
These edges are stored in a dictionary (e.g., a balanced binary tree or a skip list), so that the operations of
insert, delete, find, predecessor and successor can be evaluated in O(log n) time each.
Event processing: There are six event types based on a case analysis of the local structure of edges around
each vertex. Let v be the current vertex encountered by the sweep.
Split vertex: Search the sweep line status to find the edge e lying immediately above v. Add a diagonal
connecting v to helper(e). Add the two edges incident to v in the sweep line status, and make v the
helper of the lower of these two edges and make v the new helper of e.
Merge vertex: Find the two edges incident to this vertex in the sweep line status (they must be adjacent).
Delete them both. Let e be the edge lying immediately above them. Make v the new helper of e.
Start vertex: (Both edges lie to the right of v, but the interior angle is less than 180 degrees.) Insert this
vertex’s edges into the sweep line status. Set the helper of the upper edge to v.
e e v e
v v v v v
This only inserts diagonals to fix the split vertices. What about the merge vertices? This could be handled by
applying essentially the same algorithm using a reverse (right to left) sweep. It can be shown that this will never
introduce crossing diagonals, but it might attempt to insert the same diagonal twice. However, the book suggests
a simpler approach. Whenever we change a helper vertex, check whether the original helper vertex is a merge
vertex. If so, the new helper vertex is then connected to the merge vertex by a new diagonal. It is not hard to
show that this essentially has the same effect as a reverse sweep, and it is easier to detect the possibility of a
duplicate insertion (in case the new vertex happens to be a split vertex).
There are many special cases (what a pain!), but each one is fairly easy to deal with, so the algorithm is quite
efficient. As with previous plane sweep algorithms, it is not hard to show that the running time is O(log n) times
the number of events. In this case there is one event per vertex, so the total time is O(n log n). This gives us an
O(n log n) algorithm for polygon triangulation.
Halfplane Intersection: Today we begin studying another very fundamental topic in geometric computing, and along
the way we will show a rather surprising connection between this topic the topic of convex hulls, which we
discussed earlier. Any line in the plane splits the plane into two regions, called halfplane, one lying on either
side of the line. We may refer to a halfplane as being either closed or open depending on whether it contains the
line itself. For this lecture we will be interested in closed halfplanes.
How do we represent lines and halfplanes? For the cleanest and most general understanding of representing
lines, it is good to study projective geometry and homogeneous coordinates. However, for the sake of time, we
will skip this. Typically, it will suffice to represent lines in the plane using the following equation:
y = ax − b,
where a denotes the slope and b denotes the negation of the y-intercept. (We will see later why this representation
is convenient.) Unfortunately, it is not fully general, since it cannot represent vertical lines. A more general line
representation will generally involve three parameters, as in:
ax + by = c.
y ≤ ax − b or ax + by ≤ c.
In the former case, this represents the halfplane lying below the line. The latter representation is more general,
since we can represent halfplanes on either side of the line, simply by multiplying all the coefficients by −1.
Halfplane intersection problem: The halfplane intersection problem is, given a set of n closed halfplanes, H =
{h1 , h2 , . . . , hn } compute their intersection. A halfplane (closed or open) is a convex set, and hence the inter-
section of any number of halfplanes is also a convex set. Unlike the convex hull problem, the intersection of n
halfplanes may generally be empty or even unbounded. A reasonable output representation might be to list the
lines bounding the intersection in counterclockwise order, perhaps along with some annotation as to whether the
final figure is bounded or unbounded.
How many sides can bound the intersection of n halfplanes in the worst case? Observe that by convexity, each
of the halfplanes can appear only once as a side, and hence the maximum number of sides is n. How fast can
we compute the intersection of halfspaces? As with the convex hull problem, it can be shown through a suitable
reduction from sorting that the problem has a lower bound of Ω(n log n).
Who cares about this problem? Our book discusses a rather fanciful application in the area of casting. More
realistically, halfplane intersection and halfspace intersection in higher dimensions are used as a method for
generating convex shape approximations. In computer graphics for example, a bounding box is often used to
approximate a complex multi-sided polyhedral shape. If the bounding box is not visible from a given viewpoint
then the object within it is certainly not visible. Testing the visibility of a 6-sided bounding box is much easier
than a multi-sided nonconvex polyhedron, and so this can be used as a filter for a more costly test. A bounding
box is just the intersection of 6 axis-aligned halfspace in 3-space. If more accurate, but still convex approxi-
mations are desired, then we may compute the intersection of a larger number of tight bounding halfspaces, in
various orientations, as the final approximation.
Solving the halfspace intersection problem in higher dimensions is quite a bit more challenging than in the
plane. For example, just storing the output as a cyclic sequence of bounding planes is not sufficient. In general
some sort of adjacency structure (a DCEL, for example) is needed.
We will discuss two algorithms for the halfplane intersection problem. The first is given in the text. For the
other, we will consider somewhat simpler problem of computing something called the lower envelope of a set
of lines, and show that it is closely related to the convex hull problem.
Divide-and-Conquer Algorithm: We begin by sketching a divide-and-conquer algorithm for computing the inter-
section of halfplanes. The basic approach is very simple:
The running time of the resulting algorithm is most easily described using a recurrence, that is, a recursively
defined equation. If we ignore constant factors, and assume for simplicity that n is a power of 2, then the running
time can be described as: ½
1 if n = 1,
T (n) =
2T (n/2) + S(n) if n > 1,
where S(n) is the time required to compute the intersection of two convex polygons whose total complexity
is n. If we can show that S(n) = O(n), then by standard results in recurrences it will follow that the overall
running time T (n) is O(n log n). (See CLR, for example.)
Intersecting Two Convex Polygons: The only nontrivial part of the process is implementing an algorithm that in-
tersects two convex polygons, C1 and C2 , into a single convex polygon. Note that these are somewhat special
convex polygons because they may be empty or unbounded.
We know that it is possible to compute the intersection of line segments in O((n + I) log n) time, where I is the
number of intersecting pairs. Two convex polygons cannot intersect in more than I = O(n) pairs. (This follows
from the observation that each edge of one polygon can intersect at most two edges of the other polygon by
convexity.) This would given O(n log n) algorithm for computing the intersection and an O(n log2 n) solution
for T (n), which is not as good as we would like.
There are two common approaches for intersecting convex polygons. Both essentially involve merging the two
boundaries. One works by a plane-sweep approach. The other involves a simultaneous counterclockwise sweep
around the two boundaries. The latter algorithm is described in O’Rourke’s book. We’ll discuss the plane-sweep
algorithm.
We perform a left-to-right plane sweep to compute the intersection. We begin by breaking the boundaries of the
convex polygons into their upper and lower chains. By convexity, the sweep line intersects each convex polygon
Ci in at most two points, and hence, there are at most four points in the sweep line status at any time. Thus we
do not need a dictionary for storing the sweep line status, a simple 4-element list suffices. Also, our event queue
need only be of constant size. At any point there are at most 8 possible candidates for the next event, namely,
the right endpoints of the four edges stabbed by the sweep line and the (up to four) intersection points of these
upper and lower edges of C1 with the upper and lower edges of C2 . Since there are only a constant number of
possible events, and each can be handled in O(1) time, the total running time is O(n).
C1
Upper envelope
Lower envelope
The lower envelope problem is a restriction of the halfplane intersection problem, but it an interesting restriction.
Notice that any halfplane intersection problem that does not involve any vertical lines can be rephrased as the
intersection of two envelopes, a lower envelope defined by the lower halfplanes and an upper envelope defined
by the upward halfplanes.
I will show that solving the lower envelope problem is essentially equivalent to solving the upper convex hull
problem. In fact, they are so equivalent that exactly the same algorithm will solve both problems, without
changing even a single character of code. All that changes is the way in which you view the two problems.
Duality: Let us begin by considering lines in the plane. Each line can be represented in a number of ways, but for
now, let us assume the representation y = ax − b, for some scalar values a and b. We cannot represent vertical
lines in this way, and for now we will just ignore them. Later in the semester we will fix this up. Why did we
subtract b? We’ll see later that this is just a convenience.
Therefore, in order to describe a line in the plane, you need only give its two coordinates (a, b). In some sense,
lines in the plane can be thought of as points in a new plane in which the coordinate axes are labeled (a, b),
rather than (x, y). Thus the line y = 7x − 4 corresponds to the point (7, 4) in this new plane. Each point in this
new plane of “lines” corresponds to a nonvertical line in the original plane. We will call the original (x, y)-plane
the primal plane and the new (a, b)-plane the dual plane.
What is the equation of a line in the dual plane? Since the coordinate system uses a and b, we might write a line
in a symmetrical form, for example b = 3a − 5, where the values 3 and 5 could be replaced by any scalar values.
Consider a particular point p = (px , py ) in the primal plane, and consider the set of all nonvertical lines passing
through this point. Any such line must satisfy the equation py = apx − b. The images of all these lines in the
dual plane is a set of points:
L = {(a, b) | py = apx − b}
Notice that this set is just the set of points that lie on a line in the dual (a, b)-plane. (And this is why we negated
b.) Thus, not only do lines in the primal plane map to points in the dual plane, but there is a sense in which a
point in the primal plane corresponds to a line in the dual plane.
To make this all more formal, we can define a function that maps points in the primal plane to lines in the dual
plane, and lines in the primal plane to points in the dual plane. We denote it using a asterisk (∗) as a superscript.
Thus, given point p = (px , py ) and line ` : (y = `a x − `b ) in the primal plane we define `∗ and p∗ to be a point
and line respectively in the dual plane defined by:
`∗ = (`a , `b )
∗
p : (b = px a − py ).
r*:(b=−2a+1)
y b
p:(−2,2) s*:(b=a+1)
q*:(b=−a−1)
q:(−1,1) L*:(0,1)
x a
L:(y=0x−1)
r:(−2,−1) s:(1,−1)
p*:(b=−2a−2)
We can define the same mapping from dual to primal as well. Duality has a number of interesting properties,
each of which is easy to verify by substituting the definition and a little algebra.
For example, to verify the order reversing property, observe that p lies above ` if and only if
py > `a px − `b .
One rather cryptical feature of this proof is that, although the upper and lower hulls appear to be connected, the
upper and lower envelopes of a set of lines appears to consist of two disconnected sets. To make sense of this,
we should interpret the primal and dual planes from the perspective of projective geometry, and think of the
rightmost line of the lower envelope as “wrapping around” to the leftmost line of the upper envelope, and vice
versa. We will discuss projective geometry later in the semester.
Another interesting question is that of orientation. We know the orientation of three points is positive if the
points have a counterclockwise orientation. What does it mean for three lines to have a positive orientation?
(The definition of line orientation is exactly the same, in terms of a determinant of the coefficients of the lines.)
Linear Programming: Last time we considered the problem of computing the intersection of n halfplanes, and
presented in optimal O(n log n) algorithm for this problem. In many applications it is not important to know
the entire polygon (or generally the entire polytope in higher dimensions), but only to find one particular point
of interest on the polygon.
One particularly important application is that of linear programming. In linear programming (LP) we are given
a set of linear inequalities, or constraints, which we may think of as defining a (possibly empty, possibly un-
bounded) polyhedron in space, called the feasible region, and we are given a linear objective function, which is
to be minimized or maximized subject to the given constraints. A typical description of a d-dimensional linear
programming problem might be:
Maximize: c1 x1 + c2 x2 + · · · + cd xd
Subject to: a1,1 x1 + · · · + a1,d xd ≤ b1
a2,1 x1 + · · · + a2,d xd ≤ b2
..
.
an,1 x1 + · · · + an,d xd ≤ bn
where ai,j , ci , and bi are given real numbers. This can be also be expressed in matrix notation:
Maximize: cT x,
Subject to: Ax ≤ b.
feasible c
region
optimal vertex
Note that the magnitude of ~c is irrelevant, since the problem is unchanged for any positive scalar multiple of ~c.
In many of our examples, we will imagine that the vector ~c is pointing straight down (that is, ~c = (0, −1)) and
hence the problem is just that of finding the lowest point (minimum y-coordinate) of the feasible region. This
involves no loss of generality, since it is always possible to transform the problem into one in which ~c = (0, −1)
by an appropriate rotation.
Normally, this extreme point will be a vertex of the feasible region polyhedron, but there some other possibil-
ities as well. The feasible region may be empty (in which case the linear programming problem is said to be
infeasible) and there is no solution. It may be unbounded, and if ~c points in the direction of the unbounded part
of the polyhedron, then there may solutions with infinitely large values of the objective function. In this case
there is no (finite) solution, and the LP problem is said to be unbounded. Finally observe that in degenerate
situations, it is possible to have an infinite number of finite optimum solutions, because an edge or face of the
feasible region is perpendicular to the objective function vector. In such instances it is common to break ties by
requiring that the solution be lexicographically maximal (e.g. among all maximal solutions, take the one with
the lexicographically maximum vector).
Linear Programming in High Dimensional Spaces: Linear programming is a very important technique used in solv-
ing large optimization problems. Typical instances may involve hundreds to thousands of constraints in very
high dimensional space. It is without doubt one of the most important formulations of general optimization
problems.
The principal methods used for solving high-dimensional linear programming problems are the simplex algo-
rithm and various interior-point methods. The simplex algorithm works by finding a vertex on the feasible
polyhedron, then walking edge by edge downwards until reaching a local minimum. By convexity, the local
minimum is the global minimum. It has been long known that there are instances where the simplex algorithm
runs in exponential time. The question of whether linear programming was even solvable in polynomial time
was unknown until Khachiyan’s ellipsoid algorithm (late 70’s) and Karmarkar’s more practical interior-point
algorithm (mid 80’s).
2-dimensional LP: We will restrict ourselves to low dimensional instances of linear programming. There are a num-
ber of interesting optimization problems that can posed as a low-dimensional linear programming problem, or
as closely related optimization problems. One which we will see later is the problem of finding a minimum
radius circle that encloses a given set of n points.
Let us consider the problem just in the plane. Here we know that there is an O(n log n) algorithm based on just
computing the feasible polyhedron, and finding its lowest vertex. However, since we are only interested in one
hi
c
vi
vi
vi−1 =v i vi−1
hi
Case 1 Case 2
The important observation is that (assuming that the feasible region is not empty) the new optimum vertex must
lie on the line that bounds hi . Call this line `i . (The book proves this formally, but here is an intuitive argument.
Suppose that the new optimum vertex does not lie on `i . Draw a line segment from vi−1 to the new optimum.
Observe (1) that as you walk along this segment the value of the objective function is decreasing monotonically
(by linearity), and (2) that this segment must cross `i (because it goes from being infeasible with respect to hi to
being feasible). Thus, it is maximized at the crossing point, which lies on `i .) Convexity and linearity are both
very important for the proof.
So this leaves the question of how do we find the optimum vertex lying on line `i . This turns out to be a 1-
dimensional LP problem. Simply intersect each of the halfplanes with this line. Each intersection will take the
Probabilistic Analysis: We analyze the running time in the expected case where we average over all n! possible
permutations. Each permutation has an equal probability of 1/n! of occurring, and an associated running time.
However, presenting the analysis as sum of n! terms does not lead to something that we can easily simplify. We
will apply a technique called backward analysis, which is quite useful.
To motivate how backward analysis works, let us consider a much simpler example, namely the problem of
computing the minimum of a set of n distinct numbers. We permute the numbers and inspect them in this order.
5 9 4 2 6 8 0 3 1 7
Let pi denote the probability that the minimum value changes on inspecting the ith number of the random
permutation. Thus, with probability pi the minimum changes (1) and with probability 1 − pi it does not (0). The
total expected number of changes is
n
X n
X
C(n) = (pi · 1 + (1 − pi ) · 0) = pi .
i=1 i=1
It suffices to compute pi . We reason as follows. Let Si be an arbitrary subset of i numbers from our initial set
of n. (In theory, the probability is conditional on the fact that the elements of Si represent the first i elements
to be chosen, but since the analysis will not depend on the particular choice of Si , it follows that the probability
that we compute will hold unconditionally.) Among all i! permutations of the elements of Si , in how many
of these does the minimum change when inspecting the ith value? The answer quite simply is that this only
happens for those sequences in which the minimum element is the last (ith) element of the sequence. Since the
minimum item appears with equal probability in each of the i positions of a random sequence, the probability
that it appears last is 1/i. Thus, pi = 1/i. From this we have
n
X n
X 1
C(n) = pi = = ln n + O(1).
i=1 i=1
i
This summation is the Harmonic series and the fact that it is nearly equal to ln n is a well known fact.
This is called an backwards analysis because the analysis depends on the last item to be considered (as opposed
to looking forward to the next). Let us try to apply this same approach to analyze the running time of the
randomized incremental linear programming algorithm. This time, let pi denote the probability that the insertion
of the ith hyperplane in the random order resulted in a change in the optimum vertex. With probability (1 − pi )
there is no change (Case 1), and it takes us O(1) time to determine this. (O(d) time in general in dimension d.)
With probability pi we need to invoke a 1-dimensional LP on a set of i − 1 halfplanes in dimension 1, with a
running time of O(i). (In general, if T (n, d) is the expected running time for n halfspaces in dimension d, then
this cost would be T (i − 1, d − 1).) Combining this we have a total expected running time of
n
X n
X
T (n) = ((1 − pi ) · 1 + pi · i)) ≤ n + pi · i
i=1 i=1
All that remains is to determine pi . We will apply the same technique. Let Si denote an arbitrary subset
consisting of i of the original halfplanes. Among all i! permutations of Si , in how many does the optimum
vertex change with the ith step? Let vi denote the optimum vertex for these i halfplanes. It is important to note
that vi only depends on the set Si and not on the order of their insertion. (If you do not see why this is important,
think about it.)
Assuming general position, there are two halfplanes h0 and h00 of Si passing through vi . If neither of these was
the last to be inserted, then vi = vi−1 , and there is no change. If either h0 or h00 was the last to be inserted, then
vi did not exist yet, and hence the optimum must have changed as a result of this insertion. Thus, the optimum
changes if and only if either h0 or h00 was the last halfplane inserted. Since all of the i halfplanes are equally
likely to be last, this happens with probability 2/i. Therefore, pi = 2/i.
To illustrate this, consider the example shown in the following figure. We have i = 7 random halfplanes that
have been added, so far and vi = v is the current optimum vertex and it is defined by h4 and h8 . Let’s consider
h13 c
h4
vi h’
v
h8 v’
h"
h15 h3
h11
which halfplane was added last. If h5 was the last to be added, imagine the picture without h5 . Prior to this v
would already have been the optimum vertex. Therefore, if h5 was added last, it would have been added with
O(1) cost.
On the other hand, if h8 was added last, then prior to its addition, we would have had a different optimum vertex,
namely v 0 . In summary, in 2 out of 7 cases (h4 and h8 ) on the ith insertion we need to solve a 1-dimensional LP
and in 5 out of 7 instances O(1) time suffices.
Returning to our analysis, since pi = 2/i we have
n
X n
X 2i
T (n) ≤ n + pi · i = n + = n + 2n = O(n).
i=1 i=1
i
Therefore, the expected running time is linear in n. I’ll leave the analysis for the d-dimensional case as an
exercise. (Note that in dimension d the optimum vertex is defined by d halfspaces, not 2.)
Low-Dimensional Linear Programming: Last time we presented an O(n) time algorithm for linear programming in
the plane. We begin by observing that the same algorithm can be generalized to any dimension d. In particular,
let {h1 , h2 , . . . , hn } be a set of n closed halfspaces in dimension d (each hi is defined by one linear inequality
of the form
hi : ai,1 x1 + ai,2 x2 + . . . + ai,d xd ≤ bi .
and ~c is a vector in d-space. We begin by selecting d halfspaces whose feasible region is bounded with respect
to c. This can be done by a generalization of the method used for the planar case. Next, we add halfspaces one
by one in random order. As before if the current optimal vertex is feasible with respect to the latest halfspace,
there is no change. This can be checked in O(d) time. Otherwise, let `i denote the hyperplane that supports hi
(formed by changing the above inequality into an equality). we intersect all the halfspaces with `i (using Gauss
elimination), and then solve the resulting d − 1 dimensional LP problem, involving i − 1 halfspaces recursively,
and return the result.
The running time is derived in exactly the same way as for the 2-dimensional case. Let T (d, n) be the expected
running time of the algorithm for n halfspaces in dimension d. Let’s consider the ith stage. Assuming general
position, there are d halfspaces whose intersection (of the supporting hyperplanes) defines the optimum vertex.
T (1, n) = n
Xn µ ¶
i−d d
T (d, n) = d + T (d − 1, i − 1)
i=1
i i
We claim that T (d, n) ∈ O(d!n). The proof is by induction on d. This is clearly true in the basis case (d = 1).
In general we have
n
X i−d d
T (d, n) = d + T (d − 1, i − 1)
i=1
i i
n
X d
≤ dn + (d − 1)!(i − 1)
i=1
i
n
X i−1
≤ dn + d! ≤ dn + d!n.
i=1
i
Oops, this is not quite what we wanted. We wanted T (d, n) ≤ d!n. However, a more careful (but messier)
induction proof will do the job. We leave this as an exercise.
Smallest Enclosing Disk: Although the vast majority of applications of linear programming are in relatively high
dimensions, there are a number of interesting applications in low dimensions. We will present one such example,
called the smallest enclosing disk problem. We are given n points in the plane and we are asked to find the closed
circular disk of minimum radius that encloses all of these points. We will present a randomized algorithm for
this problem that runs in O(n) expected time.
We should say a bit about terminology. A circle is the set of points that are equidistant from some center point.
A disk is the set of points lying within a circle. We can talk about open or closed disks to distinguish whether
the bounding circle itself is part of the disk. In higher dimensions the generalization of a circle is a sphere in
3-space, or hypersphere in higher dimensions. The set of points lying within a sphere or hypersphere is called a
ball.
Before discussing algorithms, we first observe that any circle is uniquely determined by three points (as the
circumcenter of the triangle they define). We will not prove this, but it follows as an easy consequence of
linearization, which we will discuss later in the lecture.
Claim: For any finite set of points in general position (no four cocircular), the smallest enclosing disk either
has at least three points on its boundary, or it has two points, and these points form the diameter of the
circle. If there are three points then they subdivide the circle bounding the disk into arcs of angle at most
π.
Proof: Clearly if there are no points on the boundary the disk’s radius could be decreased. If there is only
one point on the boundary then this is also clearly true. If there are two points on the boundary, and they
are separated by an arc of length strictly less than π, then observe that we can find a disk that passes
through both points and has a slightly smaller radius. (By considering a disk whose center point is only
the perpendicular bisector of the two points and lies a small distance closer to the line segment joining the
points.)
Thus, none of these configurations could be a candidate for the minimum enclosing disk. Also observe
that if there are three points that define the smallest enclosing disk they subdivide the circle into three arcs
each of angle at most π (for otherwise we could apply the same operation above). Because points are in
general position we may assume there cannot be four or more cocircular points.
This immediately suggests a simple O(n4 ) time algorithm. In O(n3 ) time we can enumerate all triples of points
and then for each we generate the resulting circle and test whether it encloses all the points in O(n) additional
time, for an O(n4 ) time algorithm. You might make a few observations to improve this a bit (e.g. by using only
triples of points on the convex hull). But even so a reduction from O(n4 ) to O(n) is quite dramatic.
Linearization: We can “almost” reduce this problem to a linear programming problem in 3-space. Although the
method does not work, it does illustrate the similarity between this problem and LP.
Recall that a point p = (px , py ) lies within a circle with center point c = (cx , cy ) and radius r if
(px − cx )2 + (py − cy )2 ≤ r2 .
In our case we are given n such points pi and are asked to determine whether there exists cx , cy and r satisfying
the resulting n inequalities, with r as small as possible. The problem is that these inequalities clearly involve
quantities like c2x and r2 and so are not linear inequalities in the parameters of interest.
The technique of linearization can be used to fix this. First let us expand the inequality above and rearrange the
terms
Observe that this is a linear inequality in cx , cy and R. If we let px and py range over all the coordinates of
all the n points we generate n linear inequalities in 3-space, and so we can apply linear programming to find
the solution, right? The only problem is that the previous objective function was to minimize r. However r is
no longer a parameter in the new version of the problem. Since we r2 = R + c2x + c2y , and minimizing r is
equivalent to minimizing r2 (since we are only interested in positive r), we could say that the objective is to
minimize R + c2x + c2y . Unfortunately, this is not a linear function of the parameters cx , cy and R. Thus we are
left with an optimization problem in 3-space with linear constraints and a nonlinear objective function.
This shows that LP is closely related, and so perhaps the same techniques can be applied.
Randomized Incremental Algorithm: Let us consider how we can modify the randomized incremental algorithm
for LP directly to solve this problem. The algorithm will mimic each step of the randomized LP algorithm.
To start we randomly permute the points. We select any two points and compute the unique circle with these
points as diameter. (We could have started with three just as easily.) Let Di−1 denote the minimum disk after
Claim: If pi ∈
/ Di−1 then pi is on the boundary of the minimum enclosing disk for the first i points, Di .
Proof: The proof makes use of the following geometric observation. Given a disk of radius r1 and a circle of
radius r2 , where r1 < r2 , the intersection of the disk with the circle is an arc of angle less than π. This is
because an arc of angle π or more contains two (diametrically opposite) points whose distance from each
other is 2r2 , but the disk of radius r1 has diameter only 2r1 and hence could not simultaneously cover two
such points.
Now, suppose to the contrary that pi is not on the boundary of Di . It is easy to see that because Di covers
a point not covered by Di−1 that Di must have larger radius than Di−1 . If we let r1 denote the radius
of Di−1 and r2 denote the radius of Di , then by the above argument, the disk Di−1 intersects the circle
bounding Di in an arc of angle less than π. (Shown in a heavy line in the figure below.)
Di−1
Di
pi
Since pi is not on the boundary of Di , the points defining Di must be chosen from among the first i − 1
points, from which it follows that they all lie within this arc. However, this would imply that between two
of the points is an arc of angle greater than π (the arc not shown with a heavy line) which, by the earlier
claim could not be a minimum enclosing disk.
The algorithm is identical in structure to the LP algorithm. We will randomly permute the points and insert them
one by one. For each new point pi , if it lies within the current disk then there is nothing to update. Otherwise,
we need to update the disk. We do this by computing the smallest enclosing disk that contains all the points
{p1 , . . . , pi−1 } and is constrained to have pi on its boundary. (The requirement that pi be on the boundary is
analogous to the constraint used in linear programming that optimum vertex lie on the line supporting the current
halfplane.)
This will involve a slightly different recursion. In this recursion, when we encounter a point that lies outside the
current disk, we will then recurse on a subproblem in which two points are constrained to lie on the boundary of
the disk. Finally, if this subproblem requires a recursion, we will have a problem in which there are three points
constrained to lie on a the boundary of the disk. But this problem is trivial, since there is only one circle passing
through three points.
Range Queries: We will shift our focus from algorithm problems to data structures for the next few lectures. We
will consider the following class of problems. Given a collection of objects, preprocess them (storing the results
in a data structure of some variety) so that queries of a particular form can be answered efficiently. Generally
we measure data structures in terms of two quantities, the time needed to answer a query and the amount of
space needed by the data structure. Often there is a tradeoff between these two quantities, but most of the
structures that we will be interested in will have either linear or near linear space. Preprocessing time is an
issue of secondary importance, but most of the algorithms we will consider will have either linear of O(n log n)
preprocessing time.
In a range queries we are given a set P of points and region R in space (e.g., a rectangle, polygon, halfspace, or
disk) and are asked list (or count or compute some accumulation function of) the subset of P lying within the
region. To get the best possible performance, the design of the data structure is tailored to the particular type of
region, but there are some data structures that can be used for a wide variety of regions.
An important concept behind all geometric range searching is that the subsets that can be formed by simple
geometric ranges is much smaller than the set of possible subsets (called the power set) of P . We can define any
range search problem abstractly as follows. Given a particular class of ranges, a range space is a pair (P, R)
consisting of the points P and the collection R of all subsets of P that be formed by ranges of this class. For
example, the following figure shows the range space assuming rectangular ranges for a set of points in the plane.
In particular, note that the sets {1, 4} and {1, 2, 4} cannot be formed by rectangular ranges.
Today we consider orthogonal rectangular range queries, that is, ranges defined by rectangles whose sides are
aligned with the coordinate axes. One of the nice things about rectangular ranges is that they can be decomposed
into a collection of 1-dimensional searches.
Canonical Subsets: A common approach used in solving almost all range queries is to represent P as a collection of
canonical subsets {S1 , S2 , . . . , Sk }, each Si ⊆ S (where k is generally a function of n and the type of ranges),
such that any set can be formed as the disjoint union of canonical subsets. Note that these subsets may generally
overlap each other.
There are many ways to select canonical subsets, and the choice affects the space and time complexities. For
example, the canonical subsets might be chosen to consist of n singleton sets, each of the form {pi }. This would
be very space efficient, since we need only O(n) total space to store all the canonical subsets, but in order to
answer a query involving k objects we would need k sets. (This might not be bad for reporting queries, but it
would be too long for counting queries.) At the other extreme, we might let the canonical subsets be the power
set of P . Now, any query could be answered with a single canonical subset, but we would have 2n different
canonical subsets to store. (A more realistic solution would be to use the set of all ranges, but this would still
be quite large for most interesting range spaces.) The goal of a good range data structure is to strike a balance
between the total number of canonical subsets (space) and the number of canonical subsets needed to answer a
query (time).
One-dimensional range queries: Before consider how to solve general range queries, let us consider how to an-
swer 1-dimension range queries, or interval queries. Let us assume that we are given a set of points P =
{p1 , p2 , . . . , pn } on the line, which we will preprocess into a data structure. Then, given an interval [xlo , xhi ],
the goal is to report all the points lying within the interval. Ideally we would like to answer a query in time
O(log n + k) time, where k is the number of points reported (an output sensitive result). Range counting queries
can be answered in O(log n) time with minor modifications.
Clearly one way to do this is to simply sort the points, and apply binary search to find the first point of P that
is greater than or equal to xlo , and less than or equal to xhi , and then list all the points between. This will not
generalize to higher dimensions, however.
Instead, sort the points of P in increasing order and store them in the leaves of a balanced binary search tree.
Each internal node of the tree is labeled with the largest key appearing in its left child. We can associate each
node of this tree (implicitly or explicitly) with the subset of points stored in the leaves that are descendents of
this node. This gives rise to the O(n) canonical subsets. For now, these canonical subsets will not be stored
explicitly as part of the data structure, but this will change later when we talk about range trees. This is illustrated
in the figure below.
We claim that the canonical subsets corresponding to any range can be identified in O(log n) time from this
structure. Given any interval [xlo , xhi ], we search the tree to find the leftmost leaf u whose key is greater than
or equal to xlo and the rightmost leaf v whose key is less than or equal to xhi . Clearly all the leaves between
u and v, together possibly with u and v, constitute the points that lie within the range. If key(u) = xlo then
we include u’s canonical (single point) subset and if key(v) = xhi then we do the same for v. To form the
remaining canonical subsets, we take the subsets of all the maximal subtrees lying between u and v.
Here is how to compute these subtrees. The search paths to u and v may generally share some common subpath,
starting at the root of the tree. Once the paths diverge, as we follow the left path to u, whenever the path goes to
the left child of some node, we add the canonical subset associated with its right child. Similarly, as we follow
7 24
{9,12,14,15}
3 12 20 27
{4,7} {17,20}
1 4 9 14 17 22 25 29
{3} {22}
1 3 4 7 9 12 14 15 17 20 22 24 25 27 29 31
u v
xlo =2 xhi =23
the right path to v, whenever the path goes to the right child, we add the canonical subset associated with its left
child.
To answer a range reporting query we simply traverse these canonical subtrees, reporting the points of their
leaves. Each tree can be traversed in time proportional to the number of leaves in each subtree. To answer a
range counting query we store the total number of points in each subtree (as part of the preprocessing) and then
sum all of these over all the canonical subtrees.
Since the search paths are of length O(log n), it follows that O(log n) canonical subsets suffice to represent the
answer to any query. Thus range counting queries can be answered in O(log n) time. For reporting queries,
since the leaves of each subtree can be listed in time that is proportional to the number of leaves in the tree (a
basic fact about binary trees), it follows that the total time in the search is O(log n + k), where k is the number
of points reported.
In summary, 1-dimensional range queries can be answered in O(log n) time, using O(n) storage. This concept
of finding maximal subtrees that are contained within the range is fundamental to all range search data structures.
The only question is how to organize the tree and how to locate the desired sets. Let see next how can we extend
this to higher dimensional range queries.
Kd-trees: The natural question is how to extend 1-dimensional range searching to higher dimensions. First we will
consider kd-trees. This data structure is easy to implement and quite practical and useful for many different
types of searching problems (nearest neighbor searching for example). However it is not the asymptotically
most efficient solution for the orthogonal range searching, as we will see later.
Our terminology is a bit nonstandard. The data structure was designed by Jon Bentley. In his notation, these
were called “k-d trees,” short for “k-dimensional trees”. The value k was the dimension, and thus there are 2-d
trees, 3-d trees, and so on. However, over time, the specific value of k was lost. Our text uses the somewhat
nonstandard form “kd-tree” rather than “k-d tree.” By the way, there are many variants of the kd-tree concept.
We will describe the most commonly used one, which is quite similar to Bentley’s original design. In our trees,
points will be stored only at the leaves. There are variants in which points are stored at internal nodes as well.
The idea behind a kd-tree is to extend the notion of a one dimensional tree. But for each node we subdivide
space either by splitting along x-coordinate of the points or along the y-coordinates. Each internal node t of the
kd-tree is associated with the following quantities:
p4 p5 p10
p
9
p2
p8
p6 p3 p4 p5 p8 p9 p
10
p3
p1 p7
p1 p2 p6 p7
Subdivision Tree structure
The cutting process has a geometric interpretation. Each node of the tree is associated implicitly with a rectan-
gular region of space, called a cell. (In general these rectangles may be unbounded, but in many applications it
is common to restrict ourselves to some bounded rectangular region of space before splitting begins, and so all
these rectangles are bounded.) The cells are nested in the sense that a child’s cell is contained within its parent’s
cell. Hence, these cells define a hierarchical decomposition of space. This is illustrated in figure.
There are two key decisions in the design of the tree.
How is the cutting dimension chosen? The simplest method is to cycle through the dimensions one by one.
(This method is shown in the above figure.) Since the cutting dimension depends only on the level of a
node in the tree, one advantage of this rule is that the cutting dimension need not be stored explicitly in
each node, instead we keep track of it while traversing the tree.
One disadvantage of this splitting rule is that, depending on the data distribution, this simple cyclic rule
may produce very skinny (elongated) cells, and such cells may adversely affect query times. Another
method is to select the cutting dimension to be the one along which the points have the greatest spread,
defined to be the difference between the largest and smallest coordinates. Bentley call the resulting tree an
optimized kd-tree.
How is the cutting value chosen? To guarantee that the tree has height O(log n), the best method is to let the
cutting value be the median coordinate along the cutting dimension. If there are an even number of points
in the subtree, we may take either the upper or lower median, or we may simply take the midpoint between
these two points. In our example, when there are an odd number of points, the median is associated with
the left (or lower) subtree.
A kd-tree is a special case of a more general class of hierarchical spatial subdivisions, called binary space
partition trees (or BSP trees) in which the splitting lines (or hyperplanes in general) may be oriented in any
direction. In the case of BSP trees, the cells are convex polygons.
Constructing the kd-tree: It is possible to build a kd-tree in O(n log n) time by a simple top-down recursive proce-
dure. The most costly step of the process is determining the median coordinate for splitting purposes. One way
to do this is to maintain two lists of pointers to the points, one sorted by x-coordinate and the other containing
pointers to the points sorted according to their y-coordinates. (In dimension d, d such arrays would be main-
tained.) Using these two lists, it is an easy matter to find the median at each step in constant time. In linear time
The figure below shows an example of a range search. White nodes have been visited in the search. Light
shaded nodes were not visited because their cell is contained entirely within Q. Dark shaded nodes are not
visited because their cell is disjoint from Q.
g Q visited contained in Q
h q
o disjoint of Q 2 counted in the search
f
p
n
b 4
k
e h 1
m
a c a b k l m n o p q
i e f g
d j l i j
c d
kd−tree subdivision Nodes visited in range search
Figure 43: Range search in a kd-tree. (Note: This particular tree was not generated by the algorithm described above.)
Analysis of√query time: How many nodes does this method visit altogether? We claim that the total number of nodes
is O( n) assuming a balanced kd-tree. Recall from the discussion above that a node is processed (both children
visited) if and only if the cell overlaps the range without being contained within the range. We say that such a
cell is stabbed by the query. To bound the total number of nodes that are processed in the search, we first bound
the number of nodes that are stabbed.
√ Given a balanced kd-tree with n points, orthogonal√range counting queries can be answered in
Theorem:
O( n) time and reporting queries can be answered in O( n + k) time. The data structure uses space
O(n).
Orthogonal Range√ Trees: Last time we saw that kd-trees could be used to answer orthogonal range queries in the
plane in O( n + k) time. Today we consider a better data structure, called orthogonal range trees.
An orthogonal range tree is a data structure which, in all dimensions d ≥ 2, uses O(n log(d−1) n) space, and
can answer orthogonal rectangular range queries in O(log(d−1) n + k) time, where k is the number of points
reported. Preprocessing time is the same as the space bound. Thus, in the plane, we can answer range queries
in time O(log n) and space O(n log n). We will present the data structure in two parts, the first is a version that
can answer queries in O(log2 n) time in the plane, and then we will show how to improve this in order to strip
off a factor of log n from the query time.
Multi-level Search Trees: The data structure is based on the concept of a multi-level search tree. In this method,
a complex search is decomposed into a constant number of simpler range searches. We cascade a number of
search structures for simple ranges together to answer the complex range query. In this case we will reduce a
d-dimensional range search to a series of 1-dimensional range searches.
Suppose you have a query which can be stated as the intersection of a small number of simpler queries. For
example, a rectangular range query in the plane can be stated as two range queries: Find all the points whose x-
coordinates are in the range [Q.xlo , Q.xhi ] and all the points whose y-coordinates are in the range [Q.ylo , Q.yhi ].
Let us consider how to do this for 2-dimensional range queries, and then consider how to generalize the process.
First, we assume that we have preprocessed the data by building a range tree for the first range query, which in
this case is just a 1-dimensional range tree for the x-coordinates. Recall that this is just a balanced binary tree
x−range tree
y−range tree
Q.hi.y
t
t.aux
Q.lo.x Q.hi.x
Q.lo.y
Q.hi.y
S(t)
Q.lo.y
S(t)
What is the query time for a range tree? Recall that it takes O(log n) time to locate the nodes representing
the canonical subsets for the 1-dimensional range query. For each, we invoke a 1-dimensional range search.
Thus there are O(log n) canonical sets, each invoking an O(log n) range search, for a total time of O(log2 n).
As before, listing the elements of these sets can be performed in additional k time by just traversing the trees.
Counting queries can be answered by precomputing the subtree sizes, and then just adding them up.
Space: The space used by the data structure is O(n log n) in the plane (and O(n log(d−1) n) in dimension d). The
reason comes by summing the sizes of the two data structures. The tree for the x-coordinates requires only
O(n) storage. But we claim that the total storage in all the auxiliary trees is O(n log n). We want to count
the total sizes of all these trees. The number of nodes in a tree is proportional to the number of leaves, and
hence the number of points stored in this tree. Rather than count the number of points in each tree separately,
instead let us count the number of trees in which each point appears. This will give the same total. Observe
that a point appears in the auxiliary trees of each of its ancestors. Since the tree is balanced, each point has
p6
p5
v
p4
p1 p2 p3 p4 p5 p6
p3
vL vR
p2
p1 p2 p3 p5 p1 p4 p6
At the root of the tree, we need to perform a binary search against all the y-values to determine which points lie
within this interval, for all subsequent levels, once we know where the y-interval falls with respect to the order
points here, we can drop down to the next level in O(1) time. Thus (as with fractional cascading) the running
time is O(2 log n), rather than O(log 2 n). It turns out that this trick can only be applied to the last level of the
search structure, because all other levels need the full tree search to compute canonical sets.
Theorem: Given a set of n points in Rd , orthogonal rectangular range queries can be answered in O(log(d−1) n+
k) time, from a data structure of size O(n log(d−1) n) which can be constructed in O(n log(d−1) n) time.
Red-Blue Segment Intersection: We have been talking about the use of geometric data structures for solving query
problems. Often data structures are used as intermediate structures for solving traditional input/output problems,
which do not involve preprocessing and queries. (Another famous example of this is HeapSort, which introduces
the heap data structure for sorting a list of numbers.) Today we will discuss a variant of a useful data structure,
the segment tree. The particular variant is called a hereditary segment tree. It will be used to solve the following
problem.
Red-Blue Segment Intersection: Given a set B of m pairwise disjoint “blue” segments in the plane and a set R
of n pairwise disjoint “red” segments, count (or report) all bichromatic pairs of intersecting line segments
(that is, intersections between red and blue segments).
It will make things simpler to think of the segments as being open (not including their endpoints). In this way,
the pairwise disjoint segments might be the edges of a planar straight line graph (PSLG). Indeed, one of the most
important application of red-blue segment intersection involves computing the overlay of two PSLG’s (one red
and the other blue) This is also called the map overlay problem, and is often used in geographic information
systems. The most time consuming part of the map overlay problem is determining which pairs of segments
overlap. See the figure below.
Let N = n + m denote the total input size and let k denote the total number of bichromatic intersecting pairs.
We will present an algorithm for this problem that runs in O(k + N log2 N ) time for the reporting problem and
O(N log2 N ) time for the counting problem. Both algorithms use O(N log N ) space. Although we will not
discuss it (but the original paper does) it is possible to remove a factor of log n from both the running time and
space, using a somewhat more sophisticated variant of the algorithm that we will present.
Because the set of red segments are each pairwise disjoint as are the blue segments, it follows that we could
solve the reporting problem by our plane sweep algorithm for segment intersection (as discussed in an earlier
lecture) in O((N + k) log N ) time and O(N ) space. Thus, the more sophisticated algorithm is an improvement
on this. However, plane sweep will not allow us to solve the counting problem.
The Hereditary Segment Tree: Recall that we are given two sets B and R, consisting of, respectively, m and n line
segments in the plane, and let N = m + n. Let us make the general position assumption that the 2N endpoints
of these line segments have distinct x-coordinates. The x-coordinates of these endpoints subdivide the x-axis
into 2N + 1 intervals, called atomic intervals. We construct a balanced binary tree whose leaves are in 1–1
correspondence with these intervals, ordered from left to right. Each internal node u of this tree is associated
with an interval Iu of the x-axis, consisting of the union of the intervals of its descendent leaves. We can think
of each such interval as a vertical slab Su whose intersection with the x-axis is Iu . (See the figure below, left.)
u s
s
s
Su
Iu
Figure 47: Hereditary Segment Tree: Intervals, slabs and the nodes associated with a segment.
We associate a segment s with a set of nodes of the tree. A segment is said to span interval Iu if its projection
a,b,c,d,e
a,b,c,d,e
a,e a,b,e b,c,d d,e
a,b c e b e b d d
b e b d d
a a
b e b e
c c
d d
Figure 48: Hereditary Segment Tree with standard lists (left) and hereditary lists (right).
Each node u is also associated with a list Bu∗ , called the blue hereditary list, which is the union of the Bv for
all proper descendents v or u. The red hereditary list Ru∗ is defined analogously. (Even though a segment may
occur in the standard list for many descendents, there is only one copy of each segment in the hereditary lists.)
The segments of Ru and Bu are called the long segments, since they span the entire interval. The segments of
Ru∗ and Bu∗ are called the short segments, since they do not span the entire interval.
By the way, if we ignored the fact that we have two colors of segments and just considered the standard lists,
the resulting tree is called a segment tree. The addition of the hereditary lists makes this a hereditary segment
tree. Our particular data structure differs from the standard hereditary segment tree in that we have partitioned
the various segment lists according to whether the segment is red or blue.
Time and Space Analysis: We claim that the total size of the hereditary segment tree is O(N log N ). To see this
observe that each segment is stored in the standard list of at most 2 log N nodes. The argument is very similar to
the analysis of the 1-dimensional range tree. If you locate the left and right endpoints of the segment among the
atomic intervals, these define two paths in the tree. In the same manner as canonical sets for the 1-dimensional
range tree, the segment will be stored in all the “inner” nodes between these two paths. (See the figure below.)
The segment will also be stored in the hereditary lists for all the ancestors of these nodes. These ancestors lie
along the two paths to the left and right, and hence there are at most 2 log N of them. Thus, each segment
appears in at most 4 log N lists, for a total size of O(N log N ).
The tree can be built in O(N log N ) time. In O(N log N ) time we can sort the 2N segment endpoints. Then for
each segment, we search for its left and right endpoints and insert the segment into the standard and hereditary
lists for the appropriate nodes and we descend each path in O(1) time for each node visited. Since each segment
appears in O(log N ) lists, this will take O(log N ) time per segment and O(N log N ) time overall.
Computing Intersections: Let us consider how to use the hereditaray segment tree to count and report bichromatic
intersections. We will do this on a node-by-node basis. Consider any node u. We classify the intersections into
two types, long-long intersections are those between a segment of Bu and Ru , and long-short intersections are
those between a segment of Bu∗ and Ru or between Ru∗ and Bu . Later we will show that by considering just
these intersection cases, we will consider every intersection exactly once.
Long-long intersections: Sort each of the lists Bu and Ru of long segments in ascending order by y-coordinate.
(Since the segments of each set are disjoint, this order is constant throughout the interval for each set.) Let
hb1 , b2 , . . . , bmu i and hr1 , r2 , . . . , rnu i denote these ordered lists. Merge these lists twice, once according
to their order along the left side of the slab and one according to their order along the right side of the slab.
Observe that for each blue segment b ∈ Bu , this allows us to determine two indices i and j, such that b
lies between ri and ri+1 along the left boundary and between rj and rj+1 along the right boundary. (For
convenience, we can think of segment 0 as an imaginary segment at y = −∞.)
It follows that if i < j then b intersects the red segments ri+1 , . . . , rj . (See the figure below, (a)). On the
other hand, if i ≥ j then b intersects the red segments rj+1 , . . . , ri . (See the figure below, (b)). We can
count these intersections in O(1) time or report them in time proportional to the number of intersections.
For example, consider the segment b = b2 in the figure below, (c). On the left boundary it lies between r3
and r4 , and hence i = 3. On the right boundary it lies between r0 and r1 , and hence j = 0. (Recall that r0
is at y = −∞.) Thus, since i ≥ j it follows that b intersects the three red segments {r1 , r2 , r3 }.
ri+1 b4 r4
rj+1
rj+1 ri+1 ri r4 r3
rj
b b b3 b4
rj ri b2
ri+1 r2
rj+1
r3 b3
b b r2 r1
rj+1
ri ri+1 rj
b1 b2
rj
ri r1 b1
(a) (b) (c)
The total time to do this is dominated by the O(mu log mu + nu log nu ) time needed to sort both lists. The
merging and counting only requires linear time.
Long-short intersections: There are two types of long-short intersections to consider. Long red and short blue,
and long blue and short red. Let us consider the first one, since the other one is symmetrical.
As before, sort the long segments of Ru in ascending order according to y-coordinate, letting hr1 , r2 , . . . , rnu i
denote this ordered list. These segments naturally subdivide the slab into nu + 1 trapezoids. For each short
segment b ∈ Bu∗ , perform two binary searches among the segments of Ru to find the lowest segment ri
and the highest segment rj that b intersects. (See the figure above, right.) Then b intersects all the red
segments ri , ri+1 , . . . , rj .
Thus, after O(log nu ) time for the binary searches, the segments of Ru intersecting b can be counted in
O(1) time, for a total time of O(m∗u log nu ). Reporting can be done in time proportional to the number of
rj
ri
b
ri−1 ri
ri−1
intersections reported. Adding this to the time for the long blue and short red case, we have a total time
complexity of O(m∗u log nu + n∗u log mu ).
If we let Nu = mu + nu + m∗u + n∗u , then observe that the total Ptime to process vertex u is O(Nu log Nu )
time. Summing this over all nodes of the tree, and recalling that u Nu = O(N log N ) we have a total time
complexity of à !
X X
T (N ) = Nu log Nu ≤ Nu log N = O(N log2 N ).
u u
Correctness: To show that the algorithm is correct, we assert that each bichromatic intersection is counted exactly
once. For any bichromatic intersection between bi and rj consider the leaf associated with the atomic interval
containing this intersection point. As we move up to the ancestors of this leaf, we will encounter bi in the
standard list of one of these ancestors, denoted ui , and will encounter rj at some node, denoted uj . If ui = uj
then this intersection will be detected as a long-long intersection at this node. Otherwise, one is a proper ancestor
of the other, and this will be detected as a long-short intersection (with the ancestor long and descendent short).
Point Location: The point location problem (in 2-space) is: given a polygonal subdivision of the plane (that is, a
PSLG) with n vertices, preprocess this subdivision so that given a query point q, we can efficiently determine
which face of the subdivision contains q. We may assume that each face has some identifying label, which is to
be returned. We also assume that the subdivision is represented in any “reasonable” form (e.g. as a DCEL). In
general q may coincide with an edge or vertex. To simplify matters, we will assume that q does not lie on an
edge or vertex, but these special cases are not hard to handle.
Our goal is to develop a data structure with O(n) that can answer queries in O(log n) time. For many years
the best methods known had an extra log factor, either in the space or in the query time. Kirkpatrick achieved
a breakthrough by presenting a time/space optimal algorithm. Kirkpatrick’s algorithm has fairly high constant
factors. Somewhat simpler and more practical optimal algorithms were discovered since then. We will present
perhaps the simplest and most practical of the known optimal algorithms. The method is based on a randomized
incremental construction, the same technique used in our linear programming algorithm.
Trapezoidal Map: The algorithm is based on a construction called a trapezoidal map (which also goes under many
other names in the computational geometry literature). Although we normally think of the input to a point
location algorithm as being a planar polygonal subdivision (or PSLG), we will define the algorithm under the
Observe that all the faces of the resulting subdivision are trapezoids with vertical sides. The left or right side
might degenerate to a line segment of length zero, implying that the resulting trapezoid degenerates to a triangle.
We claim that the process of converting an arbitrary polygonal subdivision into a trapezoidal decomposition
increases its size by at most a constant factor. Actually this follows from the facts that we only increase the
number of vertices by a constant factor and the graph is planar. But since constant factor expansions in space
are significant, it is a good idea to work this through carefully. We assume that the final trapezoidal map will be
given as a polygonal subdivision of the plane (represented, say, using a DCEL).
Claim: Given a polygonal subdivision with n segments, the resulting trapezoidal map has at most 6n + 4
vertices and 3n + 1 trapezoids.
Proof: To prove the bound on the number of vertices, observe that each vertex shoots two bullet paths, each of
which will result in the creation of a new vertex. Thus each original vertex gives rise to three vertices in
the final map. Since each segment has two vertices, this implies at most 6n vertices.
To bound the number of trapezoids, observe that for each trapezoid in the final map, its left side (and its
right as well) is bounded by a vertex of the original polygonal subdivision. The left endpoint of each line
segment can serve as the left bounding vertex for two trapezoids (one above the line segment and the other
below) and the right endpoint of a line segment can serve as the left bounding vertex for one trapezoid.
Thus each segment of the original subdivision gives rise to at most three trapezoids, for a total of 3n
trapezoids. The last trapezoid is the one bounded by the left side of the bounding box.
An important fact to observe about each trapezoid is that it is defined (that is, its existence is determined) by
exactly four entities from the original subdivision: a segment on top, a segment on the bottom, a bounding
vertex on the left, and a bounding vertex on the right. This simple observation will play an important role in the
analysis.
Trapezoidal decompositions, like triangulations, are interesting data structures in their own right. It is another
example of the idea of converting a complex shape into a disjoint collection of simpler objects. The fact that
the sides are vertical makes trapezoids simpler than arbitrary quadrilaterals. Finally observe that the trapezoidal
decomposition is a refinement of the original polygonal subdivision, and so once we know which face of the
trapezoidal map a query point lies in, we will know which face of the original subdivision it lies in (either
implicitly, or because we label each face of the trapezoidal map in this way).
Observe that the structure of the trapezoidal decomposition does not depend on the order in which the segments
are added. This observation will be important for the probabilistic analysis. The following is also important to
the analysis.
Claim: Ignoring the time spent to locate the left endpoint of an segment, the time that it takes to insert the ith
segment and update the trapezoidal map is O(ki ), where ki is the number of newly created trapezoids.
Proof: Consider the insertion of the ith segment, and let K denote the number of bullet paths that this segment
intersects. We need to shoot four bullets (two from each endpoint) and then trim each of the K bullet paths,
for a total of K + 4 operations that need to be performed. If the new segment did not cross any of the
bullet paths, then we would get exactly four new trapezoids. For each of the K bullet paths we cross, we
add one more to the number of newly created trapezoids, for a total of K + 4. Thus, letting ki = K + 4 be
the number of trapezoids created, the number of update operations is exactly ki . Each of these operations
can be performed in O(1) time given any reasonable representation of the trapezoidal map (e.g. a DCEL).
Analysis: We left one important detail out, namely, how we locate the trapezoid containing left endpoint of each new
segment that we add. Let’s ignore this for now. (We will see later that this is O(log n) time on average). We
will show that the expected time to add each new segment is O(1). Since there are n insertions, this will lead to
a total expected time complexity of O(n(1 + log n)) = O(n log n).
We know that the size of the final trapezoidal map is O(n). It turns out that the total size of the point location data
structure will actually be proportional to the number of new trapezoids that are created with each insertion. In the
Lemma: Consider the randomized incremental construction of a trapezoidal map, and let ki denote the number
of new trapezoids created when the ith segment is added. Then E[ki ] = O(1), where the expectation is
taken over all permutations of the segments.
Proof: The analysis will be based on a backwards analysis. Recall that such an analysis is based on analyzing
the expected value assuming that the last insertion was random.
Let Ti denote the trapezoidal map after the insertion of the ith segment. Because we are averaging over all
permutations, among the i segments that are present in Ti , each one has an equal probability 1/i of being
the last one to have been added. For each of the segments s we want to count the number of trapezoids that
would have been created, had s been the last segment to be added. Let’s say that a trapezoid ∆ depends
on an segment s, if s would have caused ∆ to be created, had s been added last. We want to count the
number of trapezoids that depend on each segment, and then compute the average over all segments. If we
let δ(∆, s) = 1 if segment s depends on ∆, and 0 otherwise, then the expected complexity is
1 X X
E[ki ] = δ(∆, s).
i
s∈Si ∆∈Ti
Some segments might have resulted in the creation of lots of trapezoids and other very few. How do we
get a handle on this quantity? The trick is, rather than count the number of trapezoids that depend on each
segment, we count the number segments that each trapezoid depends on. (The old combinatorial trick of
reversing the order of summation.) In other words we want to compute:
1 X X
E[ki ] = δ(∆, s).
i
∆∈Ti s∈Si
This is much easier to determine. In particular, each trapezoid is bounded by at most four sides (recall that
degenerate trapezoids are possible). The top and bottom sides are each determined by a segment of Si ,
and clearly if either of these was the last to be added, then this trapezoid would have come into existence
as a result. The left and right sides are each determined by a endpoint of a segment in Si , and clearly if
either of these was the last to be added, then this trapezoid would have come into existence. Thus, each
Since the expected number of new trapezoids created with each insertion is O(1), it follows that the total
number of trapezoids that are created in the entire process is O(n). This fact is important in bounding the
total time needed for the randomized incremental algorithm. The only question that we have not considered in
the construction is how to locate the trapezoid that contains left endpoint of each newly added segment. We will
consider this question, and the more general question of how to do point location next time.
Point Location: Last time we presented a randomized incremental algorithm for building a trapezoidal map. Today
we consider how to modify this algorithm to answer point location queries for the resulting trapezoidal de-
composition. The preprocessing time will be O(n log n) in the expected case (as was the time to construct the
trapezoidal map), and the space and query time will be O(n) and O(log n), respectively, in the expected case.
Note that this may be applied to any spatial subdivision, by treating it as a set of line segments, and then building
the resulting trapezoidal decomposition and using this data structure.
Recall that we treat the input as a set of segments S = {s1 , . . . , sn } (permuted randomly), that Si denotes the
subset consisting of the first i segments of S, and Ti denotes the trapezoidal map of Si . One important element
of the analysis to remember from last time is that each time we add a new line segment, it may result in the
creation of the collection of new trapezoids, which were said to depend on this line segment. We presented a
backwards analysis that the number of new trapezoids that are created with each stage is expected to be O(1).
This will play an important role in today’s analysis.
Point Location Data Structure: The point location data structure is based on a rooted directed acyclic graph. Each
node will either two or zero outgoing edges. Nodes with zero outgoing edges are called leaves. There will be
one leaf for each trapezoid in the map. The other nodes are called internal nodes, and they are used to guide the
search to the leaves. This is not a binary tree, however, because subtrees may be shared.
There are two types of internal nodes, x-nodes and y-nodes. Each x-node contains the x-coordinate x0 of an
endpoint of one of the segments. Its two children correspond to the points lying to the left and to the right of
the vertical line x = x0 . Each y-node contains a pointer to a line segment of the subdivision. The left and right
children correspond to whether the query point is above or below the line containing this segment, respectively.
Note that the search will reach a y-node only if we have already verified that the x-coordinate of the query point
lies within the vertical slab that contains this segment.
Our construction of the point location data structure mirrors the incremental construction of the trapezoidal map.
In particular, if we freeze the construction just after the insertion of any segment, the current structure will be a
point location structure for the current trapezoidal map. In the figure below we show a simple example of what
the data structure looks like for two line segments. The circular nodes are the x-nodes and the hexagonal nodes
are the y-nodes. There is one leaf for each trapezoid. The y-nodes are shown as hexagons. For example, if the
query point is in trapezoid D, we would first detect that it is to the right of enpoint p1 , then left of q1 , then below
s1 (the right child), then right of p2 , then above s2 (the left child).
Incremental Construction: The question is how do we build this data structure incrementally? First observe that
when a new line segment is added, we only need to adjust the portion of the tree that involves the trapezoids that
have been deleted as a result of this new addition. Each trapezoid that is deleted will be replaced with a search
structure that determines the newly created trapezoid that contains it.
Suppose that we add a line segment s. This results in the replacement of an existing set of trapezoids with a
set of new trapezoids. As a consequence, we will replace the leaves associated with each such deleted trapezoid
with a small search structure, which locates the new trapezoid that contains the query point. There are three
cases that arise, depending on how many endpoints of the segment lie within the current trapezoid.
Single (left or right) endpoint: A single trapezoid A is replaced by three trapezoids, denoted X, Y , and Z.
Letting p denote the endpoint, we create an x-node for p, and one child is a leaf node for the trapezoid X
that lies outside vertical projection of the segment. For the other child, we create a y-node whose children
are the trapezoids Y and Z lying above and below the segment, respectively. (See the figure below left.)
No segment endpoints: This happens when the segment cuts completely through a trapezoid. A single trape-
zoid is replaced by two trapezoids, one above and one below the segment, denoted Y and Z. We replace
the leaf node for the original trapezoid with a y-node whose children are leaf nodes associated with Y and
Z. (This case is not shown in the figure.)
Two segment endpoints: This happens when the segment lies entirely inside the trapezoid. In this case one
trapezoid A is replaced by four trapezoids, U , X, Y , and Z. Letting p and q denote the left and right
endpoints of the segment, we create two x-nodes, one for the p and the other for q. We create a y-node for
the line segment, and join everything together as shown in the figure.
s Y s Y X
A X A s U
q
p pZ p Z
A p A p
X s U q
Y Z s X
Y Z
Figure 56: Line segment insertion and updates to the point location structure. The single-endpoint case (left) and the
two-endpoint case (right). The no-endpoint case is not shown.
It is important to notice that (through sharing) each trapezoid appears exactly once as a leaf in the resulting
structure. An example showing the complete transformation to the data structure after adding a single segment
is shown in the figure below.
p1
B q1
q1 K A q2
p s1 s1
1 I q3
s3 B s2 q3
A p N p
3 2 s3
p L M p3 s3 N
2 s2 s2
q2 H s3 M
H J F s3 F
I J K L
Analysis: We claim that the size of the point location data structure is O(n) and the query time is O(log n), both in
the expected case. As usual, the expectation depends only on the order of insertion, not on the line segments or
the location of the query point.
To prove the space bound of O(n), observe that the number of new nodes added to the structure with each new
segment is proportional to the number of newly created trapezoids. Last time we showed that with each new
insertion, the expected number of trapezoids that were created was O(1). Therefore, we add O(1) new nodes
with each insertion in the expected case, implying that the total size of the data structure is O(n).
Analyzing the query time is a little subtler. In a normal probabilistic analysis of data structures we think of the
data structure as being fixed, and then compute expectations over random queries. Here the approach will be to
imagine that we have exactly one query point to handle. The query point can be chosen arbitrarily (imagine an
adversary that tries to select the worst-possible query point) but this choice is made without knowledge of the
random choices the algorithm makes. We will show that for any query point, most random orderings of the line
segments will lead to a search path of length O(log n) in the resulting tree.
Let q denote the query point. Rather than consider the search path for q in the final search structure, we will
consider how q moves incrementally through the structure with the addition of each new line segment. Let
∆i denote the trapezoid of the map that q lies in after the insertion of the first i segments. Observe that if
∆i−1 = ∆i , then insertion of the ith segment did not affect the trapezoid that q was in, and therefore q will stay
where it is relative to the current search structure. (For example, if q was in trapezoid B prior to adding s3 in the
figure above, then the addition of s3 does not incur any additional cost to locating q.) However, if ∆i−1 6= ∆i ,
then the insertion of the ith segment caused q’s trapezoid to be deleted. As a result, q must locate itself with
respect to the newly created trapezoids that overlap ∆i−1 . Since there are a constant number of such trapezoids
(at most four), there will be O(1) work needed to locate q with respect to these. In particular, q may fall as much
as three levels in the search tree. (The worst case occurs in the two-endpoint case, if the query point falls into
one of the trapezoids X or Y lying above or below the segment.)
To compute the expected length of the search path, it suffices to compute the probability that the trapezoid that
We will show that Pi ≤ 4/i. From this it will follow that the expected path length is at most
Xn Xn
4 1
3 = 12 ,
i=1
i i=1
i
Lemma: Given a set of n non-crossing line segments in the plane, and a parameter λ > 0, the probability that
the total depth of the randomized search structure exceeds 3λ ln(n + 1), is at most 2/(n + 1)λ ln 1.25−3 .
For example, for λ = 20, the probability that the search path exceeds 60 ln(n + 1) is at most 2/(n + 1)1.5 . (The
constant factors here are rather weak, but a more careful analysis leads to a better bound.)
Nonetheless, this itself is enough to lead to variant of the algorithm for which O(log n) time is guaranteed.
Rather than just running the algorithm once and taking what it gives, instead keep running it and checking the
structure’s depth. As soon as the depth is at most c log n for some suitably chosen c, then stop here. Depending
on c and n, the above lemma indicates how long you may need to expect to repeat this process until the final
structure has the desired depth. For sufficiently large c, the probability of finding a tree of the desired depth will
be bounded away from 0 by some constant factor, and therefore after a constant number of trials (depending on
this probability) you will eventually succeed in finding a point location structure of the desired depth. A similar
argument can be applied to the space bounds.
Theorem: Given a set of n non-crossing line segments in the plane, in expected O(n log n) time, it is possible
to construct a point location data structure of (worst case) size O(n) that can answer point location queries
in (worst case) time O(log n).
Euclidean Geometry: We now will make a subtle but important shift. Up to now, virtually everything that we have
done has not needed the notion of angles, lengths, or distances (except for our work on circles). All geometric
The distance between two points p and q, denoted dist(p, q) or |pq|, is defined to be |p − q|.
Voronoi Diagrams: Voronoi diagrams (like convex hulls) are among the most important structures in computational
geometry. A Voronoi diagram records information about what is close to what. Let P = {p1 , p2 , . . . , pn } be a
set of points in the plane (or in any dimensional space), which we call sites. Define V(pi ), the Voronoi cell for
pi , to be the set of points q in the plane that are closer to pi than to any other site. That is, the Voronoi cell for
pi is defined to be:
V(pi ) = {q | |pi q| < |pj q|, ∀j 6= i}.
Another way to define V(pi ) is in terms of the intersection of halfplanes. Given two sites pi and pj , the set of
points that are strictly closer to pi than to pj is just the open halfplane whose bounding line is the perpendicular
bisector between pi and pj . Denote this halfplane h(pi , pj ). It is easy to see that a point q lies in V(pi ) if and
only if q lies within the intersection of h(pi , pj ) for all j 6= i. In other words,
Since the intersection of halfplanes is a (possibly unbounded) convex polygon, it is easy to see that V(pi ) is a
(possibly unbounded) convex polygon. Finally, define the Voronoi diagram of P , denoted Vor(P ) to be what
is left of the plane after we remove all the (open) Voronoi cells. It is not hard to prove (see the text) that the
Voronoi diagram consists of a collection of line segments, which may be unbounded, either at one end or both.
An example is shown in the figure below.
Voronoi edges: Each point on an edge of the Voronoi diagram is equidistant from its two nearest neighbors pi
and pj . Thus, there is a circle centered at such a point such that pi and pj lie on this circle, and no other
site is interior to the circle.
pk
pj pj
pi pi
Voronoi vertices: It follows that the vertex at which three Voronoi cells V(pi ), V(pj ), and V(pk ) intersect,
called a Voronoi vertex is equidistant from all sites. Thus it is the center of the circle passing through these
sites, and this circle contains no other sites in its interior.
Degree: If we make the general position assumption that no four sites are cocircular, then the vertices of the
Voronoi diagram all have degree three.
Convex hull: A cell of the Voronoi diagram is unbounded if and only if the corresponding site lies on the
convex hull. (Observe that a site is on the convex hull if and only if it is the closest point from some point
at infinity.) Thus, given a Voronoi diagram, it is easy to extract the convex hull in linear time.
Size: If n denotes the number of sites, then the Voronoi diagram is a planar graph (if we imagine all the
unbounded edges as going to a common vertex infinity) with exactly n faces. It follows from Euler’s
formula that the number of Voronoi vertices is at most 2n − 5 and the number of edges is at most 3n − 6.
(See the text for details.)
Computing Voronoi Diagrams: There are a number of algorithms for computing Voronoi diagrams. Of course,
there is a naive O(n2 log n) time algorithm, which operates by computing V(pi ) by intersecting the n − 1
bisector halfplanes h(pi , pj ), for j 6= i. However, there are much more efficient ways, which run in O(n log n)
time. Since the convex hull can be extracted from the Voronoi diagram in O(n) time, it follows that this is
asymptotically optimal in the worst-case.
Historically, O(n2 ) algorithms for computing Voronoi diagrams were known for many years (based on incre-
mental constructions). When computational geometry came along, a more complex, but asymptotically superior
O(n log n) algorithm was discovered. This algorithm was based on divide-and-conquer. But it was rather com-
plex, and somewhat difficult to understand. Later, Steven Fortune invented a plane sweep algorithm for the
problem, which provided a simpler O(n log n) solution to the problem. It is his algorithm that we will discuss.
Somewhat later still, it was discovered that the incremental algorithm is actually quite efficient, if it is run as a
randomized incremental algorithm. We will discuss this algorithm later when we talk about the dual structure,
called a Delaunay triangulation.
sweep line
unanticipated
events
Figure 60: Plane sweep for Voronoi diagrams. Note that the position of the indicated vertices depends on sites that
have not yet been encountered by the sweep line, and hence are unknown to the algorithm. (Note that the sweep line
moves from top to bottom.)
Fortune made the clever observation of rather than computing the Voronoi diagram through plane sweep in its
final form, instead to compute a “distorted” but topologically equivalent version of the diagram. This distorted
version of the diagram was based on a transformation that alters the way that distances are measured in the
plane. The resulting diagram had the same topological structure as the Voronoi diagram, but its edges were
parabolic arcs, rather than straight line segments. Once this distorted diagram was generated, it was an easy
matter to “undistort” it to produce the correct Voronoi diagram.
Our presentation will be different from Fortune’s. Rather than distort the diagram, we can think of this algorithm
as distorting the sweep line. Actually, we will think of two objects that control the sweeping process. First, there
will be a horizontal sweep line, moving from top to bottom. We will also maintain an x-monotonic curve called a
beach line. (It is so named because it looks like waves rolling up on a beach.) The beach line is a monotone curve
formed from pieces of parabolic arcs. As the sweep line moves downward, the beach line follows just behind.
The job of the beach line is to prevent us from seeing unanticipated events until the sweep line encounters the
corresponding site.
The Beach Line: In order to make these ideas more concrete, recall that the problem with ordinary plane sweep is
that sites that lie below the sweep line may affect the diagram that lies above the sweep line. To avoid this
problem, we will maintain only the portion of the diagram that cannot be affected by anything that lies below
the sweep line. To do this, we will subdivide the halfplane lying above the sweep line into two regions: those
points that are closer to some site p above the sweep line than they are to the sweep line itself, and those points
that are closer to the sweep line than any site above the sweep line.
What are the geometric properties of the boundary between these two regions? The set of points q that are
equidistant from the sweep line to their nearest site above the sweep line is called the beach line. Observe that
for any point q above the beach line, we know that its closest site cannot be affected by any site that lies below
points equidistant
from p and L
sweep line
L
beach line
Figure 61: The beach line. Notice that only the portion of the Voronoi diagram that lies above the beach line is
computed. The sweep line status maintains the intersection of the Voronoi diagram with the beach line.
Thus, the beach line consists of the lower envelope of these parabolas, one for each site. Note that the parabola
of some sites above the beach line will not touch the lower envelope and hence will not contribute to the beach
line. Because the parabolas are x-monotone, so is the beach line. Also observe that the vertex where two arcs of
the beach line intersect, which we call a breakpoint, is a point that is equidistant from two sites and the sweep
line, and hence must lie on some Voronoi edge. In particular, if the beach line arcs corresponding to sites pi and
pj share a common breakpoint on the beach line, then this breakpoint lies on the Voronoi edge between pi and
pj . From this we have the following important characterization.
Lemma: The beach line is an x-monotone curve made up of parabolic arcs. The breakpoints of the beach line
lie on Voronoi edges of the final diagram.
Fortune’s algorithm consists of simulating the growth of the beach line as the sweep line moves downward,
and in particular tracing the paths of the breakpoints as they travel along the edges of the Voronoi diagram. Of
course, as the sweep line moves the parabolas forming the beach line change their shapes continuously. As with
all plane-sweep algorithms, we will maintain a sweep-line status and we are interested in simulating the discrete
event points where there is a “significant event”, that is, any event that changes the topological structure of the
Voronoi diagram and the beach line.
Sweep Line Status: The algorithm maintain the current location (y-coordinate) of the sweep line. It stores, in
left-to-right order the set of sites that define the beach line. Important: The algorithm never needs to store
the parabolic arcs of the beach line. It exists solely for conceptual purposes.
Events: There are two types of events.
Site events: When the sweep line passes over a new site a new arc will be inserted into the beach line.
Vertex events: (What our text calls circle events.) When the length of a parabolic arc shrinks to zero, the
arc disappears and a new Voronoi vertex will be created at this point.
The algorithm consists of processing these two types of events. As the Voronoi vertices are being discovered
by vertex events, it will be an easy matter to update a DCEL for the diagram as we go, and so to link the entire
diagram together. Let us consider the two types of events that are encountered.
It is important to consider whether this is the only way that new arcs can be introduced into the sweep line. In
fact it is. We will not prove it, but a careful proof is given in the text. As a consequence of this proof, it follows
that the maximum number of arcs on the beach line can be at most 2n − 1, since each new point can result in
creating one new arc, and splitting an existing arc, for a net increase of two arcs per point (except the first).
The nice thing about site events is that they are all known in advance. Thus, after sorting the points by y-
coordinate, all these events are known.
Vertex events: In contrast to site events, vertex events are generated dynamically as the algorithm runs. As with the
line segment plane sweep algorithm, the important idea is that each such event is generated by objects that are
neighbors on the beach line. However, unlike the segment intersection where pairs of consecutive segments
generated events, here triples of points generate the events.
In particular, consider any three consecutive sites pi , pj , and pk whose arcs appear consecutively on the beach
line from left to right. (See the figure below.) Further, suppose that the circumcircle for these three sites lies at
least partially below the current sweep line (meaning that the Voronoi vertex has not yet been generated), and
that this circumcircle contains no points lying below the sweep line (meaning that no future point will block the
creation of the vertex).
Consider the moment at which the sweep line falls to a point where it is tangent to the lowest point of this
circle. At this instant the circumcenter of the circle is equidistant from all three sites and from the sweep line.
Thus all three parabolic arcs pass through this center point, implying that the contribution of the arc from pj has
disappeared from the beach line. In terms of the Voronoi diagram, the bisectors (pi , pj ) and (pj , pk ) have met
each other at the Voronoi vertex, and a single bisector (pi , pk ) remains. (See the figure below.)
Sweep-line algorithm: We can now present the algorithm in greater detail. The main structures that we will maintain
are the following:
(Partial) Voronoi diagram: The partial Voronoi diagram that has been constructed so far will be stored in a
DCEL. There is one technical difficulty caused by the fact that the diagram contains unbounded edges. To
handle this we will assume that the entire diagram is to be stored within a large bounding box. (This box
should be chosen large enough that all of the Voronoi vertices fit within the box.)
pi pi pi
pk pk pk
Beach line: The beach line is represented using a dictionary (e.g. a balanced binary tree or skip list). An
important fact of the construction is that we do not explicitly store the parabolic arcs. They are just there
for the purposes of deriving the algorithm. Instead for each parabolic arc on the current beach line, we
store the site that gives rise to this arc. Notice that a site may appear multiple times on the beach line (in
fact linearly many times in n). But the total length of the beach line will never exceed 2n − 1. (You should
try to construct an example where a single site contributes multiple arcs to the beach line.)
Between each consecutive pair of sites pi and pj , there is a breakpoint. Although the breakpoint moves as
a function of the sweep line, observe that it is possible to compute the exact location of the breakpoint as a
function of pi , pj , and the current y-coordinate of the sweep line. In particular, the breakpoint is the center
of a circle that passes through pi , pj and is tangent to the sweep line. Thus, as with beach lines, we do not
explicitly store breakpoints. Rather, we compute them only when we need them.
The important operations that we will have to support on the beach line are
(1) Given a fixed location of the sweep line, determine the arc of the beach line that intersects a given
vertical line. This can be done by a binary search on the breakpoints, which are computed “on the
fly”. (Think about this.)
(2) Compute predecessors and successors on the beach line.
(3) Insert an new arc pi within a given arc pj , thus splitting the arc for pj into two. This creates three
arcs, pj , pi , and pj .
(4) Delete an arc from the beach line.
It is not difficult to modify a standard dictionary data structure to perform these operations in O(log n)
time each.
Event queue: The event queue is a priority queue with the ability both to insert and delete new events. Also the
event with the largest y-coordinate can be extracted. For each site we store its y-coordinate in the queue.
For each consecutive triple pi , pj , pk on the beach line, we compute the circumcircle of these points.
(We’ll leave the messy algebraic details as an exercise, but this can be done in O(1) time.) If the lower
endpoint of the circle (the minimum y-coordinate on the circle) lies below the sweep line, then we create a
vertex event whose y-coordinate is the y-coordinate of the bottom endpoint of the circumcircle. We store
this in the priority queue. Each such event in the priority queue has a cross link back to the triple of sites
that generated it, and each consecutive triple of sites has a cross link to the event that it generated in the
priority queue.
The algorithm proceeds like any plane sweep algorithm. We extract an event, process it, and go on to the next
event. Each event may result in a modification of the Voronoi diagram and the beach line, and may result in the
creation or deletion of existing events.
Here is how the two types of events are handled:
hp1 , p2 , pj , p3 , p4 i.
The insertion of pi splits the arc pj into two arcs, denoted p0j and p00j . Although these are separate arcs,
they involve the same site, pj . The new sequence is
Any event associated with the old triple p2 , pj , p3 will be deleted. We also consider the creation of new
events for the triples p2 , p0j , pi and pi , p00j , p3 . Note that the new triple p0j , pi , p00j cannot generate an event
because it only involves two distinct sites.
Vertex event: Let pi , pj , and pk be the three sites that generate this event (from left to right). We delete the
arc for pj from the beach line. We create a new vertex in the Voronoi diagram, and tie the edges for the
bisectors (pi , pj ), (pj , pk ) to it, and start a new edge for the bisector (pi , pk ) that starts growing down
below. Finally, we delete any events that arose from triples involving this arc of pj , and generate new
events corresponding to consecutive triples involving pi and pk (there are two of them).
For example, suppose that prior to insertion we had the beach-line sequence
hp1 , pi , pj , pk , p2 i.
Delaunay Triangulations: Last time we gave an algorithm for computing Voronoi diagrams. Today we consider the
related structure, called a Delaunay triangulation (DT). Since the Voronoi diagram is a planar graph, we may
naturally ask what is the corresponding dual graph. The vertices for this dual graph can be taken to be the sites
themselves. Since (assuming general position) the vertices of the Voronoi diagram are of degree three, it follows
that the faces of the dual graph (excluding the exterior face) will be triangles. The resulting dual graph is a
triangulation of the sites, the Delaunay triangulation.
Delaunay triangulations have a number of interesting properties, that are consequences of the structure of the
Voronoi diagram.
Convex hull: The boundary of the exterior face of the Delaunay triangulation is the boundary of the convex
hull of the point set.
Circumcircle property: The circumcircle of any triangle in the Delaunay triangulation is empty (contains no
sites of P ).
Empty circle property: Two sites pi and pj are connected by an edge in the Delaunay triangulation, if and
only if there is an empty circle passing through pi and pj . (One direction of the proof is trivial from
the circumcircle property. In general, if there is an empty circumcircle passing through pi and pj , then
the center c of this circle is a point on the edge of the Voronoi diagram between pi and pj , because c is
equidistant from each of these sites and there is no closer site.)
Closest pair property: The closest pair of sites in P are neighbors in the Delaunay triangulation. (The circle
having these two sites as its diameter cannot contain any other sites, and so is an empty circle.)
If the sites are not in general position, in the sense that four or more are cocircular, then the Delaunay triangula-
tion may not be a triangulation at all, but just a planar graph (since the Voronoi vertex that is incident to four or
more Voronoi cells will induce a face whose degree is equal to the number of such cells). In this case the more
appropriate term would be Delaunay graph. However, it is common to either assume the sites are in general
position (or to enforce it through some sort of symbolic perturbation) or else to simply triangulate the faces of
degree four or more in any arbitrary way. Henceforth we will assume that sites are in general position, so we do
not have to deal with these messy situations.
Given a point set P with n sites where there are h sites on the convex hull, it is not hard to prove by Euler’s
formula that the Delaunay triangulation has 2n−2−h triangles, and 3n−3−h edges. The ability to determine the
number of triangles from n and h only works in the plane. In 3-space, the number of tetrahedra in the Delaunay
triangulation can range from O(n) up to O(n2 ). In dimension n, the number of simplices (the d-dimensional
generalization of a triangle) can range as high as O(ndd/2e ).
Minimum Spanning Tree: The Delaunay triangulation possesses some interesting properties that are not directly
related to the Voronoi diagram structure. One of these is its relation to the minimum spanning tree. Given ¡ ¢a
set of n points in the plane, we can think of the points as defining a Euclidean graph whose edges are all n2
(undirected) pairs of distinct points, and edge (pi , pj ) has weight equal to the Euclidean distance from pi to pj .
A minimum spanning tree is a set of n − 1 edges that connect the points (into a free tree) such that the total
weight of edges is minimized. We could compute the MST using Kruskal’s algorithm. Recall that Kruskal’s
algorithm works by first sorting the edges and inserting them one by one. We could first compute the Euclidean
graph, and then pass the result on to Kruskal’s algorithm, for a total running time of O(n2 log n).
However there is a much faster method based on Delaunay triangulations. First compute the Delaunay trian-
gulation of the point set. We will see later that it can be done in O(n log n) time. Then compute the MST of
the Delaunay triangulation by Kruskal’s algorithm and return the result. This leads to a total running time of
O(n log n). The reason that this works is given in the following theorem.
Theorem: The minimum spanning tree of a set of points P (in any dimension) is a subgraph of the Delaunay
triangulation.
c c
a b a b
T T’
The removal of ab from the MST splits the tree into two subtrees. Assume without loss of generality that
c lies in the same subtree as a. Now, remove the edge ab from the MST and add the edge bc in its place.
The result will be a spanning tree T 0 whose weight is
The last inequality follows because ab is the diameter of the circle, implying that |bc| < |ab|. This
contradicts the hypothesis that T is the MST, completing the proof.
By the way, this suggests another interesting question. Among all triangulations, we might ask, does the Delau-
nay triangulation minimize the total edge length? The answer is no (and there is a simple four-point counterex-
ample). However, this claim was made in a famous paper on Delaunay triangulations, and you may still hear
it quoted from time to time. The triangulation that minimizes total edge weight is called the minimum weight
triangulation. To date, no polynomial time algorithm is known for computing it, and the problem is not known
to be NP-complete.
Maximizing Angles and Edge Flipping: Another interesting property of Delaunay triangulations is that among all
triangulations, the Delaunay triangulation maximizes the minimum angle. This property is important, because it
implies that Delaunay triangulations tend to avoid skinny triangles. This is useful for many applications where
triangles are used for the purposes of interpolation.
In fact a much stronger statement holds as well. Among all triangulations with the same smallest angle, the
Delaunay triangulation maximizes the second smallest angle, and so on. In particular, any triangulation can be
associated with a sorted angle sequence, that is, the increasing sequence of angles (α1 , α2 , . . . , αm ) appearing
in the triangles of the triangulation. (Note that the length of the sequence will be the same for all triangulations
of the same point set, since the number depends only on n and h.)
Theorem: Among all triangulations of a given point set, the Delaunay triangulation has the lexicographically
largest angle sequence.
Before getting into the proof, we should recall a few basic facts about angles from basic geometry. First, recall
that if we consider the circumcircle of three points, then each angle of the resulting triangle is exactly half the
angle of the minor arc subtended by the opposite two points along the circumcircle. It follows as well that if
a point is inside this circle then it will subtend a larger angle and a point that is outside will subtend a smaller
angle. This in the figure part (a) below, we have θ1 > θ2 > θ3 .
We will not give a formal proof of the theorem. (One appears in the text.) The main idea is to show that for
any triangulation that fails to satisfy the empty circle property, it is possible to perform a local operation, called
an edge flip, which increases the lexicographical sequence of angles. An edge flip is an important fundamental
operation on triangulations in the plane. Given two adjacent triangles 4abc and 4cda, such that their union
forms a convex quadrilateral abcd, the edge flip operation replaces the diagonal ac with bd. Note that it is
only possible when the quadrilateral is convex. Suppose that the initial triangle pair violates the empty circle
condition, in that point d lies inside the circumcircle of 4abc. (Note that this implies that b lies inside the
circumcircle of 4cda.) If we flip the edge it will follow that the two circumcircles of the two resulting triangles,
4abd and 4bcd are now empty (relative to these four points), and the observation above about circles and
angles proves that the minimum angle increases at the same time. In particular, in the figure above, we have
φab > θab φbc > θbc φcd > θcd φda > θda .
There are two other angles that need to be compared as well (can you spot them?). It is not hard to show that,
after swapping, these other two angles cannot be smaller than the minimum of θab , θbc , θcd , and θda . (Can you
see why?)
Since there are only a finite number of triangulations, this process must eventually terminate with the lexico-
graphically maximum triangulation, and this triangulation must satisfy the empty circle condition, and hence is
the Delaunay triangulation.
Constructing the Delaunay Triangulation: We will present a simple randomized O(n log n) expected time algo-
rithm for constructing Delaunay triangulations for n sites in the plane. The algorithm is remarkably similar in
spirit to the randomized algorithm for trapezoidal map algorithm in that not only builds the triangulation but also
provides a point-location data structure as well. We will not discuss the point-location data structure in detail,
but the details are easy to fill in.
As with any randomized incremental algorithm, the idea is to insert sites in random order, one at a time, and
update the triangulation with each new addition. The issues involved with the analysis will be showing that after
each insertion the expected number of structural changes in the diagram is O(1). As with other incremental
algorithm, we need some way of keeping track of where newly inserted sites are to be placed in the diagram.
We will describe a somewhat simpler method than the one we used in the trapezoidal map. Rather than building
a data structure, this one simply puts each of the uninserted points into a bucket according to the triangle that
contains it in the current triangulation. In this case, we will need to argue that the expected number of times that
a site is rebucketed is O(log n).
Incircle Test: The basic issue in the design of the algorithm is how to update the triangulation when a new site is
added. In order to do this, we first investigate the basic properties of a Delaunay triangulation. Recall that a
dx dy d2x + d2y 1
We will not prove the correctness of this test, but a simpler assertion, namely that if the above determinant is
equal to zero, then the four points are cocircular. The four points are cocircular if there exists a center point
q = (qx , qy ) and a radius r such that
(ax − qx )2 + (ay − qy )2 = r2 ,
and similarly for the other three points. Expanding this and collecting common terms we have
and similarly for the other three points, b, c, and d. If we let X1 , X2 , X3 and X4 denote the columns of the
above matrix we have
X3 − 2qx X1 − 2qy X2 + (qx2 + qy2 − r2 )X4 = 0.
Thus, the columns of the above matrix are linearly dependent, implying that their determinant is zero. We will
leave the completion of the proof as an exercise. Next time we will show how to use the incircle test to update
the triangulation, and present the complete algorithm.
Incremental update: When we add the next site, pi , the problem is to convert the current Delaunay triangulation into
a new Delaunay triangulation containing this site. This will be done by creating a non-Delaunay triangulation
containing the new site, and then incrementally “fixing” this triangulation to restore the Delaunay properties.
The fundamental changes will be: (1) adding a site to the middle of a triangle, and creating three new edges,
and (2) performing an edge flip. Both of these operations can be performed in O(1) time, assuming that the
triangulation is maintained, say, as a DCEL.
The algorithm that we will describe has been known for many years, but was first analyzed by Guibas, Knuth,
and Sharir. The algorithm starts within an initial triangulation such that all the points lie in the convex hull. This
can be done by enclosing the points in a large triangle. Care must be taken in the construction of this enclosing
triangle. It is not sufficient that it simply contain all the points. It should be the case that these points do not lie
in the circumcircles of any of the triangles of the final triangulation. Our book suggests computing a triangle
that contains all the points, but then fudging with the incircle test so that these points act as if they are invisible.
The sites are added in random order. When a new site p is added, we find the triangle 4abc of the current
triangulation that contains this site (we will see how later), insert the site in this triangle, and join this site to
b b b
d d
p d p p
c a c a c a
e e e
b b b
d d d
p p p
c a c a c a
e e e
b b b
f d f d f d
p p p
c a c a c a
The code for the incremental algorithm is shown in the figure below. The current triangulation is kept in a global
data structure. The edges in the following algorithm are actually pointers to the DCEL.
There is only one major issue in establishing the correctness of the algorithm. When we performed empty-circle
tests, we only tested triangles containing the site p, and only sites that lay on the opposite side of an edge of such
SwapTest(ab) {
if (ab is an edge on the exterior face) return;
Let d be the vertex to the right of edge ab;
if (inCircle(p, a, b, d) { // d violates the incircle test
Flip edge ab for pd;
SwaptTest(ad); // Fix the new suspect edges
SwaptTest(db);
}
}
a triangle. We need to establish that these tests are sufficient to guarantee that the final triangulation is indeed
Delaunay.
First, we observe that it suffices to consider only triangles that contain p in their circumcircle. The reason is that
p is the only newly added site, it is the only site that can cause a violation of the empty-circle property. Clearly
the triangle that contained p must be removed, since its circumcircle definitely contains p. Next, we need to
argue that it suffices to check only the neighboring triangles after each edge flip. Consider a triangle 4pab that
contains p and consider the vertex d belonging to the triangle that lies on the opposite side of edge ab. We argue
that if d lies outside the circumcircle of pab, then no other point of the point set can lie within this circumcircle.
A complete proof of this takes some effort, but here is a simple justification. What could go wrong? It might be
that d lies outside the circumcircle, but there is some other site, say, a vertex e of a triangle adjacent to d, that
lies inside the circumcircle. This is illustrated in the following figure. We claim that this cannot happen. It can
be shown that if e lies within the circumcircle of 4pab, then a must lie within the circumcircle of 4bde. (The
argument is a exercise in geometry.) However, this violates the assumption that the initial triangulation (before
the insertion of p) was a Delaunay triangulation.
d
e
b a
As you can see, the algorithm is very simple. The only things that need to be implemented are the DCEL (or
other data structure) to store the triangulation, the incircle test, and locating the triangle that contains p. The first
two tasks are straightforward. The point location involves a little thought.
Point Location: The point location can be accomplished by one of two means. Our text discusses the idea of building
Thus, the total expected time spent in rebucketing is O(n log n), as desired.
There is one place in the proof that we were sloppy. (Can you spot it?) We showed that the number of points that
required rebucketing is O(n log n), but notice that when a point is inserted, many rebucketing operations may be
needed (one for the initial insertion and one for each additional edge flip). We will not give a careful analysis of
the total number of individual rebucketing operations per point, but it is not hard to show that the expected total
Arrangements: So far we have studied a few of the most important structures in computational geometry: convex
hulls, Voronoi diagrams and Delaunay triangulations. Perhaps, the next most important structure is that of a
line arrangement. As with hulls and Voronoi diagrams, it is possible to define arrangements (of d − 1 dimen-
sional hyperplanes) in any dimension, but we will concentrate on the plane. As with Voronoi diagrams, a line
arrangement is a polygonal subdivision of the plane. Unlike most of the structures we have seen up to now, a
line arrangement is not defined in terms of a set of points, but rather in terms of a set L of lines. However, line
arrangements are used mostly for solving problems on point sets. The connection is that the arrangements are
typically constructed in the dual plane. We will begin by defining arrangements, discussing their combinatorial
properties and how to construct them, and finally discuss applications of arrangements to other problems in
computational geometry.
Before discussing algorithms for computing arrangements and applications, we first provide definitions and
some important theorems that will be used in the construction. A finite set L of lines in the plane subdivides
the plane. The resulting subdivision is called an arrangement, denoted A(L). Arrangements can be defined for
curves as well as lines, and can also be defined for (d − 1)-dimensional hyperplanes in dimension d. But we
will only consider the case of lines in the plane here. In the plane, the arrangement defines a planar graph whose
vertices are the points where two or more lines intersect, edges are the intersection free segments (or rays) of
the lines, and faces are (possibly unbounded) convex regions containing no line. An example is shown below.
vertex
face
edge
bounding box
An arrangement is said to be simple if no three lines intersect at a common point. We will make the usual general
position assumptions that no three lines intersect in a single point. This assumption is easy to overcome by some
sort of symbolic perturbation.
An arrangement is not formally a planar graph, because it has unbounded edges. We can fix this (topologically)
by imagining that a vertex is added at infinity, and all the unbounded edges are attached to this vertex. A
somewhat more geometric way to fix this is to imagine that there is a bounding box which is large enough
to contain all the vertices, and we tie all the unbounded edges off at this box. Rather than computing the
coordinates of this huge box (which is possible in O(n2 ) time), it is possible to treat the sides of the box as
existing at infinity, and handle all comparisons symbolically. For example, the lines that intersect the right side
of the “box at infinity” have slopes between +1 and −1, and the order in which they intersect this side (from
top to bottom) is in decreasing order of slope. (If you don’t see this right away, think about it.)
Incremental Construction: Arrangements are used for solving many problems in computational geometry. But in
order to use arrangements, we first must be able to construct arrangements. We will present a simple incremental
algorithm, which builds an arrangement by adding lines one at a time. Unlike most of the other incremental
algorithms we have seen so far, this one is not randomized. Its asymptotic running time will be the same,
O(n2 ), no matter what order we insert the lines. This is asymptotically optimal, since this is the size of the
arrangement. The algorithm will also require O(n2 ), since this is the amount of storage needed to store the final
result. (Later we will consider the question of whether it is possible to traverse an arrangement without actually
building it.)
As usual, we will assume that the arrangement is represented in any reasonable data structure for planar graphs,
a DCEL for example. Let L = {`1 , `2 , . . . , `n } denote the lines. We will simply add lines one by one to the
arrangement. (The order does not matter.) We will show that the i-th line can be inserted in O(i) time. If we
sum over i, this will give O(n2 ) total time.
Suppose that the first i − 1 lines have already been added, and consider the effort involved in adding `i . Recall
our assumption that the arrangement is assumed to lie within a large bounding box. Since each line intersects
this box twice, the first i − 1 lines have subdivided the boundary of the box into 2(i − 1) edges. We determine
where `i intersects the box, and which of these edge it crosses intersects. This will tell us which face of the
arrangement `i first enters.
Next we trace the line through the arrangement, from one face to the next. Whenever we enter a face, the main
question is where does the line exit the face? We answer the question by a very simple strategy. We walk along
the edges face, say in a counterclockwise direction (recall that a DCEL allows us to do this) and as we visit each
edge we ask whether `i intersects this edge. When we find the edge through which `i exits the face, we jump to
the face on the other side of this edge (again the DCEL supports this) and continue the trace. This is illustrated
in the figure below.
l l
Zone Theorem: The most important combinatorial property of arrangements (which is critical to their efficient con-
struction) is a rather surprising result called the zone theorem. Given an arrangement A of a set L of n lines, and
given a line ` that is not in L, the zone of ` in A(`), denoted ZA (`), is the set of faces whose closure intersects
`. The figure above illustrates a zone for the line `. For the purposes of the above construction, we are only
interested in the edges of the zone that lie below `i , but if we bound the total complexity of the zone, then this
will be an upper bound on the number of edges traversed in the above algorithm. The combinatorial complexity
of a zone (as argued above) is at most O(n2 ). The Zone theorem states that the complexity is actually much
smaller, only O(n).
Theorem: (The Zone Theorem) Given an arrangement A(L) of n lines in the plane, and given any line ` in the
plane, the total number of edges in all the cells of the zone ZA (`) is at most 6n.
Proof: As with most combinatorial proofs, the key is to organize everything so that the counting can be done
in an easy way. Note that this is not trivial, because it is easy to see that any one line of L might contribute
many segments to the zone of `. The key in the proof is finding a way to add up the edges so that each line
appears to induce only a constant number of edges into the zone.
The proof is based on a simple inductive argument. We will first imagine, through a suitable rotation, that
` is horizontal, and further that none of the lines of L is horizontal (through an infinitesimal rotation). We
split the edges of the zone into two groups, those that bound some face from the left side and those that
bound some face from the right side. More formally, since each face is convex, if we split it at its topmost
and bottommost vertices, we get two convex chains of edges. The left-bounding edges are on the left chain
and the right-bounding edges are on the right chain. We will show that there are at most 3n lines that
bounded faces from the left. (Note that an edge of the zone that crosses ` itself contributes only once to
the complexity of the zone. In the book’s proof they seem to count this edge twice, and hence their bound
they get a bound of 4n instead. We will also ignore the edges of the bounding box.)
For the base case, when n = 1, then there is exactly one left bounding edge in `’s zone, and 1 ≤ 3n.
Assume that the hypothesis is true for any set of n−1 lines. Consider the rightmost line of the arrangement
to intersect `. Call this `1 . (Selecting this particular line is very important for the proof.) Suppose that
we consider the arrangement of the other n − 1 lines. By the induction hypothesis there will be at most
3(n − 1) left-bounding edges in the zone for `.
Now let us add back `1 and see how many more left-bounding edges result. Consider the rightmost face
of the arrangement of n − 1 lines. Note that all of its edges are left-bounding edges. Line `1 will intersect
` within this face. Let ea and eb denote the two edges of this that `1 intersects, one above ` and the
other below `. The insertion of `1 creates a new left bounding edge along `1 itself, and splits the left
bounding edges ea and eb into two new left bounding edges for a net increase of three edges. Observe
that `1 cannot contribute any other left-bounding edges to the zone, because (depending on slope) either
the line supporting ea or the line supporting eb blocks `1 ’s visibility from `. (Note that it might provide
right-bounding edges, but we are not counting them here.) Thus, the total number of left-bounding edges
on the zone is at most 3(n − 1) + 3 ≤ 3n, and hence the total number of edges is at most 6n, as desired.
l1
l
eb
Applications of Arrangements and Duality: The computational and mathematical tools that we have developed
with geometric duality and arrangements allow a large number of problems to be solved. Here are some exam-
ples. Unless otherwise stated, all these problems can be solved in O(n2 ) time and O(n2 ) space by constructing
a line arrangement, or in O(n2 ) time and O(n) space through topological plane sweep.
General position test: Given a set of n points in the plane, determine whether any three are collinear.
Minimum area triangle: Given a set of n points in the plane, determine the minimum area triangle whose
vertices are selected from these points.
Minimum k-corridor: Given a set of n points, and an integer k, determine the narrowest pair of parallel lines
that enclose at least k points of the set. The distance between the lines can be defined either as the vertical
distance between the lines or the perpendicular distance between the lines.
Visibility graph: Given line segments in the plane, we say that two points are visible if the interior of the line
segment joining them intersects none of the segments. Given a set of n non-intersecting line segments,
compute the visibility graph, whose vertices are the endpoints of the segments, and whose edges a pairs of
visible endpoints.
Maximum stabbing line: Given a set of n line segments in the plane, compute the line that stabs (intersects)
the maximum number of these line segments.
Hidden surface removal: Given a set of n non-intersecting polygons in 3-space, imagine projecting these
polygon onto a plane (either orthogonally or using perspective). Determine which portions of the polygons
are visible from the viewpoint under this projection.
Note that in the worst case, the complexity of the final visible scene may be as high as O(n2 ), so this is
asymptotically optimal. However, since such complex scenes rarely occur in practice, this algorithm is
really only of theoretical interest.
Ham Sandwich Cut: Given n red points and m blue points, find a single line that simultaneously bisects these
point sets. It is a famous fact from mathematics (called the Ham-Sandwich Theorem) that such a line
always exists. If the point sets are separated by a line, then this can be done in time: O(n + m), space:
O(n + m).
Sorting all angular sequences: Here is a natural application of duality and arrangements that turns out to be impor-
tant for the problem of computing visibility graphs. Consider a set of n points in the plane. For each point p in
this set we want to perform an angular sweep, say in counterclockwise order, visiting the other n − 1 points of
`∗ = (a, b)
p∗ : (b = px a − py ).
Recall that the a-coordinate in the dual plane corresponds to the slope of a line in the primal plane. Suppose that
p is the point that we want to sort around, and let p1 , p2 , . . . , pn be the points in final angular order about p.
p4 p*7
p5 p*6 p*8
p*5 p*2
p6 p3
p p2
p*3
p*4
p7
p1 p*1
p8 p*
Consider the arrangement defined by the dual lines p∗i . How is this order revealed in the arrangement? Consider
the dual line p∗ , and its intersection points with each of the dual lines p∗i . These form a sequence of vertices in
the arrangement along p∗ . Consider this sequence ordered from left to right. It would be nice if this order were
the desired circular order, but this is not quite correct. The a-coordinate of each of these vertices in the dual
arrangement is the slope of some line of the form ppi . Thus, the sequence in which the vertices appear on the
line is a slope ordering of the points about pi , not an angular ordering.
However, given this slope ordering, we can simply test which primal points lie to the left of p (that is, have a
smaller x-coordinate in the primal plane), and separate them from the points that lie to the right of p (having
a larger x-coordinate). We partition the vertices into two sorted sequences, and then concatenate these two
sequences, with the points on the right side first, and the points on the left side later. The resulting is an angular
sequence starting with the angle −90 degrees and proceeding up to +270 degrees.
Thus, once the arrangement has been constructed, we can reconstruct each of the angular orderings in O(n)
time, for a total of O(n2 ) time.
Maximum Discrepancy: Next we consider a problem derived from computer graphics and sampling. Suppose that
we are given a collection of n points S lying in a unit square U = [0, 1]2 . We want to use these points for
random sampling purposes. In particular, the property that we would like these points to have is that for any
halfplane h, we would like the size of the fraction of points of P that lie within h should be roughly equal to the
area of intersection of h with U . That is, if we define µ(h) to be the area of h ∩ U , and µS (h) = |S ∩ h|/|S|,
then we would like µ(h) ≈ µS (h) for all h. This property is important when point sets are used for things like
sampling and Monte-Carlo integration.
To this end, we define the discrepancy of S with respect to a halfplane h to be
For example, in the figure below (a), the area of h ∩ U is µ(h) = 0.625, and there are 7 out of 13 points in
h, thus µS (h) = 7/13 = 0.538. Thus the discrepancy of h is |0.625 − 0.538| = 0.087. Define the halfplane
l
h θ
r2
p
r1
(a) (b)
Since there are an uncountably infinite number of halfplanes, it is important to derive some sort of finiteness
criterion on the set of halfplanes that might produce the greatest discrepancy.
Lemma: Let h denote the halfplane that generates the maximum discrepancy with respect to S, and let ` denote
the line that bounds h. Then either (i) ` passes through at least two points of S, or (ii) ` passes through one
point of S, and this point is the midpoint of the line segment ` ∩ U .
Remark: If a line passes through one or more points of S, then should this point be included in µS (h)? For
the purposes of computing the maximum discrepancy, the answer is to either include or omit the point,
whichever will generate the larger discrepancy. The justification is that it is possible to perturb h infinites-
imally so that it includes none or all of these points without altering µ(h).
Proof: If ` does not pass through any point of S, then (depending on which is larger µ(h) or µS (h)) we can
move the line up or down without changing µS (h) and increasing or decreasing µ(h) to increase their
difference. If ` passes through a point p ∈ S, but is not the midpoint of the line segment ` ∩ U , then we
claim that we can rotate this line about p and hence increase or decrease µ(h) without altering µS (h), to
increase their difference.
To establish the claim, consider the figure above (b). Suppose that the line ` passes through point p and
let r1 < r2 denote the two lengths along ` from p to the sides of the square. Observe that if we rotate `
through a small angle θ, then to a first order approximation, the loss due to area of the triangle on the left is
r12 θ/2, since this triangle can be approximated by an angular sector of a circle of radius r1 and angle θ. The
gain due to the area of the triangle on the right is r22 θ/2. Thus, since r1 < r2 this rotation will increase the
area of region lying below h infinitessimally. A rotation in the opposite decreases the area infinitessimally.
Since the number of points bounded by h does not change as a function of θ, the discrepancy cannot be
achieved as long as such a rotation is possible.
(Note that this proof reveals something slightly stronger. If ` contacts two points, the line segment between
these points must contain the midpoint of the ` ∩ U . Do you see why?)
Since for each point p ∈ S there are only a constant number of lines ` (at most two, I think) through this point
such that p is the midpoint of ` ∩ U , it follows that there are at most O(n) lines of type (i) above, and hence the
discrepancy of all of these lines can be tested in O(n2 ) time.
To compute the discrepancies of the other lines, we can dualize the problem. In the primal plane, a line `
that passes through two points pi , pj ∈ S, is mapped in the dual plane to a point `∗ at which the lines p∗i and
p∗j intersect. This is just a vertex in the arrangement of the dual lines for S. So, if we have computed the
arrangement, then all we need to do is to visit each vertex and compute the discrepancy for the corresponding
primal line.
L1
L3
L5
We claim that it is an easy matter to compute the level of each vertex of the arrangement (e.g. by plane sweep).
The initial levels at x = −∞ are determined by the slope order of the lines. As the plane sweep proceeds,
the index of a line in the sweep line status is its level. Thus, by using topological plane sweep, in O(n2 ) time
we can compute the minimum and maximum level number of each vertex in the arrangement. From the order
reversing property, for each vertex of the dual arrangement, the minimum level number minus 1 indicates the
number of primal points that lie strictly below the corresponding primal line and the maximum level number is
the number of dual points that lie on or below this line. Thus, given the level numbers and the fact that areas can
be computed in O(1) time, we can compute the discrepancy in O(n2 ) time and O(n) space, through topological
plane sweep.
Shortest paths: We are given a set of n disjoint polygonal obstacles in the plane, and two points s and t that lie
outside of the obstacles. The problem is to determine the shortest path from s to t that avoids the interiors of the
obstacles. (It may travel along the edges or pass through the vertices of the obstacles.) The complement of the
interior of the obstacles is called free space. We want to find the shortest path that is constrained to lie entirely
in free space.
Today we consider a simple (but perhaps not the most efficient) way to solve this problem. We assume that we
measure lengths in terms of Euclidean distances. How do we measure paths lengths for curved paths? Luckily,
we do not have to, because we claim that the shortest path will always be a polygonal curve.
Claim: The shortest path between any two points that avoids a set of polygonal obstacles is a polygonal curve,
whose vertices are either vertices of the obstacles or the points s and t.
From this it follows that the edges that constitute the shortest path must travel between s and t and vertices of
the obstacles. Each of these edges must have the property that it does not intersect the interior of any obstacle,
implying that the endpoints must be visible to each other. More formally, we say that two points p and q are
mutually visible if the open line segment joining them does not intersect the interior of any obstacle. By this
definition, the two endpoints of an obstacle edge are not mutually visible, so we will explicitly allow for this
case in the definition below.
Definition: The visibility graph of s and t and the obstacle set is a graph whose vertices are s and t the obstacle
vertices, and vertices v and w are joined by an edge if v and w are either mutually visible or if (v, w) is an
edge of some obstacle.
s s
t t
It follows from the above claim that the shortest path can be computed by first computing the visibility graph and
labeling each edge with its Euclidean length, and then computing the shortest path by, say, Dijkstra’s algorithm
(see CLR). Note that the visibility graph is not planar, and hence may consist of Ω(n2 ) edges. Also note that,
even if the input points have integer coordinates, in order to compute distances we need to compute square
roots, and then sums of square roots. This can be approximated by floating point computations. (If exactness is
important, this can really be a problem, because there is no known polynomial time procedure for performing
arithmetic with arbitrary square roots of integers.)
Computing the Visibility Graph: We give an O(n2 ) procedure for constructing the visibility graph of n line seg-
ments in the plane. The more general task of computing the visibility graph of an arbitrary set of polygonal
obstacles is a very easy generalization. In this context, two vertices are visible if the line segment joining them
does not intersect any of the obstacle line segments. However, we allow each line segment to contribute itself as
an edge in the visibility graph. We will make the general position assumption that no three vertices are collinear,
but this is not hard to handle with some care. The algorithm is not output sensitive. If k denotes the number of
edges in the visibility graph, then an O(n log n + k) algorithm does exist, but it is quite complicated.
The text gives an O(n2 log n) time algorithm. We will give an O(n2 ) time algorithm. Both algorithms are based
on the same concept, namely that of performing an angular sweep around each vertex. The text’s algorithm
operates by doing this sweep one vertex at a time. Our algorithm does the sweep for all vertices simultaneously.
We use the fact (given in the lecture on arrangements) that this angular sort can be performed for all vertices in
O(n2 ) time. If we build the entire arrangement, this sorting algorithm will involve O(n2 ) space. However it
can be implemented in O(n) space using an algorithm called topological plane sweep. Topological plane sweep
Same segment: If v and w are endpoints of the same segment, then they are visible, and we add the edge (v, w)
to the visibility graph.
Invisible: Consider the distance from v to w. First, determine whether w lies on the same side as f (v) or b(v).
For the remainder, assume that it is f (v). (The case of b(v) is symmetrical).
Compute the contact point of the bullet path shot from v in direction θ with segment f (v). If this path hits
f (v) strictly before w, then we know that w is not visible to v, and so this is a “non-event”.
Segment entry: Consider the segment that is incident to w. Either the sweep is just about to enter this segment
or is just leaving it. If we are entering the segment, then we set f (v) to this segment.
Segment exit: If we are just leaving this segment, then the bullet path will need to shoot out and find the next
segment that it hits. Normally this would require some searching. (In particular, this is one of the reasons
that the text’s algorithm has the extra O(log n) factor—to perform this search.) However, we claim that
the answer is available to us in O(1) time.
In particular, since we are sweeping over w at the same time that we are sweeping over v. Thus we know
that the bullet extension from w hits f (w). All we need to do is to set f (v) = f (w).
This is a pretty simple algorithm (although there are a number of cases). The only information that we need to
keep track of is (1) a priority queue for the events, and (2) the f (v) and b(v) pointers for each vertex v. The
priority queue is not stored explicitly. Instead it is available from the line arrangement of the duals of the line
segment vertices. By performing a topological sweep of the arrangement, we can process all of these events in
O(n2 ) time.
Motion planning: Last time we considered the problem of computing the shortest path of a point in space around a
set of obstacles. Today we will study a much more general approach to the more general problem of how to plan
the motion of one or more robots, each with potentially many degrees of freedom in terms of its movement and
perhaps having articulated joints.
Work Space and Configuration Space: The environment in which the robot operates is called its work space, which
consists of a set of obstacles that the robot is not allowed to intersect. We assume that the work space is static,
that is, the obstacles do not move. We also assume that a complete geometric description of the work space is
available to us.
For our purposes, a robot will be modeled by two main elements. The first is a configuration, which is a finite
sequence of values that fully specifies the position of the robot. The second element is the robot’s geometric
shape description. Combined these two element fully define the robot’s exact position and shape in space.
For example, suppose that the robot is a 2-dimensional polygon that can translate and rotate in the plane. Its
configuration may be described by the (x, y) coordinates of some reference point for the robot, and an angle θ
that describes its orientation. Its geometric information would include its shape (say at some canonical position),
given, say, as a simple polygon. Given its geometric description and a configuration (x, y, θ), this uniquely
R(0,0,0)
determines the exact position R(x, y, θ) of this robot in the plane. Thus, the position of the robot can be
identified with a point in the robot’s configuration space.
A more complex example would be an articulated arm consisting of a set of links, connected to one another by a
set of revolute joints. The configuration of such a robot would consist of a vector of joint angles. The geometric
description would probably consist of a geometric representation of the links. Given a sequence of joint angles,
the exact shape of the robot could be derived by combining this configuration information with its geometric
description. For example, a typical 3-dimensional industrial robot has six joints, and hence its configuration
can be thought of as a point in a 6-dimensional space. Why six? Generally, there are three degrees of freedom
needed to specify a location in 3-space, and 3 more degrees of freedom needed to specify the direction and
orientation of the robot’s end manipulator.
Given a point p in the robot’s configuration space, let R(p) denote the placement of the robot at this configura-
tion. The figure below illustrates this in the case of the planar robot defined above.
Because of limitations on the robot’s physical structure and the obstacles, not every point in configuration space
corresponds to a legal placement of the robot. Any configuration which is illegal in that it causes the robot
to intersect one of the obstacles is called a forbidden configuration. The set of all forbidden configurations is
denoted Cforb (R, S), and all other placements are called free configurations, and the set of these configurations
is denoted Cfree (R, S), or free space.
Now consider the motion planning problem in robotics. Given a robot R, an work space S, and initial config-
uration s and final configuration t (both points in the robot’s free configuration space), determine (if possible)
a way to move the robot from one configuration to the other without intersecting any of the obstacles. This
reduces to the problem of determining whether there is a path from s to t that lies entirely within the robot’s free
configuration space. Thus, we map the task of computing a robot’s motion to the problem of finding a path for
a single point through a collection of obstacles.
Configuration spaces are typically higher dimensional spaces, and can be bounded by curved surfaces (especially
when rotational elements are involved). Perhaps the simplest case to visualize is that of translating a convex
polygonal robot in the plane amidst a collection of polygonal obstacles. In this cased both the work space and
CP = {~
p | R(~
p) ∩ P 6= ∅}.
One way to visualize CP is to imagine “scraping” R along the boundary of P and seeing the region traced out
by R’s reference point.
The problem we consider next is, given R and P , compute the configuration obstacle CP. To do this, we first
introduce the notion of a Minkowski sum. Let us violate our notions of affine geometry for a while, and think of
points (x, y) in the plane as vectors. Given any two sets S1 and S2 in the plane, define their Minkowski sum to
be the set of all pairwise sums of points taken from each set:
S1 ⊕ S2 = {~
p + ~q | p~ ∈ S1 , ~q ∈ S2 }.
CP P+(−R)
P P
R −R
oi oi oi oi
oj
oj oj
oj
Lemma 1: Given a set convex objects T1 , T2 , . . . , Tn with disjoint interiors, and convex R, the set
{Ti ⊕ R | 1 ≤ i ≤ n}
is a collection of pseudodisks.
Proof: Consider two polygons T1 and T2 with disjoint interiors. We want to show that T1 ⊕ R and T2 ⊕ R do
not cross over one another.
~ the most extreme point of R in direction d~ is the point r ∈ R that
Given any directional unit vector d,
~
maximizes the dot product (d · r). (Recall that we treat the “points” of the polygons as if they were
vectors.) The point of T1 ⊕ R that is most extreme in direction d is the sum of the points t and r that are
most extreme for T1 and R, respectively.
d2 T1 d2 T1 extreme
d1
d1 T2 extreme
T2
Now, if to the contrary T1 ⊕ R and T2 ⊕ R had a crossing intersection, then observe that we can find points
p1 p2 , p3 , and p4 , in cyclic order around the boundary of the convex hull of (T1 ⊕ R) ∪ (T2 ⊕ R) such
that p1 , p3 ∈ T1 ⊕ R and p2 , p4 ∈ T2 ⊕ R. First consider p1 . Because it is on the convex hull, consider
the direction d~1 perpendicular to the supporting line here. Let r, t1 , and t2 be the extreme points of R, T1
and T2 in direction d~1 , respectively. From our basic fact about Minkowski sums we have
p1 = r + t1 p2 = r + t2 .
Since p1 is on the convex hull, it follows that t1 is more extreme than t2 in direction d~1 , that is, T1 is
more extreme than T2 in direction d~1 . By applying this same argument, we find that T1 is more extreme
than T2 in directions d~1 and d~3 , but that T2 is more extreme than T1 in directions d~2 and d~4 . But this is
impossible, since from the observation above, there can be at most one alternation in extreme points for
nonintersecting convex polygons. See the figure below.
d2 T1 extreme d2
d3 d3 T2 extreme
T1 + R
T2 + R d1
d4 T1 extreme
d1
T2 extreme
d4
Lemma 2: Given a collection of pseudodisks, with a total of n vertices, the complexity of their union is O(n).
Proof: This is a rather cute combinatorial lemma. We are given some collection of pseudodisks, and told that
altogether they have n vertices. We claim that their entire union has complexity O(n). (Recall that in
general the union of n convex polygons can have complexity O(n2 ), by criss-crossing.) The proof is
based on a clever charging scheme. Each vertex in the union will be charged to a vertex among the original
pseudodisks, such that no vertex is charged more than twice. This will imply that the total complexity is
at most 2n.
e2 e1 e2 e1 e2 e1
v
v u v u
But what do we do if both e1 shoots straight through P2 and e2 shoots straight through P1 ? Now we have
no vertex to charge. This is okay, because the pseudodisk property implies that this cannot happen. If both
edges shoot completely through, then the two polygons must cross over each other.
Recall that in our application of this lemma, we have n C-obstacles, each of which has at most m + 3 vertices,
for a total input complexity of O(nm). Since they are all pseudodisks, it follows from Lemma 2 that the total
complexity of the free space is O(nm).
Doubly-connected Edge List: We consider the question of how to represent plane straight-line graphs (or PSLG).
The DCEL is a common edge-based representation. Vertex and face information is also included for whatever
geometric application is using the data structure. There are three sets of records one for each element in the
PSLG: vertex records, a edge records, and face records. For the purposes of unambiguously defining left and
right, each undirected edge is represented by two directed half-edges.
We will make a simplifying assumption that faces do not have holes inside of them. This assumption can be
satisfied by introducing some number of dummy edges joining each hole either to the outer boundary of the face,
or to some other hole that has been connected to the outer boundary in this way. With this assumption, it may
be assumed that the edges bounding each face form a single cyclic list.
Vertex: Each vertex stores its coordinates, along with a pointer to any incident directed edge that has this vertex
as its origin, v.inc edge.
Edge: Each undirected edge is represented as two directed edges. Each edge has a pointer to the oppositely
directed edge, called its twin. Each directed edge has an origin and destination vertex. Each directed edge
is associate with two faces, one to its left and one to its right.
We store a pointer to the origin vertex e.org. (We do not need to define the destination, e.dest, since
it may be defined to be e.twin.org.)
e.twin
e e.org
e.next
e.prev
e.left
The figure shows two ways of visualizing the DCEL. One is in terms of a collection of doubled-up directed
edges. An alternative way of viewing the data structure that gives a better sense of the connectivity structure is
based on covering each edge with a two element block, one for e and the other for its twin. The next and prev
pointers provide links around each face of the polygon. The next pointers are directed counterclockwise around
each face and the prev pointers are directed clockwise.
Of course, in addition the data structure may be enhanced with whatever application data is relevant. In some
applications, it is not necessary to know either the face or vertex information (or both) at all, and if so these
records may be deleted. See the book for a complete example.
For example, suppose that we wanted to enumerate the vertices that lie on some face f . Here is the code:
Merging subdivisions: Let us return to the applications problem that lead to the segment intersection problem. Sup-
pose that we have two planar subdivisions, S1 and S2 , and we want to compute their overlay. In particular, this
is a subdivision whose vertices are the union of the vertices of each subdivision and the points of intersection of
the line segments in the subdivision. (Because we assume that each subdivision is a planar graph, the only new
vertices that could arise will arise from the intersection of two edges, one from S1 and the other from S2 .) Sup-
pose that each subdivision is represented using a DCEL. Can we adapt the plane-sweep algorithm to generate
the DCEL of the overlaid subdivision?
The splitting procedure creates the new edge, links it into place. After this the edges have been split, but they
are not linked to each other. The edge constructor is given the origin and destination of the new edge and creates
a new edge and its twin. The procedure below initializes all the other fields. Also note that the destination of
a1 , that is the origin of a1 ’s twin must be updated, which we have omitted. The splice procedure interlinks
four edges around a common vertex in the counterclockwise order a1 (entering), b1 (entering), a2 (leaving), b2
(leaving).
splice
a1 b2
a1t b2t
a2
b1
b1t a2t
Segment Data: So far we have considered geometric data structures for storing points. However, there are many
others types of geometric data that we may want to store in a data structure. Today we consider how to store
orthogonal (horizontal and vertical) line segments in the plane. We assume that a line segment is represented by
giving its pair of endpoints. The segments are allowed to intersect one another.
As a basic motivating query, we consider the following window query. Given a set of orthogonal line segments
S, which have been preprocessed, and given an orthogonal query rectangle W , count or report all the line
segments of S that intersect W . We will assume that W is closed and solid rectangle, so that even if a line
segment lies entirely inside of W or intersects only the boundary of W , it is still reported. For example, given
the window below, the query would report the segments that are shown with solid lines, and segments with
broken lines would not be reported.
Endpoint inside: Report all the segments of S that have at least one endpoint inside W . (This can be done
using a range query.)
Horizontal through segments: Report all the horizontal segments of S that intersect the left side of W . (This
reduces to a vertical segment stabbing query.)
Vertical through segments: Report all the vertical segments of S that intersect the bottom side of W . (This
reduces to a horizontal segment stabbing query.)
We will present a solution to the problem of vertical segment stabbing queries. Before dealing with this, we
will first consider a somewhat simpler problem, and then modify this simple solution to deal with the general
problem.
Vertical Line Stabbing Queries: Let us consider how to answer the following query, which is interesting in its own
right. Suppose that we are given a collection of horizontal line segments S in the plane and are given an (infinite)
vertical query line `q : x = xq . We want to report all the line segments of S that intersect `q . Notice that for
the purposes of this query, the y-coordinates are really irrelevant, and may be ignored. We can think of each
horizontal line segment as being a closed interval along the x-axis. We show an example in the figure below on
the left.
As is true for all our data structures, we want some balanced way to decompose the set of intervals into subsets.
Since it is difficult to define some notion of order on intervals, we instead will order the endpoints. Sort the
interval endpoints along the x-axis. Let hx1 , x2 , . . . , x2n i be the resulting sorted sequence. Let xmed be the
median of these 2n endpoints. Split the intervals into three groups, L, those that lie strictly to the left of xmed ,
R those that lie strictly to the right of xmed , and M those that contain the point xmed . We can then define a
binary tree by putting the intervals of L in the left subtree and recursing, putting the intervals of R in the right
subtree and recursing. Note that if xq < xmed we can eliminate the right subtree and if xq > xmed we can
eliminate the left subtree. See the figure right.
0 5 10 15 20 25 30 0 5 10 15 20 25 30
But how do we handle the intervals of M that contain xmed ? We want to know which of these intervals
intersects the vertical line `q . At first it may seem that we have made no progress, since it appears that we are
back to the same problem that we started with. However, we have gained the information that all these intervals
intersect the vertical line x = xmed . How can we use this to our advantage?
Let us suppose for now that xq ≤ xmed . How can we store the intervals of M to make it easier to report those
that intersect `q . The simple trick is to sort these lines in increasing order of their left endpoint. Let ML denote
the resulting sorted list. Observe that if some interval in ML does not intersect `q , then its left endpoint must be
to the right of xq , and hence none of the subsequent intervals intersects `q . Thus, to report all the segments of
ML that intersect `q , we simply traverse the sorted list and list elements until we find one that does not intersect
`q , that is, whose left endpoint lies to the right of xq . As soon as this happens we terminate. If k 0 denotes the
total number of segments of M that intersect `q , then clearly this can be done in O(k 0 + 1) time.
On the other hand, what do we do if xq > xmed ? This case is symmetrical. We simply sort all the segments of
M in a sequence, MR , which is sorted from right to left based on the right endpoint of each segment. Thus each
element of M is stored twice, but this will not affect the size of the final data structure by more than a constant
factor. The resulting data structure is called an interval tree.
Interval Trees: The general structure of the interval tree was derived above. Each node of the interval tree has a left
child, right child, and itself contains the median x-value used to split the set, xmed , and the two sorted sets ML
and MR (represented either as arrays or as linked lists) of intervals that overlap xmed . We assume that there is
a constructor that builds a node given these three entities. The following high-level pseudocode describes the
basic recursive step in the construction of the interval tree. The initial call is root = IntTree(S), where
S is the initial set of intervals. Unlike most of the data structures we have seen so far, this one is not built by
the successive insertion of intervals (although it would be possible to do so). Rather we assume that a set of
intervals S is given as part of the constructor, and the entire structure is built all at once. We assume that each
interval in S is represented as a pair (xlo , xhi ). An example is shown in the following figure.
We assert that the height of the tree is O(log n). To see this observe that there are 2n endpoints. Each time
through the recursion we split this into two subsets L and R of sizes at most half the original size (minus the
elements of M ). Thus after at most lg(2n) levels we will reduce the set sizes to 1, after which the recursion
bottoms out. Thus the height of the tree is O(log n).
Implementing this constructor efficiently is a bit subtle. We need to compute the median of the set of all
endpoints, and we also need to sort intervals by left endpoint and right endpoint. The fastest way to do this is to
presort all these values and store them in three separate lists. Then as the sets L, R, and M are computed, we
0 5 10 15 20 25 30
This procedure actually has one small source of inefficiency, which was intentionally included to make code
look more symmetric. Can you spot it? Suppose that xq = t.xmed ? In this case we will recursively search the
right subtree. However this subtree contains only intervals that are strictly to the right of xmed and so is a waste
of effort. However it does not affect the asymptotic running time.
As mentioned earlier, the time spent processing each node is O(1 + k 0 ) where k 0 is the total number of points
that were recorded at this node. Summing over all nodes, the total reporting time is O(k + v), where k is the
total number of intervals reported, and v is the total number of nodes visited. Since at each node we recurse on
only one child or the other, the total number of nodes visited v is O(log n), the height of the tree. Thus the total
reporting time is O(k + log n).
Vertical Segment Stabbing Queries: Now let us return to the question that brought us here. Given a set of horizontal
line segments in the plane, we want to know how many of these segments intersect a vertical line segment. Our
approach will be exactly the same as in the interval tree, except for how the elements of M (those that intersect
the splitting line x = xmed ) are handled.
Going back to our interval tree solution, let us consider the set M of horizontal line segments that intersect
the splitting line x = xmed and as before let us consider the case where the query segment q with endpoints
(xq , ylo ) and (xq , yhi ) lies to the left of the splitting line. The simple trick of sorting the segments of M by
their left endpoints is not sufficient here, because we need to consider the y-coordinates as well. Observe that
a segment of M stabs the query segment q if and only if the left endpoint of a segment lies in the following
semi-infinite rectangular region.
This is illustrated in the figure below. Observe that this is just an orthogonal range query. (It is easy to generalize
the procedure given last time to handle semi-infinite rectangles.) The case where q lies to the right of xmed is
symmetrical.
Figure 91: The segments that stab q lie within the shaded semi-infinite rectangle.
So the solution is that rather than storing ML as a list sorted by the left endpoint, instead we store the left
endpoints in a 2-dimensional range tree (with cross-links to the associated segments). Similarly, we create a
range tree for the right endpoints and represent MR using this structure.
The segment stabbing queries are answered exactly as above for line stabbing queries, except that part that
searches ML and MR (the for-loops) are replaced by searches to the appropriate range tree, using the semi-
infinite range given above.
We will not discuss construction time for the tree. (It can be done in O(n log n) time, but this involves some
thought as to how to build all the range trees efficiently). The space needed is O(n log n), dominated primarily
from the O(n log n) space needed for the range trees. The query time is O(k + log3 n), since we need to answer
O(log n) range queries and each takes O(log2 n) time plus the time for reporting. If we use the spiffy version
of range trees (which we mentioned but never discussed) that can answer queries in O(k + log n) time, then we
can reduce the total time to O(k + log2 n).
Point Location: The point location problem (in 2-space) is: given a polygonal subdivision of the plane (that is, a
PSLG) with n vertices, preprocess this subdivision so that given a query point q, we can efficiently determine
which face of the subdivision contains q. We may assume that each face has some identifying label, which is to
be returned. We also assume that the subdivision is represented in any “reasonable” form (e.g. as a DCEL). In
general q may coincide with an edge or vertex. To simplify matters, we will assume that q does not lie on an
edge or vertex, but these special cases are not hard to handle.
It is remarkable that although this seems like such a simple and natural problem, it took quite a long time to
discover a method that is optimal with respect to both query time and space. It has long been known that
there are data structures that can perform these searches reasonably well (e.g. quad-trees and kd-trees), but for
which no good theoretical bounds could be proved. There were data structures of with O(log n) query time but
O(n log n) space, and O(n) space but O(log2 n) query time.
The first construction to achieve both O(n) space and O(log n) query time was a remarkably clever construction
due to Kirkpatrick. It turns out that Kirkpatrick’s idea has some large embedded constant factors that make it
less attractive practically, but the idea is so clever that it is worth discussing, nonetheless. Later we will discuss
a more practical randomized method that is presented in our text.
a b
Let T0 denote the initial triangulation. What Kirkpatrick’s method does is to produce a sequence of triangula-
tions, T0 , T1 , T2 , . . . , Tk , where k = O(log n), such that Tk consists only of a single triangle (the exterior face
of T0 ), and each triangle in Ti+1 overlaps a constant number of triangles in Ti .
We will see how to use such a structure for point location queries later, but for now let us concentrate on how
to build such a sequence of triangulations. Assuming that we have Ti , we wish to compute Ti+1 . In order to
guarantee that this process will terminate after O(log n) stages, we will want to make sure that the number of
vertices in Ti+1 decreases by some constant factor from the number of vertices in Ti . In particular, this will
be done by carefully selecting a subset of vertices of Ti and deleting them (and along with them, all the edges
attached to them). After these vertices have been deleted, we need retriangulate the resulting graph to form
Ti+1 . The question is: How do we select the vertices of Ti to delete, so that each triangle of Ti+1 overlaps only
a constant number of triangles in Ti ?
There are two things that Kirkpatrick observed at this point, that make the whole scheme work.
Constant degree: We will make sure that each of the vertices that we delete have constant (≤ d) degree (that
is, each is adjacent to at most d edges). Note that the when we delete such a vertex, the resulting hole will
consist of at most d − 2 triangles. When we retriangulate, each of the new triangles, can overlap at most d
triangles in the previous triangulation.
Independent set: We will make sure that no two of the vertices that are deleted are adjacent to each other,
that is, the vertices to be deleted form an independent set in the current planar graph Ti . This will make
retriangulation easier, because when we remove m independent vertices (and their incident edges), we
create m independent holes (non triangular faces) in the subdivision, which we will have to retriangulate.
However, each of these holes can be triangulated independently of one another. (Since each hole contains
a constant number of vertices, we can use any triangulation algorithm, even brute force, since the running
time will be O(1) in any case.)
An important question to the success of this idea is whether we can always find a sufficiently large independent
set of vertices with bounded degree. We want the size of this set to be at least a constant fraction of the current
Lemma: Given a planar graph with n vertices, there is an independent set consisting of vertices of degree at
most 8, with at least n/18 vertices. This independent set can be constructed in O(n) time.
We will present the proof of this lemma later. The number 18 seems rather large. The number is probably
smaller in practice, but this is the best bound that this proof generates. However, the size of this constant is
one of the reasons that Kirkpatrick’s algorithm is not used in practice. But the construction is quite clever,
nonetheless, and once a optimal solution is known to a problem it is often not long before a practical optimal
solution follows.
Kirkpatrick Structure: Assuming the above lemma, let us give the description of how the point location data struc-
ture, the Kirkpatrick structure, is constructed. We start with T0 , and repeatedly select an independent set of
vertices of degree at most 8. We never include the three vertices a, b, and c (forming the outer face) in such an
independent set. We delete the vertices from the independent set from the graph, and retriangulate the resulting
holes. Observe that each triangle in the new triangulation can overlap at most 8 triangles in the previous trian-
gulation. Since we can eliminate a constant fraction of vertices with each stage, after O(log n) stages, we will
be down to the last 3 vertices.
The constant factors here are not so great. With each stage, the number of vertices falls by a factor of 17/18. To
reduce to the final three vertices, implies that (18/17)k = n or that
k = log18/17 n ≈ 12 lg n.
It can be shown that by always selecting the vertex of smallest degree, this can be reduced to a more palatable
4.5 lg n.
The data structure is based on this decomposition. The root of the structure corresponds to the single triangle
of Tk . The nodes at the next lower level are the triangles of Tk−1 , followed by Tk−2 , until we reach the leaves,
which are the triangles of our initial triangulation, T0 . Each node for a triangle in triangulation Ti+1 , stores
pointers to all the triangles it overlaps in Ti (there are at most 8 of these). Note that this structure is a directed
acyclic graph (DAG) and not a tree, because one triangle may have many parents in the data structure. This is
shown in the following figure.
To locate a point, we start with the root, Tk . If the query point does not lie within this single triangle, then we
are done (it lies in the exterior face). Otherwise, we search each of the (at most 8) triangles in Tk−1 that overlap
this triangle. When we find the correct one, we search each of the triangles in Tk−2 that overlap this triangles,
and so forth. Eventually we will find the triangle containing the query point in the last triangulation, T0 , and this
is the desired output. See the figure below for an example.
Construction and Analysis: The structure has O(log n) levels (one for each triangulation), it takes a constant amount
of time to move from one level to the next (at most 8 point-in-triangle tests), thus the total query time is O(log n).
The size of the data structure is the sum of sizes of the triangulations. Since the number of triangles in a
triangulation is proportional to the number of vertices, it follows that the size is proportional to
(using standard formulas for geometric series). Thus the data structure size is O(n) (again, with a pretty hefty
constant).
The last thing that remains is to show how to construct the independent set of the appropriate size. We first
present the algorithm for finding the independent set, and then prove the bound on its size.
(1) Mark all nodes of degree ≥ 9.
T0 m h g j x v t
T1
li w u
a c d f p
q r s
be
G
J
T3 T2 F
E D
H I A B C
T4
(not shown)
K T4
H I J T3
A B C D E F G T2
p q r s t u v w x y z T1
a b c d e f g h i j k l m n o T0
H I J
A B C D E F G
p q r s t u v w x y z
a b c d e f g h i j k l m n o
Next, we claim that there must be at least n/2 vertices of degree 8 or less. To see why, suppose to the contrary
that there were more than n/2 vertices of degree 9 or greater. The remaining vertices must have degree at least
3 (with the possible exception of the 3 vertices on the outer face), and thus the sum of all degrees in the graph
would have to be at least as large as
n n
9 + 3 = 6n,
2 2
which contradicts the equation above.
Now, when the above algorithm starts execution, at least n/2 vertices are initially unmarked. Whenever we
select such a vertex, because its degree is 8 or fewer, we mark at most 9 new vertices (this node and at most 8
of its neighbors). Thus, this step can be repeated at least (n/2)/9 = n/18 times before we run out of unmarked
vertices. This completes the proof.
Convex Hull Size Verification Problem (CHSV): Given a point set P and integer h, does the convex hull of
P have h distinct vertices?
Clearly if this takes Ω(n log h) time, then computing the hull must take at least as long. As with sorting, we
will assume that the computation is described in the form of a decision tree. Assuming that the algorithm uses
only comparisons is too restrictive for computing convex hulls, so we will generalize the model of computation
to allow for more complex functions. We assume that we are allowed to compute any algebraic function of the
point coordinates, and test the sign of the resulting function. The result is called a algebraic decision tree.
The input to the CHSV problem is a sequence of 2n = N real numbers. We can think of these numbers as
forming a vector (z1 , z2 , . . . , zN ) = ~z ∈ RN , which we will call a configuration. Each node of the decision tree
is associated with a multivariate algebraic formula of degree at most d. e.g.
would be an algebraic function of degree 2. The node branches in one of three ways, depending on whether the
result is negative, zero, or positive. (Computing orientations and dot-products both fall into this category.) Each
leaf of the resulting tree corresponds to a possible answer that the algorithm might give.
For each input vector ~z to the CHSV problem, the answer is either “yes” or “no”. The set of all “yes” points
is just a subset of points Y ⊂ RN , that is a region in this space. Given an arbitrary input ~z the purpose of the
decision tree is to tell us whether this point is in Y or not. This is done by walking down the tree, evaluating
the functions on ~z and following the appropriate branches until arriving at a leaf, which is either labeled “yes”
(meaning ~z ∈ Y ) or “no”. An abstract example (not for the convex hull problem) of a region of configuration
space and a possible algebraic decision tree (of degree 1) is shown in the following figure. (We have simplified
it by making it a binary tree.) In this case the input is just a pair of real numbers.
c
b b
discard these
a a
Theorem: Let Y ∈ RN be any set and let T be any d-th order algebraic decision tree that determines member-
ship in W . If W has M disjoint connected components, then T must have height at least Ω((log M ) − N ).
Multiset Size Verification Problem (MSV): Given a multiset of n real numbers and an integer k, confirm that
the multiset has exactly k distinct elements.
Lemma: The MSV problem requires Ω(n log k) steps in the worst case in the d-th order algebraic decision tree
Proof: In terms of points in Rn , the set of points for which the answer is “yes” is
It suffices to show that there are at least k!k n−k different connected components in this set, because by
Ben-Or’s result it would follow that the time to test membership in Y would be
Consider the all the tuples (z1 , . . . , zn ) with z1 , . . . zk set to the distinct integers from 1 to k, and zk+1 . . . zn
each set to an arbitrary integer in the same range. Clearly there are k! ways to select the first k elements
and k n−k ways to select the remaining elements. Each such tuple has exactly k distinct items, but it is
not hard to see that if we attempt to continuously modify one of these tuples to equal another one, we
must change the number of distinct elements, implying that each of these tuples is in a different connected
component of Y .
To finish the lower bound proof, we argue that any instance of MSV can be reduced to the convex hull size
verification problem (CHSV). Thus any lower bound for MSV problem applies to CHSV as well.
The proof is rather unsatisfying, because it relies on the fact that there are many duplicate points. You might
wonder, does the lower bound still hold if there are no duplicates? Kirkpatric and Seidel actually prove a stronger
(but harder) result that the Ω(n log h) lower bound holds even you assume that the points are distinct.
Divide-and-Conquer: (For both VD and DT.) The first O(n log n) algorithm for this problem. Not widely used
because it is somewhat hard to implement. Can be generalized to higher dimensions with some difficulty.
Can be generalized to computing Vornoi diagrams of line segments with some difficulty.
Randomized Incremental: (For DT and VD.) The simplest. O(n log n) time with high probability. Can be
generalized to higher dimensions as with the randomized algorithm for convex hulls. Can be generalized
to computing Voronoi diagrams of line segments fairly easily.
Fortune’s Plane Sweep: (For VD.) A very clever and fairly simple algorithm. It computes a “deformed”
Voronoi diagram by plane sweep in O(n log n) time, from which the true diagram can be extracted easily.
Can be generalized to computing Voronoi diagrams of line segments fairly easily.
Reduction to convex hulls: (For DT.) Computing a Delaunay triangulation of n points in dimension d can
be reduced to computing a convex hull of n points in dimension d + 1. Use your favorite convex hull
algorithm. Unclear how to generalize to compute Voronoi diagrams of line segments.
We will cover all of these approaches, except Fortune’s algorithm. O’Rourke does not give detailed explanations
of any of these algorithms, but he does discuss the idea behind Fortune’s algorithm. Today we will discuss the
divide-and-conquer algorithm. This algorithm is presented in Mulmuley, Section 2.8.4.
Divide-and-conquer algorithm: The divide-and-conquer approach works like most standard geometric divide-and-
conquer algorithms. We split the points according to x-coordinates into 2 roughly equal sized groups (e.g. by
presorting the points by x-coordinate and selecting medians). We compute the Voronoi diagram of the left side,
and the Voronoi diagram of the right side. Note that since each diagram alone covers the entire plane, these two
diagrams overlap. We then merge the resulting diagrams into a single diagram.
The merging step is where all the work is done. Observe that every point in the the plane lies within two
Voronoi polygons, one in V D(L) and one in V D(R). We need to resolve this overlap, by separating overlapping
polygons. Let V (l0 ) be the Voronoi polygon for a point from the left side, and let V (r0 ) be the Voronoi polygon
for a point on the right side, and suppose these polygons overlap one another. Observe that if we insert the
bisector between l0 and r0 , and through away the portions of the polygons that lie on the “wrong” side of the
bisector, we resolve the overlap. If we do this for every pair of overlapping Voronoi polygons, we get the final
Voronoi diagram. This is illustrated in the figure below.
The union of these bisectors that separate the left Voronoi diagram from the right Voronoi diagram is called the
contour. A point is on the contour if and only if it is equidistant from 2 points in S, one in L and one in R.
T (n) = 2T (n/2) + n,
Lemma: The contour consists of a single polygonal curve (whose first and last edges are semiinfinite) which is
monotone with respect to the y-axis.
Proof: A detailed proof is a real hassle. Here are the main ideas, though. The contour separates the plane into
two regions, those points whose nearest neighbor lies in L from those points whose nearest neighbor lies
in R. Because the contour locally consists of points that are equidistant from 2 points, it is formed from
pieces that are perpendicular bisectors, with one point from L and the other point from R. Thus, it is a
piecewise polygonal curve. Because no 4 points are cocircular, it follows that all vertices in the Voronoi
diagram can have degree at most 3. However, because the contour separates the plane into only 2 types of
regions, it can contain only vertices of degree 2. Thus it can consist only of the disjoint union of closed
curves (actually this never happens, as we will see) and unbounded curves. Observe that if we orient the
contour counterclockwise with respect to each point in R (clockwise with respect to each point in L), then
each segment must be directed in the −y directions, because L and R are separated by a vertical line. Thus,
the contour contains no horizontal cusps. This implies that the contour cannot contain any closed curves,
and hence contains only vertically monotone unbounded curves. Also, this orientability also implies that
there is only one such curve.
Lemma: The topmost (bottommost) edge of the contour is the perpendicular bisector for the two points forming
the upper (lower) tangent of the left hull and the right hull.
Proof: This follows from the fact that the vertices of the hull correspond to unbounded Voronoi polygons, and
hence upper and lower tangents correspond to unbounded edges of the contour.
These last two theorem suggest the general approach. We start by computing the upper tangent, which we know
can be done in linear time (once we know the left and right hulls, or by prune and search). Then, we start tracing
the contour from top to bottom. When we are in Voronoi polygons V (l0 ) and V (r0 ) we trace the bisector
between l0 and r0 in the negative y-direction until its first contact with the boundaries of one of these polygons.
Suppose that we hit the boundary of V (l0 ) first. Assuming that we use a good data structure for the Voronoi
diagram (e.g. quad-edge data structure) we can determine the point l1 lying on the other side of this edge in the
left Voronoi diagram. We continue following the contour by tracing the bisector of l1 and r0 .
However, in order to insure efficiency, we must be careful in how we determine where the bisector hits the edge
of the polygon. Consider the figure shown below. We start tracing the contour between l0 and r0 . By walking
along the boundary of V (l0 ) we can determine the edge that the contour would hit first. This can be done in
time proportional to the number of edges in V (l0 ) (which can be as large as O(n)). However, we discover that
before the contour hits the boundary of V (l0 ) it hits the boundary of V (r0 ). We find the new point r1 and now
trace the bisector between l0 and r1 . Again we can compute the intersection with the boundary of V (l0 ) in time
proportional to its size. However the contour hits the boundary of V (r1 ) first, and so we go on to r2 . As can be
seen, if we are not smart, we can rescan the boundary of V (l0 ) over and over again, until the contour finally hits
the boundary. If we do this O(n) times, and the boundary of V (l0 ) is O(n), then we are stuck with O(n2 ) time
to trace the contour.
We have to avoid this repeated rescanning. However, there is a way to scan the boundary of each Voronoi
polygon at most once. Observe that as we walk along the contour, each time we stay in the same polygon
V (l0 ), we are adding another edge onto its Voronoi polygon. Because the Voronoi polygon is convex, we know
l0 r1
r2
that the edges we are creating turn consistently in the same direction (clockwise for points on the left, and
counterclockwise for points on the right). To test for intersections between the contour and the current Voronoi
polygon, we trace the boundary of the polygon clockwise for polygons on the left side, and counterclockwise
for polygons on the right side. Whenever the contour changes direction, we continue the scan from the point
that we left off. In this way, we know that we will never need to rescan the same edge of any Voronoi polygon
more than once.
Delaunay Triangulations and Convex Hulls: At first, Delaunay triangulations and convex hulls appear to be quite
different structures, one is based on metric properties (distances) and the other on affine properties (collinearity,
coplanarity). Today we show that it is possible to convert the problem of computing a Delaunay triangulation
in dimension d to that of computing a convex hull in dimension d + 1. Thus, there is a remarkable relationship
between these two structures.
We will demonstrate the connection in dimension 2 (by computing a convex hull in dimension 3). Some of
this may be hard to visualize, but see O’Rourke for illustrations. (You can also reason by analogy in one lower
dimension of Delaunay triangulations in 1-d and convex hulls in 2-d, but the real complexities of the structures
are not really apparent in this case.)
The connection between the two structures is the paraboloid z = x2 + y 2 . Observe that this equation defines
a surface whose vertical cross sections (constant x or constant y) are parabolas, and whose horizontal cross
sections (constant z) are circles. For each point in the plane, (x, y), the vertical projection of this point onto
this paraboloid is (x, y, x2 + y 2 ) in 3-space. Given a set of points S in the plane, let S 0 denote the projection
of every point in S onto this paraboloid. Consider the lower convex hull of S 0 . This is the portion of the convex
hull of S 0 which is visible to a viewer standing at z = −∞. We claim that if we take the lower convex hull of
S 0 , and project it back onto the plane, then we get the Delaunay triangulation of S. In particular, let p, q, r ∈ S,
and let p0 , q 0 , r0 denote the projections of these points onto the paraboloid. Then p0 q 0 r0 define a face of the lower
convex hull of S 0 if and only if 4pqr is a triangle of the Delaunay triangulation of S. The process is illustrated
in the following figure.
The question is, why does this work? To see why, we need to establish the connection between the triangles of
the Delaunay triangulation and the faces of the convex hull of transformed points. In particular, recall that
Delaunay condition: Three points p, q, r ∈ S form a Delaunay triangle if and only if the circumcircle of these
points contains no other point of S.
Convex hull condition: Three points p0 , q 0 , r0 ∈ S 0 form a face of the convex hull of S 0 if and only if the plane
passing through p0 , q 0 , and r0 has all the points of S 0 lying to one side.
Clearly, the connection we need to establish is between the emptiness of circumcircles in the plane and the
emptiness of halfspaces in 3-space. We will prove the following claim.
Lemma: Consider 4 distinct points p, q, r, s in the plane, and let p0 , q 0 , r0 , s0 be their respective projections onto
the paraboloid, z = x2 + y 2 . The point s lies within the circumcircle of p, q, r if and only if s0 lies on the
lower side of the plane passing through p0 , q 0 , r0 .
To prove the lemma, first consider an arbitrary (nonvertical) plane in 3-space, which we assume is tangent to
the paraboloid above some point (a, b) in the plane. To determine the equation of this tangent plane, we take
derivatives of the equation z = x2 + y 2 with respect to x and y giving
∂z ∂z
= 2x, = 2y.
∂x ∂y
At the point (a, b, a2 + b2 ) these evaluate to 2a and 2b. It follows that the plane passing through these point has
the form
z = 2ax + 2by + γ.
To solve for γ we know that the plane passes through (a, b, a2 + b2 ) so we solve giving
a2 + b2 = 2a · a + 2b · b + γ,
If we shift the plane upwards by some positive amount r2 we get the plane
How does this plane intersect the paraboloid? Since the paraboloid is defined by z = x2 + y 2 we can eliminate
z giving
x2 + y 2 = 2ax + 2by − (a2 + b2 ) + r2 ,
which after some simple rearrangements is equal to
(x − a)2 + (y − b)2 = r2 .
p’
r’ s’
q’
q
s
r p
Theorem: Given a set of points S in the plane (assume no 4 are cocircular), and given 3 points p, q, r ∈ S, the
triangle 4pqr is a triangle of the Delaunay triangulation of S if and only if triangle 4p0 q 0 r0 is a face of
the lower convex hull of the projected set S 0 .
From the definition of Delaunay triangulations we know that 4pqr is in the Delaunay triangulation if and only
if there is no point s ∈ S that lies within the circumcircle of pqr. From the previous lemma this is equivalent to
saying that there is no point s0 that lies in the lower convex hull of S 0 , which is equivalent to saying that p0 q 0 r0
is a face of the lower convex hull. This completes the proof.
In order to test whether a point s lies within the circumcircle defined by p, q, r, it suffices to test whether s0
lies within the lower halfspace of the plane passing through p0 , q 0 , r0 . If we assume that p, q, r are oriented
counterclockwise in the plane this this reduces to determining whether the quadruple p0 , q 0 , r0 , s0 is positively
oriented, or equivalently whether s lies to the left of the oriented circle passing through p, q, r.
This leads to the incircle test we presented last time.
px py p2x + p2y 1
qx qy qx2 + qy2 1
in(p, q, r, s) = det
rx
> 0.
ry rx2 + ry2 1
sx sy s2x + s2y 1
Voronoi Diagrams and Upper Envelopes: We know that Voronoi diagrams and Delaunay triangulations are dual
geometric structures. We have also seen (informally) that there is a dual relationship between points and lines in
the plane, and in general, points and planes in 3-space. From this latter connection we argued that the problems
of computing convex hulls of point sets and computing the intersection of halfspaces are somehow “dual” to
one another. It turns out that these two notions of duality, are (not surprisingly) interrelated. In particular, in the
Theorem: Given a set of points S in the plane (assume no 4 are cocircular), let H denote the set of upper half-
spaces defined by the previous transformation. Then the Voronoi diagram of H is equal to the projection
onto the (x, y)-plane of the 1-skeleton of the convex polyhedron which is formed from the intersection of
halfspaces of S 0 .
q’
p’
p
q
It is hard to visualized this surface, but it is not hard to show why this is so. Suppose we have 2 points in the
plane, p = (a, b) and q = (c, d). The corresponding planes are:
If we determine the intersection of the corresponding planes and project onto the (x, y)-coordinate plane (by
eliminating z from these equations) we get
We claim that this is the perpendicular bisector between (a, b) and (c, d). To see this, observe that it passes
through the midpoint ((a + c)/2, (b + d)/2) between the two points since
a+c b+d
(2a − 2c) + (2b − 2d) = (a2 − c2 ) + (b2 − d2 ).
2 2
and, its slope is −(a − c)/(b − d), which is the negative reciprocal of the line segment from (a, b) to (c, d). From
this it can be shown that the intersection of the upper halfspaces defines a polyhedron whose edges project onto
the Voronoi diagram edges.
Topological Plane Sweep: In the last two lectures we have introduced arrangements of lines and geometric duality
as important tools in solving geometric problems on lines and points. Today give an efficient algorithm for
sweeping an arrangement of lines.
As we will see, many problems in computational geometry can be solved by applying line-sweep to an arrange-
ment of lines. Since the arrangement has size O(n2 ), and since there are O(n2 ) events to be processed, each
involving an O(log n) heap deletion, this typically leads to algorithms running in O(n2 log n) time, using O(n2 )
space. It is natural to ask whether we can dispense with the additional O(log n) factor in running time, and
whether we need all of O(n2 ) space (since in theory we only need access to the current O(n) contents of the
sweep line).
We discuss a variation of plane sweep called topological plane sweep. This method runs in O(n2 ) time, and
uses only O(n) space (by essentially constructing only the portion of the arrangement that we need at any point).
Although it may appear to be somewhat sophisticated, it can be implemented quite efficiently, and is claimed to
outperform conventional plane sweep on arrangements of any significant size (e.g. over 20 lines).
Cuts and topological lines: The algorithm is called topological plane sweep because we do not sweep a straight ver-
tical line through the arrangement, but rather we sweep a curved topological line that has the essential properties
of a vertical sweep line in the sense that this line intersects each line of the arrangement exactly once. The notion
of a topological line is an intuitive one, but it can be made formal in the form of something called a cut. Recall
that the faces of the arrangement are convex polygons (possibly unbounded). (Assuming no vertical lines) the
edges incident to each face can naturally be partitioned into the edges that are above the face, and those that are
below the face. Define a cut in an arrangement to be a sequence of edges c1 , c2 , . . . , cn , in the arrangement, one
taken from each line of the arrangement, such that for 1 ≤ i ≤ n − 1, ci and ci+1 are incident to the same face
of the arrangement, and ci is above the face and ci+1 is below the face. An example of a topological line and
the associated cut is shown below.
c1
c2
c3 c4
c5
The topological plane sweep starts at the leftmost cut of the arrangement. This consists of all the left-unbounded
edges of the arrangement. Observe that this cut can be computed in O(n log n) time, because the lines intersect
the cut in inverse order of slope. The topological sweep line will sweep to the right until we come to the
rightmost cut, which consists all of the right-unbounded edges of the arrangement. The sweep line advances by
a series of what are called elementary steps. In an elementary steps, we find two consecutive edges on the cut
that meet at a vertex of the arrangement (we will discuss later how to determine this), and push the topological
sweep line through this vertex. Observe that on doing so these two lines swap in their order along the sweep
line. This is shown below.
It is not hard to show that an elementary step is always possible, since for any cut (other than the rightmost cut)
there must be two consecutive edges with a common right endpoint. In particular, consider the edge of the cut
whose right endpoint has the smallest x-coordinate. It is not hard to show that this endpoint will always allow
an elementary step. Unfortunately, determining this vertex would require at least O(log n) time (if we stored
these endpoints in a heap, sorted by x-coordinate), and we want to perform each elementary step in O(1) time.
Hence, we will need to find some other method for finding elementary steps.
Upper and Lower Horizon Trees: To find elementary steps, we introduce two simple data structures, the upper hori-
zon tree (UHT) and the lower horizon tree (LHT). To construct the upper horizon tree, trace each edge of the
cut to the right. When two edges meet, keep only the one with the higher slope, and continue tracing it to the
right. The lower horizon tree is defined symmetrically. There is one little problem in these definitions in the
sense that these trees need not be connected (forming a forest of trees) but this can be fixed conceptually at least
by the addition of a vertical line at x = +∞. For the upper horizon we think of its slope as being +∞ and for
the lower horizon we think of its slope as being −∞. Note that we consider the left endpoints of the edges of
the cut as not belonging to the trees, since otherwise they would not be trees. It is not hard to show that with
these modifications, these are indeed trees. Each edge of the cut defines exactly one line segment in each tree.
An example is shown below.
The important things about the UHT and LHT is that they give us an easy way to determine the right endpoints
of the edges on the cut. Observe that for each edge in the cut, its right endpoint results from a line of smaller
slope intersecting it from above (as we trace it from left to right) or from a line of larger slope intersecting it
from below. It is easy to verify that the UHT and LHT determine the first such intersecting line of each type,
respectively. It follows that if we intersect the two trees, then the segments they share in common correspond
exactly to the edges of the cut. Thus, by knowing the UHT and LHT, we know where are the right endpoints
are, and from this we can determine easily which pairs of consecutive edges share a common right endpoint,
and from this we can determine all the elementary steps that are legal. We store all the legal steps in a stack (or
queue, or any list is fine), and extract them one by one.
(1) Input the lines and sort by slope. Let C be the initial (leftmost) cut, a list of lines in decreasing order of
slope.
(2) Create the initial UHT incrementally by inserting lines in decreasing order of slope. Create the initial LHT
incrementally by inserting line in increasing order of slope. (More on this later.)
(3) By consulting the LHT and UHT, determine the right endpoints of all the edges of the initial cut, and for
all pairs of consecutive lines (li , li+1 ) sharing a common right endpoint, store this pair in stack S.
(4) Repeat the following elementary step until the stack is empty (implying that we have arrived at the right-
most cut).
(a) Pop the pair (li , li+1 ) from the top of the stack S.
(b) Swap these lines within C, the cut (we assume that each line keeps track of its position in the cut).
(c) Update the horizon trees. (More on this later.)
(d) Consulting the changed entries in the horizon tree, determine whether there are any new cut edges
sharing right endpoints, and if so push them on the stack S.
The important unfinished business is to show that we can build the initial UHT and LHT in O(n) time, and
to show that, for each elementary step, we can update these trees and all other relevant information in O(1)
amortized time. By amortized time we mean that, even though a single elementary step can take more than O(1)
time, the total time needed to perform all O(n2 ) elementary steps is O(n2 ), and hence the average time for each
step is O(1).
This is done by an adaptation of the same incremental “face walking” technique we used in the incremental
construction of line arrangements. Let’s consider just the UHT, since the LHT is symmetric. To create the initial
(leftmost) UHT we insert the lines one by one in decreasing order of slope. Observe that as each new line is
inserted it will start above all of the current lines. The uppermost face of the current UHT consists of a convex
polygonal chain, see the figure below left. As we trace the newly inserted line from left to right, there will be
some point at which it first hits this upper chain of the current UHT. By walking along the chain from left to
right, we can determine this intersection point. Each segment that is walked over is never visited again by this
initialization process (because it is no longer part of the upper chain), and since the initial UHT has a total of
O(n) segments, this implies that the total time spent in walking is O(n). Thus, after the O(n log n) time for
sorting the segments, the initial UHT tree can be built in O(n) additional time.
new line
v v
Next we show how to update the UHT after an elementary step. The process is quite similar, as shown in the
figure right. Let v be the vertex of the arrangement which is passed over in the sweep step. As we pass over v,
the two edges swap positions along the sweep line. The new lower edge, call it l, which had been cut off of the
UHT by the previous lower edge, now must be reentered into the tree. We extend l to the left until it contacts an
edge of the UHT. At its first contact, it will terminate (and this is the only change to be made to the UHT). In
order to find this contact, we start with the edge immediately below l the current cut. We traverse the face of the
UHT in counterclockwise order, until finding the edge that this line intersects. Observe that we must eventually
Ham Sandwich Cuts of Linearly Separated Point Sets: We are given n red points A, and m blue points B, and
we want to compute a single line that simultaneously bisects both sets. (If the cardinality of either set is odd,
then the line passes through one of the points of the set.) We make the simplifying assumption that the sets are
separated by a line. (This assumption makes the problem much simpler to solve, but the general case can still
be solved in O(n2 ) time using arrangements.)
To make matters even simpler we assume that the points have been translated and rotated so this line is the y-
axis. Thus all the red points (set A) have positive x-coordinates, and hence their dual lines have positive slopes,
whereas all the blue points (set B) have negative x-coordinates, and hence their dual lines have negative slopes.
As long as we are simplifying things, let’s make one last simplification, that both sets have an odd number of
points. This is not difficult to get around, but makes the pictures a little easier to understand.
Consider one of the sets, say A. Observe that for each slope there exists one way to bisect the points. In
particular, if we start a line with this slope at positive infinity, so that all the points lie beneath it, and drop in
downwards, eventually we will arrive at a unique placement where there are exactly (n − 1)/2 points above the
line, one point lying on the line, and (n − 1)/2 points below the line (assuming no two points share this slope).
This line is called the median line for this slope.
What is the dual of this median line? If we dualize the points using the standard dual transformation: D(a, b) :
y = ax−b, then we get n lines in the plane. By starting a line with a given slope above the points and translating
it downwards, in the dual plane we moving a point from −∞ upwards in a vertical line. Each time the line passes
a point in the primal plane, the vertically moving point crosses a line in the dual plane. When the translating line
hits the median point, in the dual plane the moving point will hit a dual line such that there are exactly (n − 1)/2
dual lines above this point and (n − 1)/2 dual lines below this point. We define a point to be at level k, Lk , in
an arrangement if there are at most k − 1 lines above this point and at most n − k lines below this point. The
median level in an arrangement of n lines is defined to be the d(n − 1)/2e-th level in the arrangement. This is
shown as M (A) in the following figure on the left.
Thus, the set of bisecting lines for set A in dual form consists of a polygonal curve. Because this curve is formed
from edges of the dual lines in A, and because all lines in A have positive slope, this curve is monotonically
increasing. Similarly, the median for B, M (B), is a polygonal curve which is monotonically decreasing. It
follows that A and B must intersect at a unique point. The dual of this point is a line that bisects both sets.
We could compute the intersection of these two curves by a simultaneous topological plane sweep of both
arrangements. However it turns out that it is possible to do much better, and in fact the problem can be solved
in O(n + m) time. Since the algorithm is rather complicated, I will not describe the details, but here are the
essential ideas. The algorithm operates by prune and search. In O(n + m) time we will generate a hypothesis
for where the ham sandwich point is in the dual plane, and if we are wrong, we will succeed in throwing away
a constant fraction of the lines from future consideration.
First observe that for any vertical line in the dual plane, it is possible to determine in O(n + m) time whether
this line lies to the left or the right of the intersection point of the median levels, M (A) and M (B). This can be
done by computing the intersection of the dual lines of A with this line, and computing their median in O(n)
time, and computing the intersection of the dual lines of B with this line and computing their median in O(m)
time. If A’s median lies below B’s median, then we are to the left of the ham sandwich dual point, and otherwise
we are to the right of the ham sandwich dual point. It turns out that with a little more work, it is possible to
determine in O(n + m) time whether the ham sandwich point lies to the right or left of a line of arbitrary slope.
The trick is to use prune and search. We find two lines L1 and L2 in the dual plane (by a careful procedure that
I will not describe). These two lines define four quadrants in the plane. By determining which side of each line
the ham sandwich point lies, we know that we can throw away any line that does not intersect this quadrant from
further consideration. It turns out that by a judicious choice of L1 and L2 , we can guarantee that a fraction of at
least (n + m)/8 lines can be thrown away by this process. We recurse on the remaining lines. By the same sort
of analysis we made in the Kirkpatrick and Seidel prune and search algorithm for upper tangents, it follows that
in O(n + m) time we will find the ham sandwich point.