Lectures On Discrete Geometry, Jiri Matousek
Lectures On Discrete Geometry, Jiri Matousek
Lectures On Discrete Geometry, Jiri Matousek
Editorial Board
S. Axler F.W. Gehring K.A. Ribet
Springer
New York
Berlin
Heidelberg
Barcelona
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Graduate Texts in Mathematics
TAKEUTIIZARING. Introduction to 34 SPITZER. Principles of Random Walk.
Axiomatic Set Theory. 2nd ed. 2nded.
2 OXTOBY. Measure and Category. 2nd ed. 35 ALEXANDERIWERMER. Several Complex
3 SCHAEFER. Topological Vector Spaces. Variables and Banach Algebras. 3rd ed.
2nded. 36 KELLEy/NAMIOKA et al. Linear
4 HILTON/STAMMBACH. A Course in Topological Spaces.
Homological Algebra. 2nd ed. 37 MONK. Mathematical Logic.
5 MAC LANE. Categories for the Working 38 GRAUERT/FRIlZSCHE. Several Complex
Mathematician. 2nd ed. Variables.
6 HUGHEs/PIPER. Projective Planes. 39 ARVESON. An Invitation to C*-Algebras.
7 SERRE. A Course in Arithmetic. 40 KEMENy/SNELL/KNAPP. Denumerable
8 TAKEUTIIZARING. Axiomatic Set Theory. Markov Chains. 2nd ed.
9 HUMPHREYS. Introduction to Lie Algebras 41 APOSTOL. Modular Functions and
and Representation Theory. Dirichlet Series in Number Theory.
10 COHEN. A Course in Simple Homotopy 2nded.
Theory. 42 SERRE. Linear Representations of Finite
11 CONWAY. Functions of One Complex Groups.
Variable I. 2nd ed. 43 GlLLMAN/JERISON. Rings of Continuous
12 BEALS. Advanced Mathematical Analysis. Functions.
13 ANDERSON/fuLLER. Rings and Categories 44 KENDIG. Elementary Algebraic Geometry.
of Modules. 2nd ed. 45 LoEVE. Probability Theory I. 4th ed.
14 GOLUBITSKy/GUILLEMIN. Stable Mappings 46 LoEVE. Probability Theory II. 4th ed.
and Their Singularities. 47 MOISE. Geometric Topology in
15 BERBERIAN. Lectures in Functional Dimensions 2 and 3.
Analysis and Operator Theory. 48 SACHSlWu. General Relativity for
16 WINTER. The Structure of Fields. Mathematicians.
17 ROSENBLATT. Random Processes. 2nd ed. 49 GRUENBERGIWEIR. Linear Geometry.
18 HALMos. Measure Theory. 2nded.
19 HALMOS. A Hilbert Space Problem Book. 50 EDWARDS. Fermat's Last Theorem.
2nded. 51 KLINGENBERG. A Course in Differential
20 HUSEMOLLER. Fibre Bundles. 3rd ed. Geometry.
21 HUMPHREYS. Linear Algebraic Groups. 52 HARTSHORNE. Algebraic Geometry.
22 BARNES/MACK. An Algebraic Introduction 53 MANIN. A Course in Mathematical Logic.
to Mathematical Logic. 54 GRAVERlWATKINS. Combinatorics with
23 GREUB. Linear Algebra. 4th ed. Emphasis on the Theory of Graphs.
24 HOLMES. Geometric Functional Analysis 55 BROWN/PEARCY. Introduction to Operator
and Its Applications. Theory I: Elements of Functional Analysis.
25 HEWITT/STROMBERG. Real and Abstract 56 MASSEY. Algebraic Topology: An
Analysis. Introduction.
26 MANES. Algebraic Theories. 57 CRoWELL/Fox. Introduction to Knot
27 KELLEY. General Topology. Theory.
28 ZARISKIISAMUEL. Commutative Algebra. 58 KOBUTZ. p-adic Numbers, p-adic
Vol.l. Analysis, and Zeta-Functions. 2nd ed.
29 ZARISKIISAMUEL. Commutative Algebra. 59 LANG. Cyclotomic Fields.
Vol.lI. 60 ARNOW. Mathematical Methods in
30 JACOBSON. Lectures in Abstract Algebra I. Classical Mechanics. 2nd ed.
Basic Concepts. 61 WHITEHEAD. Elements of Homotopy
31 JACOBSON. Lectures in Abstract Algebra II. Theory.
Linear Algebra. 62 KARGAPOLOv/MERLZJAKOV. Fundamentals
32 JACOBSON. Lectures in Abstract Algebra of the Theory of Groups.
Ill. Theory of Fields and Galois Theory. 63 BOLLOBAS. Graph Theory.
33 HIRSCH. Differential Topology.
(continued after index)
Jiff Matousek
Lectures on
Discrete Geometry
Springer
Jin Matousek
Department of Applied Mathematics
Charles University
Malostranske mim. 25
118 00 Praha 1
Czech Republic
matousek@kam.mff.cuni.cz
Editorial Board
S. Axler F. w. Gehring K.A. Ribet
Mathematics Department Mathematics Department Mathematics Department
San Francisco State East Hall University of California,
University University of Michigan Berkeley
San Francisco, CA 94132 Ann Arbor, MI 48109 Berkeley, CA 94720-3840
USA USA USA
axler@sfsu.edu fgehring@math.lsa. ribet@math.berkeley.edu
umich.edu
9 8 7 6 54 3 2 1
The next several pages describe the goals and the main topics of this book.
Questions in discrete geometry typically involve finite sets of points, lines,
circles, planes, or other simple geometric objects. For example, one can ask,
what is the largest number of regions into which n lines can partition the
plane, or what is the minimum possible number of distinct distances occur-
ring among n points in the plane? (The former question is easy, the latter
one is hard.) More complicated objects are investigated, too, such as convex
polytopes or finite families of convex sets. The emphasis is on "combinato-
rial" properties: Which of the given objects intersect, or how many points
are needed to intersect all of them, and so on.
Many questions in discrete geometry are very natural and worth studying
for their own sake. Some of them, such as the structure of 3-dimensional
convex polytopes, go back to the antiquity, and many of them are motivated
by other areas of mathematics. To a working mathematician or computer
scientist, contemporary discrete geometry offers results and techniques of
great diversity, a useful enhancement of the "bag of tricks" for attacking
problems in her or his field. My experience in this respect comes mainly
from combinatorics and the design of efficient algorithms, where, as time
progresses, more and more of the first-rate results are proved by methods
drawn from seemingly distant areas of mathematics and where geometric
methods are among the most prominent.
The development of computational geometry and of geometric methods in
combinatorial optimization in the last 20-30 years has stimulated research in
discrete geometry a great deal and contributed new problems and motivation.
Parts of discrete geometry are indispensable as a foundation for any serious
study of these fields. I personally became involved in discrete geometry while
working on geometric algorithms, and the present book gradually grew out of
lecture notes initially focused on computational geometry. (In the meantime,
several books on computational geometry have appeared, and so I decided to
concentrate on the nonalgorithmic part.)
In order to explain the path chosen in this book for exploring its subject,
let me compare discrete geometry to an Alpine mountain range. Mountains
can be explored by bus tours, by walking, by serious climbing, by playing
vi Preface
in the local casino, and in many other ways. The book should provide safe
trails to a few peaks and lookout points (key results from various subfields
of discrete geometry). To some of them, convenient paths have been marked
in the literature, but for others, where only climbers' routes exist in research
papers, I tried to add some handrails, steps, and ropes at the critical places,
in the form of intuitive explanations, pictures, and concrete and elementary
proofs. l However, I do not know how to build cable cars in this landscape:
Reaching the higher peaks, the results traditionally considered difficult, still
needs substantial effort. I wish everyone a clear view of the beautiful ideas in
the area, and I hope that the trails of this book will help some readers climb
yet unconquered summits by their own research. (Here the shortcomings of
the Alpine analogy become clear: The range of discrete geometry is infinite
and no doubt, many discoveries lie ahead, while the Alps are a small spot on
the all too finite Earth.)
This book is primarily an introductory textbook. It does not require any
special background besides the usual undergraduate mathematics (linear al-
gebra, calculus, and a little of combinatorics, graph theory, and probability).
It should be accessible to early graduate students, although mastering the
more advanced proofs probably needs some mathematical maturity. The first
and main part of each section is intended for teaching in class. I have actually
taught most of the material, mainly in an advanced course in Prague whose
contents varied over the years, and a large part has also been presented by
students, based on my writing, in lectures at special seminars (Spring Schools
of Combinatorics). A short summary at the end of the book can be useful for
reviewing the covered material.
The book can also serve as a collection of surveys in several narrower
subfields of discrete geometry, where, as far as I know, no adequate recent
treatment is available. The sections are accompanied by remarks and biblio-
graphic notes. For well-established material, such as convex polytopes, these
parts usually refer to the original sources, point to modern treatments and
surveys, and present a sample of key results in the area. For the less well cov-
ered topics, I have aimed at surveying most of the important recent results.
For some of them, proof outlines are provided, which should convey the main
ideas and make it easy to fill in the details from the original source.
Topics. The material in the book can be divided into several groups:
• Foundations (Sections 1.1-1.3, 2.1, 5.1-5.4, 5.7, 6.1). Here truly basic
things are covered, suitable for any introductory course: linear and affine
subspaces, fundamentals of convex sets, Minkowski's theorem on lattice
points in convex bodies, duality, and the first steps in convex polytopes,
Voronoi diagrams, and hyperplane arrangements. The remaining sections
of Chapters 1, 2, and 5 go a little further in these topics.
1 I also wanted to invent fitting names for the important theorems, in order to
make them easier to remember. Only few of these names are in standard usage.
Preface Vll
reliable guide.) Many interesting topics are neglected completely, such as the
wide area of packing and covering, where very accessible treatments exist,
or the celebrated negative solution by Kahn and Kalai of the Borsuk conjec-
ture, which I consider sufficiently popularized by now. Many more chapters
analogous to the fifteen of this book could be added, and each of the fifteen
chapters could be expanded into a thick volume. But the extent of the book,
as well as the time for its writing, are limited.
Exercises. The sections are complemented by exercises. The little framed
numbers indicate their difficulty: III is routine, 0 may need quite a bright
idea. Some of the exercises used to be a part of homework assignments in my
courses and the classification is based on some experience, but for others it
is just an unreliable subjective guess. Some of the exercises, especially those
conveying important results, are accompanied by hints given at the end of
the book.
Additional results that did not fit into the main text are often included as
exercises, which saves much space. However, this greatly enlarges the danger
of making false claims, so the reader who wants to use such information may
want to check it carefully.
Sources and further reading. A great inspiration for this book project
and the source of much material was the book Combinatorial Geometry of
Pach and Agarwal [PA95]. Too late did I become aware of the lecture notes by
Ball [BaI97] on modern convex geometry; had I known these earlier I would
probably have hesitated to write Chapters 13 and 14 on high-dimensional
convexity, as I would not dare to compete with this masterpiece of mathe-
matical exposition. Ziegler's book [Zie94] can be recommended for studying
convex polytopes. Many other sources are mentioned in the notes in each
chapter. For looking up information in discrete geometry, a good starting
point can be one of the several handbooks pertaining to the area: Handbook
of Convex Geometry [GW93], Handbook of Discrete and Computational Ge-
ometry [G097], Handbook of Computational Geometry [SUOO], and (to some
extent) Handbook of Combinatorics [GGL95], with numerous valuable sur-
veys. Many of the important new results in the field keep appearing in the
journal Discrete and Computational Geometry.
Acknowledgments. For invaluable advice and/or very helpful comments on
preliminary versions of this book I would like to thank Micha Sharir, Gunter
M. Ziegler, Yuri Rabinovich, Pankaj K. Agarwal, Pavel Valtr, Martin Klazar,
Nati Linial, Gunter Rote, Janos Pach, Keith Ball, Uli Wagner, Imre Barany,
Eli Goodman, Gyorgy Elekes, Johannes Blomer, Eva Matouskova, Gil Kalai,
Joram Lindenstrauss, Emo Welzl, Komei Fukuda, Rephael Wenger, Piotr In-
dyk, Sariel Har-Peled, Vojtech Rodl, Geza T6th, Karoly Boroczky Jr., Rados
Radoicic, Helena Nyklova, Vojtech Franek, Jakub Simek, Avner Magen, Gre-
gor Baudis, and Andreas Marwinski (I apologize if I forgot someone; my notes
are not perfect, not to speak of my memory). Their remarks and suggestions
Preface ix
Preface v
1 Convexity 1
1.1 Linear and Affine Subspaces, General Position ............. 1
1.2 Convex Sets, Convex Combinations, Separation. . . . . . . . . . . . 5
1.3 Radon's Lemma and HeIly's Theorem. .. . . . .. . . . . ..... . . .. 9
1.4 Centerpoint and Ham Sandwich. . . . . . . . . . . . . . . . . . . . . . . . .. 14
4 Incidence Problems 41
4.1 Formulation........................................... 41
4.2 Lower Bounds: Incidences and Unit Distances. . . . . . . . . . . . .. 51
4.3 Point-Line Incidences via Crossing Numbers. . . . . . . . . . . . . .. 54
4.4 Distinct Distances via Crossing Numbers . . . . . . . . . . . . . . . . .. 59
4.5 Point-Line Incidences via Cuttings ....................... 64
4.6 A Weaker Cutting Lemma. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 70
4.7 The Cutting Lemma: A Tight Bound ..................... 73
5 Convex Polytopes 77
5.1 Geometric Duality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 78
5.2 H-Polytopes and V-Polytopes. . . . . . . . . . . . . . . . . . . . . . . . . . .. 82
5.3 Faces of a Convex Polytope. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 86
5.4 Many Faces: The Cyclic Polytopes. . . . . . . . . . . . . . . . . . . . . . .. 96
5.5 The Upper Bound Theorem ............................. 100
xii Contents
Bibliography 417
Index 459
Notation and Terminology
This section summarizes rather standard things, and it is mainly for reference.
More special notions are introduced gradually throughout the book. In order
to facilitate independent reading of various parts, some of the definitions are
even repeated several times.
If X is a set, IXI denotes the number of elements (cardinality) of X. If X
is a multiset, in which some elements may be repeated, then IXI counts each
element with its multiplicity.
°
The very slowly growing function log* x is defined by log* x = for x :::; 1
and log* x = 1 + log* (log2 x) for x > 1.
For a real number x, l x J denotes the largest integer less than or equal
r
to x, and x 1 means the smallest integer greater than or equal to x. The
boldface letters Rand Z stand for the real numbers and for the integers,
respectively, while Rd denotes the d-dimensional Euclidean space. For a point
x = (Xl, X2,"" Xd) E R d , IIxll = Jxi + x~ + ... + x~ is the Euclidean norm
of x, and for x, Y E R d , (x, y) = XIYI +X2Y2 + ... +XdYd is the scalar product.
Points of Rd are usually considered as column vectors.
The symbol B(x, r) denotes the closed ball of radius r centered at x in
some metric space (usually in Rd with the Euclidean distance), i.e., the set
of all points with distance at most r from x. We write Bn for the unit ball
B(O, 1) in Rn. The symbol 8A denotes the boundary of a set A ~ R d , that
is, the set of points at zero distance from both A and its complement.
For a measurable set A ~ R d , vol(A) is the d-dimensional Lebesgue mea-
sure of A (in most cases the usual volume).
Let I and 9 be real functions (of one or several variables). The notation
I = O(g) means that there exists a number C such that III :::; Glgi for all
values of the variables. Normally, C should be an absolute constant, but if
I and 9 depend on some parameter(s) that we explicitly declare to be fixed
(such as the space dimension d), then C may depend on these parameters
as well. The notation I = D(g) is equivalent to 9 = 0U), I(n) = o(g(n))
to limn~ooU(n)/g(n)) = 0, and I = 8(g) means that both I = O(g) and
1= D(g).
For a random variable X, the symbol E[Xj denotes the expectation of X,
and Prob [Aj stands for the probability of an event A.
xvi Notation and Terminology
Graphs are considered simple and undirected in this book unless stated
otherwise, so a graph G is a pair (V, E), where V is a set (the vertex set) and
E ~ (~) is the edge set. Here (~) denotes the set of all k-element subsets
of V. For a multigraph, the edges form a multiset, so two vertices can be
connected by several edges. For a given (multi)graph G, we write V(G) for
the vertex set and E( G) for the edge set. A complete graph has all possible
edges; that is, it is of the form(V, ).
(~) A complete graph on n vertices is
denoted by Kn- A graph G is bipartite if the vertex set can be partitioned
into two subsets VI and V2, the (color) classes, in such a way that each edge
connects a vertex of VI to a vertex of V2. A graph G' = (V', E') is a subgraph
of a graph G = (V, E) if V' ~ V and E' ~ E. We also say that G contains
a copy of H if there is a subgraph G' of G isomorphic to H, where G' and
H are isomorphic if there is a bijective map <p: V(G') -t V(H) such that
{u,v} E E(G') if and only if {<p(u),<p(v)} E E(H) for all u,v E V(G'). The
degree of a vertex v in a graph G is the number of edges of G containing v.
SAn r-regular graph has all degrees equal to r. Paths and cycles are graphs as
'in the following picture,
IjN~ ~oOO
paths cycles
Convexity
+an · This yields an expression of the form /31(a1 - an) + /32(a2 - an) + ... +
/3n (an - an) +an = /31 a1 + /32 a2 + ... + /3n-1 an-1 + (1- /31 - /32 - ... - /3n-dan ,
where /31, ... ,/3n are arbitrary real numbers. Thus, an affine combination of
points aI, ... ,an E R d is an expression of the form
Then indeed, it is not hard to check that the affine hull of X is the set of all
affine combinations of points of X.
The affine dependence of points aI, ... ,an means that one of them can
be written as an affine combination of the others. This is the same as the
existence of real numbers aI, a2, ... an, at least one of them nonzero, such
that both
-------::::~-----"7"X3 = 1
___------=~--"7" X3 =0
What conditions are suitable for including into a "general position" as-
sumption? In other words, what can be considered as an unlikely coincidence?
For example, let X be an n-point set in the plane, and let the coordinates of
the ith point be (Xi, Yi). Then the vector v(X) = (Xl, X2,···, Xn, YI, Y2,···, Yn)
can be regarded as a point of R2n. For a configuration X in which Xl = X2,
i.e., the first and second points have the same x-coordinate, the point v(X)
lies on the hyperplane {Xl = X2} in R2n. The configurations X where some
two points share the x-coordinate thus correspond to the union of G) hy-
perplanes in R2n. Since a hyperplane in R 2n has (2n-dimensional) measure
zero, almost all points of R2n correspond to planar configurations X with all
the points having distinct x-coordinates. In particular, if X is any n-point
planar configuration and c > 0 is any given real number, then there is a con-
figuration X', obtained from X by moving each point by distance at most c,
such that all points of X' have distinct x-coordinates. Not only that: Almost
all small movements (perturbations) of X result in X' with this property.
This is the key property of general position: Configurations in general
position lie arbitrarily close to any given configuration (and they abound
in any small neighborhood of any given configuration). Here is a fairly gen-
eral type of condition with this property. Suppose that a configuration X
is specified by a vector t = (tl' t2,"" t m ) of m real numbers (coordinates).
The objects of X can be points in R d , in which case m = dn and the tj
are the coordinates of the points, but they can also be circles in the plane,
with m = 3n and the tj expressing the center and the radius of each circle,
and so on. The general position condition we can put on the configuration
X is p(t) = p(h, t2, ... , t m ) i:- 0, where p is some nonzero polynomial in m
variables. Here we use the following well-known fact (a consequence of Sard's
theorem; see, e.g., Bredon [Bre93], Appendix C): For any nonzero m-variate
polynomial P(tl, ... , t m ), the zero set {t E Rm: p(t) = O} has measure 0 in
Rm.
Therefore, almost all configurations X satisfy p(t) i:- O. So any condition
that can be expressed as p(t) i:- 0 for a certain polynomial p in m real
variables, or, more generally, as PI (t) i:- 0 or P2 (t) i:- 0 or ... , for finitely or
countably many polynomials Pl>P2,"" can be included in a general position
assumption.
For example, let X be an n-point set in R d , and let us consider the con-
dition "no d+l points of X lie in a common hyperplane." In other words, no
d+l points should be affinely dependent. As we know, the affine dependence
of d+ 1 points means that a suitable d x d determinant equals O. This deter-
minant is a polynomial (of degree d) in the coordinates of these d+ 1 points.
Introducing one polynomial for every (d+l)-tuple of the points, we obtain
(d~l) polynomials such that at least one of them is 0 for any configuration X
with d+ 1 points in a common hyperplane. Other usual conditions for general
position can be expressed similarly.
1.2 Convex Sets, Convex Combinations, Separation 5
Exercises
1. Verify that the affine hull of a set X ~ Rd equals the set of all affine
combinations of points of X. 0 .
2. Let A be a 2 x 3 matrix and let b E R 2 . Interpret the solution of the
system Ax = b geometrically (in most cases, as an intersection of two
planes) and discuss the possible cases in algebraic and geometric terms.
o
3. (a) What are the possible intersections of two (2-dimensional) planes
in R4? What is the "typical" case (general position)? What about two
hyperplanes in R 4? 0
(b) Objects in R4 can sometimes be "visualized" as objects in R3 moving
in time (so time is interpreted as the fourth coordinate). Thy to visualize
the intersection of two planes in R 4 discussed (a) in this way.
..
with a finite X:
X conv(X)
6 Chapter 1: Convexity
1.2.2 Claim. A point x belongs to conv(X) if and only if there exist points
Xl, X2,'" Xn E X and nonnegative real numbers t l , t2, ... , tn with L~l ti =
1 such that X = L~=l tiXi.
(ii) There exists ayE Rd such that yT A is a vector with all entries strictly
negative. Thus, if we multiply the j th equation in the system Ax = 0 by
Yj and add these equations together, we obtain an equation that obviously
has no nontrivial nonnegative solution, since all the coefficients on the
left-hand sides are strictly negative, while the right-hand side is O.
Proof. Let us see why this is yet another version of the separation theorem.
Let V C Rd be the set of n points given by the column vectors of the
matrix A. We distinguish two cases: Either 0 E conv(V) or 0 tj. conv(V).
In the former case, we know that 0 is a convex combination of the points
of V, and the coefficients of this convex combination determine a nontrivial
nonnegative solution to Ax = O.
In the latter case, there exists a hyperplane strictly separating V from 0,
i.e., a unit vector y E Rd such that (y, v) < (y,O) = 0 for each v E V. This is
just the y from the second alternative in the Farkas lemma. D
Exercises
1. Give a detailed proof of Claim 1.2.2. 0
2. Write down a detailed proof of the separation theorem. [I]
3. Find an example of two disjoint closed convex sets in the plane that are
not strictly separable. ITl
4. Let I: R d ---+ R k be an affine map.
(a) Prove that if C ~ Rd is convex, then I(C) is convex as well. Is the
preimage of a convex set always convex? 0
(b) For X ~ Rd arbitrary, prove that conv(f(X)) = conv(f(X)). ITl
5. Let X ~ Rd. Prove that diam(conv(X)) = diam(X), where the diameter
diam(Y) of a set Y is sup{/lx - y/l: x, y E Y}. [I]
6. A set C ~ Rd is a convex cone if it is convex and for each x E C, the ray
01 is fully contained in C.
(a) Analogously to the convex and affine hulls, define the appropriate
"conic hull" and the corresponding notion of "combination" (analogous
to the convex and affine combinations). [I]
(b) Let C be a convex cone in Rd and b (j. C a point. Prove that there
exists a vector a with (a, x) 2: 0 for all x E C and (a, b) < O. 0
7. (Variations on the Farkas lemma) Let A be a dxn matrix and let b E Rd.
(a) Prove that the system Ax = b has a nonnegative solution x E Rn if
and only if every y E Rd satisfying yT A 2: 0 also satisfies yTb 2: O. [I]
(b) Prove that the system of inequalities Ax :::; b has a nonnegative
solution x if and only if every nonnegative y E Rd with yT A 2: 0 also
satisfies yTb 2: O. [I]
8. (a) Let C C Rd be a compact convex set with a nonempty interior, and
let p E C be an interior point. Show that there exists a line £ passing
through p such that the segment £ n C is at least as long as any segment
parallel to £ and contained in C. [iJ
(b) Show that (a) may fail for C compact but not convex. ITl
•
•
• •
Proof. Let A = {al,a2,'" ,ad+2}' These d+2 points are necessarily affinely
dependent. That is, there exist real numbers al,.'" ad+2, not all of them 0,
such that L~~; ai = 0 and L~~; aiai = O.
Set P = {i: ai > O} and N = {i: ai < O}. Both P and N are nonempty.
We claim that P and N determine the desired subsets. Let us put Al =
{ai: i E P} and A2 = {ai: i EN}. We are going to exhibit a point x that is
contained in the convex hulls of both these sets.
Put S = LiEP ai; we also have S = - LiEN ai. Then we define
(1.1)
(1.2)
Exercises
1. Prove Caratheodory's theorem (you may use Radon's lemma). 8J
2. Let K C Rd be a convex set and let Cb""Cn ~ R d, n 2:: d+1, be
convex sets such that the intersection of every d+ 1 of them contains a
translated copy of K. Prove that then the intersection of all the sets C i
also contains a translated copy of K. ~
This result was noted by Vincensini [Vin39] and by Klee [Kle53].
3. Find an example of 4 convex sets in the plane such that the intersection
of each 3 of them contains a segment of length 1, but the intersection of
all 4 contains no segment of length 1. ITl
4. A strip of width w is a part of the plane bounded by two parallel lines at
distance w. The width of a set X ~ R2 is the smallest width of a strip
containing X.
(a) Prove that a compact convex set of width 1 contains a segment of
length 1 of every direction. GJ
(b) Let {C b C2 , ... ,Cn } be closed convex sets in the plane, n 2:: 3, such
that the intersection of every 3 of them has width at least 1. Prove that
n~=l Ci has width at least 1. ~
1.3 Radon's Lemma and Helly's Theorem 13
.
half-space " we thus consider the compact convex set conv(X n ,) c ,.
...~...........~ .
.......
• •• •
"
,
........ " .
conv{r n X )
1.4 Centerpoint and Ham Sandwich 15
Letting'Y run through all open half-spaces 'Y with IX n 'YI > d~l n, we obtain
a family C of compact convex sets. Each of them contains more than d~l n
points of X, and so the intersection of any d+ 1 of them contains at least
one point of X. The family C consists of finitely many distinct sets (since X
n
has finitely many distinct subsets), and so C i= 0 by Helly's theorem. Each
point in this intersection is a centerpoint. 0
Exercises
1. (Centerpoints for general mass distributions)
(a) Let J.1 be a Borel probability measure on Rd; that is, J.1(Rd) = 1 and
each open set is measurable. Show that for each open half-space 'Y with
J.1( 'Y) > t there exists a compact set C C 'Y with J.1( C) > t. I2l
(b) Prove that each Borel probability measure in Rd has a centerpoint
(use (a) and the infinite Helly's theorem). I2l
2. Prove that for any k finite sets AI, ... ,Ak C Rd, where 1:::::; k:::::; d, there
d
exists a (k-1)-flat such that every hyperplane containing it has at least
l IAil points of Ai in both ofits closed half-spaces for all i = 1,2, ... , k.
2
This chapter is a quick excursion into the geometry of numbers, a field where
number-theoretic results are proved by geometric arguments, often using
properties of convex bodies in Rd. We formulate the simple but beautiful
theorem of Minkowski on the existence of a nonzero lattice point in every
symmetric convex body of sufficiently large volume. We derive several con-
sequences, concluding with a geometric proof of the famous theorem of La-
grange claiming that every natural number can be written as the sum of at
most 4 squares.
integer vectors in the cube [-R, Rjd: C = {C' +v: v E [-R, RjdnZ d },
as is indicated in the drawing (C is painted in gray).
Each such translate is disjoint from C', and thus every two of these
translates are disjoint as well. They are all contained in the enlarged
cube K = [-R - D, R + Djd, where D denotes the diameter of C'.
Hence
........
~
.
. . . .........
~ . .. .. . .
• • • • • • • • • • • • · 0 · ••••••.•••.•
Proof. Suppose than one could see outside along some line epassing through
the origin. This means that the strip S of width 0.16 with e as the middle
line contains no lattice point in K except for the origin. In other words, the
symmetric convex set C = KnS contains no lattice points but the origin. But
as is easy to calculate, vol( C) > 4, which contradicts Minkowski's theorem.
o
2.1.3 Proposition (Approximating an irrational number by a frac-
tion). Let a E (0,1) be a real number and N a natural number. Then there
exists a pair of natural numbers m, n such that n ::::; Nand
This proposition implies that there are infinitely many pairs m, n such
that la - ~I < 1/n2 (Exercise 4). This is a basic and well-known result
in elementary number theory. It can also be proved using the pigeonhole
principle.
The proposition has an analogue concerning the approximation of several
numbers al,"" ak by fractions with a common denominator (see Exercise 5),
and there a proof via Minkowski's theorem seems to be the simplest.
Proof of Proposition 2.1.3. Consider the set
This is a symmetric convex set of area (2N+1)~ > 4, and therefore it con-
tains some nonzero integer lattice point (n, m). By symmetry, we may assume
n> O. The definition of C gives n :::; N and lam - ml < -k.
In other words,
lex - !!!ol
n
< _1 •
nN
D
Exercises
1. Prove: If C ~ R d is convex, symmetric around the origin, bounded, and
such that vol( C) > k2 d , then C contains at least 2k lattice points. ~
2. By the method of the proof of Minkowski's theorem, show the following
result (Blichtfeld; Van der Corput): If S ~ Rd is measurable and vol(S) >
k, then there are points Sll S2, . .. ,Sk E S with all Si - 8j E Zd, 1 :::; i,j :::;
k. @]
3. Show that the boundedness of C in Minkowski's theorem is not really
necessary. [IJ
4. (a) Verify the claim made after Example 2.1.3, namely, that for any
irrational ex there are infinitely many pairs m, n such that lex - mini <
1/n2 • [IJ
(b) Prove that for ex = J2 there are only finitely many pairs m, n with
lex - mini < 1/4n2 . ~
(c) Show that for any algebraic irrational number a (Le., a root of a
univariate polynomial with integer coefficients) there exists a constant D
such that lex - mini < linD holds for finitely many pairs (m, n) only.
Conclude that, for example, the number 2:::1 2- ii is not algebraic. ill
2.2 General Lattices 21
5. (a) Let 0:1> 0:2 E (0,1) be real numbers. Prove that for a given N E N
there exist ml,m2,n E N, n ~ N, such that 100i - ~I < nffi, i = 1,2.
8J
(b) Formulate and prove an analogous result for the simultaneous ap-
proximation of d real numbers by rationals with a common denominator.
o (This is a result of Dirichlet [Dir42].)
6. Let K c R 2 be a compact convex set of area 0: and let x be a point
chosen uniformly at random in [0, 1)2.
(a) Prove that the expected number of points of Z2 in the set K + x
equals 0:. 0
(b) Show that with probability at least 1 - 0:, K + x contains no point
of Z2. III
Let us remark that this lattice has in general many different bases. For in-
stance, the sets {(O, 1), (1, On and {(I, 0), (3, In are both bases of the "stan-
dard" lattice Z2.
Let us form a d x d matrix Z with the vectors Zl, ... , Zd as columns. We
define the determinant of the lattice A = A(Zl, Z2, ... , Zd) as det A = 1 det Z I.
Geometrically, det A is the volume of the parallelepiped {O:lZl + 0:2Z2 + ... +
O:dZd: 0:1, ... , O:d E [0, I]}:
(the proof is left to Exercise 1). The number det A is indeed a property of the
lattice A (as a point set), and it does not depend on the choice of the basis
of A (Exercise 2). It is not difficult to show that if Z is the matrix of some
basis of A, then the matrix of every basis of A has the form BU, where U is
an integer matrix with determinant ±l.
22 Chapter 2: Lattices and Minkowski's Theorem
Proof. We proceed by induction. For some i, 1 :::; i :::; d+l, suppose that
linearly independent vectors Zl, Z2,"" Zi-l E A with the following prop-
erty have already been constructed. If F i - l denotes the (i-l )-dimensional
subspace spanned by Zl, ... , Zi-l, then all points of A lying in F i - l can be
written as integer linear combinations of Zl, ... , Zi-l. For i = d+ 1, this gives
the statement of the theorem.
So consider an i :::; d. Since A generates R d , there exists a vector w E A
not lying in the subspace F i - l . Let P be the i-dimensional parallelepiped
determined by Zl, Z2, ... , Zi-l and by w: P = {alzl +a2z2 + ... +ai-lzi-l +
aiw: al,"" ai E [0, I]}. Among all the (finitely many) points of A lying in
P but not in F i - l , choose one nearest to F i - l and call it Zi, as in the picture:
2.2 General Lattices 23
Note that if the points of A n P are written in the form a1z1 + a2z2 + ... +
ai-1Zi-1 + aiW, then Zi is one with the smallest ai. It remains to show that
Zl, Z2, ... , Zi have the required property.
So let v E A be a point lying in Fi (the linear span of Zl,"" Zi). We
can write v = {31Z1 + {32Z2 + ... + {3izi for some real numbers {31,'" ,{3i' Let
'Yj be the fractional part of {3j, j = 1,2, ... , i; that is, 'Yj = {3j - l{3jJ. Put
v' = 'Y1Z1 + 'Y2Z2 + ... + 'YiZi. This point also lies in A (since v and v' differ
by an integer linear combination of vectors of A). We have 0 :s; 'Yj < 1, and
hence v' lies in the parallelepiped P. Therefore, we must have 'Yi = 0, for
otherwise, v' would be nearer to F i - 1 than Zi' Hence v' E An Fi-I, and by
the inductive hypothesis, we also get that all the other 'Yj are O. So all the (3j
are in fact integer coefficients, and the inductive step is finished. 0
Therefore, a lattice can also be defined as a full-dimensional discrete sub-
group of Rd.
introduction is in the first half of the book Pach and Agarwal [PA95],
and an authoritative reference is Conway and Sloane [CS99]. Let us
remark that the lattice constant (and hence the maximum lattice pack-
ing density) is not known in general even for Euclidean spheres, and
many ingenious constructions and arguments have been developed for
packing them efficiently. These problems also have close connections
to error-correcting codes.
Successive minima and Minkowski's second theorem. Let C C Rd
be a convex body containing 0 in the interior and let A c R d
be a lattice. The ith successive minimum of C with respect to A,
denoted by Ai = Ai (C, A), is the infimum of the scaling factors
A > 0 such that AC contains at least i linearly independent vec-
tors of A. In particular, Al is the smallest number for which AIC
contains a nonzero lattice vector, and Minkowski's theorem guaran-
tees that At ~ 2d det(A)/vol(C). Minkowski's second theorem asserts
(2d /d!) det(A) ~ AIA2··· Ad· vol(C) ~ 2d det(A).
The flatness theorem. If a convex body K is not required to be sym-
metric about 0, then it can have arbitrarily large volume without con-
taining a lattice point. But any lattice-point free body has to be flat:
For every dimension d there exists c( d) such that any convex body
K ~ Rd with K n Zd = 0 has lattice width at most c(d). The lat-
tice width of K is defined as min{maxxEK (x,y) - minxEK(x,y): y E
Zd \ {O}}; geometrically, we essentially count the number of hyper-
planes orthogonal to y, spanned by points of Zd, and intersecting K.
Such a result was first proved by Khintchine in 1948, and the current
best bound c(d) = O(d3 / 2 ) is due to Banaszczyk, Litvak, Pajor, and
Szarek [BLPS99]; we also refer to this paper for more references.
Computing lattice points in convex bodies. Minkowski's theorem pro-
vides the existence of nonzero lattice points in certain convex bodies.
Given one of these bodies, how efficiently can one actually compute
a nonzero lattice point in it? More generally, given a convex body in
Rd, how difficult is it to decide whether it contains a lattice point, or
to count all lattice points? For simplicity, we consider only the integer
lattice Zd here.
First, if the dimension d is considered as a constant, such prob-
lems can be solved efficiently, at least in theory. An algorithm due to
Lenstra [Len83] finds in polynomial time an integer point, if one exists,
in a given convex polytope in R d, d fixed. It is based on the flatness
theorem mentioned above (the ideas are also explained in many other
sources, e.g., [GLS88], [Lov86], [Sch86], [Bar97]). More recently, Barvi-
nok [Bar93] (or see [Bar97]) provided a polynomial-time algorithm for
counting the integer points in a given fixed-dimensional convex poly-
tope. Both algorithms are nice and certainly nontrivial, and especially
2.2 General Lattices 25
Exercises
1. Let VI, ... , Vd be linearly independent vectors in Rd. Form a matrix A
with VI, ... ,Vd as rows. Prove that Idet AI is equal to the volume of the
2.3 An Application in Number Theory 27
parallelepiped {alvl + a2V2 + ... + adVd: al, ... ,ad E [0, I]}. (You may
want to start with d = 2.) 12]
2. Prove that if Zl, ... , Zd and z~, ... , z~ are vectors in R d such that
A(ZI, ... ,Zd) = A(z~, ... ,z~), then IdetZI = IdetZ'I, where Z is the
d x d matrix with the Zi as columns, and similarly for Z'. 12]
3. Prove that for n rational vectors VI, ... ,Vn , the set A = {il VI + i2V2 +
... + invn : iI, i 2, ... , in E Z} is a discrete subgroup of Rd. 12]
4. (Minkowski's theorem on linear forms) Prove the following from Min-
kowski's theorem: Let £i(X) = L;~=l aijXj be linear forms in d variables,
i = 1,2, ... , d, such that the d x d matrix (aij kj has determinant 1.
Let bl , . .. ,bd be positive real numbers with bl b2 .•. bd = 1. Then there
exists a nonzero integer vector Z E Zd \ {o} with I£i (z) I ::; bi for all
i = 1,2, ... ,d. 12]
Let F = G F(p) stand for the field of residue classes modulo p, and let
F* = F \ {a}. An element a E F* is called a quadratic residue modulo p
if there exists an x E F* with x2 == a (modp). Otherwise, a is a quadratic
nonresidue.
area of Cis 27rp > 4p = 4detA, and so C contains a point (a, b) E A \ {O}. We
have 0 < a 2 + b2 < 2p. At the same time, (a, b) = iZ 1 + j Z2 for some i, j E Z,
which means that a = i, b = iq + jp. We calculate a 2 + b2 = i 2 + (iq + jp)2 =
i 2 + i 2q2 + 2iqjp + j2p2 = i 2 (1 + q2) = 0 (modp). Therefore a 2 + b2 = p. D
Exercises
1. (Lagrange's four-square theorem) Let p be a prime.
(a) Show that there exist integers a, b with a2 + b2 = -1 (modp). 0
(b) Show that the set A = {(x,y,z,t) E Z4: z = ax + by (modp), t =
bx - ay (modp)} is a lattice, and compute det(A). Iil
(c) Show the existence of a nonzero point of A in a ball of a suitable
radius, and infer that p can be written as a sum of 4 squares of integers.
[3J
(d) Show that any natural number can be written as a sum of 4 squares
of integers. 0
3
Next, we give an inductive proof; it yields an almost tight bound for n(k).
Second proof of the Erdos-Szekeres theorem. In this proof, by a set
in general position we mean a set with no 3 points on a common line and no
2 points having the same x-coordinate. The latter can always be achieved by
rotating the coordinate system.
Let X be a finite point set in the plane in general position. We call X a
cup if X is convex independent and its convex hull is bounded from above by
a single edge (in other words, if the points of X lie on the graph of a convex
function).
3.1 The Erdos-Szekeres Theorem 31
•....
....•....... _- ...
..' ..
Similarly, we define a cap, with a single edge bounding the convex hull from
below.
e---
. ... __ .............
....
k+C-
j(k, C) :s;; ( k _ 2
4) + 1. (3.1)
Theorem 3.1.3 clearly follows from this, with n(k) < j(k, k). For k :s;; 2
or C :s;; 2 the formula holds. Thus, let k, C 2: 3, and consider a set P in
general position with N = j(k-1,C) + j(k,C-1)-1 points. We prove that
it contains a k-cup or an C-cap. This will establish the inequality j(k, C) :s;;
j(k-1,C) + j(k,C-1)-1, and then (3.1) follows by induction; we leave the
simple manipulation of binomial coefficients to the reader.
Suppose that there is no C-cap in X. Let E ~ X be the set of points
p E X such that X contains a (k-1)-cup ending with p.
We have lEI 2: N - j(k-1,C) + 1 = j(k, C-1), because X \ E contains no
(k-1)-cup and so IX \ EI < j(k-1,C).
Either the set E contains a k-cup, and then we are done, or there is an
(C-1)-cap. The first point p of such an (C-1)-cap is, by the definition of E,
the last point of some (k-1)-cup in X, and in this situation, either the cup
or the cap can be extended by one point:
r
k-1 C-1 k-1 C-1
J~",1
"",•.... ,.. ,., ..•...,p
or
j,."" "'r
"'•.. " ..... " .•'/p
~I
... -.. e----_ • ....
A lower bound for sets without k-cups and i-caps. Interestingly, the
bound for j(k, C) proved above is tight, not only asymptotically but exactly!
This means, in particular, that there are n-point planar sets in general posi-
tion where any convex independent subset has at most O(1og n) points, which
is somewhat surprising at first sight.
An example of a set Xk,R of (kt~24) points in general position with no
k-cup and no C-cap can be constructed, again by induction on k + C. If k :s;; 2
or C :s;; 2, then Xk,e can be taken as a one-point set.
32 Chapter 3: Convex Independent Subsets
Supposing both k ::::: 3 and £ ::::: 3, the set Xk,i is obtained from the sets
L = Xk-l,£ and R = Xk,i-l according to the following picture:
........
...........
....
L = k - l ,t
The set L is placed to the left of R in such a way that all lines determined
by pairs of points in L go below Rand all lines determined by pairs of points
of R go above L.
Consider a cup C in the set Xk,i thus constructed. If C n L = 0, then
!C1 ::; k-l by the assumption on R. If C n L i- 0, then C has at most 1 point
in R, and since no cup in L has more than k-2 points, we get !C! ::; k-l as
well. The argument for caps is symmetric.
We have !Xk,i! = !Xk-1,e! + !Xk,i-l!, and the formula for !Xk,e! follows
by induction; the calculation is almost the same as in the previous proof. 0
2k - 2 2k -
+1::;n(k)::; ( k-2
5) +2.
The upper bound is a small improvement over the bound f(k, k) derived
above; see Exercise 5. The lower bound results from an inductive construction
slightly more complicated than that of Xk,i.
The original upper bound of n(k) ~ (2:-=-24) +1 from [ES35] has been
improved only recently and very slightly; the last improvement to the
bound stated in the text above is due to T6th 1 and Valtr [TV98].
The Erdos-Szekeres theorem was generalized to planar convex sets.
The following somewhat misleading term is used: A family of pairwise
disjoint convex sets is in general position if no set is contained in the
convex hull of the union of two other sets of the family. For every k
there exists n such that in any family of n pairwise disjoint convex sets
in the plane in general position, there are k sets in convex position,
meaning that none of them is contained in the convex hull of the union
of the others. This was shown by Bisztriczky and G. Fejes T6th [BT89]
and, with a different proof and better quantitative bound, by Pach and
T6th [PT98]. The assumption of general position is necessary.
An interesting problem is the generalization of the Erdos-Szekeres
theorem to R d , d ;::: 3. The existence of nd(k) such that every nd(k)
points in Rd in general position contain a k-point subset in convex
position is easy to see (Exercise 4), but the order of magnitude is wide
open. The current best upper bound nd(k) ~ (2kk':~-1)+d [KarOl]
slightly improves the immediate bound. Fiiredi [unpublished] conjec-
tured that n3 (k) ~ eO( v'k). If true, this would be best possible: A
construction of Karolyi and Valtr [KV01] shows that for every fixed
d ;::: 3, nd(k) ;::: eCdk1/(d-l) with a suitable Cd > O. The construction
starts with a one-point set X o, and X H1 is obtained from Xi by re-
placing each point x E Xi by the two points x - (Ef,Ef-l,.·. ,Ei)
and x + (Ef,Ef-l, ... ,Ei), with Ei > 0 sufficiently small, and then
perturbing the resulting set very slightly, so that Xi+! is in suitable
general position. We have \Xi \ = 2i , and the key lemma asserts that
mc(XH1 ) ~ mc(Xi ) +mc(-7l'(Xi)) , where mc(X) denotes the maximum
size of a convex independent subset of X and 7f is the projection to
the hyperplane {Xd = O}.
Another interesting generalization of the Erdos-Szekeres theorem
to Rd is mentioned in Exercise 5.4.3.
The bounds in the Erdos-Szekeres theorem were also investigated
for special point sets, namely, for the so-called dense sets in the plane.
An n-point Xc R2 is called c-dense if the ratio of the maximum and
minimum distances of points in X is at most cVn. For every planar
n-point set, this ratio is at least coVn for a suitable constant Co > 0,
as an easy volume argument shows, and so the dense sets are quite
well spread. Improving on slightly weaker results of Alon, Katchalski,
and Pulleyblank [AKP89], Valtr [VaI92a] showed, by a probabilistic
argument, that every c-dense n-point set in general position contains
1 The reader should be warned that four mathematicians named Toth are men-
tioned throughout the book. For two of them, the surname is actually Fejes Toth
(Laszlo and Gabor), and for the other two it is just Toth (Geza and Csaba).
34 Chapter 3: Convex Independent Subsets
Exercises
1. Find a configuration of 8 points in general position in the plane with no
5 convex independent points (thereby showing that n(5) 2:: 9). 0
2. Prove that the set {(i,j); i = 1,2, ... ,m,j = 1,2, ... ,m} contains no
convex independent subset with more that Cm 2 / 3 points (with C some
constant independent of m). ~
3. Prove that for each k there exists n(k) such that each n(k)-point set in
the plane contains a k-point convex independent subset or k points lying
on a common line. 0
4. Prove an Erdos-Szekeres theorem in Rd: For every k there exists n =
nd(k) such that any n points in Rd in general position contain a k-point
convex independent subset. 0
5. (A small improvement on the upper bound on n(k)) Let X C Rd be a
planar set in general position with f(k, £)+ 1 points, where f is as in the
second proof of Erdos-Szekeres, and let t be the (unique) topmost point
of X. Prove that X contains a k-cup with respect to t or an £-cap with
respect to t, where a cup with respect to t is a subset Y S;; X \ {t} such
that Y U { t} is in convex position, and a cap with respect to t is a subset
Y S;; X \ {t} such that {x, y, z, t} is not in convex position for any triple
{x,y,z} S;; Y. Infer that n(k) ::; f(k-l,k)+1. 0
6. Show that the construction of Xk,i described in the text can be realized
on a polynomial-size grid. That is, if we let n = IXk,il, we may suppose
that the coordinates of all points in Xk,i are integers between 1 and n C
with a suitable constant c. (This was observed by Valtr.) 0
• If I = 0, we have a 6-hole.
• If there is one point x in I, we consider a diagonal that partitions the
hexagon into two quadrilaterals:
The point x lies in one of these quadrilaterals, and the vertices of the
other quadrilateral together with x form a 5-hole.
• If III 2:: 2, we choose an edge xy of conv(I). Let I be an open half-plane
bounded by the line xy and containing no points of I (it is determined
uniquely unless III = 2).
If II n HI 2:: 3, we get a 5-hole formed by x, y, and 3 points of I n H.
For Ii n HI :::; 2, we have one of the two cases indicated in the following
picture:
v
lL
~y
it x
u
Proof. We note that one can produce a smaller Horton set from a larger
one by deleting points from the right. We construct H(k), a Horton set of size
2k, by induction.
We define H(O) as the point (0,0). Suppose that we can construct a Horton
set H(k) with 2k points whose x-coordinates are 0, 1, ... , 2k-1. The induction
step goes as follows.
Let A = 2H(k) (i.e., H(k) expanded twice), and B = A + (1, hk), where
hk is a sufficiently large number. We set H(k+l) = Au B. It is easily seen
that if hk is large enough, B lies high above A, and so H(k+l) is Horton as
well. The set H(3) looks like this:
•
• •
•
• •
•
•
D
r=4
3.2.4 Lemma. Every Horton set is both 4-c1osed from above and 4-c1osed
from below.
3.2 Horton Sets 37
HI
Ho
This means that C has at least 2 points, a and b, in the lower part Ha.
Since the points of Ha and HI alternate along the x-axis, there is a point
e E HI between a and b in the ordering by x-coordinates. This e is above the
segment ab, and so it closes the cup C from above. We argue similarly for a
~~p. D
Proof. (Very similar to the previous one.) For contradiction, suppose there
is a 7-hole X in the considered Horton set H. If X <;;; Ha or X <;;; HI, we
use induction. Otherwise, we select the part (Ha or Ht} containing the larger
portion of X; this has at least 4 points of X. If this part is, say, Ha, and it lies
deep below HI, these 4 points must form a cup in Ha, for if some 3 of them
were a cap, no point of HI could complete them to a convex independent set.
By Lemma 3.2.4, Ha (being a Horton set) contains a point closing the 4-cup
from above. Such a point must be contained in the convex hull of the 7-hole
X, a contradiction. D
Exercises
1. Prove that an n-point Horton set contains no convex independent subset
with more than 4log 2 n points. [2J
2. Find a configuration of 9 points in the plane in general position with no
5-hole. [2J
3. Prove that every sufficiently large set in general position in R3 has a
7-hole. m
4. Let H be a Horton set and let k 2:: 7. Prove that if Y S;; H is a k-point
subset in convex position, then IH n conv(Y)I 2:: 2Lk/4J. Thus, not only
does H contain no k-holes, but each convex k-gon has even exponentially
many points inside. [!]
3.2 Horton Sets 39
This result is due to Nyklova. [NykOO], who proved exact bounds for
Horton sets and observed that the number of points inside each convex
k-gon can be somewhat increased by replacing each point of a Horton set
by a tiny copy of a small Horton set.
5. Call a set X C R2 in general position almost convex if no triangle with
vertices at points of X contains more than 1 point of X in its interior.
Let X C R2 be a finite set in general position such that no triangle with
vertices at vertices of conv(X) contains more than 1 point of X. Prove
that X is almost convex. 0
6. (a) Let q :::: 2 be an integer and let k = mq+2 for an integer m :::: 1. Prove
that every sufficiently large set X c R 2 in general position contains a
k-point convex independent subset Y such that the number of points of
X in the interior of conv(Y) is divisible by q. Use Ramsey's theorem for
triples. 0
(b) Extend the result of (a) to all k :::: q + 2.0
4
Incidence Problems
4.1 Formulation
Point-line incidences. Consider a set P of m points and a set L of n lines
in the plane. What is the maximum possible number of their incidences, i.e.,
pairs (p, £) such that pEP, £ E L, and p lies on £? We denote the number
of incidences for specific P and L by J(P, L), and we let J(m, n) be the
maximum of J(P, L) over all choices of an m-element P and an n-element L.
For example, the following picture illustrates that J(3, 3) ;:: 6,
We give two proofs in the sequel, one simpler and one including techniques
useful in more general situations. We will mostly consider only the most
interesting case m = n. The general case needs no new ideas but only a little
more complicated calculation.
Of course, the problem of point-line incidences can be generalized in many
ways. We can consider incidences between points and hyperplanes in higher
dimensions, or between points in the plane and some family of curves, and
so on. A particularly interesting case is that of points and unit circles, which
is closely related to counting unit distances.
Unit distances and distinct distances. Let U(n) denote the maximum
possible number of pairs of points with unit distance in an n-point set in the
plane. For n ~ 3 we have U(n) = (~) (all distances can be 1), but already
for n = 4 at most 5 of the 6 distances can be 1; i.e., U(4) = 5:
<1>
We are interested in the asymptotic behavior of the function U (n) for n -+ 00.
This can also be reformulated as an incidence problem. Namely, consider
an n-point set P and draw a unit circle around each point of p, thereby
obtaining a set C of n unit circles. Each pair of points at unit distance con-
tributes two point-circle incidences, and hence U(n) ~ ~hcirc(n, n), where
hcirc (m, n) denotes the maximum possible number of incidences between m
points and n unit circles.
Unlike the case of point-line incidences, the correct order of magnitude of
U(n) is not known. An upper bound of O(n 4 / 3 ) can be obtained by modifying
proofs of the Szemen§di-Trotter theorem. But the best known lower bound
is U(n) 2: nHcl/loglogn, for some positive constant el; this is superlinear in
n but grows more slowly than nHc: for every fixed c > o.
A related quantity is the minimum possible number of distinct distances
determined by n points in the plane; formally,
The intersections of the lines, indicated by black dots, are called the vertices.
By removing all the vertices lying on a line £ E L, the line is split into
two unbounded rays and several segments, and these parts are the edges.
Finally, by deleting all the lines of L, the plane is divided into open convex
polygons, called the cells. In Chapter 6 we will study arrangements of lines
and hyperplanes further, but here we need only this basic terminology and
(later) the simple fact that an arrangement of n lines in general position has
G) vertices, n 2 edges, and G)+n+1 cells. For the time being, the reader can
regard this as an exercise, or wait until Chapter 6 for a proof.
Many cells in arrangements. What is the maximum total number of
vertices of m distinct cells in an arrangement of n lines in the plane? Let us
denote this number by K (m, n). A simple construction shows that the maxi-
mum number of incidences I(m, n) is asymptotically bounded from above by
K(m, n); more exactly, we have I(m, n) ::; ~ K(m, 2n). To see this, consider
a set P of m points and a set L of n lines realizing I (m, n), and replace each
line £ E L by a pair of lines £', £" parallel to £ and lying at distance c from £:
e'-,--~--~~,,~-------
e" ---''-----I.........,......-+---~-----
For unit distances in the plane Erdos [Erd46] established the lower
bound U(n) = !l(nHc/loglogn) (Section 4.2) and again conjectured it
to be tight, but the best known upper bound remains O(n 4 / 3 ). This
was first shown by Spencer, Szemeredi, and Trotter [SST84], and it
can be re-proved by modifying each of the proofs mentioned above for
point-line incidences. Further improvement of the upper bound prob-
ably needs different, more "algebraic," methods, which would use the
"circularity" in a strong way, not just in the form of simple combi-
natorial axioms (such as that two points determine at most two unit
circles).
For the analogous problem of unit distances among n points in R 3 ,
Erdos [Erd60] proved !l(n4 / 3 log logn) from below and O(n 5 / 3 ) from
above. The example for the lower bound is the grid {I, 2, ... , ln 1 / 3 JP
appropriately scaled; the bound !l(n 4 / 3 ) is entirely straightforward,
and the extra log log n factor needs further number-theoretic consid-
erations. The upper bound follows by an argument with forbidden
K 3 ,3; similar proofs are shown in Section 4.5. The current best bound
is close to O(n 3 / 2 ); more precisely, it is n 3 / 2 20(a?(n)) [CEG+90]. Here
the function a(n), to be defined in Section 7.2, grows extremely slowly,
more slowly than log n, log log n, log log log n, etc. In dimensions 4 and
higher, the number of unit distances can be !l(n 2 ) (Exerdse 2). Here
even the constant at the leading term is known; see [PA95]. Among
other results related to the unit-distance problems and considering
point sets with various restrictions, we mention a neat construction of
Erdos, Hickerson, and Pach [EHP89] showing that, for every a E (0,2),
there is an n-point set on the 2-dimensional unit sphere with the dis-
tance a occurring at least !l( n log* n) times (the special distance J2
can even occur !l(n 4 / 3 ) times), and the annoying (and still unsolved)
problem of Erdos and Moser, whether the number of unit distances in
an n-point planar set in convex position is always bounded by O(n)
(see [PA95] for partial results and references).
For distinct distances in the plane, the best known upper bound,
due to Erdos, is O(n/v'logn). This bound is attained for the VnxVn
square grid. After a series of increases of the lower bound (Moser
[Mos52], Chung [Chu84], Beck [Bec83], Clarkson et al. [CEG+90],
Chung, Szemen§di, and Trotter [CST92], Szekely [Sze97], Solymosi and
T6th [STOll) the current record is !l(n 4 /(5-1/e)-E) for every fixed E: > 0
(the exponent is approximately 0.863) by Tardos [TarOl], who im-
proved a number-theoretic lemma in the Solymosi-T6th proof. Aronov
and Sharir [ASOlb] obtained the lower bound of approximately nO. 526
for distinct distances in R3.
Another challenging quantity is the number lcirc(m, n) of inci-
dences of m points with n arbitrary circles in the plane. The lower
bound for point-line incidences can be converted to an example with
46 Chapter 4: Incidence Problems
°
plane. Earlier, Elekes and Erdos [EE94] proved that this number is
O(n 2-{logn)-c) for all K, where c > depends on K, and it is O(n 2 )
whenever all the coordinates of the points in K are algebraic num-
bers. Building on these results, Laczkovich and Ruzsa proved that the
maximum number of similar copies of K is O(n 2) if and only if the
.
cross-ratio of every 4 points of K is algebraic, where the cross-ratio
of points a, b, c, d E R2 equals ~=~ ~=~, with a, b, c, d interpreted as
complex numbers in this formula.
Their proof makes use of very nice results from the additive the-
ory of numbers, most notably a theorem of Freiman [Fre73] (also see
Ruzsa [Ruz94]): If A is a set of n integers such that IA + AI :::;; en,
°
where A + A = {a + b: a, b E A} and c > is a constant, then A
is contained in a d-dimensional generalized arithmetic progression of
size at most Cn, with C and d depending on c only. Here a d-dimen-
sional generalized arithmetic progression is a set of integers of the form
{zo+iIql +i2q2+·· ·+idqd: i l = 0,1, ... , nl, i2 = 0,1, ... , n2,···, id =
0,1, ... , nd} for some integers zo and ql, q2, ... , qd. It is easy to see that
IA + AI :::;; CdlAI for every d-dimensional generalized arithmetic pro-
gression, and Freiman's theorem is a sort of converse statement: If
IA + AI = O(IA!), then A is not too far from a generalized arithmetic
progression. (Freiman's theorem has also been used for incidence-
related problems by Erdos, Fiiredi, Pach, and Ruzsa [EFPR93], and
48 Chapter 4: Incidence Problems
°
According to Purdy's conjecture, these are the only possible positions
of u and v if the number of distances is linear: For every C > there
is an no such that if n 2: no and I Dist(P, Q)I ~ Cn, then u and v are
parallel or perpendicular.
If we parameterize the line u by a real parameter x, and v by y, and
denote the cosine of the angle of u and v by A, then Purdy's conjecture
can be reformulated in algebraic terms as follows: Whenever X, Y c R
are n-point sets such that the polynomial F(x, y) = x 2 + y2 + 2AXY
°
attains at most Cn distinct values on X x Y, i.e., I{ F(x, y): x E X, Y E
Y}I ~ Cn, then necessarily A = or A = ±l, provided that n 2: no(C).
Elekes and R6nyai [EROO] characterized all bivariate polynomials
F(x,y) that attain only O(n) values on Cartesian products X x Y.
For every C, d there exists an no such that if F(x, y) is a bivariate
polynomial of degree at most d and X, Y c Rare n-point sets, n 2: no,
such that F(x,y) attains at most Cn distinct values on X x Y, then
F(x, y) has one of the two special forms f(g(x) +h(y)) or f(g(x)h(y)),
where f, g, h are univariate polynomials. In fact, we need not consider
the whole X x Y; it suffices to assume that F attains at most Cn values
on an arbitrary subset of on 2 pairs from X x Y (with no depending
on 0, too). A similar result holds for a bivariate rational function
F( x, y), with one more special form to consider, namely, F( x, y) =
f((g(x) + h(y))/(l - g(x)h(y))).
We indicate a proof only for the special case of the polynomial
F(x, y) = x 2 + y2 + 2AXY from Purdy's conjecture (following Elekes
[Ele99]); the basic idea of the general case is similar, but several more
tools are needed, especially from elementary algebraic geometry. So let
Z = F(X, Y) be the set of values attained by F on X x Y. For each
Yi E Y, put h(x) = F(x, Yi), and define the family r = hij: i,j =
4.1 Formulation 49
° Perhaps this latter bound can be improved to f2(n 2 -E:) for every
c: > (so there would be an almost-dichotomy: either the number of
values of F can be linear, or it has to be always near-quadratic). On the
other hand, it is known that the polynomial x2 + y2 + xy attains only
O(n 2 / vllog n) distinct values for x, y ranging over {l, 2, ... ,n}, and
so the bound need not always be linear or quadratic. It seems likely
that in the general case of the Elekes-R6nyai theorem the number of
values attained by F should be near-quadratic unless F is one of the
special forms.
Further generalizations of the Elekes-R6nyai theorem were ob-
tained by Elekes and SzabO; see [Ele01].
Exercises
1. Let hcirc (m, n) be the maximum number of incidences of m points with
n unit circles and let U(n) be the maximum number of unit distances for
an n-point set.
(a) Prove that hcirc(2n, 2n) = O(hcirc(n, n)). IT]
(b) We have seen that U(n) :S ~hcirc(n, n). Prove that hcirc(n, n)
O(U(n)).121
2. Show that an n-point set in R4 may determine f2(n 2 ) unit distances. m
3. Prove that if X c Rd is a set where every two points have distance 1,
then IXI :S d+1. 0
4. What can be said about the maximum possible number of incidences of
n lines in R 3 with m points? 121
5. Use the Szemeredi-Trotter theorem to show that n points in the plane
determine at most
(a) O(n 7 / 3 ) triangles of unit area, 0
(b) O(n 7 / 3 ) triangles with a given fixed angle Ct. 121
50 Chapter 4: Incidence Problems
The result in (a) was first proved by Erdos and Purdy [EP71]. As for
(b), Pach and Sharir [PS92] proved the better bound O(n 2 Iogn); also
see [PA95].
6. (a) Using the Szemen§di-Trotter theorem, show that the maximum pos-
sible number of distinct lines such that each of them contains at least k
points of a given m-point set P in the plane is O(m 2 /k 3 + m/k). ~
(b) Prove that such lines have at most 0 (m 2 / k 2 + m) incidences with P.
~
7. (Many points on a line or many lines)
(a) Let P be an m-point set in the plane and let k ::; y'rii be an integer
parameter. Prove (using Exercise 6, say) that at most O(m 2 /k) pairs of
points of P lie on lines containing at least k and at most y'rii points of
P.IT!
(b) Similarly, for K 2: y'rii, the number of pairs lying on lines with at
least y'rii and at most K points is O(Km). IT!
(c) Prove the following theorem of Beck [Bec83]: There is a constant
c > 0 such that for any n- point P ~ R 2 , at least cn 2 distinct lines are
determined by P or there is a line containing at least cn points of P. ~
(d) Derive that there exists a constant c> 0 such that for every n-point
set P in the plane that does not lie on a single line there exists a point
pEP lying on at least en distinct lines determined by points of P. IT]
Part (d) is a weak form of the Dirac-Motzkin conjecture; the full conjec-
ture, still unsolved, is the same assertion with c = ~.
8. (Many distinct radii)
(a) Assume that Icirc(m, n) = O(mD: n ,6 +m+n) for some constants a < 1
and (3 < 1, where lcirc(m, n) is the maximum number of incidences of m
points with n circles in the plane. In analogy with to Exercise 7, derive
that there is a constant c > 0 such that for any n-point set P C R2,
there are at least cn 3 distinct circles containing at least 3 points of P
each or there is a circle or line containing at least cn points of P. IT!
(b) Using (a), prove the following result of Elekes (an answer to a question
of Balog): For any n-point set P C R2 not lying on a common circle or
line, the circles determined by P (i.e., those containing 3 or more points
of P) have n( n) distinct radii. [I]
(c) Find an example of an n-point set with only O(n) distinct radii. IT!
9. (Sums and products cannot both be few) Let A c R be a set of n distinct
real numbers and let S = A + A = {a + b: a, b E A} and P = A . A =
{ab: a, bE A}. .
(a) Check that each of the n 2 lines {(x,y) E R2: y = a(x - b)}, a,b E A,
contains at least n distinct points of S x P. IT]
(b) Conclude using Exercise 6 that IS x PI = n(n5 / 2 ), and consequently,
max(ISI, IP) = n(n 5 / 4 ); i.e., the set of sums and the set of products can
never both have almost linear size. ~ (This is a theorem of Elekes [Ele97]
improving previous results on a problem raised by Erdos and Szemeredi.)
4.2 Lower Bounds: Incidences and Unit Distances 51
10. (a) Find n-point sets in the plane that contain f2(n 2 ) similar copies of
the vertex set of an equilateral triangle. [II
(b) Verify that the following set Pm has n = O(m4) points and contains
f2(n 2 ) similar copies of the vertex set of a regular pentagon: Identify R2
with the complex plane C, let w = e 27ri / 5 denote a primitive 5th root of
unity, and put
. . . .
.............
.. ... ... .......
.. ......... .
. .. ..... .
.. . .. .
. .... °:. .... .. .. .. .
. ... .
'
. .. . .. °::.
°0 • °0 •• ° 0 •• ° 0 •• ° 0 ••
...................... .
................... .
. . ....
. .... . .... . . ......
. ...
2k2 + 2k2 = 4k 2 . Therefore, for each i = 0, I, ... , k-l, each line of L contains
a point of P with the x-coordinate equal to i, and so I(P, L) 2: k.ILI = ~ n 4 / 3 .
o
Next, we consider unit distances, where the construction is equally simple
but the analysis uses considerable number-theoretic tools.
4.2.2 Theorem (Many unit distances). For all n 2: 2, there exist con-
figurations of n points in the plane determining at least n l+cl/ log log n unit
distances, with a positive constant Cl'
A configuration with the asymptotically largest known number of unit
distances is a Vii x Vii regular grid with a suitably chosen step. Here unit
distances are related to the number of possible representations of an integer
as a sum of two squares. We begin with the following claim:
4.2.3 Lemma. Let PI < P2 < ... < Pr be primes of the form 4k+l, and
put M = P1P2'" Pro Then M can be expressed as a sum of two squares of
integers in at least 2r ways.
n
7f(n) = (1 + 0(1»-1- as n ---+ 00.
nn
Proofs of this fact are quite complicated; on the other hand, it is not so hard
to prove weaker bounds cn/logn < 7f(n) < Cn/logn for suitable positive
constants c, C.
We consider primes in the arithmetic progression 1,5,9, ... ,4k+ 1, .... A
famous theorem of Dirichlet asserts that every arithmetic progression con-
tains infinitely many primes unless this is impossible for a trivial reason,
namely, unless all the terms have a nontrivial common divisor. The following
theorem is still stronger:
4.2.4 Theorem. Let d and a be relatively prime natural numbers, and let
7fd,a(n) be the number of primes of the form a + kd (k = 0,1,2, .. ') not
exceeding n. We have
1 n
7fd,a(n) = (1 + 0(1)) 'P(d) . Inn'
where 'P denotes the Euler function: 'P( d)is the number of integers between 1
and d that are relatively prime to d.
For every d :::: 2, there are 'P( d) residue classes modulo d that can possi-
bly contain primes. The theorem shows that the primes are quite uniformly
distributed among these residue classes.
The proof of the theorem is not simple, and we omit it, but it is very
nice, and we can only recommend to the reader to look it up in a textbook
on number theory.
Proof of the lower bound for unit distances (Theorem 4.2.2). Let us
suppose that n is a square. For the set P we choose the points of the fo x fo
grid with step I/VM, where M is the product of the first r-l primes of the
form 4k+ 1, and r is chosen as the largest number such that M :::; ~.
It is easy to see that each point of the grid participates in at least as many
unit distances as there are representations of M as a sum of two squares of
nonnegative integers. Since one representation by a sum of two squares of
nonnegative integers corresponds to at most 4 representations by a sum of
two squares of arbitrary integers (the signs can be chosen in 4 ways), we have
at least 2r - 1 /16 unit distances by Lemma 4.2.3.
By the choice of r, we have 4PIP2'" Pr-l :::; n < 4PIP2'" Pr, and
hence 2r :::; nand Pr > (~)l/r. Further, we obtain, by Theorem 4.2.4,
r = 7f4,1(Pr) :::: (1 - o(I»Pr/logPr :::: y'Pr :::: n 1/ 3r for sufficiently large
n, and thus r 3r :::: n. Taking logarithms, we have 3r log r :::: log n, and hence
r :::: logn/(31ogr) :::: log n/(3 log log n). The number of unit distances is at
least n 2r - 4 :::: n l+cl/ log log n, as Theorem 4.2.2 claims. Let us remark that for
sufficiently large n the constant Cl can be made as close to 1 as desired. 0
Exercises
1. By extending the example in the text, prove that for all m, n with n 2 ::::; m
and m 2 ::::; n, we have I(m, n) = D(n 2 / 3 m 2 / 3 ). [I]
2. (Another example for incidences) Suppose that n = 4t 6 for an integer
t ~ 1 and let P = Hi,j): 0 ::::; i,j < fo}. Let S = {(a, b), a, b =
1,2, ... ,t, gcd(a, b) = I}, where gcd(a, b) denotes the greatest common
divisor of a and b. For each point pEP, consider the lines passing
through p with slope alb, for all pairs (a, b) E S. Let L be the union of
all the lines thus obtained for all points pEP.
(a) Check that ILl::::; n. ~
(b) Prove that lSI ~ ct 2 for a suitable positive constant c > 0, and infer
that I(P,L) = D(nt 2 ) = D(n 4 / 3 ). [IJ
if and only if its crossing number is O. The crossing number of the graph G
is the smallest possible crossing number of a drawing of G; we denote it by
cr(G). For example, cr(K5) = 1:
As is well known, for n > 2, a planar graph with n vertices has at most
3n-6 edges. This can be rephrased as follows: If the number of edges is
at least 3n-5 then cr( G) > O. The following theorem can be viewed as a
generalization of this fact.
1 IEI3
cr(G) > -
- 64 1V12 - IVI
.-
The lower bound in this theorem is asymptotically tight; i.e., there exist
graphs with n vertices, m edges, and crossing number O(m 3 /n 2 ); see Exer-
cise 1. The assumption that the graph is simple cannot be omitted.
For a proof of this theorem, we need a simple lemma:
Proof. If lEI> 31V1 and some drawing of the graph had fewer than 1E1-31V1
crossings, then we could delete one edge from each crossing and obtain a
planar graph with more than 31V1 edges. D
1 m3
x->
64-n-
2 .
The best known upper bound on the number of unit distances, U(n) =
O(n 4 / 3 ), can be proved along similar lines; see Exercise 2.
Pach, Spencer, and T6th [PSTOO] proved that for graphs with cer-
tain forbidden subgraphs, the bound can be improved substantially:
For example, if G has n vertices, m edges, and contains no cycle of
length 4, then cr(G) = n(m 4 jn 3 ) for m 2: 400n, which is asymp-
totically tight. Generally, let g be a class of graphs that is mono-
tone (closed under adding edges) and such that any n-vertex graph
in g has at most O(n1+a) edges, for some a E (0,1). Then cr(G) 2:
cm 2+l/ a jn1+1/ a for any G E g with n vertices and m 2: Cn 10g2 n
edges, with suitable constants C, c > 0 depending on g. The proof
applies a generally useful lower bound on the crossing number, which
we outline next. Let bw(G) denote the bisection width of G, i.e., the
minimum number of edges connecting VI and V2 , over all partitions
!
(VI, V2 ) of V(G) with lVII, IV2 1 2: IV(G)I. Leighton [Lei83] proved
that cr(G) = n(bw(G)2) - IV(G)I for any graph G of maximum de-
gree bounded by a constant. Pach, Shahrokhi, and Szegedy [PSS96],
and independently Sykora and Vrt'o [SV94], extended this to graphs
with arbitrary degrees:
(4.1)
each of the resulting parts has 0(81+0<) edges, and so there are 0( n8 0<)
edges within the parts. This number of edges plus the number of edges
deleted in the bisections add up to m, and this provides an inequality
relating cr( G) to nand m; see [PSTOOj for the calculations.
The notion of crossing number is a subtle one. Actually, one can
give several natural definitions; a study of various notions and of their
relations was made by Pach and T6th [PTOOj. Besides counting the
crossings, as we did in the definition of cr(G), one can count the
number of (unordered) pairs of edges that cross; the resulting no-
tion is called the pairwise crossing number in [PTOOj, and we denote
it by pair-cr( a). We always have pair-cr( G) ::; cr( G), but since two
edges (arcs) are allowed to cross several times, it is not clear whether
pair-cr(G) = cr(G) for all graphs G, and currently this seems to be a
challenging open problem (see Exercise 4 for a typical false attempt at
a proof). A simple argument shows that cr(G) ::; 2pair-cr(G)2 (Exer-
cise 4( c)). A stronger claim, proved in [PTOOj, is cr( G) ::; 2 odd-cr( G)2 ,
where odd-cr(G) is the odd-crossing number of G, counting the num-
ber of pairs of edges that cross an odd number of times. An inspiration
for their proof is a theorem of Ranani and Tutte claiming that a graph
G is planar if and only if odd-cr(G) = O. In a drawing of G, call an
edge e even if there is no edge crossed by e an odd number of times.
Pach and T6th show, by a somewhat complicated proof, that if we
consider a drawing of G and let Eo be the set of the even edges, then
there is another drawing of G in which the edges of Eo are involved in
no crossings at all. The inequality cr( G) ::; 2 odd-cr( G)2 then follows
by an argument similar to that in Exercise 4( c).
Finally, let us remark that if we consider rectilinear drawings
(where each edge is drawn as a straight segment), then the result-
ing rectilinear crossing number can be much larger than any of the
crossing numbers considered above: Graphs are known with cr( G) = 4
and arbitrarily large rectilinear crossing numbers (Bienstock and Dean
[BD93]).
Exercises
1. Show that for any nand m, 5n ::; m ::; (~), there exist graphs with n
vertices, m edges, and crossing number 0(m 3 /n 2 ). 0
2. In a manner similar to the above proof for point-line incidences, prove the
bound hcirc(n, n) = 0(n 4 / 3 ), where hcirc(m, n) denotes the maximum
possible number of incidences between m points and n unit circles in the
plane (be careful in handling possible multiple edges in the considered
topological graph!). m
3. Let K (n, m) denote the maximum total number of edges of m dis-
tinct cells in an arrangement of n lines in the plane. Prove K(n, m) =
0(n 2 / 3 m 2 / 3 +n+m) using the method of the present section (it may be
4.4 Distinct Distances via Crossing Numbers 59
convenient to classify edges into top and bottom ones and bound each
type separately). 0
4. (a) Prove that in a drawing of G with the smallest possible number of
crossings, no two edges cross more than once. 0
(b) Explain why the result in (a) does not imply that pair-cr(G) = cr(G)
(where pair-cr( G) is the minimum number of pairs of crossing edges in a
drawing of G). 0
(c) Prove that if G is a graph with pair-cr( G) = k, then cr( G) ::::; (2;). [II
Proof. Fix an n-point set P, and let t be the number of distinct distances
determined by P. This means that for each point PEP, all the other points
are contained in t circles centered at p (the radii correspond to the t distances
appearing in P).
These tn circles obtained for all the n points of P have n( n-l) incidences
with the points of P. The first idea is to bound this number of incidences from
above in terms of nand t, in a way similar to the proof of the Szemen§di-
Trotter theorem in the preceding section, which yields a lower bound for t.
First we delete all circles with at most 2 points on them (the innermost
circle and the second outermost circle in the above picture). We have de-
stroyed at most 2nt incidences, and so still almost n 2 incidences remain (we
may assume that t is much smaller than n, for otherwise, there is nothing
to prove). Now we define a graph G: The vertices are the points of P and
the edges are the arcs of the circles between the points. This graph has n
vertices, almost n 2 edges, and there are at most t 2 n 2 crossings because every
two circles intersect in at most 2 points.
60 Chapter 4: Incidence Problems
So the line fuv has at least k incidences with the points of P. But the Sze-
meredi-Trotter theorem tells us that there cannot be too many distinct lines,
each incident to many points of P. Let us make this precise.
By a consequence of the Szemeredi-Trotter theorem stated in Exer-
cise 4.1.6(b), lines containing at least k points of P each have altogether
no more than O(n2 /k 2 + n) incidences with P.
Let M be the set of pairs {u, v} of vertices of G connected by at least
k edges in G, and let E be the set of edges (arcs) connecting these pairs.
Each edge in E connecting the pair {u, v} contributes one incidence of the
bisecting line fuv with a point pEP. On the other hand, one incidence of
4.4 Distinct Distances via Crossing Numbers 61
1( k1)
k 1-
k'-l 1
~ 3k
= 0 (IEI
3
kx2 k 3n 2 ) - O(n),
Exercises
1. Let lcirc(m, n) be the maximum number of incidences between m points
and n arbitrary circles in the plane. Fill in the details of the following
approach to bounding lcirc(n, n). Let K be a set of n circles, C the set
of their centers, and P a set of n points.
(a) First, assume that the centers of the circles are mutually distinct, i.e.,
101 = IKI· Proceed as in the proof in the text: Remove circles with at
most 2 incidences, and let the others define a drawing of a multigraph G
with vertex set P and arcs of the circles as edges. Handle the edges with
multiplicity k or larger via Szemen3di-Trotter, using the incidences ofthe
bisectors with the set C, and those with multiplicity < k by Lemma 4.4.2.
Balance k suitably. What bound is obtained for the total number of
incidences? 0
(b) Extend the argument to handle concentric circles too. 0
2. This exercise provides another bound for lcirc(n, n), the maximum possi-
ble number of incidences between n arbitrary circles and n points in the
plane. Let K be the set of circles and P the set of points. Let Pi be the
points with at least di = 2i and fewer than 2i+l incidences; we will argue
for each Pi separately.
Define the multigraph G on Pi as usual, with arcs of circles of K con-
necting neighboring points of Pi (the circles with at most 2 incidences
with Pi are deleted). Let E be the set of edges of G. For a point u E Pi,
let N(u) be the set of its neighboring points, and for a v E N(u), let
64 Chapter 4: Incidence Problems
Proof. There are at most G) crossing pairs of lines in total. On the other
hand, a point Pi E P with d i incidences "consumes" (~i) crossing pairs (their
intersections all lie at Pi). Therefore, l:~l (~i) :::; (;).
We want to bound l:~l di from above. Since points with no incidences
can be deleted from P in advance, we may assume d i :::: 1 for all i, and then
we have (~) :::: (d i -1)2/2. By the Cauchy-Schwarz inequality,
Forbidden subgraph arguments. For integers r,8 :::: 1, let Kr,s denote
the complete bipartite graph on r + 8 vertices; the picture shows K 3 ,4:
4.5 Point-Line Incidences via Cuttings 65
The above proof can be expressed using graphs with forbidden K 2,2 as a
subgraph and thus put into the context of extremal graph theory.
A typical question in extremal graph theory is the maximum possible
number of edges of a (simple) graph on n vertices that does not contain a
given forbidden subgraph, such as K 2 ,2. Here the subgraph is understood in
a noninduced sense: For example, the complete graph K4 does contain K 2 ,2
as a subgraph. More generally, one can forbid all subgraphs from a finite or
infinite family F of graphs, or consider "containment" relations other than
being a subgraph, such as "being a minor."
If the forbidden subgraph H is not bipartite, then, for example, the com-
plete bipartite graph Kn,n has 2n vertices, n 2 edges, and no subgraph iso-
morphic to H. This shows that forbidding a nonbipartite H does not reduce
the maximum number of edges too significantly, and the order of magnitude
remains quadratic.
On the other hand, forbidding Kr,s with some fixed rand s decreases
the exponent of n, and forbidden bipartite subgraphs are the key to many
estimates in incidence problems and elsewhere.
4.5.2 Theorem (Kovari-Sos-Turan theorem). Let r ::::; s be fixed nat-
ural numbers. Then any graph on n vertices containing no Kr,s as a subgraph
has at most O(n2-1/r) edges.
If G is a bipartite graph with color classes of sizes m and n containing no
subgraph Kr,s with the r vertices in the class of size m and the s vertices in
the class of size n, then
As was noted above, for arbitrary bipartite graphs with forbidden K 2 ,2,
not necessarily being incidence graphs of points and lines in the plane, the
bound in the K6vari-Sos-Thran theorem cannot be improved. So, in order
to do better for point-line incidences, one has to use some more geometry
than just the excluded K 2 ,2. In fact, this was one of the motivations of the
problem of point-line incidences: In a finite projective plane of order q, we
have n = q2+q+1 points, n lines, and (q+1)n >::;; n 3/2 incidences, and so the
Szemeredi-Trotter theorem strongly distinguishes the Euclidean plane from
finite projective planes in a combinatorial sense.
Proof of the Szemeredi-Trotter theorem (Theorem 4.1.1) for m =
n. The bound from Lemma 4.5.1 is weaker than the tight Szemeredi-Trotter
bound, but it is tight if n 2 ::; m or m 2 ::; n. The idea of the present proof
is to convert the "balanced" case (n points and n lines) into a collection of
"unbalanced" subproblems, for which Lemma 4.5.1 is optimal. We apply the
following important result:
4.5.3 Lemma (Cutting lemma ). Let L be a set of n lines in the plane,
and let r be a parameter, 1 < r < n. Then the plane can be subdivided
into t generalized triangles (this means intersections of three half-planes)
~I' ~2"'" ~t in such a way that the interior of each ~i is intersected by at
most ~ lines of L, and we have t ::; Cr2 for a certain constant C independent
ofn and r.
Such a collection ~I"'" ~t may look like this, for example:
Such a point has to be the vertex of some ~i' and so there are no more than
3t such exceptional points. These points have at most J(n, 3t) incidences with
the lines of L. Another exceptional case is a line of L containing a side of ~i
but not intersecting its interior and therefore not included in L i , although it
may be incident with some points on the boundary of ~i'
There are at most 3t such exceptional lines, and they have at most J(3t, n)
incidences with the points of P. So we have
t
Exercises
1. Let hcirc (m, n) be the maximum number of incidences between m points
and n unit circles in the plane. Prove that hcirc (m, n) = O( my'n + n) by
the method of Lemma 4.5.1. III
2. Let [circ(m, n) be the maximum possible number of incidences between
m points and n arbitrary circles in the plane. Prove that lcirc(m, n)
O(nvrn + n) and [circ(m, n) = O(mn 2 / 3 + n). III
lin of
a ld d diagonal
lin s of L \
This implies the promised weaker version of the cutting lemma: Since the
probability of the sample S being good is positive, there exists at least one
good S that yields the desired collection of triangles.
Proof of Lemma 4.6.1. Let us say that a triangle T is dangerous if its
interior is intersected by at least k = ~ lines of L. We fix some arbitrary
dangerous triangle T. What is the probability that no line of the sample S
intersects the interior of T? We select a random line s times. The probability
that we never hit one of the k lines intersecting the interior of T is at most
72 Chapter 4: Incidence Problems
(1 - k/n)s. Using the well-known inequality l+x ::; eX, we can bound this
probability by e- ks / n = e- 61nn = n- 6 •
Call a triangle T interesting (for L) if it can appear in a triangulation for
some sample S S;; L. Any interesting triangle has vertices at some three ver-
tices of the arrangement of L, and hence there are fewer than n 6 interesting
triangles. 2 Therefore, with a positive probability, a random sample S inter-
sects the interiors of all the dangerous interesting triangles simultaneously.
In particular, none of the triangles ~i appearing in the triangulation of such
a sample S can be dangerous. This proves Lemma 4.6.1. D
Exercises
1. Calculate the exact expected size of S, a sample drawn from n elements
by s independent random draws with replacements. 0
2. Calculate the number of (generalized) triangles arising by triangulating
an arrangement of n lines in the plane in general position. (First, specify
how exactly the unbounded cells are triangulated.) 0
3. (A cutting lemma for circles) Consider a set K of n circles in the plane.
Select a sample S S;; K by s independent random draws with replacement.
Consider the arrangement of S, and construct its vertical decomposition;
that is, from each vertex extend vertical segments upwards and down-
wards until they hit a circle of S (or all the way to infinity). Similarly
extend vertical segments from the leftmost and rightmost points of each
circle.
2 The unbounded triangles have only 1 or 2 vertices, but they are completely
determined by their two unbounded rays, and so their number is at most n 2 •
4.7 The Cutting Lemma: A Tight Bound 73
(a) Show that this partitions the plane into 0(8 2 ) "circular trapezoids"
(shapes bounded by at most two vertical segments and at most two cir-
cular arcs). I2l
(b) Show that for 8 = Cr In n with a sufficiently large constant C, there
is a positive probability that the sample S intersects all the dangerous
interesting circular trapezoids, where "dangerous" and "interesting" are
defined analogously to the definitions in the proof of the weaker version
of the cutting lemma . 0
4. Using Exercises 3 and 4.5.1, show that the number of unit distances
determined by n points in the plane is 0(n4/3Iog2 /3 n). 0
5. Using Exercises 3 and 4.5.2, show that lcirc(n, n) = 0(n1.41ogC n) (for
some constant c), where Icirc(m, n) is the maximum possible number of
incidences between m points and n arbitrary circles in the plane. 0
The level k is drawn thick, while the thin segments are pieces of lines of L
and they do not belong to the level k.
Let eo, el, ... , et be the edges of Ek numbered from left to right; eo and
et are the unbounded rays. Let us fix a point Pi in the interior of ei. For an
integer parameter q :::: 2, we define the q-simplification of the level k as the
monotone polygonal line containing the left part of eo up to the point Po, the
segments POPq, PqP2q,· .. , PL(t-l)/qJqPt, and the part of et to the right of Pt·
Thus, the q-simplification has at most ~+2 edges. Here is an illustration for
t = 9, q = 4:
(We could have defined the q-simplification by connecting every qth vertex
of the level, but the present way makes some future considerations a little
simpler.)
4.7.1 Lemma.
(i) The portion II of the level k (considered as a polygonal line) between the
points Pj and Pj+q is intersected by at most q+1 lines of L.
(ii) The segment PjPj+q is intersected by at most q+1 lines of L.
(iii) The q-simplification of the level k is contained in the strip between the
levels k - Iq/21 and k + Iq/21·
Proof of the cutting lemma for lines in general position. Let r be the
given parameter. If r = Q(n), then it suffices to produce a O-cutting of size
O(n 2 ) by simply triangulating the arrangement of L. Hence we may assume
that r is much smaller than n.
Set q = In/lOr 1- Divide the levels Eo, E l , ... , E n - l into q groups: The
ith group contains all E j with j congruent to i modulo q (i = 0,1, ... ,q-1).
Since the total number of edges in the arrangement is n 2 , there is an i such
4.7 The Cutting Lemma: A Tight Bound 75
that the ith group contains at most n 2 /q edges. We fix one such i; from now
on, we consider only the levels i, q+i, 2q+i, . .. , and we construct the desired
~-cutting from them.
Let Pj be the q-simplification of the level jq+i. If E jq +i has mj edges,
then Pj has at most mj / q + 3 edges, and the total number of edges of the Pj ,
j = 0, 1, ... , l(n-1)/qJ, can be estimated by n 2/q2 + 3(n/q+1) = 0(n 2/q2).
We note that the polygonal chains Pj never intersect properly: If they did, a
vertex of some Pj, which has level qj + i, would be above PHI, and this is
ruled out by Lemma 4.7.1(iii).
We form the vertical decomposition for the Pj ; that is, we extend vertical
segments from each vertex of Pj upwards and downwards until they hit Pj - I
and PHI:
Exercises
1. (a) Verify that each trapezoid arising in the described construction is
intersected by at most 2.5q+O(1) lines. Setting q appropriately, show that
the plane can subdivided into 12.5r2 + O(r) trapezoids, each intersected
by at most ~ lines, assuming 1 « r « n. ~
(b) Improve the bounds from (a) to 2q+O(1) and Sr2+0(r), respectively.
[!J
5
Convex Polytopes
Convex polytopes are convex hulls of finite point sets in Rd. They constitute
the most important class of convex sets with an enormous number of appli-
cations and connections.
Three-dimensional convex polytopes, especially the regular ones, have
been fascinating people since the antiquity. Their investigation was one of
the main sources of the theory of planar graphs, and thanks to this well-
developed theory they are quite well understood. But convex polytopes in
dimension 4 and higher are considerably more challenging, and a surprisingly
deep theory, mainly of algebraic nature, was developed in attempts to under-
stand their structure.
A strong motivation for the study of convex polytopes comes from prac-
tically significant areas such as combinatorial optimization, linear program-
ming, and computational geometry. Let us look at a simple example illus-
trating how polytopes can be associated with combinatorial objects. The
3-dimensional polytope in the picture
2341 1342
4213 3214
78 Chapter 5: Convex Polytopes
~D,(a)
A nice interpretation of duality is obtained by working in Rd+l and iden-
tifying the "primal" Rd with the hyperplane 7f = {x E R d + l : Xd+1 = I}
and the "dual" Rd with the hyperplane p = {x E R d+ 1 : Xd+1 = -I}. The
hyperplane dual to a point a E 7f is produced as follows: We construct the
hyperplane in R d + 1 perpendicular to Oa and containing 0, and we intersect
it with p. Here is an illustration for d = 2:
~--------~~------~7f
In this way, the duality Vo can be naturally extended to k-flats in Rd, whose
duals are (d-k-l)-flats. Namely, given a k-flat J C 7f, we consider the (k+l)-
flat F through 0 and J, we construct the orthogonal complement of F, and
we intersect it with p, obtaining Vo(f).
Let us consider the pentagon drawn above and place it so that the origin
lies in the interior. Let Vi = VO(£i), where £i is the line containing the side
aiai+l. Then the points dual to the lines intersecting the pentagon al a2 ... as
fill exactly the exterior of the convex pentagon VI V2 ... Vs:
80 Chapter 5: Convex Polytopes
This follows easily from the properties of duality listed below (of course, there
is nothing special about a pentagon here). Thus, the considered set of lines
can be nicely described in the dual plane. A similar passage from lines to
points or back is useful in many geometric or computational problems.
Properties of the duality transform. Let p be a point of Rd distinct
from the origin and let h be a hyperplane in R d not containing the origin.
Let h - stand for the closed half-space bounded by h and containing the
origin, while h+ denotes the other closed half-space bounded by h. That is,
if h = {x E Rd: (a,x) = I}, then h- = {x E Rd: (a,x) ::; I}.
5.1.3 Definition (Dual set). For a set X ~ R d , we define the set dual to
X, denoted by X*, as follows:
Another common name used for the duality is polarity; the dual set would
then be called the polar set. Sometimes it is denoted by XO.
Geometrically, X* is the intersection of all half-spaces of the form Vo (x)-
with x E X. Or in other words, X* consists of the origin plus all points y
such that X ~ Vo(y)-. For example, if X is the pentagon ala2 ... a5 drawn
above, then X* is the pentagon VIV2 .•. V5'
For any set X, the set X* is obviously closed and convex and contains the
origin. Using the separation theorem (Theorem 1.2.4), it is easily shown that
for any set X ~ R d , the set (X*)* is the closure conv(XU{O}). In particular,
for a closed convex set containing the origin we have (X*)* = X (Exercise 3).
For a hyperplane h, the dual set h* is different from the point VO(h).l
For readers familiar with the duality of planar graphs, let us remark that
it is closely related to the geometric duality applied to convex polytopes in
R3. For example, the next drawing illustrates a planar graph and its dual
graph (dashed):
1 In the literature, however, the "star" notation is sometimes also used for the dual
point or hyperplane, so for a point p, the hyperplane Vo(p) would be denoted by
p*, and similarly, h* may stand for Vo(h).
5.1 Geometric Duality 81
Later we will see that these are graphs of the 3-dimensional cube and of
the regular octahedron, which are polytopes dual to each other in the sense
defined above. A similar relation holds for all 3-dimensional polytopes and
their graphs.
Other variants of duality. The duality transform Do defined above is just
one of a class of geometric transforms with similar properties. For some pur-
poses, other such transforms (dualities) are more convenient. A particularly
important duality, denoted by V, corresponds to moving the origin to the
"minus infinity" of the xd-axis (the xd-axis is considered vertical). A formal
definition is as follows.
5.1.4 Definition (Another duality). A non vertical hyperplane h can be
uniquely written in the form h = {x E Rd: Xd = alXI + ... +ad-Ixd-l - ad}.
We set V( h) = (aI, ... , ad-I, ad). Conversely, the point a = (aI, ... , ad-I, ad)
maps back to h.
The property (i) of Lemma 5.1.2 holds for this V, and an analogue of (ii)
is:
(ii') A point p lies above a hyperplane h if and only if the point V(h) lies
above the hyperplane V(p).
Exercises
1. Let C = {x E Rd: IXII + ... + IXdl ::::; I}. Show that C* is the d-dimen-
sional cube {x E R d: max IXi I ::::; I}. Picture both bodies for d = 3. 0
2. Prove the assertion made in the text about the lines intersecting a convex
pentagon. 0
3. Show that for any X ~ R d , (X*)* equals the closure of conv(X U {O}),
where X* stands for the dual set to X. 0
4. Let C ~ Rd be a convex set. Prove that C* is bounded if and only if 0
lies in the interior of C. 0
5. Show that C = C* if and only if C is the unit ball centered at the origin.
o
n
6. (a) Let C = conv(X) ~ Rd. Prove that C* = xEX Vo(x)-. 0
n
(b) Show that if C = hEH h - , where H is a collection of hyperplanes not
passing through 0, and if C is bounded, then C* = conv{Vo(h): hE H}.
o
(c) What is the right analogue of (b) if C is unbounded? 0
7. What is the dual set h* for a hyperplane h, and what about h**? 0
82 Chapter 5: Convex Polytopes
one to handle large real-world problems. This algorithmic question is not yet
satisfactorily solved. Moreover, in some cases the number of required half-
spaces may be astronomically large compared to the number n of points, as
we will see later in this chapter.
As another illustration of the computational difference between V-po-
lytopes and H-polytopes, we consider the maximization of a given linear
function over a given polytope. For V-polytopes it is a trivial problem, since
it suffices to substitute all points of V into the given linear function and select
the maximum of the resulting values. But maximizing a linear function over
the intersection of a collection of half-spaces is the basic problem of linear
programming, and it is certainly nontrivial.
Terminology. The usual terminology does not distinguish V-polytopes and
H-polytopes. A convex polytope means a point set in Rd that is a V-polytope
(and thus also an H-polytope). An arbitrary, possibly unbounded, H-poly-
hedron is called a convex polyhedron. All polytopes and polyhedra considered
in this chapter are convex, and so the adjective "convex" is often omitted.
The dimension of a convex polyhedron is the dimension of its affine hull.
It is the smallest dimension of a Euclidean space containing a congruent copy
of P.
Basic examples. One of the easiest classes of polytopes is that of cubes.
The d-dimensional cube as a point set is the Cartesian product [-1, l]d.
d=l
D d=2 d=3
As a V-polytope, the d-dimensional cube is the convex hull of the set {-I, l}d
(2d points), and as an H-polytope, it can be described by the inequalities
-1 :s: Xi :s: 1, i = 1,2, ... , d, i.e., by 2d half-spaces. We note that it is also
the unit ball of the maximum norm IIxll oo = maxi IxJ
Another important example is the class of crosspolytopes (or generalized
octahedra). The d-dimensional crosspolytope is the convex hull of the "co-
ordinate cross," i.e., conv{el, -el, e2, -e2, ... , ed, -ed}, where el,"" ed are
the vectors of the standard orthonormal basis.
d=l
0 d=2 d=3
84 Chapter 5: Convex Polytopes
It is also the unit ball of the t'l-norm IIxliI = I:~=l IXil . As an H-polytope,
it can be expressed by the 2d half-spaces of the form (a, :::;)1, where a runs
through all vectors in {-I, l}d.
The polytopes with the smallest possible number of vertices (for a given
dimension) are called simplices.
(0, 0, 1)
(1, 0, 0)
Exercises
1. Verify that a d-dimensional simplex in Rd can be expressed as the inter-
section of d+ 1 half-spaces. ~
2. (a) Show that every convex polytope in Rd is an orthogonal projection
of a simplex of a sufficiently large dimension onto the space Rd (which
we consider embedded as a d-£lat in some Rn). 121
(b) Prove that every convex polytope P symmetric about 0 (i.e., with
P = - P) is the affine image of a crosspolytope of a sufficiently large
dimension. 121
For polytopes in R 3 , the graph is always planar: Project the polytope from its
interior point onto a circumscribed sphere, and then make a "cartographic
map" of this sphere, say by stereographic projection. Moreover, it can be
shown that the graph is vertex 3-connected. (A graph G is called vertex k-
connected if IV(G)I 2: k+1 and deleting any at most k-1 vertices leaves G
connected.) Nicely enough, these properties characterize graphs of convex 3-
polytopes:
5.3.3 Theorem (Steinitz theorem). A finite graph is isomorphic to the
graph of a 3-dimensional convex polytope if and only if it is planar and vertex
3-connected.
We omit a proof of the considerably harder "if" part (exhibiting a poly-
tope for every vertex 3-connected planar graph); all known proofs are quite
complicated.
Graphs of higher-dimensional polytopes probably have no nice description
comparable to the 3-dimensional case, and it is likely that the problem of
deciding whether a given graph is isomorphic to a graph of a 4-dimensional
convex polytope is NP-hard. It is known that the graph of every d-dimen-
sional polytope is vertex d-connected (Balinski's theorem), but this is only a
necessary condition.
Examples. A d-dimensional simplex has been defined as the convex hull of
a (d+1)-point affinely independent set V. It is easy to see that each subset of
V determines a face of the simplex. Thus, there are (~!i) faces of dimension
k, k = -1,0, ... ,d, and 2d +1 faces in total.
The d-dimensional crosspolytope has V = {ell -el, ... ,ed, -ed as the
vertex set. A proper subset F c V determines a face if and only if there is
no i such that both ei E F and -ei E F (Exercise 2). It follows that there
are 3d +1 faces, including the empty one and the whole crosspolytope.
The nonempty faces of the d-dimensional cube [-l,lJd correspond to
vectors v E {-I, 1, O} d. The face corresponding to such v has the vertex
set {u E {-I, I} d: Ui = Vi for all i with Vi -I=- O}. Geometrically, the vector v
is the center of gravity of its face.
The face lattice. Let F(P) be the set of all faces of a (bounded) convex
polytope P (including the empty face 0 of dimension -1). We consider the
partial ordering of F(P) by inclusion.
5.3 Faces of a Convex Polytope 87
Ji73
5
12 45
1 2
P
o
The vertices are numbered 1-5, and the faces are labeled by the vertex sets.
The face lattice is graded, meaning that every maximal chain has the same
length (the rank of a face F is dim(F)+I). Quite obviously, it is atomic: Every
face is the join of its vertices. A little less obviously, it is coatomic; that is,
every face is the meet (intersection) of the facets containing it. An important
consequence is that combinatorial type of a polytope is determined by the
vertex-facet incidences. More precisely, if we know the dimension and all
subsets of vertices that are vertex sets of facets (but without knowing the
coordinates of the vertices, of course), we can uniquely reconstruct the whole
face lattice in a simple and purely combinatorial way.
Face lattices of convex polytopes have several other nice properties, but no
full algebraic characterization is known, and the problem of deciding whether
90 Chapter 5: Convex Polytopes
Exercises
1. Verify that if V c Rd is affinely independent, then each subset F ~ V
determines a face of the simplex conv(V). I2J
2. Verify the description of the faces of the cube and of the crosspolytope
given in the text. [II
3. Consider the (n-1 )-dimensional permutahedron as defined in the intro-
duction to this chapter.
(a) Verify that it really has n! vertices corresponding to the permutations
of {l, 2, ... , n}.12J
(b) Describe all faces of the permutahedron combinatorially (what sets
of permutations are vertex sets of faces?). [II
(c) Determine the dimensions of the faces found in (b). In particular, show
that the facets correspond to ordered partitions (A, B) of {I, 2, ... ,n},
A, B =I- 0, and count them. [II
4. Let P C R4 = conv{ ±ei ± ej: i,j = 1,2,3,4, i =I- j}, where eI, ... ,e4 is
the standard basis (this P is called the 24-cell). Describe the face lattice
of P and prove that P is combinatorially equivalent to P* (in fact, P can
be obtained from P* by an isometry and scaling). [II
5. Using Proposition 5.3.2, prove the following:
(a) If F is a face of a convex polytope P, then F is the intersection of P
with the affine hull of F. I2J
(b) If F and G are faces of a convex polytope P, then F n G is a face,
too. CD
6. Let P be a convex polytope in R3 containing the origin as an interior
point, and let F be a j-face of P, j = 0,1,2.
(a) Give a precise definition of the face F' of the dual polytope P* cor-
responding to F (i.e., describe F' as a subset of R 3 ). I2J
(b) Verify that F' is indeed a face of P*. I2J
7. Let V C Rd be the vertex set of a convex polytope and let U C V. Prove
that U is the vertex set of a face of conv(V) if and only if the affine hull
of U is disjoint from conv(V \ U). [II
8. Prove that the graph of any 3-dimensional convex polytope is 3-connected;
i.e., removing any 2 vertices leaves the graph connected. [II
9. Let C be a convex set. Call a point x E C exposed if there is a hyperplane
h with Cnh = {x} and all the rest of C on one side. For convex polytopes,
exposed points are exactly the vertices, and we have shown that any
extremal point is also exposed. Find an example of a compact convex set
C C R 2 with an extremal point that is not exposed. [II
10. (On extremal points) For a set X ~ R d , let ex(X) = {x E X: x rt.
conv(X \ {x})} denote the set of extremal points of X.
(a) Find a convex set C ~ Rd with C =I- conv(ex(C)). CD
(b) Find a compact convex C ~ R3 for which ex(C) is not closed. [!]
96 Chapter 5: Convex Polytopes
(c) By modifying the proof of Theorem 5.2.2, prove that C = conv( ex( C))
for every compact convex C C Rd (this is a finite-dimensional version of
the well known Krein-Milman theorem). ~
As a corollary, we see that every d points of the moment curve are affinely
independent, for otherwise, we could pass a hyperplane through them plus
one more point of I' SO the moment curve readily supplies explicit examples
of point sets in general position.
II
1'6
_L-=*---I--~~--t-- h p
98 Chapter 5: Convex Polytopes
(n- ld/2J)
ld/2J +
(n -ld/2J - 1)
ld/2J _ 1 for d even, and
n - ld/2J - 1)
2( ld/2J for dodd.
Proof. The number of facets equals the number of ways of placing d black
circles and n - d white circles in a row in such a way that we have an even
number of black circles between each two white circles.
Let us say that an arrangement of black and white circles is paired if any
contiguous segment of black circles has an even length (the arrangements
permitted by Gale's criterion need not be paired because of the initial and
final segments). The number of paired arrangements of 2k black circles and
n - 2k white circles is (nk"k), since by deleting every second black circle we
get a one-to-one correspondence with selections of the positions of k black
circles among n - k possible positions.
Let us return to the original problem, and first consider an odd d = 2k+ 1.
In a valid arrangement of circles, we must have an odd number of consecutive
black circles at the beginning or at the end (but not both). In the former case,
we delete the initial black circle, and we get a paired arrangement of 2k black
and n-1- 2k white circles. In the latter case, we similarly delete the black
circle at the end and again get a paired arrangement as in the first case. This
establishes the formula in the theorem for odd d.
For even d = 2k, the number of initial consecutive black circles is ei-
ther odd or even. In the even case, we have a paired arrangement, which
contributes (nk"k) possibilities. In the odd case, we also have an odd num-
ber of consecutive black circles at the end, and so by deleting the first and
last black circles we obtain a paired arrangement of 2(k-1) black circles and
n-2k white circles. This contributes (nk"~~2) possibilities. D
Exercises
1. (a) Show that if V is a finite subset of the moment curve, then all the
points of V are extreme in conv(V); that is, they are vertices of the
corresponding cyclic polytope. ~
(b) Show that any two cyclic polytopes in R d with n vertices are com-
binatorially the same: They have isomorphic face lattices. Thus, we can
speak of the cyclic polytope. 0
2. (Another curve like I) Let !3 c R d be the curve {C~l ' t~2' ... , t~d): t E
R, t > a}. Show that any hyperplane intersects !3 in at most d points
(and if there are d intersections, then there is no tangency), and conclude
that any n distinct points on f3 form the vertex set of a polytope com-
binatorially isomorphic to the cyclic polytope. [IJ (Let us remark that
many other curves have these properties as well; the moment curve is
just the most convenient example.)
3. (Universality of the cyclic polytope)
(a) Let Xl, ... ,xn be points in Rd. Let Yi denote the vector arising by
appending 1 as the (d+l)st component of Xi' Show that if the determi-
nants of all matrices with columns Yi 1 , ••• , Yid+l' for all choices of indices
il < i2 < '" < id+l, have the same nonzero sign, then Xl, ... ,X n form
the vertex set of a convex polytope combinatorially equivalent to the n-
vertex cyclic polytope in Rd. [IJ
(b) Show that for any integers nand d there exists N such that among any
N points in Rd in general position, one can choose n points forming the
vertex set of a convex polytope combinatorially equivalent to the n-vertex
cyclic polytope. 0 (This can be seen as a d-dimensional generalization of
the Erdos-Szekeres theorem.)
4. Prove that if n is sufficiently large in terms of d, then for every set of
n points in R d in general position, one can choose d+ 1 simplices of di-
mension d with vertices at some of these points such that any hyperplane
avoids at least one of these simplices. Use Exercise 3. ~
100 Chapter 5: Convex Polytopes
r
We have exhibited at least one d/21-face for which v is the lowest vertex or
the highest vertex. Since the lowest vertex and the highest vertex are unique
for each face, the number of vertices is no more than twice the number of
rd/21-faces. D
Proof. The basic idea is very simple: Move (perturb) every vertex of P by a
very small amount, in such a way that the vertices are in general position, and
show that each k-face of P gives rise to at least one k-face of the perturbed
polytope. There are several ways of doing this proof.
102 Chapter 5: Convex Polytopes
We process the vertices one by one. Let V be the vertex set of P and
let v E V. The operation of €-pushing v is as follows: We choose a point v'
lying in the interior of P, at distance at most € from v, and on no hyperplane
determined by the points of V, and we set V' = (V \ {v}) U {v'}. If we
successively €v-push each vertex v of the polytope, the resulting vertex set is
in general position and we have a simple polytope.
It remains to show that for any polytope P with vertex set V and any
v E V, there is an € > 0 such that €-pushing v does not decrease the number
of faces.
Let U c V be the vertex set of a k-face of P, 0 ::; k ::; d-1, and let V'
arise from V by €-pushing v. If v rt u, then no doubt, U determines a face of
conv(V'), and so we assume that v E U. First suppose that v lies in the affine
hull of U \ {v}; we claim that then U \ {v} determines a k-face of conv(V').
This follows easily from the criterion in Exercise 5.3.7: A subset U c V is the
vertex set of a face of conv(V) if and only if the affine hull of U is disjoint
from conv(V \ U). We leave a detailed argument to the reader (one must use
the fact that v is pushed inside).
If v lies outside of the affine hull of U \ {v}, then we want to show that
U ' = (U \ {v}) U {v'} determines a k-face of conv(V'). The affine hull of U
is disjoint from the compact set conv(V \ U). If we move v continuously by
a sufficiently small amount, the affine hull of U moves continuously, and so
there is an € > 0 such that if we move v within € from its original position,
the considered affine hull and conv(V \ U) remain disjoint. 0
The h-vector and such. Here we introduce some notions extremely useful
for deeper study of the f-vectors of convex polytopes. In particular, they are
crucial in proofs of the (exact) upper bound theorem.
Let us go back to the setting of the proof of Proposition 5.5.3. There we
considered a simple polytope that used to be called P* but now, for simplicity,
let us call it P. It is positioned in Rd in such a way that no edge is horizontal,
and so for each vertex v, there are some iv edges going upwards and d - iv
edges going downwards.
The central definition is this: The h-vector of Pis (ho, hI' ... ' hd), where
hi is the number of vertices v with exactly i edges going upwards. So, for
example, we have ho = hd = l.
Next, we relate the h-vector to the f-vector. Each vertex v is the lowest
vertex for exactly (ik') faces of dimension k, and each k-face has exactly one
t, G)
lowest vertex, and so
fk = hi (5.1)
(for i < k we have (!) = 0). So the h-vector determines the f-vector. Less
obviously, the h-vector can be uniquely reconstructed from the f-vector! A
quick way of seeing this is via generating functions: If f(x) is the polynomial
d k d·
I:k=O fk X and h(x) = I:i=O hiXt, then (5.1) translates to f(x) = h(x+1),
5.5 The Upper Bound Theorem 103
(5.2)
We have defined the h-vector using one particular choice of the vertical
direction, but now we know that it is determined by the f-vector and thus
independent of the chosen direction. By turning P upside down, we see that
hj =
~ 1F-
L../- . k(d-k)
d _ . fk-l.
k=O J
The upper bound theorem has the following neat reformulation in terms
of h-vectors: For any d-dimensional simplicial polytope with fo = n vertices,
we have
n - d+i -1)
hi:::; ( . , i=0,1,···,ldj2J. (5.3)
z
Proving the upper bound theorem is not one of our main topics, but an
outline of a proof can be found in this book. It starts in the next section
and finishes in Exercise 11.3.6, and it is not among the most direct possible
proofs. Deriving the upper bound theorem from (5.3) is a pure and direct
calculation, verifying that the h-vector of the cyclic polytope satisfies (5.3)
with equality. We omit this part.
using the Gale transform discussed below in Section 5.6, can be found
in Welzl [We101] and in Exercises 11.3.5 and 11.3.6. Our exposition of
the asymptotic upper bound theorem is based on Seidel [Sei95].
The ordering of the vertices of a simple polytope P by their height
in the definition of the h-vector corresponds to a linear ordering of the
facets of P*. This ordering of the facets is a shelling. Shelling, even
in the strictly peaceful mathematical sense, is quite important, also
beyond the realm of convex polytopes. Let K be a finite cell complex
whose cells are convex polytopes (such as the boundary complex of a
°
convex polytope), and suppose that all maximal cells have the same
dimension k. Such K is called shellable if k = or k 2: 1 and K has
a shelling. A shelling of K is an enumeration F I , F 2 , ... , Fn of the
facets (maximum-dimension cells) of K such that (i) the boundary
complex of FI is shellable, and (ii) for every i > 1, there is a shelling
of the complex Fi n u~:i Fj that can be extended to a shelling of the
boundary complex of Fi . The boundary complex of a convex polytope
is homeomorphic to a sphere, and a shelling builds the sphere in such
a way that each new cell is glued by contractible part of its boundary
to the previously built part, except for the last cell, which closes the
remaining hole.
McMullen's proof of the upper bound theorem does not generalize
to simplicial spheres (i.e., finite simplicial complexes homeomorphic
to spheres), for example because they need not be shellable, counter-
intuitive as this may look. The upper bound theorem for them was
proved by Stanley [Sta75] using much heavier algebraic and algebraic-
topological tools.
An interesting extension of the upper bound theorem was found
by Kalai [KaI91]. Let P be a simplicial d-dimensional polytope. All
proper faces of P are simplices, and so the boundary is a simplicial
complex. Let K be any sub complex of the boundary (a subset of the
proper faces of P such that if F E K, then all faces of F also lie in
K). The strong upper bound theorem, as Kalai's result is called, asserts
that if K has at least as many (d-1)-faces as the d-dimensional cyclic
polytope on n vertices, then K has at least as many k-faces as that
cyclic polytope, for all k = 0, 1, ... ,d-1. (Note that we do not assume
that P has n vertices!) The proof uses methods developed for the
proof of the g-theorem mentioned below as well as Kalai's technique
of algebraic shifting.
Another major achievement concerning the f-vectors of polytopes
is the so-called g-theorem. The inventive name g-vector of a d-dimen-
sional simple polytope refers to the vector (gO, g1, . .. ,gLd/2J), where
go = ho and gi = hi - hi-I, i = 1,2, ... , Ld/2J. The g-theorem char-
acterizes all possible integer vectors that can appear as the g-vector
of a d-dimensional simple (or simplicial) polytope. Since the g-vector
5.5 The Upper Bound Theorem 105
Exercises
1. (a) Let P be a k-dimensional convex polytope in Rk, and Q an .e-dimen-
sional convex polytope in R e. Show that the Cartesian product P x Q C
RkH is a convex polytope of dimension k +.e. 12]
(b) If F is an i-face of P, and G is a j-face of Q, i,j 2: 0, then F x G is
an (i + j)-face of P x Q. Moreover, this yields all the nonempty faces of
PxQ.0
(c) Using the product of suitable polytopes, find an example of a "fat-
lattice" polytope, i.e., a polytope for which the total number offaces has
a larger order of magnitude than the number of vertices plus the number
of facets together (the dimension should be a constant). 0
(d) Show that the following yields a 5-dimensional fat-lattice polytope:
The convex hull of two regular n-gons whose affine hulls are skew 2-flats
in R5. 0
For recent results on fat-lattice polytopes see Eppstein, Kuperberg, and
Ziegler [EKZ01].
The Gale transform assigns to a sequence a = (aI, a2, ... , an) of n ;::: d+l
points in R d another sequence 9 = (1h, 92, ... , 9n) of n points. The points
91,92, ... ,9n live in a different dimension, namely in R n - d- l . For example,
n points in the plane are transformed to n points in R n -3 and vice versa.
In the literature one finds many results about k-dimensional polytopes with
k+3 or k+4 vertices; this is because their vertex sets have a low-dimensional
Gale transform.
Let us stress that the Gale transform operates on sequences, not individual
points: We cannot say what 91 is without knowing all of aI, a2, ... , an. We
also require that the affine hull of the ai be the whole Rd; otherwise, the
Gale transform is not defined. (On the other hand, we do not need any sort
of general position, and some of the ai may even coincide.)
The reader might wonder why the points of the Gale transform are written
with bars. This is to indicate that they should be interpreted as vectors
in a vector space, rather than as points in an affine space. As we will see,
"affine" properties of the sequence a, such as affine dependencies, correspond
to "linear" properties of the Gale transform, such as linear dependencies.
In order to obtain the Gale transform of a, we first convert the ai into
(d+l)-dimensional vectors: iii E R d +l is obtained from ai by appending a
(d+l)st coordinate equal to 1. This is the embedding Rd --+ R d + l often used
for relating affine notions in Rd to linear notions in R d +l ; see Section 1.1.
Let A be the d x n matrix with iii as the ith column. Since we assume that
there are d+ 1 affinely independent points in a, the matrix A has rank d+ 1,
and so the vector space V generated by the rows of A is a (d+ 1)-dimensional
subspace ofRn. We let V-L be the orthogonal complement of V in Rn; that is,
V-L = {w ERn: (v,w) = 0 for all v E V}. We have dim(V-L) = n-d-1. Let
us choose some basis (b1, b2 , •. . , bn-d-l) of V -L, and let B be the (n-d-l) x n
matrix with bj as the jth row. Finally, we let 9i E R n - d - l be the ith column
of B. The sequence 9 = (91,92, ... , 9n) is the Gale transform of a. Here is a
pictorial summary:
n
5.6.1 Observation.
(i) (The Gale transform is determined up to linear isomorphism) In the
construction of g, we can choose an arbitrary basis of V -L. Choosing a
different basis corresponds to multiplying the matrix B from the left by a
nonsingular (n-d-l) x (n-d-l) matrix T (Exercise 1), and this means
transforming (91, ... ,9n) by a linear isomorphism ofRn-d-l.
5.6 The Gale Transform 109
LinVal(a)= {(f(al), f(a2), ... , f(a n )): f: R d+1 --+ R is a linear function},
LinDep(a) = {a ERn: alaI + a2a2 + ... + ana n = O}.
For a point sequence a = (al,"" an), we then let AfNal(a) = LinVal(a) and
AffDep(a) = LinDep(a), where a is obtained from a as above, by appending
1'so Another description is
AfNal(a) =
{(f(al),J(a2), ... , f(a n )): f: Rd --+ R is an affine function},
AffDep(a) = {a ERn: alaI + ... + ana n = 0, a1 + ... + an = O}.
as
f(x) < 0
{I, 2, ... ,n}, and we ask whether the points of the subsequence aI = (iii: i E
1) span a linear hyperplane. First, we observe that they lie in a common linear
hyperplane if and only ifthere is a nonzero <P E LinVal(a) such that <Pi = 0 for
all i E I. It could still happen that all of aI lies in a lower-dimensional linear
subspace. Using the assumption that a spans Rd+l, it is not difficult to see
that aI spans a linear hyperplane if and only if all <P E LinVal(a) that vanish
on aI have identical zero sets; that is, the set {i: <Pi = O} is the same for all
such <po If we know that aI spans a linear hyperplane, we can also see how
the other vectors in a are distributed with respect to this linear hyperplane.
Analogously, knowing AffVal(a), we can determine which subsequences of
a span (affine) hyperplanes and how the other points are partitioned by these
hyperplanes. For example, we can tell whether there are some d+ 1 points on
a common hyperplane, and so we know whether a is in general position. As a
more complicated example, let P = conv(a). We can read off from AffVal(a)
which of the ai are the vertices of P, and also the whole face lattice of P
(Exercise 6).
Similar information can be inferred from AffDep( a) (exactly the same
information, in fact, since AffDep(a) = AffVal(a).L; see Exercise 7). For
an a E AffDep(a) let I+(a) = {i E {l,2, ... ,n}: ai > O} and L(a) =
{i E {l, 2, ... ,n}: ai < O}. As we learned in the proof of Radon's lemma
(Lemma 1.3.1), 1+ = I+(a) and L = L(a) correspond to Radon partitions
of a. Namely, 2:iEI+ aiai = 2:iEL (-ai)ai, and dividing by 2:iEI+ ai =
2:iEL (-ai), we have convex combinations on both sides, and so conv(aI+)n
conv(aL) =f. 0. Conversely, if hand 12 are disjoint index sets with conv(ah)n
conv(aI2) =f. 0, then there is a nonzero a E AffDep(a) with I+(a) s-;;; hand
L(a) s-;;; 12 • For example, ai is a vertex of conv(a) if and only if there is no
a E AffDep(a) with h(a) = {i}.
For a sequence a of vectors, linear dependencies correspond to expressing
o as a convex combination. Namely, for disjoint index sets hand 12 , we
have 0 E conv( {iii: i E h} U {-iii: i E I 2 }) if and only if there is a nonzero
a E LinDep(a) with h(a) s-;;; hand L(a) s-;;; h
Together with these geometric interpretations of LinVal(a), AffVal(a),
LinDep(a), and AffDep(a), the following lemma (whose proof is left to Ex-
ercise 8) allows us to translate properties of point configurations to those of
their Gale transforms.
5.6.2 Lemma. Let a be a sequence ofn points in Rd whose points afHnely
span R d , and let g be its Gale transform. Then LinVal(g) = AffDep(a) and
LinDep(g) = AffVal(a). 0
For example, the facet ala2a5a6 is reflected by the complementary pair 93,94
of parallel oppositely oriented vectors, and so on.
Signs suffice. As was noted above, in order to find out whether some
ai is a vertex of conv(a), we ask whether there is an a E AffDep(a) with
I+(a) = {i}. Only the signs of the vectors in AffDep(a) are important here,
and this is the case with all the combinatorial-geometric information about
point sequences or vector sequences in Corollary 5.6.3. For such purposes,
the knowledge of sgn(AffDep(a)) = ((sgn(ar), ... ,sgn(an )): a E AffDep(a)}
is as good as the knowledge of AffDep(a).
We can thus declare two sequences a and b combinatorially isomorphic if
sgn(AffDep(a)) = sgn(AffDep(b)) and sgn(AfNal(a)) = sgn(AfNal(b)).2 We
will hear a little more about this notion of combinatorial isomorphism in
Section 9.3 when we discuss order types, and also in the notes to Section 6.2
in connection with oriented matroids.
2 It is nontrivial but true that either of these equalities implies the other one.
112 Chapter 5: Convex Polytopes
Here we need only one very special case: If 9 = (fh, ... ,9n) is a sequence
of vectors, tl' ... ' tn > 0 are positive real numbers, and g' = (t191, ... , tn9n),
then clearly,
The positive gi are marked by full circles, the negative ones by empty circles,
and we have borrowed the (incomplete) yin-yang symbol for marking the
positions shared by one positive and one negative point. This sequence 9 of
positive and negative points in Rn-d-2, or more formally the pair (g,a),
is called an affine Gale diagram of a. It conveys the same combinatorial
information as g, although we cannot reconstruct a from it up to linear
isomorphism, as was the case with g. (For this reason, we speak of Gale
diagram rather than Gale transform.) One has to get used to interpreting
the positive and negative points properly. If we put
So far we have assumed that !Ji =1= 0 for all i. This need not hold in general,
and points !Ji = 0 need a special treatment in the affine Gale diagram: They
are called the special points, and for a full specification of the affine Gale
diagram, we draw the positive and negative points and give the number
of special points. It is easy to find out how the presence of special points
influences the conditions in the previous proposition.
A nonrational polytope. Configurations of k+4 points in R k have planar
affine Gale diagrams. This leads to many interesting constructions of k-dimen-
sional convex polytopes with k+4 vertices. Here we give just one example: an
8-dimensional polytope with 12 vertices that cannot be realized with rational
coordinates; that is, no polytope with isomorphic face lattice has all vertex
coordinates rational. First one has to become convinced that if 9 distinct
points are placed in R2 so that they are not all collinear and there are collinear
triples and 4-tuples as is marked by segments in the left drawing below,
then not all coordinates of the points can be rational. We omit the proof,
which has little to do with the Gale transform or convex polytopes.
Next, we declare some points negative, some positive, and some both
positive and negative, as in the right drawing, obtaining 12 points. These
points have a chance of being an affine Gale diagram of the vertex set of
an 8-dimensional convex polytope, since condition (ii) in Proposition 5.6.4
114 Chapter 5: Convex Polytopes
Exercises
1. Let E be a k x n matrix of rank k :::; n. Check that for any k x n matrix E'
whose rows generate the same vector space as the rows of E, there exists
a nonsingular k x k matrix T with E' = TE. Infer that if 9 = (91, ... ,9n)
is a Gale transform of a, then any other Gale transform of a has the form
(T91 , T92,.·., T9n) for a nonsingular square matrix T. [I]
2. Let a be a sequence of d+ 1 affinely independent points in Rd. What is
the Gale transform of a, and what are AffVal(a) and AffDep(a)? IT]
3. Let 9 be a Gale transform of the vertex set of a convex polytope PeRd,
and let Ii be obtained from 9 by appending the zero vector. Check that
Ii is again a Gale transform of a convex independent set. What is the
relation of this set to P? [I]
5.7 Voronoi Diagrams 115
•
•
•
(Of course, the Voronoi diagram is clipped by a rectangle so that it fits into a
finite page.) The points of P are traditionally called the sites in the context
of Voronoi diagrams.
116 Chapter 5: Convex Polytopes
n
Indeed,
reg(p) = {x: dist(x,p) S dist(x,q)}
QEP\{P}
If such a passage is possible at all, the robot can always walk along
the edges of the Voronoi diagram of P, except for the initial and final
5.7 Voronoi Diagrams 117
segments of the tour. This allows one to reduce the robot motion problem
to a graph search problem: We define a subgraph of the Voronoi diagram
consisting of the edges that are passable for the robot .
• (A nice triangulation: the Delaunay triangulation) Let P C R2 be a finite
point set. In many applications one needs to construct a triangulation of
P (that is, to subdivide conv(P) into triangles with vertices at the points
of P) in such a way that the triangles are not too skinny. Of course, for
some sets, some skinny triangles are necessary, but we want to avoid
them as much as possible. One particular triangulation that is usually
very good, and provably optimal with respect to several natural criteria,
is obtained as the dual graph to the Voronoi diagram of P. Two points
of P are connected by an edge if and only if their Voronoi regions share
an edge.
The region of the new point x cuts off portions of the regions of some of
the old points. Let wp be the area of the part of reg(p) in the Voronoi
diagram of P that belongs to reg(x) after inserting x. The interpolated
value f(x) is
f(x) = L L wp w f(p)·
PEP qEP q
e(p)
---~-~=--j4---.j---- Xd+l = 0
Proof. We just substitute into the equations of U and of e(p). The Xd+!-
coordinate of u(x) is x~ + ... + x~, while the Xd+l-coordinate of the point
5.7 Voronoi Diagrams 119
Let £(p) denote the half-space lying above the hyperplane e(p). Consider
an n-point set P C Rd. By Proposition 5.7.2, x E reg(p) holds if and only
if e(p) is vertically closest to U at x among all e(q), q E P. Here is what we
have derived:
5.7.3 Corollary. The Voronoi diagram of P is the vertical projection of the
facets of the polyhedron npEP £(p) onto the hyperplane Xd+l = O. 0
5.7.4 Corollary. The maximum total number of faces of all regions of the
Voronoi diagram of an n-point set in Rd is O(n rd / 21 ).
locally
Delaunay
It can be shown that every sequence of such local flips is finite and
finishes with the Delaunay triangulation of P (Exercise 7). This pro-
cedure has an analogue in higher dimensions, where it gives a simple
and practically successful algorithm for computing Delaunay trian-
gulations (and Voronoi diagrams); see, e.g., Edelsbrunner and Shah
[ES96j.
Generalizations of Voronoi diagrams. The example in the text with
robot motion planning, as well as other applications, motivates var-
ious notions of generalized Voronoi diagrams. First, instead of the
Euclidean distance, one can take various other distance functions, say
the Cp-metrics. Second, instead of the spheres of influence of points,
we can consider the spheres of influence of other sites, such as dis-
joint polygons (this is what we get if we have a circular robot moving
amidst polygonal obstacles). We do not attempt to survey the numer-
ous results concerning such generalizations, again referring to [AKOOj.
Results on the combinatorial complexity of Voronoi diagrams under
non-Euclidean metrics and/or for nonpoint sites will be mentioned in
the notes to Section 7.7.
In another, very general, approach to Voronoi diagrams, one takes
the Voronoi diagram induced by two objects as a primitive notion. So
for every two objects we are given a partition of space into two regions
separated by a bisector, and Voronoi diagrams for more than two ob-
jects are built using the 2-partitions for all pairs. If one postulates a
few geometric properties of the bisectors, one gets a reasonable theory
of Voronoi diagrams (the so-called abstract Voronoi diagrams), includ-
ing efficient algorithms. So, for example, we do not even need a notion
of distance at this level of generality. Abstract Voronoi diagrams (in
the plane) were suggested by Klein [Kle89j.
A geometrically significant generalization of the Euclidean Voronoi
diagram is the power diagram: Each point pEP is assigned a real
weight w(p), and reg(P) = {x E Rd: Ilx - pll2 - w(p) ::; IIx _ qll2 -
w(q) for all q E P}. While Voronoi diagrams in Rd are projections
of certain convex polyhedra in Rd+l, the projection into Rd of every
intersection of finitely many nonvertical upper half-spaces in R d + 1 is
a power diagram. Moreover, a hyperplane section of a power diagram
is again a power diagram. Several other generalized Voronoi diagrams
in Rd (for example, with multiplicative weights of the sites) can be
obtained by intersecting a suitable power diagram in R d + 1 with a
simple surface and projecting into Rd, which yields fast algorithms;
see Aurenhammer and Imai [AI88j.
122 Chapter 5: Convex Polytopes
Exercises
1. Prove that the region reg(p) of a point p in the Voronoi diagram of a
finite point set P C Rd is unbounded if and only if p lies on the surface
of conv(P). [!]
5.7 Voronoi Diagrams 123
2. (a) Show that the Voronoi diagram of the 2n-point set {(~,O,O): i =
1,2, ... , n} U {(O, 1, *): j = 1,2, ... , n} in R3 has D(n 2 ) vertices. 0
(b) Let d = 2k+ 1 be odd, let e1,"" ed be vectors of the standard
orthonormal basis in R d, and let eo stand for the zero vector. For
i = 0,1, ... , k and j = 1,2, ... , n, let Pi,j = e2i + *e2i+1' Prove that
for every choice of jo, j1, . .. ,jk E {I, 2, ... ,n}, there is a point in Rd for
which the nearest points among the Pi,j are exactly PO,jo' P1,j" ... ,Pk,jk'
Conclude that the Voronoi diagram of the Pi,j has combinatorial com-
plexity D(n k ) = D(n rd / 21 ). 0
3. (Voronoi diagram of flats) Let 101,"" Cd-1 be small distinct positive
numbers and for i = 1,2, ... , d-1 and j = 1,2, ... , n, let Fi,j be the
(d-2)-flat {x E Rd: Xi = j, Xd = ci}' For every choice of j1, 12, ... ,jd-1 E
{1,2, ... ,n}, find a point in Rd for which the nearest sites (under the
Euclidean distance) among the Fi,j are exactly F 1 ,j" F2,h, ... ,Fd- 1,jd_l'
Conclude that the Voronoi diagram of the Fi,j has combinatorial com-
plexity D(n d - 1 ). 0
This example is from Aronov [AroOO].
4. For a finite point set in the plane, define the farthest-point Voronoi dia-
gram as indicated in the text, verify the claimed correspondence with a
convex polyhedron in R3, and prove that all nonempty regions are un-
bounded. 0
5. (Delaunay triangulation) Let P be a finite point set in the plane with no
3 points collinear and no 4 points cocircular.
(a) Prove that the dual graph of the Voronoi diagram of P, where two
points p, q E P are connected by a straight edge if and only if the bound-
aries of reg(p) and reg(q) share a segment, is a plane graph where the
outer face is the complement of conv(P) and every inner face is a trian-
gle.0
(b) Define a graph on P as follows: Two points P and q are connected
by an edge if and only if there exists a circular disk with both P and q
on the boundary and with no point of P in its interior. Prove that this
graph is the same as in (a), and so we have an alternative definition of
the Delaunay triangulation. 0
6. (Delaunay triangulation and minimum spanning tree) Let Pc R2 be a
finite point set with no 3 points collinear and no 4 co circular. Let T be a
spanning tree of minimum total edge length in the complete graph with
the vertex set P, where the length of an edge is just its Euclidean length.
Prove that all edges of T are also edges of the Delaunay triangulation of
P.0
7. (Delaunay triangulation by local flipping) Let P C R2 be an n-point set
with no 3 points collinear and no 4 cocircular. Let T be an arbitrary
triangulation of conv(P). Suppose that triangulations Ti, 72, ... are ob-
tained from T by successive local flips as described in the notes above (in
each step, we select a convex quadrilateral in the current triangulation
124 Chapter 5: Convex Polytopes
partitioned into two triangles in a way that is not the Delaunay triangu-
lation of the four vertices and we flip the diagonal of the quadrilateral).
(a) Prove that the sequence of triangulations is always finite (and give
as good an estimate for its maximum length as you can). [!]
(b) Show that if no local flipping is possible, then the current triangula-
tion is the Delaunay triangulation of P. 0
8. Consider a finite set of disjoint segments in the plane. What types of
curves may bound the regions in their Voronoi diagram? The region of a
given segment is the set of points for which this segment is a closest one.
o
9. Let A and B be two finite point sets in the plane. Choose ao E A arbi-
trarily. Having defined ao, ... , ai and b1 , •.• , bi - 1 , define bi+1 as a point
of B \ {b 1 , •.. , bd nearest to ai, and ai+l as a point of A \ {ao, ... , ail
nearest to bi+ 1. Continue until one of the sets becomes empty. Prove that
at least one of the pairs (ai, bi+d, (bi+l' ai+d, i = 0,1,2, ... , realizes the
shortest distance between a point of A and a point of B. (This was used
°
by Eppstein [Epp95] in some dynamical geometric algorithms.) [!]
10. (a) Let C be any circle in the plane X3 = (in R3). Show that there exists
a half-space h such that C is the vertical projection of the set h n U onto
X3 = 0, where U = {x E R3: X3 = xI + xn is the unit paraboloid. CD
(b) Consider n arbitrary circular disks K 1 , ... , Kn in the plane. Show that
there exist only O(n) intersections of their boundaries that lie inside no
other Ki (this means that the boundary of the union of the Ki consists
of O(n) circular arcs). [!]
11. Define a "spherical polytope" as an intersection of n balls in R3 (such
an object has facets, edges, and vertices similar to an ordinary convex
polytope).
(a) Show that any such spherical polytope in R3 has O(n 2 ) faces. You
may assume that the spheres are in general position. 0
(b) Find an example of an intersection of n balls having quadratically
many vertices. [!]
(c) Show that the intersection of n unit balls has O(n) complexity only.
o
6
N umber of Faces in
Arrangements
1 This terminology is not unified in the literature. What we call faces are sometimes
referred to as cells (O-cells, I-cells, and 2-cells).
6.1 Arrangements of Hyperplanes 127
vectors of the marked faces in a line arrangement. Only the signs are shown,
and the positive half-planes lie above their lines.
00+-
Of course, not all possible sign vectors correspond to nonempty faces. For n
lines, there are 3n sign vectors but only O(n 2 ) faces, as we will derive below.
Counting the cells in a hyperplane arrangement. We want to count
the maximum number of faces in an arrangement of n hyperplanes in Rd. As
we will see, this is much simpler than the similar task for convex polytopes!
If a set H of hyperplanes is in general position, which means that the
intersection of every k hyperplanes is (d-k)-dimensional, k = 2,3, ... , d+1,
the arrangement of H is called simple. For IHI ;::: d+ 1 it suffices to require that
every d hyperplanes intersect at a single point and no d+ 1 have a common
point.
Every d-tuple of hyperplanes in a simple arrangement determines exactly
one vertex, and so a simple arrangement of n hyperplanes has exactly G)
vertices. We now calculate the number of cells; it turns out that the order of
magnitude is also n d for d fixed.
(6.1)
Together with the initial conditions (for d = 1 and for n = 0), this recurrence
determines all values of IP, and so it remains to check that formula (6.1)
satisfies the recurrence. We have
What is the number of faces of the intermediate dimensions 1,2, ... , d-l
in a simple arrangement of n hyperplanes? This is not difficult to calculate
using Proposition 6.1.1 (Exercise 1); the main conclusion is that the total
number of faces is O(nd ) for a fixed d.
What about nonsimple arrangements? It turns out that a simple arrange-
ment of n hyperplanes maximizes the number of faces of each dimension
among arrangements of n hyperplanes. This can be verified by a perturbation
argument, which is considerably simpler than the one for convex polytopes
(Lemma 5.5.4), and which we omit.
Exercises
1. (a) Count the number of faces of dimensions 1 and 2 for a simple ar-
rangement of n planes in R 3 . [2]
(b) Express the number of k-faces in a simple arrangement of n hyper-
planes in Rd. [2]
2. Prove that the number of unbounded cells in an arrangement of n hyper-
planes in Rd is O(n d - l ) (for a fixed d). [2]
3. (a) Check that an arrangement of d or fewer hyperplanes in Rd has no
bounded cell. [2]
(b) Prove that an arrangement of d+ 1 hyperplanes in general position in
R d has exactly one bounded cell. [II
4. How many d-dimensional cells are there in the arrangement of the (g)
hyperplanes in Rd with equations {Xi = Xj}, where 1 :::; i < j :::; d? [II
5. How many d-dimensional cells are there in the arrangement of the hy-
perplanes in Rd with the equations {Xi - Xj = O}, {Xi - Xj = I}, and
{Xi - Xj = -I}, where 1 :::; i < j :::; d? IT]
6. (Flags in arrangements)
(a) Let H be a set of n lines in the plane, and let V be the set of vertices
of their arrangement. Prove that the number of pairs (v, h) with v E V,
hE H, and v E h, i.e., the number of incidences J(V, H), is bounded by
O(n 2 ). (Note that this is trivially true for simple arrangements.) [2]
(b) Prove that the maximum number of d-tuples (Fo, F l , ... ,Fd ) in an
arrangement of n hyperplanes in R d , where Fi is an i-dimensional face
and F i - l is contained in the closure of Fi , is O(n d ) (d fixed). Such d-
tuples are sometimes called flags of the arrangement. [II
7. Let P = {Pl, ... ,Pn} be a point set in the plane. Let us say that points
X, y have the same view of P if the points of P are visible in the same
cyclic order from them. If rotating light rays emanate from X and from y,
the points of P are lit in the same order by these rays. We assume that
130 Chapter 6: Number of Faces in Arrangements
neither x nor y is in P and that neither of them can see two points of P
in occlusion.
(a) Show that the maximum possible number of points with mutually
distinct views of Pis O(n 4 ). ~
(b) Show that the bound in (a) cannot be improved in general. [!]
of the degrees of the Pi; when speaking of the arrangement of Zl,"" Zn,
one usually assumes that D is bounded by some (small) constant. Without
a bound on D, even a single Zi can have arbitrarily many connected compo-
nents.
In many cases, the Zi are algebraic surfaces, such as ellipsoids, paraboloids,
etc., but since we are in the real domain, sometimes they need not look like
surfaces at all. For example, the zero set of the polynomial p( Xl, X2) = xi + x~
consists of the single point (0,0). Although it is sometimes convenient to think
of the Zi as surfaces, the results stated below apply to zero sets of arbitrary
polynomials of bounded degree.
It is known that if both d and D are considered as constants, the maximum
number of faces in the arrangement of Zl, Z2,"" Zn as above is at most
O(n d ). This is one of the most useful results about arrangements, with many
surprising applications (a few are outlined below and in the exercises). In
the literature one often finds a (formally weaker) version dealing with sign
patterns of the polynomials Pi' A vector 0" E {-I, 0, + l}n is called a sign
pattern of PI, P2, ... ,Pn if there exists an X E R d such that the sign of Pi (x)
is O"i, for all i = 1,2, ... ,n. Trivially, the number of sign patterns for any n
polynomials is at most 3n . For d = 1, it is easy to see that the actual number
of sign patterns is much smaller, namely at most 2nD + 1 (Exercise 1). It is
not so easy to prove, but still true, that there are at most C(d, D) . n d sign
patterns in dimension d. This result is generally called the Milnor- Thom
theorem (and it was apparently first proved by Oleinik and Petrovskii, which
fits the usual pattern in the history of mathematics). Here is a more precise
(and more recent) version of this result, where the dependence on D and d
is specified quite precisely.
Proofs of these results are not included here because they would require
at least one more chapter. They belong to the field of real algebraic geometry.
The classical, deep, and extremely extensive field of algebraic geometry mostly
studies algebraic varieties over algebraically closed fields, such as the complex
numbers (and the questions of combinatorial complexity in our sense are
not among its main interests). Real algebraic geometry investigates algebraic
varieties and related concepts over the real numbers or other real-closed fields;
the presence of ordering and the missing roots of polynomials makes its flavor
distinctly different.
132 Chapter 6: Number of Faces in Arrangements
Much of what we have proved for arrangements of lines is true for arrange-
ments of pseudolines as well. This holds for the maximum number of vertices,
edges, and cells, but also for more sophisticated results like the Szemeredi-
Trotter theorem on the maximum number of incidences of m points and n
lines; these results have proofs that do not use any properties of straight lines
not shared by pseudolines.
One might be tempted to say that pseudolines are curves that behave
topologically like lines, but as we will see below, in at least one sense this is
2 This "affine" definition is a little artificial, and we use it only because we do
not want to assume the reader's familiarity with the topology of the projective
plane. In the literature one usually considers arrangements of pseudolines in
the projective plane, where the definition is very natural: Each pseudoline is a
closed curve whose removal does not disconnect the projective plane, and every
two pseudolines intersect exactly once (which already implies that they cross at
the intersection point). Moreover, one often adds the condition that the curves
do not form a single pencil; Le., not all of them have a common point, since
otherwise, one would have to exclude the case of a pencil in the formulation of
many theorems. But here we are not going to study pseudoline arrangements in
any depth.
6.2 Arrangements of Other Geometric Objects 133
profoundly wrong. The correct statement is that every two of them behave
topologically like two lines, but arrangements of pseudolines are more general
than arrangements of lines.
We should first point out that there is no problem with the "local" struc-
ture of the pseudoliries, since each pseudoline arrangement can be redrawn
equivalently (in a sense defined precisely below) by polygonal lines, as a wiring
diagram:
3--~~~::-
2
1
92
91
We have m :::::; ~, and the lines hI, ... ,hm and 91, ... ,9m form a regular grid.
n
Each of the about ~ pseudolines Pi in the middle passes near (n) vertices of
arrangement of H' if there exists a homeomorphism 'P of the projective plane
onto itself such that each pseudoline h E H is mapped to a pseudo line 'P( h) E
H'. For affinely isomorphic arrangements in the affine plane, the corresponding
arrangements in the projective plane are isomorphic, but the isomorphism in the
projective plane also allows for mirror reflection and for "relocating the infinity."
Combinatorially, the isomorphism in the projective plane can be described using
the (quasi)orderings 7rl, ••• ,7rn as well. Here the 7ri have to agree only up to
a possible reversal and cyclic shift for each i, and also the numbering of the
pseudolines by 1,2, ... ,n is not canonical.
We also remark that two arrangements of lines are isomorphic if and only if
the dual point configurations have the same order type, up to a mirror reflection
of the whole configuration (order types are discussed in Section 9.3).
4 For isomorphism in the projective plane, one gets an equivalent notion of stretch-
ability.
6.2 Arrangements of Other Geometric Objects 135
this grid, and for each such vertex it has a choice of going below it or above.
This gives 2n(n2) possibilities in total.
Now we use Theorem 6.2.1 to estimate the number of nonisomorphic sim-
ple arrangements of n straight lines. Let the lines be Cl , ... ,Cn, where Ci
has the equation y = aiX + bi and al > a2 > ... > an. The x-coordinate
of the intersection Ci n Cj is !;=~ii' To determine the ordering 7Ti of the in-
tersections along Ci, it suffices to know the ordering of the x-coordinates of
these intersections, and this can be inferred from the signs of the polynomials
Pijk(ai, bi , aj, bj , ak, bk ) = (b i - bj)(ak - ai) - (b i - bk)(aj - ai). So the num-
ber of nonisomorphic arrangements of n lines is no larger than the number
of possible sign patterns of the O(n 3 ) polynomials Pijk in the 2n variables
aI, bl , ... ,an, bn , and Theorem 6.2.1 yields the upper bound of 20 (n!ogn). For
large n, this is a negligible fraction of the total number of simple pseudoline
arrangements. (Similar considerations apply to nonsimple arrangements as
well.)
The problem of deciding the stretchability of a given pseudoline arrange-
ment has been shown to be algorithmically difficult (at least NP-hard). One
can easily encounter this problem when thinking about line arrangements and
drawing pictures: What we draw by hand are really pseudo lines, not lines,
and even with the help of a ruler it may be almost impossible to decide ex-
perimentally whether a given arrangement can really be drawn with straight
lines. But there are computational methods that can decide stretchability in
reasonable time at least for moderate numbers of lines.
5 The correspondence need not really be one-to-one. For example, the oriented
matroids of two projectively isomorphic pseudoline arrangements agree only up
to reorientation.
138 Chapter 6: Number of Faces in Arrangements
Exercises
1. Let Pl(X), ... ,Pn(X) be univariate real polynomials of degree at most D.
Check that the number of sign patterns of the Pi is at most 2nD+ 1. 0
2. (Intersection graphs) Let S be a set of n line segments in the plane. The
intersection graph of S is the graph on n vertices, which correspond to
the segments of S, with two vertices connected by an edge if and only if
the corresponding two segments intersect.
(a) Prove that the graph obtained from K5 by subdividing each edge
exactly once is not the intersection graph of segments in the plane (and
not even the intersection graph of any arcwise connected sets in the
plane).8J
(b) Use Theorem 6.2.1 to prove that most graphs are not intersection
graphs of segments: While the total number of graphs on n given vertices
is 2(~) = 2n2 /2+0(n) , only 20(n log n) of them are intersection graphs of
segments (be careful about collinear segments!). 0
(c) Show that the number of (isomorphism classes of) intersection graphs
of planar arcwise connected sets, and even of planar convex sets, on n
vertices cannot be bounded by 20( n log n). (The right order of magnitude
does not seem to be known for either of these classes of intersection
graphs.) 8J
3. (Number of combinatorially distinct simplicial convex polytopes) Use
Theorem 6.2.1 to prove that for every dimension d ~ 3 there exists Cd > 0
such that the number of combinatorial types of simplicial polytopes in
R d with n vertices is at most 2Cdn log n. (The combinatorial equivalence
means isomorphic face lattices; see Definition 5.3.4.) 8J
140 Chapter 6: Number of Faces in Arrangements
Such a result was proved by Alon [Alo86b] and by Goodman and Pollack
[GP86].
4. (Sign patterns of matrices and rank) Let A be a real n x n matrix. The
sign matrix a(A) is the n x n matrix with entries in {-1,0,+1} given
by the signs of the corresponding entries in A.
(a) Check that A has rank at most q if and only if there exist n x q
matrices U and V with A = UV T . [II
(b) Estimate the number of distinct sign matrices of rank q using Theo-
rem 6.2.1, and conclude that there exists an n x n matrix S containing
only entries +1 and -1 such that any real matrix A with a(A) = S has
rank at least en, with a suitable constant e > 0. [II
The result in (b) is from Alon, Frankl, and Rodl [AFR85] (for another
application see [Mat96b]).
5. (Extendible pseudosegments) A family of pseudosegments is a finite col-
lection S = {S1' S2, ... ,sn} of curves in the plane such that each Si is
x-monotone and its vertical projection on the x-axis is a closed interval,
every two curves in the family intersect at most once, and whenever they
intersect they cross (tangential contacts are not allowed). Such an S is
called extendible if there is a family L = {e 1, ... ,en} of pseudolines such
that Si ~ ei , i = 1,2, ... ,no
(a) Find an example of a nonextendible family of 3 pseudosegments. IT]
(b) Define an oriented graph G with vertex set S and with an edge from
Si to Sj if Si n 8j t=- 0 and 8i is below Sj on the left of their intersection.
Check that if S is extendible, then G is acyclic. IT]
(c) Prove that, conversely, if G is acyclic, then S is extendible. Extend
the pseudosegments one by one, maintaining the acyclicity of G. [II
(d) Let Ii be the projection of 8i on the x-axis. Show that if for every
i < j, Ii n I j = 0 or Ii ~ I j or I j ~ h then G is acyclic, and hence S is
extendible. 0
(e) Given a family of closed intervals h, ... ,In ~ R, show that each in-
terval in the family can be partitioned into at most o (log n) subintervals
in such a way that the resulting family of subintervals has the property
as in (d). This implies that an arbitrary family of n pseudosegments can
be cut into a family of O( n log n) extendible pseudosegments. [II
These notions and results are from Chan [ChaOOa].
The vertices of level 0 are the vertices of the cell lying below all the
hyperplanes, and since this cell is the intersection of at most n half-spaces,
it has at most O(nLd/2J) vertices, by the asymptotic upper bound theorem
(Theorem 5.5.2). From this result we derive a bound on the maximum number
of vertices of level at most k. The elegant probabilistic technique used in the
proof is generally applicable and probably more important than the particular
result itself.
6.3.1 Theorem (Clarkson's theorem on levels). The total number of
vertices of level at most k in an arrangement of n hyperplanes in Rd is at
most
O( n Ld/2J (k+ 1) rd/2 1),
with the constant of proportionality depending on d.
We are going to prove the theorem for simple arrangements only. The
general case can be derived from the result for simple arrangements by a
standard perturbation argument. But let us stress that the simplicity of the
arrangement is essential for the forthcoming proof.
For all k (0 S k S n - d), the bound is tight in the worst case. To see this
for k :::: 1, consider a set of ~ hyperplanes such that the lower unbounded cell
in their arrangement is a convex polyhedron with n«~)Ld/2J) vertices, and
replace each of the hyperplanes by k very close parallel hyperplanes. Then
each vertex of level 0 in the original arrangement gives rise to n(k d ) vertices
of level at most k in the new arrangement.
A much more challenging problem is to estimate the maximum possible
number of vertices of level exactly k. This will be discussed in Chapter II.
One of the main motivations that led to Clarkson's theorem on levels was
an algorithmic problem. Given an n-point set P C R d , we want to construct
142 Chapter 6: Number of Faces in Arrangements
a data structure for fast answering of queries of the following type: For a
query point x E Rd and an integer t, report the t points of P that lie nearest
to x.
Clarkson's theorem on levels is needed for bounding the maximum amount
of memory used by a certain efficient algorithm. The connection is not entirely
simple. It uses the lifting transform described in Section 5.7, relating the
algorithmic problem in Rd to the complexity of levels in Rd+l, and we do
not discuss it here.
Proof of Theorem 6.3.1 for d = 2. First we demonstrate this special
case, for which the calculations are somewhat simpler.
Let H be a set of n lines in general position in the plane. Let p denote a
certain suitable number in the interval (0,1) whose value will be determined
at the end of the proof. Let us imagine the following random experiment. We
choose a subset R ~ H at random, by including each line h E H into R with
probability p, the choices being independent for distinct lines h.
Let us consider the arrangement of R, temporarily discarding all the other
°
lines, and let f (R) denote the number of vertices of level in the arrangement
of R. Since R is random, f is a random variable. We estimate the expectation
of f, denoted by E[f], in two ways.
First, we have f(R) :::; IRI for any specific set R, and hence E[fl :::;
E[lRll =pn.
Now we estimate E[fl differently: We bound it from below using the
number of vertices of the arrangement of H of level at most k. For each
vertex v of the arrangement of H, we define an event Av meaning "v becomes
°
one of the vertices of level in the arrangement of R." That is, Av occurs
if v contributes 1 to the value of f. The event Av occurs if and only if the
following two conditions are satisfied:
• Both lines determining the vertex v lie in R .
• None of the lines of H lying below v falls into R.
~}h
~ t ese must not be 'R ill
We deduce that Prob[Avl = p2(1- p)l(v) , where £(v) denotes the level of the
vertex v.
Let V be the set of all vertices of the arrangement of H, and let V:::;k ~ V
be the set of vertices of level at most k, whose cardinality we want to estimate.
We have
Proof for an arbitrary dimension. The idea of the proof is the same
as above. As for the technical realization, there are at least two possible
routes. The first is to retain the same probability distribution for selecting
the sample R (picking each hyperplane of the given set H independently with
probability p); in this case, most of the proof remains as before, but we need
a lemma showing that E[IRI Ld/2J] = O((pn) Ld/2 J ). This is not difficult to
prove, either from a Chernoff-type inequality or by elementary calculations
(see Exercises 6.5.2 and 6.5.3).
The second possibility, which we use here, is to change the probability
distribution. Namely, we define an integer parameter r and choose a random
r-element subset R s:;; H, with all the (~) subsets being equally probable.
With this new way of choosing R, we proceed as in the proof for d = 2.
We define f(R) as the number of vertices of level 0 in the arrangement of R
and estimate E[J] in two ways. On the one hand, we have f(R) = O(r Ld/2J)
for all R, and so
E[J] = O( r Ld/2J).
The notation V for the set of all vertices of the arrangement of H, VSk
for the vertices of level at most k, and Av for the event "v is a vertex of level
o in the arrangement of R," is as in the previous proof. The conditions for
Av are
• All the d hyperplanes defining the vertex v fall into R .
• None of the hyperplanes of H lying below v fall into R.
So if £ = £(v) is the level of v, then
6.3.2 Lemma. Suppose that 1 :::; k :::; 2nd - 1, which implies 2d :::; r :::; ~.
Then
P(k) 2": cd(k+l)-d
for a suitable Cd > 0 depending only on d.
We postpone the proof of the lemma a little and finish the proof of The-
orem 6.3.1. We want to substitute the bound from the lemma into (6.2). In
order to meet the assumptions of the lemma, we must restrict the range of k
somewhat. But if, say, k 2": ~, then the bound claimed by the theorem is of
order n d and thus trivial, and for k = 0 we already know that the theorem
holds. So we may assume 1 :::; k :::; 2nd - 1, and we have
() (n~~dk)
P k = (~)
(n-d-k)(n-d-k-l) ... (n-k-r+l) . r(r-l) ... (r-d+l)
n(n-l)··· (n-r+l)
r(r-l)··· (r-d+l) n-d-k n-d-k-l n-k-r+l
n(n-l)··· (n-d+l) n-d n-d-l n-r+l
Now, ~ 2": (k~l - 1)/n 2": 2(k~1) (since k < ~, say) and 1 - n-~+l 2": 1 - 2:
(a somewhat finer calculation actually gives 1 - k~l here). Since k :::; ~, we
can use the inequality I-x 2": e- 2x valid for x E [0, ~J, and we arrive at
Exercises
1. Show that for n hyperplanes in R d in general position, the total number
of vertices oflevels k, k+l, ... ,n-d is at most O(n Ld / 2J (n-k) [d/21). 0
2. (a) Consider n lines in the plane in general position (their arrangement
is simple). Call a vertex v of their arrangement an extreme if one of its
defining lines has a positive slope and the other one has a negative slope.
Prove that there are at most O((k+l)2) extremes of level at most k.
Imitate the proof of Clarkson's theorem on levels. 0
(b) Show that the bound in (a) cannot be improved in general. IT]
3. Let K 1 , ... , Kn be circular disks in the plane. Show that the number of
intersections of their boundary circles that are contained in at most k
disks is bounded by O(nk). Use the result of Exercise 5.7.10 and assume
general position if convenient. 0
4. Let L be a set of n nonvertical lines in the plane in general position.
(a) Let W be an arbitrary subset of vertices of the arrangement of L,
and let Xw be the number of pairs (v,f), where v E W, f E L, and
146 Chapter 6: Number of Faces in Arrangements
£ goes (strictly) below v. For every real number p E (0,1), prove that
Xw 2': p-1IWI_ p- 2 n. [II
(b) Let W be a set of vertices in the arrangement of L such that no line
of L lies strictly below more than k vertices of W, where k 2': 1. Use (a)
to prove IWI = O(n-/k). 0
(c) Check that the bound in (b) is tight for all k ~ ~. 0
This exercise and the next one are from Sharir [ShaOl].
5. Let P be an n-point set in the plane in general position (no 4 points on
a common circle). Let C be a set of circles such that each circle in C
passes through 3 points of P and contains no more than k points of P
in its interior. Prove that ICI ~ O(nk2/3), by an approach analogous to
that of Exercise 4. [II
The following result bounds the maximum complexity of the zone. In the
proof we will meet another interesting random sampling technique.
6.4.1 Theorem (Zone theorem). The number of faces in the zone of any
hyperplane in an arrangement ofn hyperplanes in Rd is O(n d - 1 ), with the
constant of proportionality depending on d.
We prove the result only for simple arrangements; the general case follows,
as usual, by a perturbation argument. Let us also assume that 9 tf- H and that
H U {g} is in general position.
6.4 The Zone Theorem 147
It is clear that the zone has O(n d - 1 ) cells, because each (d-l)-dimen-
sional cell of the (d-l )-dimensional arrangement within 9 is intersects only
one d-dimensional cell of the zone. On the other hand, this information is
not sufficient to conclude that the total number of vertices of these cells
is O(n d - 1 ): For example, as we know from Chapter 4, n arbitrarily chosen
cells in an arrangement of n lines in the plane can together have as many as
O(n4/3) vertices.
Proof. We proceed by induction on the dimension d. The base case is d = 2;
it requires a separate treatment and does not follow from the trivial case
d = 1 by the inductive argument shown below.
The case d = 2. (For another proof see Exercise 7.1.5.) Let H be a set of n
lines in the plane in general position. We consider the zone of a line g. Since
a convex polygon has the same number of vertices and edges, it suffices to
bound the total number of I-faces (edges) visible from the line g.
Imagine 9 drawn horizontally. We count the number of visible edges lying
above g. Among those, at most n intersect the line g, since each line of H
gives rise to at most one such edge. The others are disjoint from g.
Consider an edge uv disjoint from 9 and visible from a point of g. Let
h E H be the line containing uv, and let a be the intersection of h with g:
--a~----~b------------g
Let the notation be chosen in such a way that u is closer to a than v, and
let e E H be the second line (besides h) defining the vertex u. Let b denote
the intersection eng. Let us call the edge uv a right edge of the line e if the
point b lies to the right of a, and a left edge of the line e if b lies to the left
of a.
We show that for each line ethere exists at most one right edge. If it were
not the case, there would exist two edges, uv and xy, where u lies lower than
x, which would both be right edges of e, as in the above drawing. The edge
xy should see some point of the line g, but the part of 9 lying to the right of
a is obscured by the line h, and the part left of a is obscured by the line e.
This contradiction shows that the total number of right edges is at most n.
Symmetrically, we see that the number of left edges in the zone is at
most n. The same bounds are obtained for edges of the zone lying below g.
Altogether we have at most O( n) edges in the zone, and the 2-dimensional
case of the zone theorem is proved.
148 Chapter 6: Number of Faces in Arrangements
The case d > 2. Here we make the inductive step from d-1 to d. We assume
that the total number of faces of a zone in R d - 1 is O(n d - 2 ), and we want to
bound the total number of zone faces in Rd.
The first idea is to proceed by induction on n, bounding the maximum
possible number of new faces created by adding a new hyperplane to n-1
given ones. However, it is easy to find examples showing that the number
of faces can increase roughly by n d - 1 , and so this straightforward approach
fails.
In the actual proof, we use a clever averaging argument. First, we demon-
strate the method for the slightly simpler case of counting only the facets
(i.e., (d-1)-faces) of the zone.
Let f(n) denote the maximum possible number of (d-1)-faces in the zone
in an arrangement of n hyperplanes in R d (the dimension d is not shown in
the notation in order to keep it simple). Let H be an arrangement and 9 a
base hyperplane such that f(n) is attained for them.
We consider the following random experiment. Color a randomly chosen
hyperplane h E H red and the other hyperplanes of H blue. We investigate
the expected number of blue facets of the zone, where a facet is blue if it lies
in a blue hyperplane.
On the one hand, any facet has probability n~l of becoming blue, and
hence the expected number of blue facets is n~l f(n).
We bound the expected number of blue facets in a different way. First,
we consider the arrangement of blue hyperplanes only; it has at most f (n-1)
blue facets in the zone by the inductive hypothesis. Next, we add the red
hyperplane, and we look by how much the number of blue facets in the zone
can increase.
A new blue facet can arise by adding the red hyperplane only if the red
hyperplane slices some existing blue facet F into two parts Fl and F 2 , as is
indicated in the picture:
gn h
6.4 The Zone Theorem 149
This increases the number of blue facets in the zone only if both FI and F2 are
visible from g. In such a case we look at the situation within the hyperplane
h; we claim that F n h is visible from 9 n h.
Let C be a cell of the zone in the arrangement of the blue hyperplanes
having F on the boundary. We want to exhibit a segment connecting F n h
to 9 n h within C. If Xl E FI sees a point YI E 9 and X2 E F2 sees Y2 E g,
then the whole interior of the tetrahedron XIX2YIY2 is contained in C. The
intersection of this tetrahedron with the hyperplane h contains a segment
witnessing the visibility of 9 n h from F n h.
If we intersect all the blue hyperplanes and the hyperplane 9 with the
red hyperplane h, we get a (d-1)-dimensional arrangement, in which F n h
is a facet in the zone of the (d-2)-dimensional hyperplane 9 n h. By the
inductive hypothesis, this zone has O(n d - 2 ) facets. Hence, adding h increases
the number of blue facets of the zone by O(n d - 2 ), and so the total number
of blue facets after h has been added is never more than f(n-1) + O(n d- 2 ).
We have derived the following inequality:
n-1
- - f(n) ::; f(n-1) + O(n d- 2 ).
n
It implies f(n) = O(n d- 1 ), as we will demonstrate later for a slightly more
general recurrence.
The previous considerations can be generalized for (d- k)- faces, where
1 ::; k ::; d-2. Let fJ(n) denote the maximum possible number of j-faces
in the zone for n hyperplanes in dimension d. Let H be a collection of n
hyperplanes where fd-k(n) is attained.
As before, we color one randomly chosen hyperplane h E H red and the
others blue. A (d-k )-face is blue if its relative interior is disjoint from the red
hyperplane. Then the probability of a fixed (d-k)-face being blue is n~k, and
the expected number of blue (d-k)-faces in the zone is at most n~k fd-k(n).
On the other hand, we find that by adding the red hyperplane, the num-
ber of blue (d-k)-faces can increase by at most O(n d - 2 ), by the inductive
hypothesis and by an argument similar to the case of facets. This yields the
recurrence
n-k
- - fd-k(n) ::; fd-k(n-1)
n
+ O(nd2
- ).
We are going to show that the number of vertices of the zone is at most
proportional to the number of the 2-faces of the zone. Every vertex is con-
tained in some 3-face of the zone. Within each such 3-face, the number of
vertices is at most 3 times the number of 2-faces, because the 3-face is a 3-
dimensional convex polyhedron. Since our arrangement is simple, each 2-face
is contained in a bounded number of 3-faces. It follows that the total number
of vertices is at most proportional to J2(n) = O(n d - l ). The analogous bound
for edges follows immediately from the bound for vertices. 0
(
6.4 The Zone Theorem 151
Exercises
1. (Sum of squares of cell complexities)
(a) Let C be the set of all cells of an arrangement of a set H of n hyper-
planes in Rd. For d = 2,3, prove that LCEC 10(C)2 = O(nd), wherelo(C)
is the number of vertices of the cell C. [1J
(b) Use the technique explained in this section to prove LCEC 10(C)2 =
O(nd(logn)Ld/2J-l) for every fixed d 2: 3 (or a similar bound with a
larger constant in the exponent of logn if it helps). I}]
The result in (b) is from Aronov, Matousek, and Sharir [AMS94].
2. Define the (:=;k )-zone of a hyperplane g in an arrangement of hyperplanes
as the collection of all faces for which some point x of their relative interior
can be connected to some point y E g so that the interior of the segment
xy intersects at most k hyperplanes.
(a) By the technique of Section 6.3 (Clarkson's theorem on levels), show
that the number of vertices of the (:=;k)-zone is O(nd-1k). [1J
(b) Show that the bound in (a) cannot be improved in general. 12]
3. In this exercise we aim at bounding K (n, n), the maximum total number
of edges of n distinct cells in an arrangement of n lines in the plane,
using the cutting lemma as in Section 4.5 (this proof is due to Clarkson,
Edelsbrunner, Guibas, Sharir, and Welzl [CEG+90]). Let L be a set of n
lines in general position.
(a) Prove the bound K(n, m) = O(nVm + m). [1J
(b) Prove K(n, n) = O(n 4 / 3 ) using the cutting lemma. [i]
4. Consider a set H of n planes in R3 in general position and a sphere S
(the surface of a ball).
(a) Show that S intersects at most O(n 2 ) cells of the arrangement of H.
12]
(b) Using (a) and Exercise 1, prove that the zone of S in the arrangement
of H has at most O(n 5 / 2 ) vertices. [i] (This is just an upper bound; the
correct order of magnitude is about n 2. )
with ingredients from the proof of Clarkson's theorem on levels and addi-
tional ideas.
We are going to re-prove the cutting lemma 4.5.3: For every set H of n
lines in the plane and every r > 1 there exists a ~-cutting for H of size O(r2),
i.e., a subdivision of the plane into O(r2) generalized triangles .6. 1 , ... ,.6.t
such that the interior of each .6. i is intersected by at most ~ lines of H. The
proof uses random sampling, and unlike the elementary proof in Section 4.7,
it can be generalized to higher dimensions without much trouble. We first give
a complete proof for the planar case and then we discuss the generalizations.
Throughout this section we assume that H is in general position. A per-
turbation argument mentioned in Section 4.7 can be used to derive the cutting
lemma for an arbitrary H.
The first idea is as in the proof of a weaker cutting lemma by random
sampling in Section 4.6: We pick a random sample S of a suitable size and
triangulate its arrangement.
The subsequent calculations become simpler and more elegant if we choose
S by independent Bernoulli trials. That is, instead of picking s random lines
with repetitions as in Section 4.6, we fix a probability p = ~ and we include
each line h E H into S with probability p, the decisions being mutually
independent (this is as in the proof of the planar case of Clarkson's theorem
on levels). These two ways of random sampling (by s random draws with
repetitions and by independent trials with success probability ~) can usually
be thought of as nearly the same; although the actual calculations differ
significantly, their results tend to be similar.
Sampling and triangulation alone do not work. Considerations similar
to those in Section 4.6 show that with probability close to 1, none of the
triangles in the triangulation for the random sample S as above is intersected
by more than C~ logn lines of H, for a suitable constant C. Later we will
see that a similar statement is true with C ~ log s instead of C ~ log n. But
it is not generally true with C~, for any C independent of sand n. So the
most direct road to an optimal ~-cutting, namely choosing const . r random
lines and triangulating their arrangement, is impassable.
To see this, consider a I-dimensional situation, where H = {hI, ... , hn }
is a set of n points in R (or if you prefer, look at the part of a 2-dimensional
arrangement along one of the lines). For simplicity, let us set s = ~; then
p = ~, and we can imagine that we toss a fair coin n times and we include hi
into S if the ith toss is heads. The picture illustrates the result of 30 tosses,
with black dots indicating heads:
0.0 • • 00.0.00.00.0 • • • 0 ••• 000 • • 0
We are interested in the length of the longest consecutive run of tails (empty
circles). For k is fixed, it is very likely that k consecutive tails show up in a
sequence of n tosses for n sufficiently large. Indeed, if we divide the tosses
into blocks of length k (suppose for simplicity that n is divisible by k),
154 Chapter 6: Number of Faces in Arrangements
000000000000000000000000000000
I I I I
then in each block, we have probability 2- k of receiving all tails. The blocks
are mutually independent, and so the probability of not obtaining all tails
in any of the ~ blocks is (1 - 2-k)n/k. For k fixed and n -+ 00 this goes
to 0, and a more careful calculation shows that for k = l~ log2 nJ we have
exponentially small probability of not receiving any block of k consecutive
tails (Exercise 1). So a sequence of n tosses is very likely to contain about log n
consecutive tails. (Sequences produced by humans that are intended to look
random usually do not have this property; they tend to be "too uniform.")
Similarly, for a smaller s, if we make a circle black with probability;, then
the longest run typically has about 2} log s consecutive empty circles.
Of course, in the one-dimensional situation one can define much more
uniform samples, say by making every 2}th circle black. But it is not clear
how one could produce such "more uniform" samples for lines in the plane
or for hyperplanes in Rd.
The strategy: a two-level decomposition. Instead of trying to select
better samples we construct a ~-cutting for H in two stages. First we take a
sample S with probability p = ~ and triangulate the arrangement, obtaining
a collection T of triangles. (The expected number of triangles is O(r2), as we
will verify later.) Typically, T is not yet a ~-cutting. Let 1(1::1) denote the set
of lines of H intersecting the interior of a triangle 1::1 E T and let n.6 = 11(1::1)1.
We define the excess of a triangle 1::1 ETas t.6 = n.6 . ~.
If t.6 ::; 1, then n.6 ::; ~ and 1::1 is a good citizen: It can be included into
the final ~-cutting as is. On the other hand, if t.6 > 1, then 1::1 needs further
treatment: We subdivide it into a collection of finer triangles such that each
of them is intersected by at most ~ lines of H. We do it in a seemingly
naive way: We consider the whole arrangement of 1(1::1), temporarily ignoring
1::1, and we construct a t~ -cutting for it. Then we intersect the triangles of
this t~ -cutting with 1::1, which can produce triangles but also quadrilaterals,
pentagons, and hexagons. Each of these convex polygons is further subdivided
into triangles, as is illustrated below:
The key insight for the proof of the cutting lemma is that although we
typically do have triangles Do E T with excess as large as about log r, they
are very few. More precisely, we show that under suitable assumptions, the
expected number of triangles in T with excess t or larger decreases exponen-
tiallyas a function of t. This will take care of both estimating (6.3) by O(r2)
and establishing Lemma 6.5.l.
Good and bad triangulations. Our collection T of triangles is obtained
by triangulating the cells in the arrangement of the random sample S. Now
is the time to specify how exactly the cells are triangulated, since not every
triangulation works. To see this, consider a set H of n lines, each of them
touching the unit circle, and let S be a random sample, again for simplicity
with probability p = ~. We have learned that such a sample is very likely to
leave a gap of about log n unselected lines (as we go along the unit circle).
If we maliciously triangulate the central cell in the arrangement of S by the
diagonals from the vertex near such a large gap,
156 Chapter 6: Number of Faces in Arrangements
all these about ~ triangles have excess about log n; this is way too large for
our purposes.
The triangulation thus cannot be quite arbitrary. For the subsequent
proof, it has to satisfy simple axioms. In the planar case, it is actually tech-
nically easier not to triangulate but to construct the vertical decomposition
of the arrangement of S. We erect vertical segments upwards and downwards
from each vertex in the arrangement of S and extend them until they meet
another line (or all the way to infinity):
So far we have been speaking of triangles, and now we have trapezoids, but
the difference is immaterial, since we can always split each trapezoid into
two triangles if we wish. Let T(S) denote the set of (generalized) trapezoids
in the vertical decomposition of S. As before, I(b.) is the set of lines of H
intersecting the interior of a trapezoid b..
6.5.2 Proposition (Trapezoids with large excess are rare). Let H be
a fixed set of n lines in general position, let p = ~, where 1 :::; r :::; ~, let S be
a random sample drawn from H by independent Bernoulli trials with success
probability p, and let t 2:: 0 be a real parameter. Let T(Sht denote the set
of trapezoids in b. E T(S) with excess at least t, i.e., with II(b.)1 2:: t~. Then
the expected number of trapezoids in T(Sht is bounded as follows:
i=O
00
i=O
"::----
4
71 ~:
,"
/
The set D(fj.) is called the defining set of fj.. Note that the same defining set
can belong to several trapezoids.
Now we list the properties required for the proof; some of them are obvious
or have already been noted,
(CO) We have ID(fj.) I :::; 4 for all fj. E Reg. Moreover, any set 8 0 ~ H is the
defining set for at most a constant number of fj. E Reg (certainly no more
than the maximum of 1/(80 )1 for 180 1 :::; 4).
(Cl) For any fj. E /(8), we have D(fj.) ~ 8 (the defining set must be present)
and 8 n J (fj.) = 0 (no intersecting line may be present),
(C2) For any fj. E Reg and any 8 ~ H such that D(fj.) ~ 8 and J(fj.) n 8 = 0,
we have fj. E /(8).
(C3) For every 8 ~ H, we have 1/(8)1 = 0(181 2 + 1). To see this, think of
adding the vertical segments to the arrangement of 8 one by one. Each
of them splits an existing region in two,
The most interesting condition is (C2), which says that the vertical de-
composition is defined "locally." It implies, in particular, that fj. is one of the
trapezoids in the vertical decomposition of its defining set. More generally, it
says that fj. E Reg is present in /(8) whenever it is not excluded for simple
local reasons (which can be checked by looking only at fj.). Checking (C2)
in our situation is easy, and we leave it to the reader. Also note that it is
(C2) that is generally violated for the mischievous triangulation considered
earlier,
Proof of Proposition 6.5.2. First we prove that if 8 <;::; H is a random
sample drawn with probability p = .;;, 0 < r < n, then
This takes care of the case t :::; 1 in the proposition. By (C3), we have 1/(8)1 =
0(181 2 + 1) for every fixed 8, and so it suffices to show that E[1812] =
0(r2 + 1), Now, 181 is the sum of independent random variables, each of
them attaining value 1 with probability p and value 0 with probability 1 - p,
and it is easy to check that E [1812] :::; r2 + r (Exercise 2(a)).
Next, we assume t ;::: 1. Let 8 <;::; H be a random sample drawn with
probability p, We observe that the conditions (Cl) and (C2) allow us to
6.5 The Cutting Lemma Revisited 159
express the probability p(.6.) that a certain trapezoid .6. E Reg appears in the
vertical decomposition I(S): Since .6. appears if and only if all lines of D(.6.)
are selected into S and none of 1(.6.) is selected, we have
p(.6.) = pID(LlJI(l_ p)IJ(LlJI.
(An analogous formula appeared in the proof of the planar Clarkson's the-
orem on levels, and one can say that the technique of that proof is devel-
oped one step further in the present proof.) If we write Reg>t = {.6. E
Reg: 11 (.6.) 12': t ~} for the set of all potential trapezoids with excess at least
t, the expected number of trapezoids in I(Sht can be written as
E[l/(Shtll = L p(.6.). (6.5)
LlE Reg ?;t
It seems difficult to estimate this sum directly; the trick is to compare it with
a similar sum obtained for the expected number of trapezoids for another
sample.
We define another probability p = If, and we let 8 be a sample drawn
from H by Bernoulli trials with success probability p. On the one hand,
we have E [1/(8)1] = O(r2 jt 2 + 1) by (6.4). On the other hand, setting
p(.6.) = pID(LlJI(l - p)IJ(LlJI we have, in analogy to (6.5),
where
R = min {~~~~: .6. E Reg~t}.
Now R can be bounded from below. For every .6. E Reg>t, we have 11(.6.)1 2':
t~ and ID(.6.) 1 :::; 4, and so -
for a sufficiently large constant C (the proposition assumes r 2': 1). Proposi-
tion 6.5.2 is proved. D
The only new part of the proof is the construction of a suitable trian-
gulation scheme that plays the role of 7(8). A vertical decomposition does
not work. More precisely, it is not known whether the vertical decomposition
of an arrangement of n hyperplanes in Rd always has at most O(n d ) cells
(prisms); this would be needed as the analogue of condition (C3). Instead
one can use the bottom-vertex triangulation, which we define next.
First we specify the bottom-vertex triangulation of a k-dimensional con-
vex polytope P C R d , 1 :::; k :::; d, by induction on k. For k = 1, P is a line
segment, and the triangulation consists of P itself. For k > 1, we let v be the
vertex of P with the smallest last coordinate (the "bottom vertex"); ties can
be broken by lexicographic ordering of the coordinate vectors. We triangu-
late all proper faces of P inductively, and we add the simplices obtained by
erecting the cone with apex v over all simplices in the triangulations of the
faces not containing v.
d=3
The set 1(6) are all hyperplanes intersecting the interior of a simplex
6, and D(6) consists of all the hyperplanes incident to at least one vertex
of 6. We again need to assume that our hyperplanes are in general posi-
tion. Then, obviously, ID(6)1 :::; d(d+1), and a more careful argument shows
that ID(6)1 :::; d(di 3 ). The important thing is that an analogue of (CO)
holds, namely, that both ID(6)1 and the number of 6 with a given D(6) are
bounded by constants.
The condition (C1) holds trivially. The "locality" condition (C2) does
need some work, although it is not too difficult, and we refer to Chazelle and
Friedman [CF90j for a detailed argument.
With (CO)-(C3) in place, the whole proof proceeds exactly as in the planar
case. To get the analogue of (6.4), namely E[lT(S)lj = O(rd+1), we need
the fact that E [ISl d ] = O(r d ) (this is what we avoided in the proof of the
higher-dimensional Clarkson's theorem on levels by passing to another way
of sampling); see Exercise 2(b) or 3.
Further generalizations. An analogue of Proposition 6.5.2 can be derived
from conditions (CO)-(C3) in a general abstract framework. It provides op-
timal ~-cuttings not only for arrangements of hyperplanes but also in other
situations, whenever one can define a suitable decomposition scheme satisfy-
ing (CO)-(C3) and bound the maximum number of cells in the decomposition
(the latter is a challenging open problem for arrangements of bounded-degree
algebraic surfaces). The significance of Proposition 6.5.2 reaches beyond the
construction of cuttings; its variations have been used extensively, mainly in
the analysis of geometric algorithms. We are going to encounter a combina-
torial application in Chapter 11.
This weaker axiom was first used instead of (C2) by Chazelle, Edels-
brunner, Guibas, Sharir, and Snoeyink [CEG+93] . For a proof of a
counterpart of Proposition 6.5.2 under (C2') see Agarwal, Matousek,
and Schwarzkopf [AMS98].
Yet another proof of the cutting lemma in arbitrary dimension was
invented by Chazelle [Cha93a]. An outline of the argument can also
be found in Chazelle's book [ChaOOc] or in the chapter by Matousek
in [SUOO].
Both the proofs of the higher-dimensional cutting lemma depend
crucially on the fact that the arrangement of n hyperplanes in R d, d
fixed, can be triangulated using O(n d ) simplices. As was explained in
Section 6.2, the arrangement of n bounded-degree algebraic surfaces
in Rd has O(n d ) faces in total, but the faces can be arbitrarily compli-
cated. A challenging open problem is whether each face can be further
decomposed into "simple" pieces (each of them defined by a constant-
bounded number of bounded-degree algebraic inequalities) such that
the total number of pieces for the whole arrangement is O(n d ) or not
much larger. This is easy for d = 2 (the vertical decomposition will
do), but dimension 3 is already quite challenging. Chazelle, Edels-
brunner, Guibas, and Sharir [CEGS89] found a general argument that
provides an O(n 2d - 2 ) bound in dimension d using a suitable vertical
decomposition. By proving a near-optimal bound in the 3-dimensional
case and using it as a basis of the induction, they obtained the bound
of O(n 2d - 3 f3(n)), where f3 is a very slowly growing function (much
smaller than log- n). Recently Koltun [KolO1] established a near-tight
6.5 The Cutting Lemma Revisited 163
Exercises
1. Estimate the largest k = k(n) such that in a row of n tosses of a fair coin
we obtain k consecutive tails with probability at least ~. In particular,
using the trick with blocks in the text, check that for k = l ~ log2 n J, the
probability of not getting all tails in any of the blocks is exponentially
small (as a function of n). IT]
2. Let X = Xl + X 2 + '" + X n , where the Xi are independent random
variables, each attaining the value 1 with probability p and the value 0
with probability 1 - p.
(a) Calculate E [X2]. IT]
(b) Prove that for every integer d 2: 1 there exists Cd such that E [Xd] <
(np+cd)d. (You can use a Chernoff-type inequality, or prove by induction
that E [(X + a)d] :::; (np + d + a)d for all nonnegative integers n, d, and
a.) @J
(c) Use (b) to prove that E[xa] :::; (np + ca)a also holds for nonintegral
a2:1.@J
3. Let X = Xl + X 2 + ... + Xn be as in the previous exercise. Show that
E[ (~)] = pd G) (where d 2: 0 is an integer) and conclude that E [Xd] :::;
cd(np)d for np 2: d and a suitable Cd > O. @J
4. Let P be a d-dimensional simple convex polytope. Prove that the bottom-
vertex triangulation of P has at most Cdfo(P) simplices, where Cd de-
pends only on d and fo(P) denotes the number of vertices of P. ~
7
Lower Envelopes
1 2 3 1 2
s + 2 letters a and b
where a -j. b. In other words, there are no indices i 1 < i2 < i3 < ... < is+2
with ai, -j. ai2' ai, = ai3 = ai 5 = .. " and ai2 = ai 4 = ai6 = .. '.
7.1 Segments and Davenport-Schinzel Sequences 167
Only (iii) needs a little thought: It suffices to note that between an occurrence
of a curve a and an occurrence of a curve b on the lower envelope, a and b
have to intersect.
Any finite sequence satisfying (i)-(iii) is called a Davenport-Schinzel se-
quence of order s over the symbols 1,2, ... , n. It is not important that the
terms of the sequence are the numbers 1,2, ... , n; often it is convenient to
use some other set of n distinct symbols.
Let us remark that every Davenport-Schinzel sequence of order s over n
symbols corresponds to the lower envelope of a suitable set of n curves with at
most s intersections for each pair of curves (Exercise 1). On the other hand,
very little is known about the realizability of Davenport-Schinzel sequences
by graphs of polynomials of degree s, say.
We will mostly consider Davenport-Schinzel sequences of order 3. This is
the simplest nontrivial case and also the one closely related to lower envelopes
of segments. Every two segments intersect at most once, and so it might
seem that their lower envelope gives rise to a Davenport-Schinzel sequence
of order 1, but this is not the case! The segments are graphs of partially defined
functions, while the discussion above concerns graphs of functions defined on
all of R. We can convert each segment into a graph of an everywhere-defined
function by appending very steep rays to both endpoints:
All the left rays are parallel, and all the right ones are parallel. Then every two
of these curves have at most 3 intersections, and so if the considered segments
are numbered 1 through n and we write the sequence of their numbers along
the lower envelope, we get a Davenport-Schinzel sequence of order 3 (no
ababa).
Let >'s (n) denote the maximum possible length of a Davenport-Schinzel
sequence of order s over n symbols. Some work is needed to see that >'s (n) is
finite for all sand n; the reader is invited to try this. The bound >'l{n) = n is
trivial, and >'2(n) = 2n-l is a simple exercise. Determining the asymptotics
of >'3(n) is a hard problem; it was posed in 1965 and solved in the mid-1980s.
We will describe the solution later, but here we start more modestly: with a
reasonable upper bound on >'3 (n).
7.1.1 Proposition. We have a(n) ~ >'3(n) ~ 2nln n + 3n.
Indeed, if some a which is neither the first nor the last a in w were surrounded
by some b from both sides, we would have the situation ... a ... bab ... a ...
with the forbidden pattern ababa. So by deleting all the a and at most 2
more symbols, we obtain a Davenport-Schinzel sequence of order 3 over n-1
symbols.
We arrive at the recurrence
Exercises
1. Let w be a Davenport-Schinzel sequence of order s over the symbols
1,2, ... , n. Construct a family of planar curves hI, h2, ... ,hn, each of
them intersecting every vertical line exactly once and each two intersect-
ing in at most s points, such that the sequence of the numbers of the
curves along the lower envelope is exactly w. ~
2. Prove that A2(n) = 2n-1 (the forbidden pattern is abab). I2l
3. Prove that for every nand s, As(n) ~ 1 + (s+l)G).12l
4. Show that the lower envelope of n rays in the plane has O(n) complexity.
8]
5. (Planar zone theorem via Davenport-Schinzel sequences) Prove the zone
theorem (Theorem 6.4.1) for d = 2 using the fact that A2(n) = O(n).
Consider only the part above the line g, and assign one symbol to each
side of each line. 8]
6. Let gI, g2, ... , gm C R2 be graphs of piecewise linear functions R -+
R that together consist of n segments and rays. Prove that the lower
envelope of gl,g2,." ,gm has complexity O(f!'t A3(2m)); in particular, if
m = 0(1), then the complexity is linear. 8]
7.2 Segments; Superlinear Complexity of the Lower Envelope 169
All the segments of a fan have a common left endpoint and positive slopes, and
the length of the segments increases with the slope. Other than forming the
fans, the segments are in general position in an obvious sense. For example,
no endpoint of a segment lies inside another segment, the endpoints do not
coincide unless the segments are in a common fan, and so on.
Let fk(m) denote the number of fans forming Sk(m); we have nk(m) =
m· Jk(m).
First we describe the construction of Sk(m) roughly, and later we make
precise some finer aspects. As was already mentioned, we proceed by induc-
tion on k and m. One of the invariants of the construction is that the left
endpoints of all the fans of Sk(m) always show up on the lower envelope.
First we specify the boundary cases with k = 1 or m = 1. For k = 1,
Sl(m) is simply a single fan with m segments. For m = 1, Sk(l) is obtained
from Sk-l(2) by the following transformation of each fan (each fan has 2
segments):
The lower segment in each fan is translated by the same tiny amount to the
left.
Now we describe the construction of Sk(m) for general k, m 2:: 2. First we
construct Sk(m-1) inductively. We shrink this Sk(m-1) both vertically and
horizontally by a suitable affine transform; the vertical shrinking is much
more intensive than the horizontal one, so that all segments become very
short and almost horizontal. Let S' be the transformed Sk(m-1). We will use
many translated copies of S' as "microscopic" ingredients in the construction
of Sk(m).
The "master plan" of the construction is obtained from Sk-l(M), where
M = fk(m-1) is the number of fans in S'. Namely, we first shrink Sk-l(M)
7.2 Segments: Superlinear Complexity of the Lower Envelope 171
vertically so that all segments become nearly horizontal, and then we apply
the affine transform (x, y) ~ (x, x + y) so that the slopes of all the segments
are just a little over 1. Let S* denote the resulting set.
For each fan F in the master construction S*, we make a copy S'p of
the microscopic construction Sf and place it so that its leftmost endpoint
coincides with the left endpoint of F. Let the segments of F be SI, •.. ,SM,
numbered by increasing slopes, and let £1, ... , £M be the left endpoints of
the fans in S'p, numbered from left to right. The fan F is gigantic compared
to S'p. Now we take F apart: We translate each Si so that its left endpoint
goes to £i. The following drawing shows this schematically, since we have no
chance to make a realistic drawing of Sk(m-l). Only a very small part of F
near its left endpoint is shown.
S4 S3 S2
:/ ./·...:::···SI
The new vertices lie on the right of S'p but, in the scale of the master con-
struction S*, very close to the former left endpoint of F, and so they indeed
appear on the lower envelope.
This is where we need to make the whole construction more precise,
namely, to say more about the structure of the fans in Sk(m). Let us call
a fan r-escalating if the ratio of the slopes of every two successive segments
in the fan is at least r. It is not difficult to check that for any given r > 1,
172 Chapter 7: Lower Envelopes
the construction of Sk(m) described above can be arranged so that all fans
in the resulting set are r-escalating.
Then, in order to guarantee that the M -1 new vertices per fan arise in
the general inductive step described above, we make sure that the fans in the
master construction S* are affine transforms of r-escalating fans for a suitable
very large r. More precisely, let Q be a given number and let r = r(k, Q) be
sufficiently large and <5 = <5(k, Q) > 0 sufficiently small. Let F arise from an
r-escalating fan by the affine transformation described above (which makes
all slopes a little bigger than 1), and assume that the shortest segment has
length 1, say. Suppose that we translate the left endpoint of Si, the segment
with the ith smallest slope in F, by <51 + <52 + ... + <5i almost horizontally to
the right, where <5 ::; <5i ::; Q<5. Then it is not difficult to see, or calculate, that
the lower envelope of the translated segments of F looks combinatorially like
that in the last picture and has M -1 new vertices. The reader who is not
satisfied with this informal argument can find real and detailed calculations
in the book [SA95].
We want to prove that the complexity of the lower envelope of Sk(m) is
at least ~ km times the number of fans; in our notation,
For the envelope complexity we get a contribution of ek-l (M) from S*,
ek (m-l) from each copy of Sf, and M -1 new vertices for each copy of Sf.
Putting this together and using the inductive assumption to eliminate the
function e, we have
2: fk-1(M)· [~(k-l)M+~k(m-l)M+M-l]
2: h-l(M) . [~kM + ~k(m - I)M]
= ~km. M· fk-l(M) = ~km· fdm).
Note how the properties of the construction Sk(m) contradict the intuition
gained from small pictures: Most of the segments appear many times on
7.3 More on Davenport-Schinzel Sequences 173
the lower envelope, and between two successive segment endpoints on the
envelope there is typically a concave arc with quite a large number of vertices.
Exercises
1. Construct Davenport-Schinzel sequences of order 3 of superlinear length
directly. That is, rephrase the construction explained in this section in
terms of Davenport-Schinzel sequences instead of segments. 0
Al(n) = 2n,
Ak(n) = A k- 1 0 A k- l 0 · · · 0 A k- l (l) (n-fold composition), k = 2,3, ....
Only the first few of these functions can be described in usual terms: We have
.2
A(n) = An(n).
And a is the inverse function to A:
1 Several versions of the Ackermann function can be found in the literature, dif-
fering in minor details but with similar properties and orders of magnitude.
174 Chapter 7: Lower Envelopes
and k pairwise parallel edges, where two edges are called parallel if
they do not cross and their four vertices are in convex position:
~:::::::,~
e-
..:
7.3 More on Davenport-Schinzel Sequences 177
A graph with no two crossing edges is planar and thus has O(n) ver-
tices. It seems to be generally believed that forbidding k pairwise cross-
ing edges forces O(n) edges for every fixed k. This has been proved
for k = 3 by Agarwal, Aronov, Pach, Pollack, and Sharir [AAP+97],
and for all k 2: 4, the best known bound is O(nlogn) due to Valtr
(see [Va199a]). For k forbidden pairwise parallel edges, he derived an
O( n) bound for every fixed k using generalized Davenport-Schinzel
sequences, and the O( n log n) bound for k pairwise crossing edges fol-
lows by a neat simple reduction. In this connection, let us mention
a nice open question: What is the smallest n = n( k) such that any
straight-edge drawing of the complete graph Kn always contains k
pairwise crossing edges? The best known bound is O(k2) [AEG+94],
but perhaps the truth is O(k) or close to it.
The second application of generalized Davenport-Schinzel sequen-
ces concerns a conjecture of Stanley and Wilf. Let a be a fixed per-
mutation of {I, 2, ... ,k}. We say that a permutation 7r of {I, 2, ... ,n}
contains a if there are indices i l < i2 < ... < ik such that a( u) < a( v)
if and only if 7r(iu) < 7r(iv), 1 :S u < v :S k. Let N(a, n) de-
note the number of permutations of {I, 2, ... ,n} that do not con-
tain a. The Stanley-Wilf conjecture states that for every k and a
there exists e such that N(a, n) :S en for all n. Using generalized
Davenport-Schinzel sequences, Alon and Friedgut [AFOO] proved that
10gN(a,n) :S nf3(n) for every fixed a, where f3(n) denotes a very
slowly growing function, and established the Stanley-Wilf conjecture
for a wide class of a (previously, much fewer cases had been known).
Klazar [KlaOO] observed that the Stanley-Wilf conjecture is implied by
a conjecture of Fiiredi and Hajnal [FH92] about the maximum number
of l's in an nxn matrix of O's and l's that does not contain a kxk
submatrix having 1's in positions specified by a given fixed k x k per-
mutation matrix. Fiiredi and Hajnal conjectured that at most O( n)
1's are possible. The analogous questions for other types of forbidden
patterns of 1 's in 0/1 matrices are also very interesting and very far
from being understood; this is another direction of generalizing the
Davenport-Schinzel sequences.
Exercises
1. Let e be a cell in an arrangement of n segments in the plane (assume
general position if convenient).
(a) Number the segments 1 through n and write down the sequence of
the segment numbers along the boundary of e, starting from an arbi-
trarily chosen vertex of the boundary (decide what to do if the boundary
has several connected components!). Check that there is no ababab sub-
e
sequence, and hence that the combinatorial complexity of is no more
than O(A4(n)). 0
178 Chapter 7: Lower Envelopes
symbol of Uj+l (at the (i+1)st position). We claim that the ith position is
the last occurrence of a or the (i+ 1)st position is the first occurrence of b.
This will imply that we have at most 2n segments Ui, because each of the n
symbols has (at most) one first and one last occurrence.
Supposing that the claim is not valid, we find the forbidden subsequence
ababa. We have a -< b, for otherwise the (i+ 1)st position could be appended
to Uj, contradicting the maximality. The b at position i+ 1 is not the first b,
and so there is some b before the ith position. There must be another a even
before that b, for otherwise we would have b -< a. Finally, there is an a after
the position i+ 1, and altogether we have the desired ababa. 0
D
7.4 Towards the Tight Upper Bound for Segments 181
How to prove good bounds from the recurrence. The recurrence just
proved can be used to show that 'ljJ(m, n) = O((m+n)a(m)), and Lemma 7.4.1
then yields the desired conclusion A3(n) = O(na(n)). We do not give the full
calculation; we only indicate how the recurrence can be used to prove better
and better bounds starting from the obvious estimate 'ljJ(m, n) :::; mn.
First we prove that 'ljJ( m, n) :::; 4m log2 m + 6n, for m a power of 2. From
our recurrence with P = 2 and ml = m2 = W-, we obtain
Exercises
1. For integers s > t ;::: 1, let 'ljJ;(m, n) denote the maximum length of a
Davenport-Schinzel sequence of order s (no subsequence abab ... with
s+ 2 letters) over n symbols that can be partitioned into m contiguous
segments, each of them a Davenport-Schinzel sequence of order t. In
particular, 'ljJs(m, n) = 'ljJ;(m, n) is the maximum length of a Davenport-
Schinzel sequence of order s over n symbols that consists of m nonrepet-
itive segments.
(a) Prove that As(n):::; 'ljJ~-l(n,n). 0
(b) Prove that
182 Chapter 7: Lower Envelopes
o
(c) Let w be a sequence witnessing 'l/Js(m, n) and let m = ml + m2 +
... + mp be some partition of m. Divide w into p parts as in the proof of
Proposition 7.4.2, the kth part consisting of mk nonrepetitive segments.
With the terminology and notation of that proof, check that the local
symbols contribute at most m+ L~=l 'l/Js(mk,nk) to the length ofw, the
middle symbols at most m + 'l/J;-2(p, n*), and the starting symbols no
more than m + 'l/Js-l(m, n*). 0
(d) Prove by induction that 'l/Js(n, m) ~ C s . (m + n) logS-2(m+1) and
As(n) ~ C~nlogS-2(n+1), for all s ;::: 2 and suitable Cs and C~ depending
only on s (set p = 2 in (c)). 0
k = 0, 1, .... Further, let fk(n) be the maximum of fk(H) over all sets H of
n triangles (in general position). So our goal is to estimate fo(n).
The first part of the proof of Proposition 7.5.1 employs a probabilistic
argument, very similar to the one in the proof of the zone theorem (Theo-
rem 6.4.1), to relate fo(H) and h(H) to fo(n-1).
7.5.2 Lemma. For every set H oin triangles in general position, we have
n-3 1
- - fo(H) ::; fo(n-1) - - h(H).
n n
Proof. We pick one triangle hE H at random and estimate E[fo(H \ {h})],
the expected number of vertices of the lower envelope after removing h. Every
vertex of the lower envelope of H is determined by 3 triangles, and so its
chances of surviving the removal of hare n;:3. For a vertex v of levell, the
probability of its appearing on the lower envelope is ~, since we must remove
the single triangle lying below v. Therefore,
n- 3 1
E[Jo(H \ {h})] = - fo(H) + - h(H).
n n
The lemma follows by using fo (H \ {h}) ::; fo (n-1). D
Before proceeding, let us inspect the inequality in the lemma just proved.
Let H be a set of n triangles with fo(H) = fo(n). If we ignored the term
~ h(H), we would obtain the recurrence n;:3
fo(n) ::; fo(n-1). This yields
only the trivial estimate fo(n) = O(n 3 ), which is not surprising, since we
have used practically no geometric information about the triangles. In order
to do better, we now want to show that h (H) is almost as big as fo(H),
in which case the term ~ h(H) decreases the right-hand side significantly.
Namely, we prove that
h(H) 2 fo(H) - O(na(n)). (7.1)
Substituting this into the inequality in Lemma 7.5.2, we arrive at
n-2
- - fo(n) ::; fo(n-1) + O(a(n)).
n
We practiced this kind of recurrences in Section 6.4: The substitution cp( n) =
~~<:-L quickly yields fo(n) = O(na(n) logn). So in order to prove Proposi-
tion 7.5.1, it remains to derive (7.1), and this is the geometric heart of the
proof.
Making someone pay for the level-O vertices. We are going to relate
the number of level-O vertices to the number of level-1 vertices by a local
charging scheme: From each vertex v of level 0, we walk around a little and
find suitable vertices of level 1 to pay for v, as follows.
The level-O vertex v is incident to 6 edges, 3 of them having level 0 and 3
level 1:
7.5 Up to Higher Dimension: Triangles in Space 185
The picture shows only a small square piece from each of the triangles incident
to v. The lower envelope is on the bottom, and the edges of level 1 emanating
from v are marked by arrows. Let e be one of the level-l edges going from v
away from the lower envelope. We follow it until one of the following events
occurs:
(i) We reach the intersection v' of e with a vertical wall 7ra erected from an
edge a of some triangle. This v' pays 1 unit to v.
(ii) We reach the intersection v' of e with another triangle; i.e., v'is a vertex
of the arrangement of H. This v' pays ~ of a unit to v.
This is done for all 3 level-1 edges emanating from v and for all vertices v of
level O. Clearly, every v receives at least 1 unit in total. It remains to discuss
what kind of vertices the v' are and to estimate the total charge paid by
them.
Since there is no other vertex on e between v and v', a particular v' can
be reached from at most 2 distinct v in case (i) and from at most 3 distinct
v in case (ii). So a v'is charged at most 2 according to case (i) or at most 1
according to case (ii) (because of the general position of H, these cases are
never combined, since no intersection of 3 triangles lies in any of the vertical
walls 7ra ).
Next, we observe that in case (i), v' has level at most 2, and in case (ii), it
has level exactly 1. This is best seen by considering the situation within the
vertical plane containing the edge e. As we move along e, just after leaving
v we are at levell, with exactly one triangle h below, as is illustrated next:
e~:7ra
v' :
:
. h
v e~
case (i) case (ii)
The level does not change unless we enter a vertical wall 7ra or another triangle
h' E H. If we first enter some 7ra , then case (i) occurs with v' = en 7ra , and
the level cannot change by more than 1 by entering 7ra . If we first reach a
triangle h', we have case (ii) with v' = en h', and v' has level 1.
Each v' reached in case (i) is a vertex in the arrangement of segments
within one of the walls 7ra , and it has level at most 2 there. It is easy to show
186 Chapter 7: Lower Envelopes
Exercises
1. Given a construction of a set of n segments in the plane with lower
envelope of complexity cr(n), show that the lower envelope of n triangles
in R3 can have complexity D(ncr(n)). 0
2. Show that the number of vertices of level at most k in the arrangement of
n segments (in general position) in the plane is at most O(k2cr(Lk~1J)).
The proof of the general case of Clarkson's theorem on levels (Theo-
rem 6.3.1) applies almost verbatim. IT]
s for some fixed s, this is no longer true: The edge can immediately go back
to another vertex on the lower envelope. Then we would be trying to charge
one vertex of the lower envelope to another. This can be done, but one must
define an "order" for each vertex, and charge envelope vertices of order i only
to vertices of order smaller than i or to vertices of significantly higher levels.
We show this for the case of curves in the plane. This example is artifi-
cial, since using Davenport-Schinzel sequences leads to much sharper bounds.
But we can thus demonstrate the ideas of the higher-dimensional proof, while
avoiding many technicalities. We remark that this proof is not really an up-
grade of the one for triangles: Here we aim at a much cruder bound, and so
some of the subtleties in the proof for triangles can be neglected.
We consider n planar curves as discussed in Section 7.1: They are graphs
of continuous functions R ~ R, and every two intersect at most s times.
Moreover, we assume for convenience that the curves cross at each intersec-
tion and no 3 curves have a common point.
7.6.1 Proposition. The maximum possible number of vertices on the lower
envelope of a set H of n curves as above is at most O(nHe) for every fixed
E > O. That is, for every s and every E > 0 there exists C such that the bound
is at most Cn He for all n.
-oo~ ~~h'
v h ~h
188 Chapter 7: Lower Envelopes
Next, we want to convert this into a recurrence involving only f and f(i).
To this end, we estimate f~ik by following the proof of Clarkson's theorem
on levels almost literally (as for the case of segments in Exercise 7.5.2). We
obtain
o(
f~k(n) = k 2 f(i) (l IJ))·
By substituting this bound (and its analogue for fs,k) into the right-hand
side of (7.2), we arrive at the system of inequalities
fast, and so A ~ As. Then the requirement that Ai be larger than the first
term in parentheses yields, after a little simplification,
e > cs-i+lk
ki k k s'
-I i+l i+2'"
Therefore, the k i should decrease very fast with i. We can set ks = ct/e
and k i = (Cr-i+ l kiH ki+2'" ks)l/e. Now setting AI, which is still a free
parameter, sufficiently (enormously) large, we can make sure that the desired
bounds f(i) (n) :s: Ain He hold at least up to n = kl' so that we can really
use the recurrence (7.3) in the induction with the k i defined above. These
considerations indicate that the induction works; to be completely sure, one
should perform it once more in detail. But we leave this to the reader's
diligence and declare Proposition 7.6.1 proved. 0
A = {X E Rd: If> (PI (X) ;::: O,P2(X) ;::: 0, ... ,Pr(X) ;::: 0) }.
Note that the formula If> may involve negations, and so the sets {X E
Rd: Pl(X) > O} and {x E Rd: Pl(X) = O} are semialgebraic, for example.
190 Chapter 7: Lower Envelopes
One might want to allow for quantifiers, that is, to admit sets like
{(Xl,X2) E R2: 3Yl VY2P(Xl,X2,Yl,Y2) ::::: O} for a 4-variate polynomial p.
As is useful to know, but not very easy to prove (and we do not attempt it
here), each such set is semialgebraic, too: According to a famous theorem of
Tarski, it can be defined by a quantifier-free formula.
Let D be the maximum of the degrees of the polynomials PI, ... , Pr ap-
pearing in the definition of a semi algebraic set A. Let us call the number
max(d, r, D) the description complexity2 of A. The results about lower en-
velopes concern semi algebraic sets whose description complexity is bounded
by a constant.
An algebraic surface patch is a special case of a semialgebraic set: It
can be defined as the intersection of the zero set of some polynomial q E
R[Xl"'" Xd] with a closed semialgebraic set B. Intuitively, q(x) = 0 defines
a "surface" in Rd, and B cuts off a closed patch from that surface. Note
that B can be all of R d, and so the forthcoming results apply, among others,
to graphs of polynomials or, more generally, to surfaces defined by a single
polynomial equation.
Let us remark that in the papers dealing with algebraic surface patches,
the definition is often more restrictive, and certainly the proofs make several
extra assumptions. Most significantly, they usually suppose that the patches
are smooth and they intersect transversally; that is, near each point com-
mon to the relative interior of k patches, these k patches look locally like k
hyperplanes in general position, 1 :::; k :::; d. These conditions follow from a
suitable general position assumption, namely, that the coefficients of all the
polynomials appearing in the descriptions of all the patches are algebraically
independent numbers. 3 This can be achieved by a perturbation, but a rigor-
ous argument, showing that a sufficiently small perturbation cannot decrease
the complexity of the lower envelope too much, is not entirely easy.
The algebraic surface patches are also typically required to be xd-mono-
tone (every vertical line intersects them only once). This can be guaranteed
by partitioning each of the original patches into smaller pieces, slicing them
along the locus of points with vertical tangent hyperplanes (and eliminating
the vertical pieces).
After these preliminaries, we can state the main theorem.
7.7.1 Theorem. For every integers band d ::::: 2 and every c > 0, there
exists C = C (d, b, c) such that the following holds. Whenever 1'1, 1'2, ... , I'n
are algebraic surface patches in R d, each of description complexity at most
b, the lower envelope of the arrangement of 1'1,1'2, ... ,I'n has combinatorial
complexity at most Cnd-He.
For this case Kedem, Livne, Pach, and Sharir [KLPS86] proved that
the complexity of U~l Ai is O(n), where the complexity is measured
as the sum of the complexities of the "exterior" cells of the arrange-
ment, Le., the cells that are not contained in any of the Ai.
For s ~ 4, long and skinny sets can form a grid pattern and have
union complexity about n 2 , but linear or near-linear bounds were
proved under additional assumptions. One type of such additional
assumption is metric, namely, that the objects are "fat." A rather
complicated proof of Efrat and Sharir [ESOO] shows that if each Ai
is convex, the ratio of the circumradius and inradius is bounded by
some constant K, and every two boundaries intersect at most s times,
then the union complexity is at most O(nHc) for anye > 0, with the
constant of proportionality depending on s, K, e. Earlier, Matousek,
Pach, Sharir, Sifrony, and Welzl [MPS+94] gave a simpler and more
precise bound of O( n log log n) for fat triangles. Pach, Safruti, and
Sharir [PSS01] showed that the union of n fat wedges in R3 (intersec-
tions of two half-spaces with angle at least some ao > 0), as well as the
union ofn cubes in R3, has complexity O(n2+ c ). Various extensions of
these results to nonconvex objects or to higher dimensions seem easy
to conjecture but quite hard to prove.
Several results are known where one assumes that the Ai have
special shapes or bounded complexity. Aronov, Sharir, and Tagansky
[AST97] proved that the complexity of the union of k convex polygons
in the plane with n vertices in total is O( k 2 +na( k)) and that the union
of k convex polytopes in R3 with n vertices in total has complexity
O( k 3 + kn log k). Boissonnat, Sharir, Tagansky, and Yvinec [BSTY98]
showed that the union of n axis-parallel cubes in Rd has O(nrd/21)
complexity, and O(nLd/2J) complexity if the cubes all have the same
size; both these bounds are tight.
Agarwal and Sharir [ASOOc] proved that the union of n infinite
cylinders of equal radius in R3 has complexity O(n2+c) (here n(n 2)
is a lower bound), and more generally, if AI"", An are pairwise dis-
joint triangles in R3 and B is a ball, then Ui(Ai + B) has complexity
O(n2+c), where Ai + B = {a + b: a E Ai, b E B} is the Minkowski
sum. The proof relies on the result mentioned above about two super-
imposed lower envelopes.
Exercises
1. Let PI, ... , Pn be points in the plane. At time t = 0, each Pi starts moving
along a straight line with a fixed velocity Vi. Use Theorem 7.7.1 to prove
that the convex hull of the n moving points changes its combinatorial
structure at most O(n2+c) times during the time interval [0,00).0
The tight bound is O(n 2 ); it was proved, together with many other related
results, by Agarwal, Guibas, Herschberger, and Veach [AGHVOl].
8
Intersection Patterns of
Convex Sets
Although simple, this is a key result, and many of the subsequent devel-
opments rely on it.
The best possible value of (3 is (3 = 1- (l_a)l/Cd+l). We prove the weaker
estimate (3 ~ d~l·
Proof. For a subset I ~ {I, 2, ... ,n}, let us write FI for the intersection
niEI F i ·
First we observe that it is enough to prove the theorem for the Fi closed
and bounded (and even convex polytopes). Indeed, given some arbitrary
F l , ... , Fn , we choose a point PI E FI for every (d+1)-tuple I with FI =I- 0
and we define Fl = conv{pI: FI =I- 0, i E I}, which is a polytope contained in
Fi . If the theorem holds for these Fl, then it also holds for the original Fi .
In the rest of the proof we thus assume that the F i , and hence also all the
nonempty F I , are compact.
Let :::;lexdenote the lexicographic ordering of the points of Rd by their
coordinate vectors. It is easy to show that any compact subset of Rd has a
unique lexicographically minimum point (Exercise 1). We need the following
consequence of Helly's theorem.
8.1.2 Lemma. Let I ~ {l, 2, ... ,n} be an index set with FI =I- 0, and let v
be the (unique) lexicographically minimum point of Fl. Then there exists an
at most d-element subset J ~ I such that v is the lexicographically minimum
point of FJ as well.
In other words, the minimum of the intersection Fr is always enforced by
some at most d "constraints" Fi , as is illustrated in the following drawing
(note that the two constraints determining the minimum are not determined
uniquely in the picture):
We can now finish the proof of the fractional ReIly theorem. For each of
the Q(d~l) index sets 1 of cardinality d+l with FI =I- 0, we fix ad-element
set J = J(I) c 1 such that FJ has the same lexicographic minimum as Fl.
The theorem follows by double counting. Since the number of distinct
d-tuples J is at most C), one of them, call it Jo, appears as J(1) for at least
Q(d~l)/C) = Q~:;:t distinct 1. Each such 1 has the form J o U {i} for some
i E {I, 2, ... ,n}. The lexicographic minimum of FJo is contained in at least
d + Q ~:;:t > Q d~l sets among the F i · Rence we may set /3 = d~l. 0
are the complexes where the homology of dimension d and larger van-
ishes for all induced subcomplexes). We do not formulate it but men-
tion one of its consequences, the upper bound theorem for families of
convex sets: If f r (N (F)) = 0 for a family F of n convex sets in R d and
some r, d::;: r ::;: n, then fk(N(F)) ::;: L~=o (k~j!l)(n-;+d); equality
holds, e.g., in the case mentioned above (several copies of Rd and hy-
perplanes in general position).
Exercises
1. Show that any compact set in Rd has a unique point with the lexico-
graphically smallest coordinate vector. 0
2. Prove the following colored Helly theorem: Let Cl , ... ,Cd + l be finite fam-
ilies of convex sets in Rd such that for any choice of sets C l E Cll ... ,
Cd+! E CMl , the intersection C l n ... n Cd+l is nonempty. Then for
some i, all the sets ofCi have a nonempty intersection. Apply a method
similar to the proof of the fractional Helly theorem; i.e., consider the lex-
icographic minima of the intersections of suitable collections of the sets.
[I]
The result is due to Lovasz ([Lov74J; also see [Bar82]).
3. Let F l , F 2 , ... , Fn be convex sets in Rd. Prove that there exist convex
polytopes PI , P2 , ... , Pn such that dim(niEI F i ) = dim(n iE1 Pi) for ev-
ery I S;;; {1,2, ... ,n} (where dim(0) = -1). 0
Proof. Call the convex hull of a (d+1)-point rainbow set a rainbow simplex.
We proceed by contradiction: We suppose that no rainbow simplex contains 0,
and we choose a (d+1)-point rainbow set S such that the distance of conv(S)
to 0 is the smallest possible. Let x be the point of conv(S) closest to O.
Consider the hyperplane h containing x and perpendicular to the segment
Ox, as in the picture:
Then all of S lies in the closed half-space h - bounded by h and not contain-
ing O. We have conv(S) n h = conv(S n h), and by Caratheodory's theorem,
there exists an at most d-point subset T ~ S n h such that x E conv(T).
Let i be a color not occurring in T (Le., Mi n T = 0). If all the points
of Mi lay in the half-space h-, then 0 would not be in conv(Mi), which we
assume. Thus, there exists a point y E Mi lying in the complement of h-
(strictly, i.e., y fj. h).
Let us form a new rainbow set S' from S by replacing the (unique) point
of Mi n S by y. We have T c S', and so x E conv(S'). Hence the segment
xy is contained in conv(S'), and we see that conv(S') lies closer to 0 than
conv(S), a contradiction. The colorful Caratheodory theorem is proved. 0
Exercises
1. Let 8 and T be (d+ 1)- point sets in R d, each containing 0 in the convex
hull. Prove that there exists a finite sequence 8 0 = 8,81 ,82 , . " , 8 m = T
of (d+l)-point sets with 8 i ~ 8 U T and 0 E conv(8i ) for all i, such
that 8 Hl is obtained from 8 i by deleting one point and adding another.
Assume general position of 8 U T if convenient. Warning: better do not
try to find a (d+l)-term sequence. 0
illustration shows what such partitions can look like for d = 2 and r = 3;
both the drawings use the same 7-point set A:
(Are these all Tverberg partitions for this set, or are there more?)
As in the colorful Caratheodory theorem, a very interesting open problem
is the existence of an efficient algorithm for finding a Tverberg partition of
a given set. There is a polynomial-time algorithm if the dimension is fixed,
but some NP-hardness results for closely related problems indicate that if
the dimension is a part of input then the problem might be algorithmically
difficult.
Several proofs of Tverberg's theorem are known. The one demonstrated
below is maybe not the simplest, but it shows an interesting "lifting" tech-
nique. We deduce the theorem by applying the colorful Caratheodory theorem
to a suitable point configuration in a higher-dimensional space.
Proof of Tverberg's theorem. We begin with a reformulation of Tver-
berg's theorem that is technically easier to handle. For a set X ~ R d , the
convex cone generated by X is defined as the set of all linear combinations of
points of X with nonnegative coefficients; that is, we set
cone(X) = {t
>=1
aiXi: Xl,"" Xn E X, al,"" an E R, ai 2: o} .
Geometrically, cone(X) is the union of all rays starting at the origin and
passing through a point of conv(X). The following statement is equivalent to
Tverberg's theorem:
8.3.2 Proposition (Tverberg's theorem: cone version). Let A be a set
of(d+1)(r-1) + 1 points in Rd+l such that 0 (j. conv(A). Then there exist r
pairwise disjoint subsets AI, A 2 , .•. , AT ~ A such that n~=l cone(Ai) =I- {a}.
Let us verify that this proposition implies Tverberg's theorem. Embed
Rd into R d + l as the hyperplane Xd+l = 1 (as in Section 1.1). A set A c
R d thus becomes a subset of R d+ 1 ; moreover, its convex hull lies in the
Xd+l = 1 hyperplane, and thus it does not contain O. By Proposition 8.3.2, the
set A can be partitioned into groups AI"'" AT with n~=l cone(Ai) =I- {O}.
The intersection of these cones thus contains a ray originating at O. It is
easily checked that such a ray intersects the hyperplane Xd+l = 1 and that
the intersection point is a Tverberg point for A. Hence it suffices to prove
Proposition 8.3.2.
202 Chapter 8: Intersection Patterns of Convex Sets
= (OIOI···IOlxIOI···IO).
({Jj(x)
--..--
x(i-l)
The last mapping, ({Jr, has -x in each block: ({Jr(X) = (-x 1- x 1···1 - x).
These maps have the following property: For any r vectors Ul, ... , U r E
R d +l ,
r
(the last equality follows from the linearity of each ({Jj). Write Uj = EiElj Uiai'
This is a linear combination of points of Aj with nonnegative coefficients, and
8.3 Tverberg's Theorem 203
• •
•
• • •
technically complicated, but the idea is simple: Start with some point
configuration for which the theorem is valid and convert it to a given
configuration by moving one point at a time. During the movement,
the current partition may stop working at some point, and it must be
shown that it can be replaced by another suitable partition by a local
change.
Later on, Tverberg found a simpler proof [Tve81]. For the proof
presented in the text above, the main idea is due to Sarkaria [Sar92],
and our presentation is based on a simplification by Onn (see [B097]).
Another proof, also due to Tverberg and inspired by the proof of the
colorful CaratModory theorem, was published in a paper by Tverberg
and Vrecica [TV93]. Here is an outline.
Let 7f = (AI, A 2, ... , Ar) be a partition of (d+1)(r-1)+1 given
points into r disjoint nonempty subsets. Consider a ball intersect-
ing all the sets conv(Aj ), j = 1,2, ... ,r, whose radius p = p(7f) is
the smallest possible. By a suitable general position assumption, it
can be assured that the smallest ball is always unique for any par-
tition. (Alternatively, among all balls of the smallest possible radius,
one can take the one with the lexicographically smallest center, which
again guarantees uniqueness.) If p( 7f) = 0, then 7f is a Tverberg parti-
tion. Supposing that p(7f) > 0, it can be shown that 7f can be locally
changed (by reassigning one point from one class to another) to an-
other partition 7f' with p( 7f') < p( 7f). Another proof, based on a similar
idea, was found by Roudneff [Rou01a]. Instead of p(7f), he considers
w(7f) = minxERd w(7f, x), where w(7f, x) = l:~=l dist(x, conv(Ai))2.
He actually proves a "cone version" of Tverberg's theorem (but dif-
ferent from our cone version and stronger).
Several extensions of Tverberg's theorem are known or conjectured.
Here we mention only two conjectures related to the dimension of the
set of Tverberg points. For X C R d , let Tr(X) denote the set of all
Tverberg points for r-partitions of A (the points of Tr(X) are usually
called r-divisible). Reay [Rea68] conjectured that if X is in general
position and has k more points than is generally necessary for the
existence of a Tverberg r-partition, i.e., IXI = (d+1)(r-1) + 1 + k,
then dim Tr(X) ;::: k. This holds under various strong general position
assumptions, and special cases for small k have also been established
°
(see Roudneff [Rou01a], [Rou01b]). Kalai asked the following sophis-
ticated question in 1974: Does l:~-:1 dim Tr(X) ;::: hold for every
finite Xc Rd? Here dim0 = -1, and so the nonexistence of Tverberg
r-partitions for large r must be compensated by sufficiently large di-
mensions of Tr(X) for small r. Together with other interesting aspects
of Tverberg's theorem, this is briefly discussed in Kalai's lively sur-
vey [Ka101]. There he also notes that edge 3-colorability of a 3-regular
graph can be reformulated as the existence of a Tverberg 3-partition
8.3 Tverberg's Theorem 205
Exercises
1. Prove (directly, without using Tverberg's theorem) that for any integers
d,rl,r2 22, we have T(d,rlr2) ::; T(d,rdT(d,r2). IT]
2. For each r 2 2 and d 2 2, find (d+l)(r-l) points in Rd with no Tverberg
r-partition. 0
3. Prove that Tverberg's theorem implies Proposition 8.3.2. Why is the
assumption 0 tJ conv(A) necessary in Proposition 8.3.2? ITl
4. (a) Derive the following Radon-type theorem (use Radon's lemma): For
every d 2 1 there exists £ = £( d) such that every £ points in R d in general
position can be partitioned into two disjoint subsets A, B such that not
only conv(A) n conv(B) -=I- 0, but this property is preserved by deleting
any single point; that is, conv(A \ {a}) n conv(B) -=I- 0 for each a E A and
conv(A) n conv(B \ {b}) -=I- 0 for each bE B. 0
(b) Show that £(2) 2 7. IT]
Remark. The best known value of £( d) is 2d+3; this was established by
Larman [Lar72], and his proof is difficult. The original question is, What
is the largest n = n(k) such that every n points in Rk in general position
can be brought to a convex position by some projective transform? Both
formulations are related via the Gale transform.
5. Show that for any d, r 2 1 there is an (N + 1)-point set in R d in general
position, N = (d+l)(r-l), having no more than ((r-l)!)d Tverberg
partitions. 0
6. Why does Tverberg's theorem imply the centerpoint theorem (Theo-
rem 1.4.2)? ITl
9
Geometric Selection
Theorems
We want show that the point a is contained in many X-simplices (so far we
have const . n and we need const . n d + 1 ).
Let J = {jo, ... ,jd} <;;; {1, 2, . . . , r} be a set of d+1 indices. We apply the
colorful Caratheodory's theorem (Theorem 8.2.1) for the (d+ 1) "color" sets
M jo , ••. , M jd , which all contain a in their convex hull. This yields a rainbow
X -simplex S J containing a and having one vertex from each of the M j ;, as
illustrated below:
If J' =f. J are two (d+1)-tuples of indices, then SJ =f. SJ', Hence the
number of X -simplices containing the point a is at least
9.1 A Point in Many Simplices: The First Selection Lemma 209
For n sufficiently large, say n :2: 2d( d+ 1), this is at least (d+ 1 )-(d+l)2- d (d~l)'
o
The second proof: from fractional Helly. Let F df!note the family of
all X-simplices. Put N = IFI = (d~l)' We want to apply the fractional Helly
theorem (Theorem 8.1.1) to:F. Call a (d+l)-tuple of sets of F good if its
d+ 1 sets have a common point. To prove the first selection lemma, it suffices
to show that there are at least a(d~l) good (d+1)-tuples for some a > 0
independent of n, since then the fractional Helly theorem provides a point
common to at least fiN members of F.
Set t = (d+ 1)2 and consider a t-point set Y eX. Using Tverberg's
theorem, we find that Y can be partitioned into d+ 1 pairwise disjoint
sets, of size d+ 1 each, whose convex hulls have a common point. (Tver-
berg's theorem does not guarantee that the parts have size d+l, but if they
don't, we can move points from the larger parts to the smaller ones, us-
ing Caratheodory's theorem.) Therefore, each t-point Y C X provides at
least one good (d+l)-tuple of members of:F. Moreover, the members of this
good (d+l)-tuple are pairwise vertex-disjoint, and therefore the (d+l)-tuple
uniquely determines Y. It follows that the number of good (d+ 1)-tuples is at
least (~) = O(n(d+l)2) :2: a(!l)' 0
In the first proof we have used Tverberg's theorem for a large point set,
while in the second proof we applied it only to configurations of bounded size.
For the latter application, if we do not care about the constant of propor-
tionality in the first selection lemma, a weaker version of Tverberg's theorem
suffices, namely the finiteness of T( d, d+ 1), which can be proved by quite
simple arguments, as we have seen.
The relation of Tverberg's theorem to the first selection lemma in the
second proof somewhat resembles the derivation of macroscopic properties
in physics (pressure, temperature, etc.) from microscopic properties (laws of
motion of molecules, say). From the information about small (microscopic)
configurations we obtained a global (macroscopic) result, saying that a sig-
nificant portion of the X -simplices have a common point.
A point in the interior of many X -simplices. In applications of the
first selection lemma (or its relatives) we often need to know that there is a
point contained in the interior of many of the X -simplices. To assert anything
like that, we have to assume some kind of nondegenerate position of X. The
following lemma helps in most cases.
But it can easily happen that one of these triples, say {a, b, c}, is not an edge
of our hypergraph. Tverberg's theorem gives us no additional information on
which triples appear in the partition, and so this argument would guarantee
a good triple only if all the triples on the considered 9 points were contained
in F. Unfortunately, a 3-uniform hypergraph on n vertices can contain more
than half of all possible (;) triples without containing all triples on some 9
points (even on 4 points). This is a "higher-dimensional" version of the fact
i
that the complete bipartite graph on ~ + ~ vertices has about n 2 edges
without containing a triangle.
Hypergraphs with many edges need not contain complete hypergraphs,
but they have to contain complete multipartite hypergraphs. For example, a
graph on n vertices with significantly more than n 3 / 2 edges contains K 2 ,2,
the complete bipartite graph on 2 + 2 vertices (see Section 4.5). Concerning
hypergraphs, let Kd+1(t) denote the complete (d+1)-partite (d+1)-uniform
hypergraph with t vertices in each of its d+ 1 vertex classes. The illustration
shows a K3(4); only three edges are drawn as a sample, although of course,
all triples connecting vertices at different levels are present.
m tk
f k(n, m) = ck ntk (n-k) - C k nt(k-l) ,
214 Chapter 9: Geometric Selection Theorems
I><I
The idea is to count the number of all pairs (K, v), where K E K and v is an
extending vertex of K, in two ways.
On the one hand, if a fixed copy K E K has qK extending vertices, then
it contributes (qf) distinct copies of Kk(t) in H. We note that one copy of
Kk(t) comes from at most 0(1) distinct K E K in this way, and therefore it
suffices to bound LKEJ( (qf) from below.
On the other hand, for a fixed vertex v, the hypergraph Hv contains at
least fk-l (n, mv) copies K E K by the inductive assumption, where mv is
the number of edges of Hv. Hence
L qK 2 L fk-l(n,m v )'
KEJ( vEV
Using LVEV mv = km, the convexity of fk-l in the second variable, and
Jensen's inequality (see page xvi), we obtain
L qK 2 nik-l(n,km/n). (9.1)
KEJ(
To conclude the proof, we define a convex function extending the binomial
coefficient (~) to the domain R:
for x :::; t - 1,
g(x) ={ ~(X-l)"'(X-t+l) for x > t - 1.
t!
9.3 Order Types and the Same- Type Lemma 215
We want to bound "'L-KEK9(QK) from below, and we have the bound (9.1) for
"'L-KEK QK· Using the bound IKI :::; nt(k-l) (clear, since Kk-l(t) has t(k-1)
vertices) and Jensen's inequality, we derive that the number of copies of Kk(t)
in 1£ is at least
t(k-l) (n fk-l (n, km/n))
en 9 nt(k-l) .
A calculation finishes the induction step; we omit the details. o
Bibliography and remarks. The second selection lemma was
conjectured, and proved in the planar case, by Barany, Fiiredi, and
Lovasz [BFL90]. The missing part for higher dimensions was the col-
ored Tverberg theorem (discussed in Section 8.3). A proof for the
planar case by a different technique, with considerably better quanti-
tative bounds than can be obtained by the method shown above, was
given by Aronov, Chazelle, Edelsbrunner, Guibas, Sharir, and Wenger
[ACE+91] (the bounds were mentioned in the text). The full proof of
the second selection lemma for arbitrary dimension appears in Alon,
Barany, Fiiredi, and Kleitman [ABFK92].
Several other "selection lemmas," sometimes involving geometric
objects other than simplices, were proved by Chazelle, Edelsbrunner,
Guibas, Herschberger, Seidel, and Sharir [CEG+94].
Theorem 9.2.2 is from Erdos and Simonovits [ES83].
Exercises
1. (a) Prove a one-dimensional selection lemma: Given an n-point set X c
R and a family F of a(~) X-intervals, there exists a point common
to D( a 2 G)) intervals of F. What is the best value of the constant of
proportionality you can get? IT]
(b) Show that this result is sharp (up to the value of the multiplicative
constant) in the full range of a. III
2. (a) Show that the exponent 82 in the second selection lemma in the plane
cannot be smaller than 2. III
(b) Show that 83 2: 2. 8J Can you also show that Sd 2: 2?
(c) Show that the proof method via the fractional Helly theorem cannot
give a better value of 82 than 3 in Theorem 9.2.1. That is, construct an
n-point set and a(~) triangles on it in such a way that no more than
O(a 5 n 9 ) triples of these triangles have a point in common. III
• • •
and
• • • • •
What is an appropriate equivalence relation that would capture the intuitive
notion of two finite point sets in Rd being "combinatorially the same"? We
have already encountered one suitable notion of combinatorial isomorphism
in Section 5.6. Here we describe an equivalent but perhaps more intuitive
approach based on the order type of a configuration. First we explain this
notion for planar configurations in general position, where it is quite simple.
Let p = (PI,P2,'" ,Pn) and q = (ql, q2, ... , qn) be two sequences of points
in R 2 , both in general position (no 2 points coincide and no 3 are collinear).
Then p and q have the same order type if for any indices i < j < k we turn
in the same direction (right or left) when going from Pi to Pk via Pj and when
going from qi to qk via qj:
or
We say that both the triples (Pi,pj,Pk) and (qi' qj, qk) have the same orien-
tation.
If the point sequences p and q are in R d , we require that every (d+1)-
element subsequence of p have the same orientation as the corresponding
subsequence of q. The notion of orientation is best explained for d-tuples of
vectors in Rd. If VI, ... , Vd are vectors in R d, there is a unique linear mapping
sending the vector ei of the standard basis of Rd to Vi, i = 1,2, ... , d. The
matrix A of this mapping has the vectors VI,"" Vd as the columns. The
orientation of (VI"'" Vd) is defined as the sign of det(A); so it can be +1
(positive orientation), -1 (negative orientation), or 0 (the vectors are linearly
dependent and lie in a (d-1)-dimensionallinear subspace). For a (d+1)-tuple
of points (PI. P2, ... ,Pd+ I), we define the orientation to be the orientation of
the d vectors P2 - PI, P3 - PI, ... ,Pd+1 - Pl. Geometrically, the orientation of
a 4-tuple (PI,P2,P3,P4) tells us on which side of the plane PIP2P3 the point
P4 lies (if PI,P2,P3,P4 are affinely independent).
Returning to the order type, let p = (PI,P2,'" ,Pn) be a point sequence
in Rd. The order type of p (also called the chirotope of p) is defined as the
mapping assigning to each (d+1)-tuple (iI, i2,.'" id+d of indices, 1 ::; i l <
i2 < ... < id+1 ::; n, the orientation of the (d+1)-tuple (PiuPi2"" ,Pid+J.
Thus, the order type of p can be described by a sequence of + 1's, -1 's, and
O's with (d~l) terms.
The order type makes good sense only for point sequences in Rd con-
taining some d+ 1 affinely independent points. Then one can read off various
properties of the sequence from its order type, such as general position, con-
vex position, and so on; see Exercise 1.
9.3 Order Types and the Same-Type Lemma 217
..........
Y3
:
..... ::::.::::::......
.•.
........ Y1 ·····•...
Y. ~:.:::.::./ ............................ ::~.::.:: . .::.~. Y2
If (X I ,X2 , ••• ,Xm ) are very large finite sets such that XIU···UXm
is in general position, 1 we can find not too small subsets YI ~ X I, ... ,
Ym ~ Xm such that (YI , ... , Ym ) has same-type transversals. To see this,
color each transversal of (Xl, X 2 , ... , Xm) by its order type. Since the num-
ber of possible order types of an m-point set in general position cannot ex-
ceed r = 2U:;:1), we have a coloring of the edges of the complete m-partite
hypergraph on (Xl, ... ,Xm) by r colors. By the Erdos-Simonovits theorem
(Theorem 9.2.2), there are sets Yi ~ Xi, not too small, such that all edges
induced by Yl U·· ·UYm have the same color, i.e., (Yl , ... , Ym ) has same-type
transversals.
As is the case for many other geometric applications of Ramsey-type theo-
rems, this result can be quantitatively improved tremendously by a geometric
argument: For m and d fixed, the size of the sets Yi can be made a constant
fraction of IXil.
sets (Z~l' ... ' ZL+J such that this (d+1)-tuple has same-type transversals.
After this step is executed for all (d+ 1)-tuples of indices, the resulting current
m-tuple of sets has same-type transversals.
This method gives the rather small lower bound
To handle the crucial case m = d+1, we will use the following criterion
for a (d+ 1)-tuple of sets having same-type transversals.
9.3.2 Lemma. Let C 1 , C 2 , ..• , Cd+! ~ Rd be convex sets. The following two
conditions are equivalent:
(i) There is no hyperplane simultaneously intersecting all ofCI , C 2 , ... , Cd+!.
(ii) For each nonempty index set I c {l, 2, ... ,d+1}, the sets UiEI C i and
UNI Cj can be strictly separated by a hyperplane.
Moreover, if Xl, X 2 , .•. , Xd+! C Rd are finite sets such that the sets C i =
conv(Xi ) have property (i) (and (ii)), then (Xl' ... ' Xd+d has same-type
transversals.
In particular, planar convex sets C l , C 2 , C 3 have no line transversal if and
only if each of them can be separated by a line from the other two. The proof
of this neat result is left to Exercise 3. We will not need the assertion that
(i) implies (ii).
Same-type lemma for d+l sets. To prove the same-type lemma for the
case m = d+1, it now suffices to choose the sets Yi ~ Xi in such a way
that their convex hulls are separated in the sense of (ii) in Lemma 9.3.2.
This can be done by an iterative application of the ham-sandwich theorem
(Theorem 1.4.3).
Suppose that for some nonempty index set I c {l, 2, ... ,d + I}, the sets
conv(U iEI Xd and conv(Uj 9!'I Xj) cannot be separated by a hyperplane. For
notational convenience, we assume that d+1 E I. Let h be a hyperplane
simultaneously bisecting Xl, X 2 , ... , Xd, whose existence is guaranteed by
the ham-sandwich theorem. Let 'Y be a closed half-space bounded by hand
containing at least half of the points of X d +!. For all i E I, including i = d+ 1,
we discard the points of Xi not lying in 'Y, and for j ~ I we throwaway the
points of Xj that lie in the interior of'Y (note that points on h are never
discarded); see Figure 9.1.
We claim that union of the resulting sets with indices in I is now strictly
separated from the union of the remaining sets. If h contains no points of the
sets, then it is a separating hyperplane. Otherwise, let the points contained
in h be all .. . ,at; we have t :::; d by the general position assumption. For
each aj, choose a point aj very near to aj. If aj lies in some Xi with i E I,
then aj is chosen in the complement of 'Y, and otherwise, it is chosen in the
interior of 'Y. We let h' be a hyperplane passing through a~, ... ,a~ and lying
9.3 Order Types and the Same-Type Lemma 219
r .\.X 3
--'
Xl \
(m· · · · · ~· · · ·\·~.\
\ . . . .\
I
~-I-_-\---+~- h
'.................--<.
X2
ini ial ct 1 = {3}
¥ '-
\
\
\
I h
I
r \
--'
\
\--,.£ .......
/
'\ c:::::7.: ..........\
.
\..
... _-.- ........ .. ....... ... (j.<. . . . . :~ h
very close to h. Then h' is the desired separating hyperplane, provided that
the aj are sufficiently close to the corresponding aj, as in the picture below:
......
h
h' ..
Thus, we have "killed" the index set I, at the price of halving the sizes
of the current sets; more precisely, the size of a set Xi is reduced from IXil
to r1Xi l/21 (or larger). We can continue with the other index sets in the
same manner. After no more than 2d - 1 halvings, we obtain sets satisfying
the separation condition and thus having same-type transversals. The same-
type lemma is proved. The lower bound for c( d, d+ 1) is doubly exponential,
roughly 2- 2d • 0
step is showing n(d, d+1) ::; 2n(d-1, d+1). The Xi are projected on
a generic hyperplane h and the appropriate partitions are found for
the projections by induction. Let XI c h be the projection of Xi, let
Y{, . .. ,y('d+l) be one of the "columns" in the partitions of the XI (we
omit the index j for simpler notation), let k = IJi'I, and let Yi ~ Xi be
the preimage of Ji'. As far as separation by hyperplanes is concerned,
the Ji' behave like d+l points in general position in R d - l , and so there
is only one inseparable (Radon) partition (see Exercise 1.3.9), i.e., an
I C {I, 2, ... , d+l} (unique up to complementation) such that UiE1 Ji'
cannot be separated from UiltI Ji'. By an argument resembling proofs
of the ham-sandwich theorem, it can be shown that there is a half-
space 'Y in Rd and a number kl such that h n Yil = kl for i E I and
l'YnYiI = k-k l for i 1- I. Letting Zi = Yin'Y for i E I and Zi = Yi \'Y
for i rt I and Ti = Yi \ Zi, one obtains that (Zl, ... , Zd+l) satisfy
condition (ii) in Lemma 9.3.2, and so they have same-type transver-
sals, and similarly for the Ti . A 2-dimensional picture illustrates the
construction:
I = {I 3}
Exercises
1. Let p = (PI,P2, ... ,Pn) be a sequence of points in Rd containing d+1
affinely independent points. Explain how we can decide the following
questions, knowing the order type of p and nothing else about it:
(a) Is it true that for every k points among the Pi, k = 2,3, ... , d+1, the
affine hull has the maximum dimension k-1? 0
(b) Does PM2 lie in conv({Pl, ... ,PMI})? I}]
(c) Are the points PI, ... ,Pn convex independent (Le., is each of them a
vertex of their convex hull)? 0
2. Let p = (PbP2, ... ,Pn) be a sequence of points in Rd whose affine hull
is the whole of Rd. Explain how we can determine the order type of p,
up to a global change of all signs, from the knowledge of sgn(AfNal(p))
(the signs of affine functions on the Pi; see Section 5.6). 0
222 Chapter 9: Geometric Selection Theorems
of R' intersects each segment of B' or each segment of R' is disjoint from
°
each segment of B' b> is another absolute constant). 12]
The result in (c) is due to Pach and Solymosi [PS01j.
We choose Y I , ... , Yk, Yi <;;;; Xi, as sets of equal size that have the maximum
possible magical density jt(YI , ... , Yk ). We denote the common size WII =
... = IYkl by s.
First we derive the condition (i) in the theorem for this choice of the Yi.
We have
and so e(YI , ... , Yk ) ~ f3s k , which verifies (i). Since obviously e(YI , ... , Yk) ~
sk, we have jt(YI , ... ,Yk ) ~ se k • Combining with jt(YI , ... , Yk ) ~ f3n ek de-
rived above, we also obtain that s ~ f31/e k n.
It remains to prove (ii). Since €s is a large number by the assumptions,
rounding it up to an integer does not matter in the subsequent calculations
(as can be checked by a simple but somewhat tedious analysis). In order
to simplify matters, we will thus assume that €s is an integer, and we let
ZI <;;;; Y I ,· · ·, Zk <;;;; Yk be €s-element sets. We want to prove e(ZI, ... , Zk) > O.
We have
9.4 A Hypergrapb Regularity Lemma 225
We want to show that the negative terms are not too large, using the as-
sumption that the magical density of Y1 , ... ,Yk is maximum. The problem
is that Y1 , ... , Yk maximize the magical density only among the sets of equal
size, while we have sets of different sizes in the terms. To get back to sets of
equal size, we use the following observation. If, say, Rl is a randomly chosen
subset of Y1 of some given size r, we have
p(Y1 \ Zl, R2,.·" Rk) = ((1 - c)s)-e p(Y1 \ Zl, R 2, ... , Rk)
k
Therefore,
To estimate the term e(Zl' Z2,"" Zi-l, li \Zi, li+l,"" Yk ), we use random
subsets Ri C li \ Zi and Ri+l C li+l,"" Rk C Yk, this time all of size lOS.
A similar calculation as before yields
e(Zl' Z2,"" Zi-l, li \ Zi, li+l,"" Y k ) :::; ci-1-c: k (1 - c)e(Y1, ... , Yk).
(This estimate is also valid for i = 1, but it is worse than the one derived
above and it would not suffice in the subsequent calculation.) From (9.2) we
obtain that e(Zl, ... , Zk) is at least e(Y1 , •.• , Y k ) multiplied by the factor
226 Chapter 9: Geometric Selection Theorems
k
1 - (1 - E:) - (1 - E:)E:- ck I:>i-I = E: - E: I - ck (1 _ E: k- I )
i=2
= E: (1 +E:-Ck(c k- I -1))
= E: (1 -1))
+eckln(l/c)(E:k-1
Exercises
1. Verify the equality E[p(RI, Y2 , ... , Yk )] = p(YI , ... , Yk ), where the ex-
pectation is with respect to a random choice of an r-element RI S;; YI .
Also derive the other similar equalities used in the proof in the text. [2]
2. (Density Ramsey-type result for segments)
(a) Let e > 0 be a given positive constant. Using Exercise 9.3.5(c) and
the weak regularity lemma, prove that there exists (3 = (3(e) > 0 such
that whenever Rand B are sets of segments in the plane with RuB in
general position and such that the number of pairs (r, b) with r E R,
b E B, and r n b ;t 0 is at least en 2 , then there are subsets R' S;; Rand
B' S;; B such that IR'I 2:: (3n, IB'I 2:: (3n, and each r E R' intersects each
bE B'. IT]
(b) Prove the analogue of (a) for noncrossing pairs. Assuming at least en 2
pairs (r, b) with r n b = 0, select R' and B' of size (3n such that r n b = 0
for each r E R' and bE B'. IT!
These results are from Pach and Solymosi [PS01].
3. (a) Let G = (V, E) be a graph, and let V be partitioned into classes
VI, V2 , V3 of size m each. Suppose that there are no edges with both
vertices in the same Vi, that Ip(Vi, Vj) - ~ I :::; e for all i < j, and that
each pair (Vi, Vj) is e-regular (this means that Ip(A,B) - p(Vi, Vj)1 :::; e
for any A S;; Vi and B S;; Vj with IAI, IBI 2:: em). Prove that the number
of triangles in G is (~ + 0(1) )m3 , where the 0(1) notation refers to e --+ 0
(while m is considered arbitrary but sufficiently large in terms of e). IT]
(b) Generalize (a) to counting the number of copies of K 4 , where G has
4 classes VI, ... , V4 of equal size (if all the densities are about ~, then the
number should be (2- 6 + 0(1))m 4 ). IT]
4. For every e > 0 and for arbitrarily large m, construct a 3-uniform 4-
partite hypergraph with vertex classes VI' ... ' V4 , each of size m, that
contains no K~3) (the system of all triples on 4 vertices), but where
Ip(Vi, Vj, Vk ) - ~I :::; e for all i < j < k and each triple (Vi, Vj, Vk ) is
228 Chapter 9: Geometric Selection Theorems
e-regular. The latter condition means Ip(Ai,Aj,A k ) - p(Vi, Vj, Vk)1 ::; 10
for every Ai ~ Vi, Aj ~ Vj, Ak ~ Vk of size at least em. 0
lemma, and we apply the weak regularity lemma (Theorem 9.4.1) to 1/.. This
yields sets YI <;;:; Xl, ... , Yd+1 <;;:; Xd+1, whose size is at least a fixed fraction of
the size of the Xi, and such that any subsets Zl <;;:; YI , ... , Zd+1 <;;:; Yd+1 of size
at least EIYiI induce an edge; this means that there is a rainbow X-simplex
with vertices in the Zi and containing the point a.
The argument is finished by applying the same-type lemma with the d+2
sets YI , Y2, ... , Yd+l and Yd+ 2 = {a}. We obtain sets Zl <;;:; YI , ... , Zd+l <;;:;
Yd+1 and Zd+2 = {a} with same-type transversals, and with IZil 2: EIYiI
for i = 1,2, ... ,d+ 1. (Indeed, the same-type lemma guarantees that at least
one point is selected even from an I-point set.) Now either all transversals
of (Zl, ... , Zd+d contain the point a in their convex hull or none does (use
Exercise 9.3.I(d)). But the latter possibility is excluded by the choice of the
Yi (by the weak regularity lemma). The positive-fraction selection lemma is
proved. 0
It is amazing how many quite heavy tools are used in this proof. It would
be nice to find a more direct argument.
v(F) :S T(F).
In the reverse direction, very little can be said in general, since T(F) can be
arbitrarily large even if v(F) = 1. As a simple geometric example, we can
take the plane as the ground set X and let the sets of F be n lines in general
position. Then v(F) = 1, since every two lines intersect, but T(F) 2:: ~ n,
because no point is contained in more than two of the lines.
Fractional packing and transversal numbers. Now we introduce an-
other parameter of a set system, which always lies between v and T and which
has proved extremely useful in arguments estimating T or v. First we restrict
ourselves to set systems on finite ground sets.
Let F be a system of subsets of a finite set X. A fractional transversal for
F is a function <p: X -+ [0,1] such that for each S E F, we have LXES <p(x) 2::
1. The size of a fractional transversal <p is LXEX <p(x), and the fractional
transversal numberT*(F) is the infimum of the sizes offractional transversals.
So in a fractional transversal, we can take one-third of one point, one-fifth
of another, etc., but we must put total weight of at least one full point into
every set.
10.1 General Preliminaries: Transversals and Matchings 233
in particular, both the minimum and the maximum are well-defined and
attained.
This result can be quickly proved by piecing together a larger matrix from
A, b, and c and applying a suitable version of the Farkas lemma (Lemma 1.2.5)
to it (Exercise 6). It can also be derived directly from the separation theorem.
234 Chapter 10: Transversals and Epsilon Nets
Let us remark that there are several versions of the linear programming
duality (differing, for example, in including or omitting the requirement x ~
0, or replacing Ax ~ b by Ax = b, or exchanging minima and maxima), and
they are easy to mix up.
Proof of Theorem 10.1.1. Set n = IXI and m = IFI, and let A be the
m x n incidence matrix of the set system F: Rows correspond to sets, columns
to points, and the entry corresponding to a point p and a set S is 1 if pES
and 0 if p f/. S. It is easy to check that v*(F) and r*(F) are solutions to the
following optimization problems:
For the definition of T*, the first attempt might be to consider all functions
cp: X --+ [0, 1] attaining only finitely many nonzero values and summing up to
at least lover every set. But this does not work very well: For example, if we
let F be the system of all compact subsets of [0, 1] of Lebesgue measure ~,
say, then //* :::; 2 but T* would be infinite, since any finite subset is avoided
by some member of F. It is better to define a fractional transversal of F as
a Borel measure f-l on X such that f-l(S) 2:: 1 for all S E F, and T*(F) as
the infimum of f-l(X) over all such f-l. With this definition, the validity of the
first part Theorem 10.1.1 is preserved; i.e., //*(F) = T*(F) for all systems F
of closed sets in a compact X. The proof uses a little of functional analysis,
and we omit it; it can be found in [KM97a]. The rationality of //* and T* no
longer holds in the infinite case.
lution whose size is no more than (1 + In IXI) times larger than the
optimal one. I Lovasz actually observed that the proof implies, for any
finite set system F,
Exercises
1. (a) Find examples of set systems with r* bounded by a constant and r
arbitrarily large. IT]
(b) Find examples of set systems with v bounded by a constant and v*
arbitrarily large. IT]
2. Let F be a system of finitely many closed intervals on the real line. Prove
that v(F) = r(F). 0
3. Prove that
r(F) ::::; r*(F) ·In(IFI+1)
for all (finite) set systems F. Choose a transversal as a random sample.
o
4. (Analysis of the greedy algorithm for transversal) Let F be a finite set
system. We choose points Xl, X2, ..• ,Xt of a transversal one by one: Xi is
taken as a point contained in the maximum possible number of uncovered
sets (i.e., sets of F containing none of Xl, ... , Xi- d.
(a) Prove that the size t of the resulting transversal satisfies
Fly = {S n Y: S E F}.
It may happen that several distinct sets in F have the same intersection with
Y; in such a case, the intersection is still present only once in FI Y .
10.2.3 Definition (VC-dimension). Let F be a set system on a set X.
Let us say that a subset A <;;; X is shattered by F if each of the subsets of A
can be obtained as the intersection of some S E F with A, i.e., if FIA = 2A.
We define the VC-dimension of F, denoted by dim(F), as the supremum of
the sizes of all finite shattered subsets of X. If arbitrarily large subsets can
be shattered, the VC-dimension is 00 .
Let us consider two examples. First, let 1i be the system of all closed
half-planes in the plane. We claim that dim(1i) = 3. If we have 3 points in
general position, each of their subsets can be cut off by a half-plane, and so
such a 3-point set is shattered. Next, let us check that no 4-point set can be
shattered. Up to possible degeneracies, there are only two essentially different
positions of 4 points in the plane:
• o •
o
o
• • •
In both these cases, if the black points are contained in a half-plane, then
a white point also lies in that half-plane, and so the 4 points are not shat-
tered. This is a rather ad hoc argument, and later we will introduce tools
for bounding the VC-dimension in geometric situations. We will see that
bounded VC-dimension is rather common for families of simple geometric
objects in Euclidean spaces.
A rather different example is the system K2 of all convex sets in the plane.
Here the VC-dimension is infinite, since any finite convex independent set A
is shattered: Each B <;;; A can be expressed as the intersection of A with a
convex set, namely, B = An conv(B).
10.2 Epsilon Nets and VO-Dimension 239
shattered by :F. Therefore, IF21 S; <I> d-l (n-l). The resulting recurrence has
already been solved in the first proof of Proposition 6.1.1. D
The rest of the proof of the epsilon net theorem is a clever probabilistic
argument; one might be tempted to believe that it works by some magic.
First we need a technical lemma concerning the binomial distribution.
2 This double sampling resembles the proof of Proposition 6.5.2, and indeed these
proofs have a lot in common, although they work in different settings.
10.2 Epsilon Nets and VC-Dimension 241
es;k)
esS) -
< (1- ~)s <
28-
e-(k/2s)s = e-k/2 = e-(Cdlnr)/4 = r-Cd/4.
The epsilon net theorem implies that for set systems of small VC~dimen
sion, the gap between the fractional transversal number and the transversal
number cannot be too large.
10.2.7 Corollary. Let F be a finite set system on a ground set X with
dim (F) ::; d. Then we have
I/1(S) - IAnSl1
IAI < c:.
So while an c:-net intersects each large set at least once, an c:-ap-
proximation provides a "proportional representation" up to the er-
ror of c:. Vapnik and Chervonenkis [VC71] proved the existence of
~-approximations of size O( dr 2 log r) for all set system of VC-dimen-
sion d.
Koml6s, Pach, and Woginger [KPW92] improved the dependence
on d in .the Haussler-Welzl bound on the size of c:-nets. The improve-
ment is achieved by choosing the second sample M of size t somewhat
larger than s and doing the calculations more carefully. They also
proved an almost matching lower bound using suitable random set
systems. The proofs can be found in [PA95] as well.
The proof in the Vapnik-Chervonenkis style, while short and
clever, does not seem to convey very well the reasons for the existence
of small c:-nets. Somewhat longer but more intuitive proofs have been
found in the investigation of deterministic algorithms for constructing
c:-approximations and c:-nets; one such proof is given in [Mat99a], for
instance.
Exercises
1. Show that for any integer d there exists a convex set C in the plane such
that the family of all isometric copies of C has VC-dimension at least d.
W
2. Show that the shatter function lemma is tight. That is, for all d and n
construct a system of VC-dimension d on n points with 1>d(n) sets. 0
Proof. The following simple but powerful trick is known as the Veronese
mapping in algebraic geometry (or as linearization; it is also related to the
reduction of Voronoi diagrams to convex polytopes in Section 5.7). Let M
be the set of all possible nonconstant monomials of degree at most D in
Xl, ... ,Xd· For example, for D = d = 2, we have M = {Xl, X2, XIX2, xi, x~}.
Let m = IMI and let the coordinates in R m be indexed by the monomials
in M. Define the map <p: Rd -+ Rm by <p(x)JL = p,(x), where the monomial p,
serves as a formal symbol (index) on the left-hand side, while on the right-
hand side we have the number obtained by evaluating p, at the point X E Rd.
For example, for d = D = 2, the map is
Let S be a set system on a ground set X with dim(S) = d < 00. Let
T = {F(8 1 , ... , 8k): 8 1 , ... , 8k E S}.
Then dim(T) = O(kdlnk).
{Ys: S E .1'} , where the Ys are pairwise distinct points, and for each x E X
we have the set {Ys: S E .1', XES} (the same set may be obtained for several
different x, but this does not matter for the VC-dimension).
10.3.4 Lemma. Let (X, F) be a set system and let (Y,9) be the dual set
system. Then dim(9) < 2dim (.:F)+l.
Proof. We show that if dim(9) ~ 2d , then dim (F) ~ d. Let A be the inci-
dence matrix of (X, F), with columns corresponding to points of X and rows
corresponding to sets of .1'. Then the transposed matrix AT is the incidence
matrix of (Y, 9). If Y contains a shattered set of size 2d , then A has a 2d x 22d
submatrix M with all the possible 0/1 vectors of length 2d as columns. We
claim that M contains as a submatrix the 2d x d matrix Ml with all pos-
sible 0/1 vectors of length d as rows. This is simply because the d columns
of Ml are pairwise distinct and they all occur as columns of M. This Ml
corresponds to a shattered subset of size d in (X, F). Here is an example for
d= 2:
1}
0 0 0 0 0 0 0 1 1 1 1 1 1 1
M-e
- 0
0
0
0
1
0 0
1
0
1
1
1
0 0
0 1
1 1
1
0
1
1
1
0
0
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
the submatrix Ml is marked bold. 0
An art gallery problem. An art gallery, for the purposes of this section, is
a compact set X in the plane, such as the one drawn in the following picture:
The set X is the lightly shaded area, while the black regions are walls that
are not part of X. We want to choose a small set G c X of guards that
10.3 Bounding the VC-Dimension and Applications 247
Proof. The bound O(rlogr) for the number of guards is obtained from the
epsilon net theorem (Theorem 10.2.4). Namely, we introduce the set system
V = {Vex): x E X}, and note that G is a set guarding all of X if and only
if it is a transversal of V. Further, an E-net for (X, V) with respect to J.L is a
transversal of V, since by the assumption, J.L(V) :::: E = ~ for each V E V. So
the theorem will be proved if we can show that dim (V) is bounded by some
constant (independent of X).
Tools like Proposition 10.3.2 and Proposition 10.3.3 seem to be of little
use, since the visibility regions can be arbitrarily complicated. We thus need
a different strategy, one that can make use of the simple connectedness. We
248 Chapter 10: Transversals and Epsilon Nets
that each point of ~3 sees all points of A3 within an angle smaller than 7r
and in the same clockwise angular order; let :::;A be this linear order of the
points of A 3 . Similarly, we have a common counterclockwise angular order
:::;2; of points of ~3 around any point of A 3 •
Suppose that the initial d was so large that d3 = IA31 = 5. For each
a E A 3, we consider the point a(a) E ~3 that sees all points of A3 but a.
Let these 5 points form a set ~4 C ~3. We have a situation indicated below,
where dashed connecting segments correspond to invisibility and they form
a matching between A3 and ~4'
.... ......•
a'
a"
The segments aa' and a' a both lie above the line aa, and they intersect as
indicated (a' cannot line in the triangle aaa', because the line aa' would go
between a and a', and neither can the segment aa' be outside that triangle,
because then the line aa' would separate a from a'). Similarly, the segments
aa" and a" a intersect as shown. The four segments aa', a' a, aa", and a" a are
contained in X, and since X is simply connected, the shaded quadrilateral
bounded by them must be a part of X. Hence a and a can see each other.
This contradiction proves Theorem 10.3.5. 0
The bound on the VC-dimension obtained from this proof is rather large:
about 1012 . By a more careful analysis, avoiding the use of Lemma 10.3.4 on
the dual VC-dimension where one loses the most, the bound has been im-
proved to 23. Determining the exact VC-dimension in the worst case might
be quite challenging. The art gallery drawn in the initial picture is not chosen
only because of the author's liking for several baroque buildings with pentag-
onal symmetry, but also because it is an example where V has VC-dimension
at least 5 (Exercise 2). A more complicated example gives VC-dimension 6,
and this is the current best lower bound.
250 Chapter 10: Transversals and Epsilon Nets
Exercises
1. (a) Determine the VC-dimension of the set system consisting of all tri-
angles in the plane. 0
(b) What is the VC-dimension of the system of all convex k-gons in the
plane, for a given integer k? [2]
10.4 Weak Epsilon Nets for Convex Sets 251
2. Show that dim (V) ;::: 5 for the art gallery shown above Theorem 10.3.5.
~
Can you construct an example with VC-dimension 6, or even higher?
3. Show that the unit square cannot be expressed as {(x, y) E R2: p(x, y) ;:::
O} for any polynomial p(x, y). 0
4. (a) Let H be a finite set of lines in the plane. For a triangle T, let HT be
the set of lines of H intersecting the interior of T, and let T <:;;: 2H be the
system of the sets HT for all triangles T. Show that the VC-dimension
of T is bounded by a constant. ~
(b) Using (a) and the epsilon net theorem, prove the suboptimal cut-
ting lemma (Lemma 6.5.1): For every finite set H of lines in the plane
and for every r, 1 < r < IHI, there exists a ~-cutting for L consisting
of O(r 2 log2 r) generalized triangles. Use the proof in Section 4.6 as an
inspiration. 0
(c) Generalize (a) and (b) to obtain a cutting lemma for circles with the
same bound O(r 2 log2 r) (see Exercise 4.6.3). ~
5. Let d ;::: 1 be an integer, let U = {I, 2, ... ,d} and V = 2u . Let the
shattering graph SGd have vertex set U U V and edge set {{a, A}: a E
U, A E V, a E A}. Prove that if H is a bipartite graph with classes Rand
S, IRI = r and lSI = s, such that r+log2 s :::; d, then there is an r-element
subset R1 <:;;: U and an s-element Sl <:;;: V such that the subgraph induced
in SGd by R1 U Sl is isomorphic to H. Thus, the shattering graph is
"universal": It contains all sufficiently small bipartite subgraphs. 0
6. For a graph G, let N(G) = {Na(v): v E V(G)} be the system of vertex
neighborhoods (where Na(v) = {u E V(G): {u,v} E E(G)}).
(a) Prove that there is a constant do such that dim(N(G)) :::; do for all
planar G. 0
(b) Show that for every C there exists d = d( C) such that if G is a
graph in which every subgraph on n vertices has at most Cn edges, for
all n ;::: 1, then dim(N(G)) :::; d. (This implies (a) and, more generally,
shows that bounded genus of G implies bounded dim(N(G)).) 0
(c) Show that for every k there exists d = d(k) such that if dim(N(G)) ;:::
d, then G contains a subdivision of the complete graph Kk as a subgraph.
(This gives an alternative proof that if dim(N(G)) is large, then the genus
of G is large, too.) 0
Is this the best way? No; according to Definition 10.2.2, three points placed
as in the picture below form a valid €-net for every € :::: 0, since any half-plane
cutting into D necessarily contains at least one of them!
•
; ...•••...........•.••..,.•••.........•••.•.•
..................... :-
One may feel that this is a cheating. The problem is that the points of this
€-net are far away from where the measure is concentrated. For some applica-
tions of €-nets this is not permissible, and for this reason, €-nets of this kind
are usually called weak €-nets in the literature, while a "real" €-net in the
above example would be required to have all of its points inside the disk D.
For €-nets obtained using the epsilon net theorem (Theorem 10.2.4), this
presents no real problem, since we can always restrict the considered set
system to the subset where we want our €-net to lie. In the above example
we would simply require an €-net for the set system (D, HID)' The restriction
to a subset does not increase the VC-dimension.
On the other hand, there are set systems of infinite VC-dimension, and
there we cannot require small €-nets to exist for every restriction of the ground
set. Indeed, if (X, F) has infinite VC-dimension, then by definition, there is
an arbitrarily large A <;;; X that is shattered by F, meaning that FIA. = 2A.
And the complete set system (A, 2A) certainly does not admit small €-nets:
Any ~-net, say, for (A,2A) with respect to the uniform measure on A must
have at least ~ IAI elements! In this sense, the epsilon net theorem is an "if
and only if" result: A set system (X, F) and all of its restrictions to smaller
ground sets admit €-nets of size depending only on € if and only if dim(F) is
finite.
As was mentioned after the definition of VC-dimension, the (important)
system K2 of convex sets in the plane has infinite VC-dimension. Therefore,
the epsilon net theorem is not applicable, and we know that restrictions of
K2 to some bad ground sets (convex independent sets, in this case) provide
arbitrarily large complete set systems. But yet it turns out that not too
large (weak) €-nets exist if the ground set is taken to be the whole plane
(or, actually, it can be restricted to any convex set). These are much less
10.4 Weak Epsilon Nets for Convex Sets 253
understood than the e-nets in the case of finite VC-dimensions, and many
interesting questions remain open.
As has been done in the literature, we will restrict ourselves to measures
concentrated on finite point sets, and first we will talk about uniform mea-
sures. To be on the safe side, let us restate the definition for this particular
case, keeping the traditional terminology of "weak e-nets."
°
10.4.1 Definition (Weak epsilon net for convex sets). Let X be a
finite point set in Rd and e > a real number. A set N ~ Rd is called a
weak e-net for convex sets with respect to X if every convex set containing
at least elXI points of X contains a point of N.
In the rest of this section we consider exclusively e-nets with respect to
convex sets, and so instead of "weak e-net for convex sets with respect to X"
we simply say "weak e-net for X."
The best known bounds are 1(2,~) = O(r2) in the plane and I(d,~) =
O(rd(log r )b(d)) for every fixed d, with a suitable constant b( d) > 0. The proof
shown below gives I(d,~) = O(r dH ). On the other hand, no lower bound
superlinear in r is known (for fixed d).
Proof. The proof is simple once we have the first selection lemma (Theo-
rem 9.1.1) at our disposal.
Let an X C Rd be an n-point set. The required weak e-net N is con-
structed by a greedy algorithm. Set No = 0. If Ni has already been con-
structed, we look whether there is a convex set C containing at least en
points of X and no point of N i . If not, Ni is a weak e-net by definition. If
yes, we set Xi = X n C, and we apply the first selection lemma to Xi. This
gives us a point ai contained in at least Cd(~~iD = f2(e d+1 n d+1 ) Xrsimplices.
We set NiH = Ni U {ad and continue with the next step of the algorithm.
Altogether there are (d~l) X-simplices. In each step of the algorithm, at
least f2(e d + 1 nd+l) of them are "killed," meaning that they were not inter-
sected by Ni but are intersected by NiH. Hence the algorithm takes at most
O(c(d+l)) steps. 0
Sketch of proof. By taking E: a little smaller, we can make the point weights
rational. Then the problem is reduced to the weak epsilon net theorem with
X a multiset. One can check that all ingredients of the proof go through in
this case, too. 0
Exercises
1. Complete the following sketch of an alternative proof of the weak epsilon
net theorem.
(a) Let X be an n-point set in the plane (assume general position if
convenient). Let h be a vertical line with half of the points of X on each
side, and let Xl, X 2 be these halves. Let M be the set of all intersections
of segments of the form XlX2 with h, where Xl E Xl and X2 E X 2 .
Let No be a weak E:'-net for M (this is a one-dimensional situation!).
Recursively construct weak E:"-nets N l , N2 for Xl and X 2, respectively,
and set N = No U Nl U N 2 . Show that with a suitable choice of E:' and
E:", N is a weak E:-net for X of size O(E:- 2 ). 0
(b) Generalize the proof from (a) to Rd (use induction on d). Estimate
the exponent of E: in the resulting bound on the size of the constructed
weak E:-net. 0
10.5 The Hadwiger-Debrunner (p, q)-Problem 255
2. The aim of this exercise is to show that if X is a finite set in the plane
in convex position, then for any 10 > 0 there exists a weak c-net for X of
size nearly linear in ~.
(a) Let an n-point convex independent set X C R2 be given and let
l::; n be a parameter. Choose points PO,Pl, ... ,Pe-l of X, appearing in
this order around the circumference of conv(X), in such a way that the
set Xi of points of X lying (strictly) between Pi-l and Pi has at most n/l
points for each i. Construct a weak c'-net Ni for each Xi (recursively)
with 10' = lc/3, and let M be the set containing the intersection of the
segment POPj-l with PjPi, for all pairs i,j, 1 ::; i < j-1 ::; l-2. Show
that the set N = {Po, ... ,Pe-d U Nl U··· U Nt U M is a weak c-net for
X. ~
(b) If /(10) denotes the minimum necessary size of a weak c-net for a
finite convex independent point set in the plane, derive a recurrence for
/(10) using (a) with a suitably chosen l, and prove the bound for /(10) =
o (~ (log ~) C). What is the smallest c you can get? ~
3. In this exercise we want to show that if X is the vertex set of a regular
convex n-gon in the plane, then there exists a weak c-net for X of size
O(~).
Suppose X lies on the unit circle u centered at O. For an arc length a ::; 7r
radians, let r(a) be the radius of the circle centered at 0 and touching a
chord of u connecting two points on u at arc distance a. For i = 0,1,2, ... ,
let Ni be a set of led~~)i J points placed at regular intervals on the circle
of radius r(c(1.01)i /10) centered at 0 (we take only those i for which
this is well-defined). Show that 0 U UiNi is a weak c-net of size O(~)
for X (the constants 1.01, etc., are rather arbitrary and can be greatly
improved). ~
Proof. The first observation is that if F satisfies the (p, d+ 1)-condition, then
many (d+1)-tuples of sets of F intersect. This can be seen by double counting.
Every p-tuple of sets of F contains (at least) one intersecting (d+1)-tuple,
10.5 The Hadwiger-Debrunner (p, q)-Problem 257
T bounded by a function
of d and T* for systems of
convex sets
linear programming
duality * v* = T*
(p, q)-theorem:
(p, d+ 1)-condition* T bounded
3 By removing these (3n sets and iterating, we would get that :F can be pierced by
O(logn) points. The main point of the (p,q)-theorem is to get rid of this logn
factor.
10.6 A (p, q)-Theorem for Hyperplane Transversals 259
Exercises
1. For which values of p and r does the following hold? Let F be a finite
family of convex sets in R d , and suppose that any subfamily consisting
of at most p sets can be pierced by at most r points. Then F can be
pierced by at most C points, for some C = Cd(p, r). ~
2. Let p 2: q 2: d+1 and p(d-1) < (q-1)d. Prove that HDd(p, q) ~ p-q+1.
You may want to start with the case of HD 2 (5, 4). 8J
3. Let X C R2 be a (4k+1)-point set, and let F = {conv(Y): Y C X, WI =
2k+1}.
(a) Verify that F has the (4, 3)-property, and show that if X is in convex
position, then r(F) 2: 3. 0
(b) Show that r(F) ~ 5 (for any X). 0
These results are due to Alon and Rosenfeld (private communication).
Let L denote the set of all lines that are common tangents to at least
two disjoint members of F. Since two disjoint convex sets in the plane have
exactly 4 common tangents, ILl:::; 4G).
First, to see the idea, let us make the simplifying assumption that no 3
sets of F have a common tangent. Then each line £ E L has a unique defining
pair of disjoint sets for which it is a common tangent. As we have seen, for
each good triple {Sl, S2, S3} there is a line £ E L such that two sets of the
triple are the defining pair of £ and the third is intersected by £. Now, since
we have ~ G) good triples and ILl :::; 4(~), there is an £0 E L playing this role
for at least 8n of the good triples, 8 > O. Each of these 8n triples contains
the defining pair of £0 plus some other set, so altogether £0 intersects at least
8n sets. (Note the similarity to the proof of the fractional Helly theorem.)
Now we need to relax the simplifying assumption. Instead of working with
lines, we work with pairs (£, is, S'}), where S, S' E F are disjoint and £ is
one of their common tangents, and we let L be the set of all such pairs. We
still have ILl :::; 4(~), and each good triple {Sl, S2, S3} gives rise to at least
10.6 A (p, q)- Theorem for Hyperplane Transversals 261
one (.e,{S,S'}) E L, where {S,S'} c {Sl, S2, S3}. The rest of the argument
is as before. 0
The interesting feature is that while this fractional Helly theorem is valid,
there is no Helly theorem for line transversals! That is, for all n one can
find families of n disjoint planar convex sets (even segments) such that any
n-1 have a line transversal but there is no line transversal for all of them
(Exercise 5.1.9).
Lemma 10.6.2 implies, exactly as in the proof of Lemma 10.5.2, that vhyp
is bounded for any family satisfying the (p, d+ 1)-condition. It remains to
prove a weak €-net result.
10.6.3 Lemma. Let L be a finite set (or multiset) of lines in the plane and
let r ~ 1 be given. Then there exists a set N of O(r2) lines (a weak €-net)
such that whenever S ~ R 2 is an (arcwise) connected set intersecting more
than I~I lines of L, then it intersects a line of N.
Proof. Recall from Section 4.5 that a ~-cutting for a set L of lines is a
collection {~b ... ,~d of generalized triangles covering the plane such that
the interior of each ~i is intersected by at most I~I lines of L. The cutting
lemma (Lemma 4.5.3) guarantees the existence of a ~-cutting of size O(r 2 ).
The cutting lemma does not directly cover multisets of lines. Nevertheless,
with some care one can check that the perturbation argument works for
multisets of lines as well.
Thus, let {~l' ... ' ~d be a ~-cutting for the considered L, t = O(r 2 ).
The weak €-net N is obtained by extending each side of each ~i into a line.
Indeed, if an arcwise connected set S intersects more than I~I lines of L,
then it cannot be contained in the interior of a single ~i' and consequently,
it intersects a line of N. 0
Exercises
1. (a) Prove that if F is a finite family of circular disks in the plane such
that every two members of F intersect, then T(F) is bounded by a con-
stant (this is a very weak version of Gallai's problem mentioned at the
beginning of this chapter). [II
(b) Show that for every p ?: 2 there is an no such that if a family of
no disks in the plane satisfies the (p,2)-condition, then there is a point
common to at least 3 disks of the family. ~
(c) Prove a (p, 2)-theorem for disks in the plane (or for balls in R d ). [II
2. A d-interval is a set J ~ R of the form J = h U 12 U ... U I d , where
the I j C R are closed intervals on the real line. (In the literature this is
customarily called a homogeneous d- interval.)
(a) Let F be a finite family of d-intervals with v(F) = k. The family
may contain multiple copies of the same d-interval. Show that there is a
10.6 A (p, q)- Theorem for Hyperplane Transversals 263
j3 = j3(d, k) > 0 such that for any such F, there is a point contained in
at least j3 . IFI members of F. 0 Can you prove this with j3 = 2~k? 0
(b) Prove that r(F) ::; dr*(F) for any finite family of d-intervals. 0
(c) Show that r(F) ::; 2d 2 v(F) for any finite family of d-intervals, or at
least that r is bounded by a function of d and v. 0
3. Let K~ denote the family of all unions of at most k convex sets in R d
(so the d-intervals from Exercise 2 are in Kf). Prove a (p, d+1)-theorem
for this family by the Alon-Kleitman technique: Whenever a finite fam-
ily F c K~ satisfies the (p, d+1)-condition, r(F) ::; f(p, d, k) for some
function f. III
4. (a) Show that the family K~ as in Exercise 3 has no finite Helly number.
That is, for every h there exists a subfamily F C K~ of h+ 1 sets in which
n
every h members intersect but F = 0. III
(b) Use the result of Exercise 3 to derive that for every k, d 2:: 1, there
exists an h with the following property. Let F C K~ be a finite family
such that the intersection of any subfamily of F lies in K~ (i.e., is a union
of at most k convex sets). Suppose that every at most h members of F
have a common point. Then all the sets of F have a common point. (This
is expressed by saying that the family K~ has Helly order at most h.) 0
11
a 4-facet
•
• •
266 Chapter 11: Attempts to Count k-Sets
the d hyperplanes passing through the vertex that are not counted in its
level).
The arrangement of H has at most O(n d - 1 ) unbounded cells (Exer-
cise 6.1.2). Therefore, all but at most O(n d - 1 ) cells of level k have a top-
most vertex, and the level of such a vertex is between k-d+1 and k. On
the other hand, every vertex is the topmost vertex of at most one cell
of level k. A similar relation exists between cells of level n-k and ver-
tices of level n-k-d. Therefore, the number of k-sets of X is at most
O(n d- 1 ) + 2:1=6KFAC(X,k-j). Conversely, KFAC(X,k) can be bounded
in terms of the number of k-sets; this we leave to Exercise 2. From now on,
we thus consider only estimating KFACd(n, k).
Viewing KFACd( n, k) in terms of the k-Ievel in a hyperplane arrangement,
we obtain some immediate bounds from the results of Section 6.3. The k-Ievel
has certainly no more vertices than all the levels 0 through k together, and
hence
KFACd(n,k) = 0 ( n Ld/2J(k+1)rd/21)
by Theorem 6.3.1. On the other hand, the arrangements showing that Theo-
rem 6.3.1 is tight (constructed using cyclic polytopes) prove that for k :::; n/2,
we have
KFACd(n, k) = n (nLd / 2J (k+1)r d/ 21- 1 ) ;
Proof. We use the method of the probabilistic proof of the cutting lemma
from Section 6.5 with only small modifications; we assume familiarity with
that proof. We work in the dual setting, and so we need to bound the number
of vertices of level k in the arrangement of a set H of n hyperplanes in general
position. Since for k bounded by a constant, the complexity of the k-Ievel is
asymptotically determined by Clarkson's theorem on levels (Theorem 6.3.1),
we can assume 2 ::::; k ::::; ~.
We set r = ~ and p = ; = fe, and we let S ~ H be a random sample
obtained by independent Bernoulli trials with success probability p. This time
we let T(S) denote the bottom-vertex triangulation of the bottom unbounded
cell of the arrangement of S (actually, in this case it seems simpler to use the
top-vertex triangulation instead of the bottom-vertex one); the rest of the
arrangement is ignored. (For d = 2, we can take the vertical decomposition
instead.) Here is a schematic illustration for the planar case:
lin s of S
T( )
""'- level k of H
The conditions (CO)-(C2) as in Section 6.5 are satisfied for this T(S)
(in (CO) we have constants depending on d, of course), and as for (C3),
we have IT(S)I = O(ISI Ld/2J + 1) for all S ~ H by the asymptotic upper
bound theorem (Theorem 5.5.2) and by the properties of the bottom-vertex
triangulation. Thus, the analogy of Proposition 6.5.2 can be derived: For
every t :2: 0, the expected number of simplices with excess at least t in T(S)
is bounded as follows:
(11.1)
Vk(S) n ~ have the same level in the arrangement of Ht::, (it is k minus the
number of hyperplanes below ~). By the assumption in the theorem, we thus
have IVk(S) n ~I = O(IHt::,ld-Cd) = o ((tt::, ~)d-Cd) = O((tt::,k)d-Cd), where tt::,
is the excess of ~. Therefore,
line with constant velocity, how many times can the pair of points with
median distance change?). They showed that n parabolas can be cut
into 0(n 5 / 3 ) pieces in total so that the resulting collection of curves
is a family of pseudosegments (see Exercise 6). This idea of cutting
curves into pseudosegments proved to be of great importance for other
problems as well; see the notes to Section 4.5. Tamaki and Tokuyama
obtained the bound of 0(n 2- 1/ 12 ) for the maximum complexity of the
k-level for n parabolas. Using the tools from [AACS98] and a cutting
into extendible pseudosegments, Chan [ChaOOa] improved this bound
to 0(nkl-2/910g2/3(k+1)).
All these results can be transferred without much difficulty from
parabolas to pseudocircles, which are closed planar Jordan curves, ev-
ery two intersecting at most twice. Aronov and Sharir [AS01a] proved
that if the curves are circles, then even cutting into 0(n 3 / 2 +c ) pseu-
dosegments is possible (the best known lower bound is f2(n 4 / 3 ); see
Exercise 5). This upper bound was extended by Nevo, Pach, Pinchasi,
and Sharir [NPPSOl] to certain families of pseudocircles: The pseudo-
circles in the family should be selected from a 3-parametric family of
real algebraic curves and satisfy an additional condition; for example,
it suffices that their interiors can be pierced by 0(1) points (also see
Alon, Last, Pinchasi, and Sharir [ALPS01] for related things).
Tamaki and Tokuyama constructed a family of n curves with at
most 3 pairwise intersections that cannot be cut into fewer than f2(n 2 )
pseudosegments, demonstrating that their approach cannot yield non-
trivial bounds for the complexity of levels for such general curves (Ex-
ercise 5). However, for graphs of polynomials of degree at most s,
Chan [ChaOOa] obtained a cutting into roughly 0(n2-1/3s-1) pseu-
dosegments and consequently a nontrivial upper bound for levels. His
bound was improved by Nevo et al. [NPPSOl].
As for higher-dimensional results, Katoh and Tokuyama [KT99]
proved the bound 0(n 2k 2/ 3) for the complexity of the k-level for n
triangles in R 3.
Bounds on k-sets have surprising applications. For example, Dey's
results for planar k-sets mentioned above imply that if G is a graph
with n vertices and m edges and each edge has weight that is a linear
function of time, then the minimum spanning tree of G changes at
most 0(mnl/3) times; see Eppstein [Epp98]. The number of k-sets
of the infinite set (zt)d (lattice points in the nonnegative orthant)
appears in computational algebra in connection with Grabner bases
of certain ideals. The bounds of O((k log k)d-l) and f2(k d - 1 log k) for
every fixed d, as well as references, can be found in Wagner [WagOl].
272 Chapter 11: Attempts to Count k-Sets
Exercises
1. Verify that for all k and all dimensions d, KFACd(n, k) :::; 2·HFAC d(2n+
d). ~
2. Show that every vertex in an arrangement of hyperplanes in general po-
sition is the topmost vertex of exactly one cell. For X c R d finite and in
general position, bound KFAC(X, k) using the numbers of j-sets of X,
k :::; j :::; k+d-1. 11]
3. Suppose that we have a construction that provides an n-point set in the
plane with at least f(n) halving edges for all even n. Show that this
implies KFAC 2 (n, k) = fl(ln/2kJf(2k)) for all k :::; ~. 11]
4. Suppose that for all even n, we can construct a planar n-point set with at
least f(n) halving edges. Show that one can construct n-point sets with
fl(nf(n)) halving facets in R3 (for infinitely many n, say). [!] Can you
extend the construction to Rd, obtaining fl(n d- 2 f(n)) halving facets?
5. (Lower bounds for cutting curves into pseudosegments) In this exercise, r
is a family of n curves in the plane, such as those considered in connection
with Davenport-Schinzel sequences: Each curve intersects every vertical
line exactly once, every two curves intersect at most s times, and no 3
have a common point.
(a) Construct such a family r with s = 2 (a family of pseudoparabolas)
whose arrangement has fl(n 4 / 3 ) empty lenses, where an empty lens is
a bounded cell of the arrangement of r bounded by two of the curves.
(The number of empty lenses is obviously a lower bound for the number
of cuts required to turn r into a family of pseudosegments.) 11]
(b) Construct a family r with s = 3 and with fl( n 2 ) empty lenses. ~
6. (Cutting pseudoparabolas into pseudosegments) Let r be a family of n
pseudoparabolas in the plane as in Exercise 5(a). For every two curves
I, I' E r with exactly two intersection points, the lens defined by I and
I' consists of the portions of I and I' between their two intersection
points, as indicated in the picture:
the upper envelope U of 1'1,1'2, 1'3 and the lower envelope L of I'i , ... , I'~'
(A more careful argument shows that even K 3 ,3 is excluded.) [!]
(b) Show that the graph G in (a) can contain a K 2 ,r for arbitrarily large r.
CD
(c) Given r, define the lens set system (X, £.) with X consisting of all
bounded edges of the arrangement of r and the sets of £. corresponding
to lenses (each lens contributes the set of arrangement edges contained
in its two arcs). Check that T(£.) is the smallest number of cuts needed
to convert r into a collection of pseudosegments, and that the result of
(a) implies v(£.) = O(n 5/ 3). CD
(d) Using the method of the proof of Clarkson's theorem on levels and
the inequality in Exercise 1O.1.4(a), prove that T(£.) = O(n 5/ 3). [II
7. (The k-set polytope) Let X C Rd be an n-point set in general position
and let k E {I, 2, ... , n-l}. The k-set polytope Qk (X) is the convex hull
of the set
{LX: 8 c X, 181 = k}
xES
Each of the other middle-level vertices yields 2 vertices of the new middle
><
level:
bundle of C
bundle of C'
11.2 Sets with Many Halving Edges 275
Then we add two new lines Av and fLv as indicated in the next picture, and
we obtain 2a m vertices of the middle level:
.................... Av
Namely, if the lines of Lm are £1, £2, ... , £n rn , then the vertical spacing in the
bundle of £i is set to c i , where c > 0 is a suitable very small number.
Let £i be a line of L m , and let di denote the number of indices j < i such
that £j intersects £i in a vertex of the middle level. In the new arrangement
of Lm+l we obtain am lines of the bundle of £i and 2di lines of the form Av
and fLv, which are almost parallel to ii, and di of them go above the bundle
and di below. Thus, for points not very close to ii, the effect is as if £i were
replicated (a m +2di ) times. This is still not good; we would need that all lines
have the same multiplicities. So we let D be the maximum of the di , and for
each i, we add D - di more lines parallel to £i below £i and D - di parallel
lines above it.
276 Chapter 11: Attempts to Count k-Sets
It remains to define the am, which are free parameters of the construction.
A good choice is to let am = 4Dm . Then we have Do = 1, Dm = 8 m , and
am = 4 . 8 m . From the recurrences above, we further calculate
nm = 2· 6m . 81+ 2+ o
+(m-l), 1m = 8m . 81+ 2+ o
+(m-l).
In the plane, T has a single point and VT are the other endpoints of the
halving edges emanating from it. In 3 dimensions, conv(T) is a segment, and
a typical picture might look as follows:
We claim that for any two angularly consecutive segments, such as at and bt,
the angle opposite the angle atb contains a point of V,f (such as z). Indeed,
the hyperplane passing through t and a has exactly n;-d points of X in
both of its open half-spaces. If we start rotating it around T towards b, the
point a enters one of the open half-spaces (in the picture, the one below the
rotating hyperplane). But just before we reach b, that half-space again has
n;-d points. Hence there was a moment when the number of points in this
half-space went from n;-d +1 to n;-d, and this must have been a moment of
reaching a suitable z.
This means that for every two consecutive points of V,f, there is at least
one point of V,f in the corresponding opposite wedge. There is actually exactly
one, for if there were two, their opposite wedge would have to contain another
point. Therefore, the numbers of points of VT in the half-spaces determined
by h differ exactly by l.
To finish the proof of the lemma, it remains to observe that if we start
rotating the hyperplane h around T in either direction, the first point of VT
encountered must be in the larger half-space. So the larger half-space has
one more point of VT than the smaller half-space. (Recall that the larger
half-space is defined with respect to X, and so we did not just parrot the
definition here.) D
Proof. We can move Ca little so that it intersects the relative interiors of the
same halving facets as before but intersects no boundary of a halving facet.
Next, we start translating C in a suitably chosen direction. (In the plane there
are just two directions, and both of them will do.) The direction is selected
so that we never cross any (d - 3)-dimensional flat determined by the points
of X. To this end, we need to find a two-dimensional plane passing through
C and avoiding finitely many (d - 3)-dimensional flats in R d , none of them
intersecting C; this is always possible.
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions 279
11.3.3 Theorem. For each d ;::: 2, the maximum number of halving facets
satisfies
HFACd(n) = O(nd-l/Sd-l),
where Sd-l is an exponent for which the statement of the second selection
lemma (Theorem 9.2.1) holds in dimension d-1. In particular, in the plane
we obtain HFAC 2 (n) = O(n 3 / 2 ).
For higher dimensions, this result shows that HFACd(n) is asymptotically
somewhat smaller than n d , but the proof method is inadequate for proving
bounds close to n d - 1 .
Theorem 11.3.3 is proved from Corollary 11.3.2 using the second selection
lemma. Let us first give a streamlined proof for the planar case, although
later on we will prove a considerably better planar bound.
Proof of Theorem 11.3.3 for d = 2. Let us project the points of X ver-
tically on the x-axis, obtaining a set Y. The projections of the halving edges
of X define a system of intervals with endpoints in Y. By Corollary 11.3.2,
any point is contained in the interior of at most O( n) of these intervals, for
otherwise, a vertical line through that point would intersect too many halving
edges.
Mark every qth point of Y (with q a parameter to be set suitably later).
Divide the intervals into two classes: those containing some marked point
in their interior and those lying in a gap between two marked points. The
number of intervals of the first class is at most O(n) per marked point, i.e.,
at most O(n 2 jq) in total. The number of intervals of the second class is no
more than (q~l) per gap, i.e., at most (~+ l)(q~l) in total. Balancing both
bounds by setting q = rvnl, we get that the total number of halving edges
is O(n 3/ 2) as claimed. 0
Note that we have implicitly applied and proved a one-dimensional second
selection lemma (Exercise 9.2.1).
Proof of Theorem 11.3.3. We consider an n-point X C Rd. We project
X vertically into the coordinate hyperplane Xd = 0, obtaining a point set Y,
which we regard as lying in R d- 1 . If the coordinate system is chosen suitably,
Y is in general position.
Each halving facet of X projects to a (d-1 )-dimensional Y -simplex in
R d- 1 ; let F be the family of these Y-simplices. If we write IFI = a(~), then
280 Chapter 11: Attempts to Count k-Sets
Exercises
1. (a) Prove the following version the Lovasz lemma in the planar case:
For a set X C R2 in general position, every vertical line e intersects the
interiors of at most k+ 1 of the k-edges. [!J
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions 281
(b) Using (a), prove the bound KFAC 2 (n, k) = O(nv'k+l) (without
appealing to Theorem 11.1.1). [1]
2. Let K ~ {l,2, ... , In/2J}. Using Exercise 1, prove that for any n-point
set X C R2 in general position, the total number of k-edges with k E
K (or equivalently, the total number of vertices of levels k E K in an
arrangement of n lines) is at most 0 (nJL:kEK k). (Note that this is
better than applying the bound KFAC 2 (n, k) = O(nv'k) for each k E K
separately.) [1]
3. (Exact planar Lovasz lemma) Let X C R2 be a 2n-point set in general
position, and let £ be a vertical line having k points of X on the left
and 2n-k points on the right. Prove that £ crosses exactly min(k,2n-k)
halving edges of X. 121
4. Let X be a set of 2n+ 1 points in R 3 in general position, and let
PI, P2, ... , P2n+ I be the points of X listed by increasing height (z-co-
ordinate).
(a) Using Exercise 3, check that if Pk+1 is a vertex of conv(X), then there
are exactly min(k,2n-k) halving triangles having Pk+1 as the middle-
height vertex (that is, the triangle is PiPk+IPj with i < k+l < j). [1]
(b) Prove that every (2n+1)-point convex independent set X C R3 in
general position has at least n 2 halving triangles. 121
(c) Assuming that each (2n+1)-point set in R3 in general position
has at least n 2 halving triangles (which follows from (b) and the re-
sult mentioned in the notes above about the number of halving trian-
gles being minimized by a set in convex position), infer that if X =
{PI,'" ,P2n+l} C R3 is in general position, then for every k, there are
always at least minCk, 2n-k) halving triangles havingpk+1 as the middle-
height vertex (even if Pk+1 is not extremal in X). [1]
(d) Derive from (c) the result about balanced lines mentioned in the notes
to this section: If R, Be R2 are n-point sets (red and blue points), with
RuB in general position, then there are at least n balanced lines £ (with
IR n £1 = IB n £1 = 1 and such that on both sides of £ the number of red
points equals the number of blue points). Embed R2 as the z = 1 plane
in R3 and use a central projection on the unit sphere in R3 centered at O.
[1]
See [SWOl] for solutions and related results.
5. (Exact Lovasz lemma) Let Xc Rd be an n-point set in general position
and let £ be a directed line disjoint from the convex hulls of all (d-l)-
point subsets of X. We think of £ as being vertical and directed upwards.
We say that £ enters a j-facet F if it passes through F from the positive
side (the one with j points) to the negative side. Let hj = hj (£, X)
denote the number of j-facets entered by £, j = 0,1, ... , n - d. Further,
let Sk(£, X) be the number of (d + k)-element subsets S ~ X such that
£ n conv(S) =I- 0.
d '-
(a) Prove that for every X and £ as above, Sk = L:;::k (~)hj. 8J
282 Chapter 11: Attempts to Count k-Sets
(b) Use (a) to show that ho, ... , hn - d are uniquely determined by
So, Sl,"" Sn-d. 0
(c) Infer from (b) that if X' is a set in general position obtained from
X by translating each point in a direction parallel to £, then hj(£, X) =
h j (£, X') for all j. Derive h j = hn-d-j' 0
(d) Prove that for every x E X and all j, we have hj (£, X \ {x}) :S
hj(£, X). [2]
(e) Choose x E X uniformly at random. Check that E [h j (£, X \ {x})] =
n-~-j h j + j!l hj+!' 0
m
(f) From (d) and (e), derive h J + l :S hj, and conclude the exact Lovasz
{(j 1) (n - 1) }
lemma:
h mm . +d - j -
J:S d-1' d-1 .
o
6. (The upper bound theorem and k-facets) Let a = (aI, a2,"" an) be a
sequence of n :::: d+ 1 convex independent points in R d in general position,
and let P be the d-dimensional simplicial convex polytope with vertex set
{al, ... , an}. Let g = (Ih, ... , gn) be the Gale transform of a, gl,"" gn E
Rn-d-l, and let bi be a point in R n - d obtained from gi by appending
a number ti as the last coordinate, where the ti are chosen so that X =
{b l , ... , bn } is in general position.
(a) Let £ be the xn-d-axis in Rn-d oriented upwards, and let Sk
Sk(£'X) and h j = hj(£,X) be as in Exercise 5. Show that h(P)
Sd-k-l(£,X), k = O,l, ... ,d -1. [2]
(b) Derive that hj(P) = hj(£,X), j = O,l, ... ,d, where h j is as at the
end of Section 5.5, and thus (f) of the preceding exercise implies the
upper bound theorem in the formulation with the h-vector (5.3).0
If (a) and (b) are applied to the cyclic polytopes, we get equality in
the bound for hj in Exercise 5(f). In fact, the reverse passage (from an
X c Rn-d in general position to a simplicial polytope in R d ) is possible
as well (see [WeIOl]), and so the exact Lovasz lemma can also be derived
from the upper bound theorem.
7. This exercise shows limits for what can be proved about k-sets using
Corollary 11.3.2 alone.
(a) Construct an n-point set X C R2 and a collection of D( n 3 / 2 ) segments
with endpoints in X such that no line intersects more than O( n) of these
segments. 0
(b) Construct an n-point set in R3 and a collection of D(n 5 / 2 ) triangles
with vertices at these points such that no line intersects more than O(n 2 )
triangles. 8J
8. (The Dey-Edelsbrunner proof of HFAC 3 (n) = O(n 8 / 3 )) Let X be an
n-point set in R3 in general position (make a suitable general position
assumption), and let T be a collection of t triangles with vertices at points
of X. By a crossing we mean a pair (T, e), where T E T is a triangle
11.4 A Better Upper Bound in the Plane 283
Theorem 11.4.1 follows from the crossing number theorem (Theorem 4.3.1)
and the following remarkable identity.
11.4.2 Theorem. For each n-point set X in the plane in general position,
where n is even, we have
(11.2)
X
X
11.4 A Better Upper Bound in the Plane 285
We note that {x, z} and {y, z} cannot be halving edges before the mutation.
After the mutation, {x, y} ceases to be halving, while {x, z} and {y, z} become
halving.
Let deg( z) = 2r+ 1 (before the mutation) and let h be the line passing
through z and parallel to xy. The larger side of h, i.e., the one with more
points of X, is the one containing x and y, and by the halving-facet inter-
leaving lemma, r+ 1 of the halving edges emanating from z go into the larger
side of h and thus cross xy. So the following changes in degrees and crossings
are caused by the mutation:
• deg(z), which was 2r+1, increases by 2, and
• cr(X) decreases by r+1.
It is easy to check that the left-hand side of the identity (11.2) remains the
same after this change.
What other mutations are possible? One is the mutation inverse to the
one discussed above, with z moving in the reverse direction. We show that
there are no other types of mutations affecting the graph of halving edges.
Indeed, for any mutation, the notation can be chosen so that z crosses over
the segment xy. Just before the mutation or just after it, it is not possible for
{x, z} to be a halving edge and {y, z} not. The last remaining possibility is a
mutation with no halving edge on {x, y, z}, which leaves the graph unchanged.
Theorem 11.4.2 is proved. 0
Tight bounds for small n. Using the identity (11.2) and the fact that
all vertices of the graph of halving edges must have odd degrees, one can
determine the exact maximum number of halving edges for small point con-
figurations (Exercise 1). Figure 11.1 shows examples of configurations with
the maximum possible number of halving edges for n = 8, 10, and 12. These
small examples seem to be misleading in various respects: For example, we
know that the number maximum of halving edges is superlinear, and so the
graph of halving edges cannot be planar for large n, and yet all the small
pictures are planar.
q~_-'-7 ·
one convex chain begins or ends at each vertex. Thus, the number cp
of chains equals !(np + rp).
A lower bound for the number of edge crossings xp is the number
of pairs {C1, C 2 } of chains such that an edge of C 1 crosses an edge of
C 2 . The trick is to estimate the number of pairs {C 1 , C 2 } that do not
cross in this way. There are two possibilities for such pairs: C 1 and C 2
can be disjoint or they can cross at a vertex:
Exercises
1. (a) Find the maximum possible number of halving edges for n = 4 and
n = 6, and construct the corresponding configurations. [II
(b) Check that the three graphs in Figure 11.1 are graphs of halving
edges of the depicted point sets. IT]
(c) Show that the configurations in Figure 11.1 maximize the number of
halving edges. [IJ
12
Two Applications of
High-Dimensional Polytopes
From this chapter on, our journey through discrete geometry leads us to the
high-dimensional world. Up until now, although we have often been consid-
ering geometric objects in arbitrary dimension, we could mostly rely on the
intuition from the familiar dimensions 2 and 3. In the present chapter we can
still use dimensions 2 and 3 to picture examples, but these tend to be rather
trivial. For instance, in the first section we are going to prove things about
graphs via convex polytopes, and for an n-vertex graph we need to consider
an n-dimensional polytope. It is clear that graphs with 2 or 3 vertices cannot
serve as very illuminating examples. In order to underline this shift to high
dimensions, from now on we mostly denote the dimension by n instead of d
as before, in agreement with the habits prevailing in the literature on high-
dimensional topics.
In the first and third sections we touch upon polyhedral combinatorics.
Let E be a finite set, for example the edge set of a graph G, and let F be
some interesting system of subsets of E, such as the set of all matchings in
G or the set of all Hamiltonian circuits of G. In polyhedral combinatorics
one usually considers the convex hull of the characteristic vectors of the sets
of F; the characteristic vectors are points of {O,l}E eRE. For the two
examples above, we thus obtain the matching polytope of G and the traveling
salesman polytope of G. The basic problem of polyhedral combinatorics is to
find, for a given F, inequalities describing the facets of the resulting polytope.
Sometimes one succeeds in describing all facets, as is the case for the matching
polytope. This may give insights into the combinatorial structure of F, and
often it has algorithmic consequences. If we know the facets and they have
a sufficiently nice structure, we can optimize any linear function over the
polytope in polynomial time. This means that given some real weights of the
elements of E, we can find in polynomial time a maximum-weight set in F
290 Chapter 12: Two Applications of High-Dimensional Polytopes
Observations.
• P( G) ~ [0,1 In. The inequality Xv :::; 1 is obtained from (ii) by choosing
K = {v}.
• The characteristic vector of each independent set lies in P( G).
• If a vector x E P(G) is integral (i.e., it is a 0/1 vector), then it is the
characteristic vector of an independent set.
Before we start proving the weak perfect graph conjecture, let us intro-
duce some more notation. Let w: V -+ {O, 1,2, ... } be a function assigning
nonnegative integer weights to the vertices of G. We define the weighted clique
number w( G, w) as the maximum possible weight of a clique, where the weight
of a clique is the sum of the weights of its vertices. We also define the weighted
292 Chapter 12: Two Applications of High-Dimensional Polytopes
Proof of (i) => (ii). This part is purely graph-theoretic. For every weight
function w: V --+ {O, 1, 2, ... }, we need to exhibit a covering of V by inde-
°
pendent sets witnessing X(G, w) = w(G, w). If w attains only values and 1,
then we can use (i) directly, since selecting an induced subgraph of G is the
same as specifying a 0/1 weight function on the vertices.
For other values of w we proceed by induction on w(V). Let w be given
and let Vo be a vertex with w(vo) > 1. We define a new weight function w':
Since w'(V) < w(V), by the inductive hypothesis we assume that we have
independent sets h, 12 , •.. , IN covering each v exactly w'(v) times, where
N = w(G, w'). If w(G, w) > N, then we can obtain the appropriate covering
for w by adding the independent set {vo}, so let us suppose w( G, w) = N.
Let the notation be chosen so that Vo E h. We define another weight
function w":
"( ) _ { w(v) - 1 for v E h,
w v - w(v) for v 1- h.
We claim that w(G, w") < N. If not, then there exists a clique K with
w"(K) = N = w(G,w'). By the choice of the h we have N ::::: w'(K) =
2:[:,1 IIi n KI· Since a clique intersects an independent set in at most one
vertex, K has to intersect each h In particular, it intersects h, and so
w(K) > w"(K) = N, contradicting w(G,w) = N.
We thus have w(G, w") < N. By the inductive hypothesis, we can produce
a covering by independent sets showing that X( G, w") < N. By adding h to
it we obtain a covering witnessing X(G,w) = N.
Proof of (ii) => (iii). Let x :::= (XVI' •.. ,xvJ be a vertex of the convex poly-
tope P(G). Since all the inequalities defining P(G) have rational coefficients,
x has rational coordinates, and we can find a natural number q such that
w = qx is an integral vector. We interpret the coordinates of w as weights of
the vertices of G. Let K be a clique with weight N = w(G,w). One of the
inequalities defining P(G) is x(K) ::::: 1, and hence N = w(K) ::::: q.
12.1 The Weak Perfect Graph Conjecture 293
By (ii) we have X(G, w) = w(G, w) ::; q, and so there are independent sets
h, ... , I q (some of them may be empty) covering each vertex v E V precisely
Wv times. Let Ci be the characteristic vector of Ii; then this property of the
sets Ii can be written as x = 2:;=1 iCi. Thus x is a convex combination of
the Ci, and since it is a vertex of P(G), it must be equal to some Ci, which is
a characteristic vector of an independent set in G.
Proof of (iii) =* (iv). It suffices to prove X(G) = w(G) for every G
satisfying (iii), since (iii) is preserved by passing to an induced subgraph
(right?).
We prove that a graph G fulfilling (iii) has a clique K intersecting all
independent sets of the maximum size o:(G). Then the graph G \ K has
independence number o:(G) -1, and by repeating the same procedure we can
cover G by 0:( G) cliques.
To find the required K, let us consider all the independent sets of size
0: = 0:( G) in G and let M ~ P( G) be the convex hull of their characteristic
vectors. We note that M lies in the hyperplane h = {x: x(V) = o:}. This h
defines a (proper) face of P(G), for otherwise, we would have vertices of P(G)
on both sides of h, and in particular, there would be a vertex z with z(V) > 0:.
This is impossible, since by (iii), z would correspond to an independent set
bigger than 0:.
°
Each facet of P( G) corresponds to an equality in some of the inequalities
defining P(G). The equality can be either of the form Xv = or of the form
x(K) = 1. The face F = P(G) n h is the intersection of some of the facets.
Not all of these facets can be of the type Xv = 0, since then their intersection
°
would contain 0, while tJ. h. Hence all x E M satisfy x(K) = 1 for a certain
clique K, and this means that K n I i- 0 for each independent set I of size 0:.
Proof of (iv) =* (i). This is the implication (i) =} (iv) for the graph G. D
Exercises
1. What are the integral vertices of the polytope P(C5 )? Find some nonin-
tegral vertex (and prove that it is really a vertex!). [!]
2. Prove that for every graph G and every clique K in G, the inequality
x(K) ::; 1 defines a facet of the polytope P(G). In other words, there
is an x E P(G) for which x(K) = 1 is the only inequality among those
defining P(G) that is satisfied with equality. [!]
3. (On Konig's edge-covering theorem) Explain why bipartite graphs are
perfect, and why the perfectness of the complements of bipartite graphs is
equivalent to Konig's edge-covering theorem asserting that the maximum
number of vertex-disjoint edges in a bipartite graph equals the minimum
number of vertices needed to intersect all edges (also see Exercise 10.1.5).
[!]
4. (Comparability graphs and Dilworth's theorem) For a finite partially
ordered set (X,::;) (see Section 12.3 for the definition), let G = (X, E)
be the graph with E = {{u,v} E (~): u < v or v < u}; that is, edges
correspond to pairs of comparable elements. Any graph isomorphic to
such a G is called a comparability graph. We also need the notions of a
12.1 The Weak Perfect Graph Conjecture 295
As we will derive below, the middle cut cannot have area smaller than both
of the other two cuts. Let us choose the coordinate system so that the cuts
are perpendicular to the xl-axis and denote by v( t) the area of the cut by the
plane Xl = t. Then the claim can be stated as follows: For any h < t < t2 we
have v(t) 2: min(v(h),v(t2)). Thus, there is some to such that the function
t I-t v(t) is nondecreasing on (-00, to] and non increasing on [to, (0). Such a
function is called unimodal. A similar result is true for any convex body C in
Rn+l if v(t) denotes the n-dimensional volume of the intersection of C with
the hyperplane {x I = t}.
How can one prove such a statement? In the planar case, with n = 1,
it is easy to see that v(t) is a concave function on the interval obtained by
projecting C on the xl-axis.
This might tempt one to think that v(t) is concave on the appropriate interval
in higher dimension, too, but this is false in general! (See Exercise 1.) There
is concavity in the game, but the right function to look at in R n + l is v(t)l/n.
Perhaps a little more intuitively, we can define r(t) as the radius of the n-di-
mensional ball whose volume equals v(t). We have r(t) = Rnv(t)l/n, where
12.2 The Brunn-Minkowski Inequality 297
Rn is the radius of a unit-volume ball in Rn; let us call r(t) the equivalent
radius of C at t.
12.2.1 Theorem (Brunn's inequality for slice volumes). Let C c
Rn+l be a compact convex body and let the interval [tmin, t max ] be the pro-
jection of C on the xl-axis. Then the equivalent radius function r(t) (or,
equivalently, the function v(t)l/n) is concave on [tmin, t max ]. Consequently,
for any tl < t < t2 we have v(t) :2: min(v(td, V(t2»'
Brunn's inequality is a consequence of the following more general and
more widely applicable statement dealing with two arbitrary compact sets.
bo
Proof. We use a basic fact from measure theory, namely, that if Xl ::! X 2 ::!
X3 ::! .. . is a sequence of measurable sets in R n such that X = n:l
Xi,
then the numbers vol(Xd converge to vol(X).
12.2 The Brunn-Minkowski Inequality 299
Let A, BeRn be nonempty and compact. For k = 1,2, ... , consider the
closed axis-parallel cubes with side length 2- k centered at the points of the
scaled grid 2- k Z n (these cubes cover Rn and have disjoint interiors). Let Ak
be the union of all such cubes intersecting the set A, and similarly for Bk .
..,
i"
r. ~
..... r-- 'I
........,
We have Al ;2 A2 ;2 ... and nk Ak = A (since any point not belonging to A
has a positive distance from it, and the distance of any point of Ak from A
is at most 2- k y'n). Therefore, vol(Ak) --+ vol(A) and vol(Bk) --+ vol(B).
We claim that A+B ;2 nk(Ak+Bk). To see this, let x E Ak+Bk for all k.
We pick Yk E Ak and Zk E Bk with x = Yk + Zk, and by passing to convergent
subsequences we may assume that Yk --+ yEA and Zk --+ Z E B. Then we
obtain x = Y + Z E A + B. Thus limk-too vol(Ak + Bk) :::;; vol(A + B). By
the Brunn-Minkowski inequality for the brick sets A k , B k , we have vol(A +
B)l/n 2': limk-too vol(Ak + Bk)l/n 2': limk-too(vol(Ak)l/n + vol(Bk)l/n)
vol(A)I/n + vol(B)I/n. 0
A'
B'
D
;--
D ......
A" B"
r r
vol(A + B) 2: vol(A' + B') + vol(A" + B")
(induction) 2: [vol(A')I/n + vol(B')I/n + [vol(A")I/n + vol(B")I/n
[pl/n vol(A)I/n + pl/n vol(B)I/n] n
+ [(1_p)l/n vol(A)I/n + (l-p)l/n vol(B)I/nr
= [vol(A)I/n + vol(B)I/n r.
This concludes the proof of the Brunn-Minkowski inequality. o
the version in Theorem 12.2.2 follows quickly (see Exercise 5). Ad-
vantageously, the dimension does not appear in the Prekopa-Leindler
inequality, and it is simple to derive the general case from the I-dimen-
sional case by induction; see Exercise 7. This passage to a dimension-
free form of the inequality, which can be proved from the I-dimensional
case by a simple product argument, is typical in the modern theory of
geometric inequalities (a similar phenomenon for measure concentra-
tion inequalities is mentioned in the notes to Section 14.2).
The Brunn-Minkowski inequality is just the first step in a so-
phisticated theory; see Schneider [Sch93] or Sangwine-Yager [SY93].
Among the most prominent notions are the mixed volumes. As was
discovered by Minkowski, if K l , ... , Kr C R n are convex bodies and
>\1, A2,"" Ar are nonnegative real parameters, then vol(AlKl +A2K2+
... + ArKr) is a homogeneous symmetric polynomial of degree n.
For 1 ::; i l ::; i2 ::; '" ::; in ::; r, the coefficient of Ail Ai2 ... Ai n
is denoted by V (Kil , K i2 , ... , Kin) and called the mixed volume of
K h , K i2 , ... , Kin' A powerful generalization of the Brunn-Minkowski
inequality, the Alexandrov-Fenchel inequality, states that for any con-
vex A,B,K3 ,K2, ... ,Kn eRn, we have
V(A, B, K 2, . .. ,Kn)2 2: V(A, A, K 3 , . •. ,Kn) . V(B, B, K 3 , • .. ,Kn).
Exercises
1. Let A be a single point and B the n-dimensional unit cube. What is the
function v(t) = vol((I-t)A + tB)? Show that v(t)i3 is not concave on
[0,1] for any (3 > ~. II]
2. Let A,B ~ Rn be convex sets. Show that the sets conv(({O}xA) U
({l}xB)) and UtE[O,l] [{t}x((I-t)A + tB)] (in Rn+l) are equal. 0
3. Prove that
n n
(gXir/n + (gYir/ ::; (g(Xi+Yi)r/
Let E(::;)denote the set of all linear extensions of a partial ordering ::;
and let e(::;) = IE(::;)I be the number of linear extensions. To sort means to
select one among the e(::;) possible linear extensions. Since a comparison of
distinct elements a and b can have two outcomes, we need at least log2 e(::;)
comparisons in the worst case to distinguish the appropriate linear extension.
Is this lower bound always asymptotically tight? Can one always sort using
0(10g2 e(::;)) comparisons, for any::;? An affirmative answer is implied by
the following theorem:
How do we use this for sorting ::;? For the first comparison, we choose
the two elements a, b as in the theorem. Depending on the outcome of this
comparison, we pass either to the partial ordering::; +(a, b) or to ::; +(b, a).
In both cases, the number of linear extensions has been reduced by the factor
1-0: For a ~ b this is clear by the theorem, and for a ::::: b this follows
from the equality e(::; + (a, b)) + e(::; + (b,a)) = e(::;). Hence, proceeding by
induction, we can sort any partial ordering::; using at most POgl/(1-8) e(::;)l
comparisons.
The conjectured "right" value of 0 in Theorem 12.3.1 is ~ ~ 0.33; obvi-
ously, one cannot do any better for the poset
(meaning that (a, b) is the only pair of distinct elements in the relation ::;).
The proof below gives 0 = ~ ~ 0.184, and more complicated proofs yield
better values, although ~ seems still elusive.
Order polytopes. We assign certain convex polytopes to partial orderings.
12.3.3 Observation. The vertices of the order polytope P(~) are precisely
the characteristic vectors of all up-sets in (X, ~), where an up-set is a subset
U ~ X such that if a E U and a ~ b, then b E U.
:r -c
~~~-------- Xa
1
h-<..(a)
- = -(-)
e-< "~ h«a).
-
- ::;EE(~)
If::::< is clear from context, we omit it in the subscript and we write just h(a).
The "good" elements a, b in the efficient comparison theorem can be se-
lected using the height. Namely, we show that any two distinct a, b with
Ih(a) - h(b)1 < 1 will do. (It is simple to check that if ::::< is not a linear or-
dering, then such a and b always exist; see Exercise 1.)
We now relate the height to the order polytope.
12.3.5 Lemma. For any n-element poset (X, ::::<), the center of gravity of the
order polytope P(::::<) is C = (c a : a E X), where Ca = n~l h~(a).
Proof of the efficient comparison theorem. Given the poset (X, ::::<),
we consider two elements a, bE X with Ih(a) - h(b)1 < 1. We want to show
that the number of linear extensions of both ::::< + (a, b) and ::::< + (b, a) is at
least a constant fraction of e(::::<). Consider the order polytopes P = P(::::<),
P::; = P(::::< + (a,b)), and P?:. = P(::::< + (b,a)). Geometrically, P is sliced into
P::; and P?:. by the hyperplane h = {x ERn: Xa = Xb}.
• We have - n~l < Cl < n~l' since Cl = n~l (h(a) - h(b)) and Ih(a) -
h(b)1 < 1.
306 Chapter 12: Two Applications of High-Dimensional Polytopes
1 1
vol(P:,:;) 2: 2e vol(P) and vol(?:,:) 2: 2e vol(P),
where P Sc is the part of P in the half-space {Yl ::; O} and P"2 is the other
part.
For t E [-1, 1], let Pt be the (n-I )-dimensional slice of P by the hyper-
plane {Yl = t}, and let ret) be the equivalent radius of Ph i.e., the radius of
an (n-I)-dimensional ball of volume VOln-l(Pt ). By Brunn's inequality for
slice volumes (Theorem 12.2.1), ret) is concave on [-1,1].
The Yl-coordinate of the center of gravity of P can be expressed as
",,(t)
y + u
-1 o 1 u
The graph of the function ",,(t) consists of two linear segments, and so K is
a double cone. First we construct the function ",,(t) for t positive. Here the
graph is a segment starting at the point V = (0, reO)) and ending at the point
U = (u,O). The number u is chosen so that vol(K"2) = vol(P"2). Since ret) is
concave and ",,(t) is linear on [0, u], we have u 2: 1. Moreover, as t grows from
o to I, we first have ret) 2: ",,(t), and then from some point on ret) ::; ",,(t).
This ensures that the center of gravity of K"2 is to the right of the center of
gravity of P>- (we can imagine that P>- is transformed into K>- by peeling
off some mass in the region labeled "-" and moving it right, to the region
labeled "+").
12.3 Sorting Partially Ordered Sets 307
Next, we define r;,(t) for t < o. We extend the segment UV to the left until
the (unique) point W such that when YWV is the graph of r;,(t) for negative
t, we have vol(Ks;} = vol(Ps:)' As t goes from 0 down to -1, r;,(t) is first above
r(t) and then below it. This is because at V, the segment WU decreases more
steeply than the function r(t). Therefore, we also have cdKsJ 2: CI (PsJ, and
hence CI (K) 2: CI (P) 2: - n~l. So, as was noted above, it remains to show
that vol(K2J 2: fevol(K), which is a more or less routine calculation.
We fix the notation as in the following picture:
l\S J{?
We note that cI(K) is a weighted average of cI(Kd and cI(K2); the weights
are the volumes of KI and K2 whose ratio is hI : h2. The center of gravity of
an n-dimensional cone is at n~ I of its height, and hence CI (K I) = - .6. r2h -
and CI (K2 ) = n~1 -.6.. Therefore,
cl(K) =
hI (_-'!:L) + h2 (...!lL)
n+1 n+1 -.6. =
h
2 -
h
1 -.6..
hI + h2 n +1
We have .6. = I-hI, and so from the condition cl(K) 2: - n~l we obtain
h2 + nhl 2: n. We substitute hI = U - h2 + 1 and rearrange, which yields
U 1
- > 1--. (12.2)
h2 - n
We are interested in bounding vol(K::::) from below. The cone K:::: is similar
to K 2 , with ratio u/h 2 . So
vol(K» 2: _u_ ( 1 - -
1 )n-I vol(K).
- u+l n
308 Chapter 12: Two Applications of High-Dimensional Polytopes
Finally, u~l 2': ~ (as u 2': 1) and (1 - ~)n-1 > e- 1 for all n, so vol(K;::J 2':
~ vol(K) follows. D
Exercises
1. Let (X, ::5) be a finite poset. Prove that if ::5 is not a linear ordering, then
there always exist a,b E X with jh(a) - h(b)j < 1. CD
2. Show that the center of gravity of a simplex with vertices ao, al,"" ad
is the same as the center of gravity of its vertex set. ~
3. Let K be a bounded convex body in R n , h a hyperplane passing through
the center of gravity of K, and KI and K2 the parts into which K is
divided by h.
(a) Prove that vol(Kd,vol(K2) 2: (n~l)nvol(K). [II
(b) Show that the bound in (a) cannot be improved in general. ~
13
We begin with comparing the volume of the n-dimensional cube with the
volume of the unit ball inscribed in it, in order to realize that volumes of
"familiar" bodies behave quite differently in high dimensions from what the
3-dimensional intuition suggests. Then we calculate that any convex polytope
in the unit ball B n whose number of vertices is at most polynomial in n
occupies only a tiny fraction of B n in terms of volume. This has interesting
consequences for deterministic algorithms for approximating the volume of
a given convex body: If they look only at polynomially many points of the
considered body, then they are unable to distinguish a gigantic ball from a
tiny polytope. Finally, we prove a classical result, John's lemma, which states
that for every n-dimensional symmetric convex body K there are two similar
ellipsoids with ratio ..;n such that the smaller ellipsoid lies inside K and the
larger one contains K. So, in a very crude scale where the ratio ..;n can be
ignored, each symmetric convex body looks like an ellipsoid.
Besides presenting nice and important results, this chapter could help
the reader in acquiring proficiency and intuition in geometric computations,
which are skills obtainable mainly by practice. Several calculations of non-
trivial length are presented in detail, and while some parts do not require
any great ideas, they still contain useful small tricks.
the result, which can be verified in various other ways and found in many
books of formulas, is
Io
Here r(x) = oo tx-1e- t dt is the usual gamma function, with r(k+1) = k!
for natural numbers k.
Let us compare the volume of the unit cube [o,l]n with that of the in-
scribed ball (of radius !).
o
(Using Exercise 1, the reader may want to add the crosspolytope inscribed
in both bodies to the comparison.) For dimension n = 3, the volume of
the ball is about 0.52, but for n = 11 it is already less than 10- 3 . Using
Stirling's formula, we find that it behaves roughly like (2~e )n/2. For large n,
the inscribed ball is thus like a negligible dust particle in the cube, as far as
the volume is concerned.
This can be experienced if one tries to generate random points uniformly
distributed in the unit ball Bn. A straightforward method is first to generate
a random point x in the cube [-1, l]n, by producing n independent random
numbers Xl,X2, ... ,Xn E [-1,1]. If Ilxll > 1, then x is discarded and the
experiment is repeated, and if IIxll :::; 1, then x is the desired random point
in the unit ball. This works reasonably in dimensions below 10, say, but in
dimension 20, we expect about 40 million discarded points for each accepted
point, and the method is rather useless.
Another way of comparing the ball and the cube is to picture the sizes of
the n-dimensional ball having the same volume as the unit cube:
n=2
on= 10
o n= 50
For large n, the radius grows approximately like 0.24Jn. This indicates that
the n-dimensional unit cube is actually quite a huge body; for example, its
diameter (the length of the longest diagonal) is In. Here is another example
illustrating the largeness of the unit cube quite vividly.
13.1 Volumes, Paradoxes of High Dimension, and Nets 313
Balls enclosing a ball. Place balls of radius ~ into each of the 2n vertices
of the unit cube [0, l]n so that they touch along the edges of the cube, and
consider the ball concentric with the cube and just touching the other balls:
Obviously, this ball is quite small, and it is fully contained in the cube, right?
No: Already for n = 5 it starts protruding out through the facets.
Proper pictures. If a planar sketch of a high-dimensional convex body
should convey at least a partially correct intuition about the distribution of
the mass, say for the unit cube, it is perhaps best to give up the convexity
in the drawing! According to Milman [Mil98], a "realistic" sketch of a high-
dimensional convex body might look like this:
argument yields an 1]-dense set whose size has essentially the best possible
order of magnitude.
Let us call a subset N ~ sn-l 1]-separated if every two distinct points of
N have (Euclidean) distance greater than 1]. In a sense, this is opposite to
being 1]-dense.
In order to construct a small 1]-dense set, we start with the empty set
and keep adding points one by one. The trick is that we do not worry about
1]-density along the way, but we always keep the current set 1]-separated.
Clearly, if no more points can be added, the current set must be 1]-dense.
The result of this algorithm is called an 1]-net. 1 That is, N ~ sn-l is an
1]-net if it is an inclusion-maximal 1]-separated subset of sn-l; i.e., if N is
1]-separated but N U {x} is not 1]-separated for any x E sn-l \ N. (These
definitions apply to an arbitrary metric space in place of sn-l.) A volume
argument bounds the maximum size of an 1]-net.
13.1.1 Lemma (Size of 17-nets in the sphere). For each 1] E (0,1]' any
1]-net N ~ sn-l satisfies
Later on, we will check that for 1] small, no 1]-dense set can be much
smaller (Exercise 14.1.3).
¥
Proof. For each x E N, consider the ball of radius centered at x. These
balls are all disjoint, and they are contained in the ball B(O, 1+1]) ~ B(0,2).
Therefore, vol(B(0,2)) 2: INlvol(B(O, ¥)), and since vol(B(O,r)) in R n is
proportional to r n , the lemma follows. 0
Exercises
1. Calculate the volume of the n-dimensional crosspolytope, i.e., the convex
hull of {el' -el, ... , en, -en}, where ei is the ith vector in the standard
basis of R n. I2l
2. (Ball volume via the Gaussian distribution)
(a) Let In = fRd e-llxl12 dx, where Ilxll = (xi+-· ·+x;Y/2 is the Euclidean
norm. Express In using h. I2l
1 Not to be confused with the notion of c:-net considered in Chapter 10; unfortu-
nately, the same name is customarily used for two rather unrelated concepts.
13.2 Hardness of Volume Approximation 315
°
rithms that can approximate the volume within a factor of (l+c) for each
fixed c > with high probability. Here "randomized" means that the algo-
rithm makes random decisions (like coin tosses) during its computation; it
does not imply any randomness of the input. These are marvelous develop-
ments, but they are not treated in this book. We only briefly explain the
relation of Theorem 13.2.1 to the deterministic volume approximation.
To understand this connection, one needs to know how the input con-
vex body is presented to an algorithm. A general convex body cannot be
exactly described by finitely many parameters, so caution is certainly neces-
sary. One way of specifying certain convex bodies, namely, convex polytopes,
is to give them as convex hulls of finite point sets (V-presentation) or as in-
tersections of finite sets of half-spaces (H-presentation). But there are many
other computationally important convex bodies that are not polytopes, or
have no polynomial-size V-presentation or H-presentation. We will meet an
example in Section 15.5, where the convex body lives in the space of n x n
real matrices and is the intersection of a polytope with the cone consisting
of all positive semidefinite matrices.
In order to abstract the considerations from the details of the presentation
of the input body, the oracle model was introduced for computation with
convex bodies. If KeRn is a convex body, a membership oracle for K is,
roughly speaking, an algorithm (subroutine, black box) that for any given
input point x E Rn outputs YES if x E K and NO if x tf- K.
This is simplified, because in order to be able to compute with the body,
acle about some points {Xl, X2, ... , X N }, gets the correct answers, and out-
puts an estimate for vol(Bn). Next, we call the algorithm with the body
K = conv( {Xl, X2, ... , X N } n Bn). The answers of the oracle are exactly the
same, and since the algorithm has no other information about the body K
and it is deterministic, it has to output the same volume estimate as it did
for Bn. But by Theorem 13.2.1, vol(Bn)/vol(K) 2: (cn/ln(N/n+1))n/2, and
so the error of the approximation must be at least this factor. If N, the num-
ber of oracle calls, is polynomial in n, it follows that the error is at least
(c'n/ logn)n/2.
By more refined consideration, one can improve the lower bound to ap-
proximately the square of the quantity just given. The idea is to input the
dual body K* into the algorithm, too, for which it gets the same answers, and
then use a deep result (the inverse Blaschke-Santal6 inequality) stating that
vol(K) vol(K*) 2: cn In! for any centrally symmetric n-dimensional convex
body K, with an absolute constant c > 0 (some technical steps are omit-
ted here). This improvement is interesting because, as was remarked above,
for symmetric convex bodies it almost matches the performance of the best
known algorithm.
Idea of the proof of Theorem 13.2.1. Let V be the set of vertices of the
polytope P c Bn, IVI = N. We choose a suitable parameter k < n and prove
that for every X E P, there is a k-tuple J of points of V such that X is close to
conv(J). Then vol(P) is simply estimated as (~) times the maximum possible
volume of the appropriate neighborhood of the convex hull of k points in Bn.
Here is the first step towards realizing this program.
13.2.2 Lemma. Let S in Rn be an n-dimensional simplex, i.e., the convex
hull of n+ 1 aflinely independent points, and let R = R( S) and p = p( S) be
the circumradius and inradius of S, respectively, that is, the radius of the
smallest enclosing ball and of the largest inscribed ball. Then ~ 2: n.
Proof. We first sketch the proof of an auxiliary claim: Among all simplices
contained in Bn, the regular simplex inscribed in Bn has the largest volume.
The volume of a simplex is proportional to the (n-1 )-dimensional volume of
its base times the corresponding height. It follows that in a maximum-volume
simplex S inscribed in Bn, the hyperplane passing through a vertex v of S
and parallel to the facet of S not containing v is tangent to Bn, for otherwise,
v could be moved to increase the height:
It can be easily shown (Exercise 2) that this property characterizes the regular
simplex (so the regular simplex is even the unique maximum).
318 Chapter 13: Volumes in High Dimension
P = p( n, k)
n
= ( ~ i2
1) 1/2
Then xx' ..l x'y (because the whole of 8' is perpendicular to xx'), and so
Ilx - Yl12 = Ilx - x'I12 + Ilx' - Yl12 :::; p(n, k)2. Finally, xy ..l F, since both
the vectors x' - y and x - x' lie in the orthogonal complement of the linear
subspace generated by F - y. 0
k ) (k-l)/2 v'k
M(k-1) = (k_ 1 (k - I)! ;
see Exercise l(b). (If we only want to prove the weaker estimate (13.1) and do
not care about the value of C, then M(k-1) can also be trivially estimated
by VOlk_l(B k- 1 ) or even by 2k- 1 .)
What remains is calculation. We have
vol(P) <
vol(Bn) -
(N)k .M(k-1). P(n , k)n-k+l . VOln_k+l(Bn-k+
vol(Bn)
1
) •
(13.2)
We first estimate
-l
We now set
k - n
In( ~+1) J
(for obtaining the weaker estimate (13.1), the simpler value k = llnn
N J is
more convenient). We may assume that InN is much smaller than n, for
otherwise, the bound in the theorem is trivially valid, and so k is larger than
any suitable constant. In particular, we can ignore the integer part in the
definition of k.
For estimating the various terms in (13.2), it is convenient to work with
the natural logarithm of the quantities. The logarithm of the bound we are
heading for is ~(1nln(~+l) -Inn + 0(1)), and so terms up to O(n) can
be ignored if we do not care about the value of the constant C. Further, we
find that kInk = klnn - klnln(~+l) = klnn + O(n). This is useful for
estimating In(k!) = kInk - O(k) = klnn - O(n).
Now, we can bound the logarithms of the terms in (13.2) one by one.
We have In (~) :::; klnN -In(k!) = k(1n(~) + Inn) -In(k!) :::; n + klnn-
klnn + O(n) = O(n); this term is negligible. Next, InM(k-1) contributes
about -In(k!) = -k In n+O(n). The main contribution comes from the term
lnp(n, k)n-k+l :::; -(n-k) In v'k + O(n) = ~(-ln n + lnln( ~+1)) + ~ In n +
320 Chapter 13: Volumes in High Dimension
for the reverse transition. It is much more difficult than the Blaschke-
Santal6 inequality and it was proved by Bourgain and Milman; see,
e.g., [MiI98] for discussion and references.
Let us remark that the weaker bound (~(ln N)) n/2 is relatively
easy to prove in the dual setting with slabs (Exercise 14.1.4), which
together with the Blaschke-Santal6 inequality gives (13.1).
Theorem 13.2.1 concerns the situation where vol(P) is small com-
pared to vol(Bn). The smallest number of vertices of P such that
vol(P) 2 (I-c) vol(Bn) for a small c > 0 was investigated by Gor-
don,· Reisner, and Schutt [GRS97]. In an earlier work they constructed
polytopes with N vertices giving c = O(nN- 2/(n-I)), and in the pa-
per mentioned they proved that this is asymptotically optimal for
N 2 (Cn)(n-I)/2, with a suitable constant C.
The oracle model for computation with convex bodies was intro-
duced by Grotschel, Lov8.sz, and Schrijver [GLS88]. A determinis-
tic polynomial-time algorithm approximating the volume of a convex
body given by a suitable oracle (weak separation oracle) achieving the
approximation factor n!(I+c), for every c > 0, was given by Betke
and Henk [BH93] (the geometric idea goes back at least to Macbeath
[Mac50]). The algorithm chooses an arbitrary direction VI and finds
the supporting hyperplanes hi and hI of K perpendicular to VI. Let
pi and PI be contact points of hi and hI with K. The next direction
V2 is chosen perpendicular to the affine hull of {pi, PI }, etc.
hI
hf pi
PI 1'1
h+
hi l
since then; see, e.g., Kannan, Lovasz, and Simonovits [KLS97]. A re-
cent success of these methods is a polynomial-time approximation al-
gorithm for the permanent of a nonnegative matrix by Jerrum, Sin-
clair, and Vigoda [JSVOl].
By considerations partially indicated in Exercise 4, Barany and
Fiiredi [BF87] showed that in deterministic polynomial time one can-
not approximate the width of a convex body within a factor better
than n( vn/ log n ). Brieden, Gritzmann, Kannan, Klee, Lovasz, and
Simonovits [BGK+99] provided a matching upper bound (up to a con-
stant), and they showed that in this case even randomized algorithms
are not more powerful. They also considered a variety of other parame-
ters of the convex body, such as diameter, inradius, and circumradius,
attaining similar results and improving many previous bounds from
[GLS88].
Lemma 13.2.2 appears in Fejes T6th [T6t65].
Exercises
1. (a) Calculate the inradius and circumradius of a regular n-dimensional
simplex. 0
(b) Calculate the volume of the regular n-dimensional simplex inscribed
in the unit ball Bn. 121
2. Suppose that the vertices of an n-dimensional simplex S lie on the sphere
sn-l and for each vertex v, the hyperplane tangent to sn-l at v is parallel
to the facet of S opposite to v. Check that S is regular. 121
3. Let S c Rn be a simplex circumscribed about Bn and let F be a facet
of S touching Bn at a point c. Show that if c is not the center of gravity
of F, then there is another simplex S' (arising by slightly moving the
hyperplane that determines the facet F) that contains B n and has volume
smaller than vol(S). [!)
4. The width of a convex body K is the minimum distance of two parallel
hyperplanes such that K lies between them. Prove that the convex hull
V
of N points in Bn has width at most 0 ( (In N) / n ). 0
5. (A weaker but simpler estimate) Let VeRn be a finite set. Prove
that conv(V) t;;; UVEV B(~v, ~llvll), where B(x,r) is the ball ofradius r
centered at x. Deduce that the convex hull of N points contained in B n
has volume at most fn vol(Bn). [!)
This is essentially the argument of Elekes [Ele86].
the bound in Theorem 13.2.1 is tight for N ~ 2n, since vol(P)/vol(B n ) ~ rn.
We begin with two extreme cases.
First we construct a k-dimensional polytope Po c Bk with 4k vertices
containing the ball ~ Bk. There are several possible ways; the simplest is based
on 7J-nets. We choose a I-net V C Sk-l and set Po = conv(V). According to
Lemma 13.1.1, we have N = IVI :::; 4k. If there were an x with IIxll = ~ not
lying in Po,
Bibliography and remarks. Several proofs are known for the lower
bound almost matching Theorem 13.2.1 (Barany and Fiiredi [BF87],
Carl and Pajor [CP88], Kochol [Koc94]). In Barany and Fiiredi [BF87],
the appropriate polytope is obtained essentially as the convex hull of
N random points on sn-l (for technical reasons, d special vertices are
added), and the volume estimate is derived from an exact formula for
the expected surface measure of the convex hull of N random points
on sn-l due to Buchta, Miiller, and Tichy [BMT95].
The idea of the beautifully simple construction in the text is due
to Kochol [Koc94]. His treatment of the basic case with exponentially
large N is different, though: He takes points of a suitably scaled integer
lattice contained in Bk for V, which yields an efficient construction
(unlike the argument with a I-net used in the text, which is only
existential) .
Exercises
1. (Polytopes in Bn with polynomially many facets)
(a) Show that the cube inscribed in the unit ball Bn, which is a convex
polytope with 2n facets, has volume of a larger order of magnitude than
any convex polytope in Bn with polynomially many vertices (and so,
concerning volume, "facets are better than vertices"). [II
(b) Prove that the inradius of any convex polytope with N facets con-
tained in B n is at most o( J(ln(N/n + 1))/n) (and so, in this respect,
facets are not better than vertices). 0
These observations are from Brieden and Kochol [BKOO].
n x~ x~ x;}
E= { XER:-+-+···+-<l, (13.3)
a~ a~ a; -
as is easy to check; this is an ellipsoid with center at 0 and with semi axes
aI, a2, .. ·, an. In this case we have vol(E) = ala2··· an· vol(Bn). An arbi-
trary ellipsoid E can be brought to this form by a suitable translation and
rotation about the origin. In the language of linear algebra, this corresponds
to diagonalizing a positive definite matrix using an orthonormal basis con-
sisting of its eigenvectors; see Exercise 1.
Proof of Theorem 13.4.1. In both cases in the theorem, Ein is chosen as
an ellipsoid of the largest possible volume contained in K. Easy compactness
considerations show that a maximum-volume ellipsoid exists. In fact, it is
also unique, but we will not prove this. (Alternatively, the proof can be done
starting with the smallest-volume ellipsoid enclosing K, but this has some
technical disadvantages. For example, its existence is not so obvious.)
326 Chapter 13: Volumes in High Dimension
We prove only the centrally symmetric case of John's lemma. The non-
symmetric case follows the same idea, but the calculations are different and
more complicated, and we leave them to Exercise 2.
So we suppose that K is symmetric about 0, and we fix the ellipsoid
E in of maximum volume contained in K. It is easily seen that Ein can be
assumed to be symmetric, too. We make a linear transformation so that Ein
becomes the unit ball Bn. Assuming that the enlarged ball Vn' B n does not
contain K, we derive a contradiction by exhibiting an ellipsoid E' <;;; K with
vol(E') > vol(Bn).
We know that there is a point x E K with /lxll > Vn. For convenience, we
may suppose that x = (8,0,0, ... ,0), 8 > Vn. To finish the proof, we check
that the region R = conv(Bn U {-x, x})
-x
s b= st bs
- ---===
t -
JS2=1' VS +
2 t2
This leads to a 2 = s2(1-b2 ) +b2. We now choose b just a little smaller than 1;
a suitable parameterization is b2 = 1-£ for a small £ > O. We want to show
that abn - 1 > 1, and for convenience, we work with the square. We have
The Maclaurin series of the right-hand side in the variable £ is 1 + (s2 - n)£ +
0(£2). Since 8 2 > n, the expression indeed exceeds 1 for all sufficiently small
£ > o. Theorem 13.4.1 is proved. 0
Exercises
1. Let E be the ellipsoid J(B n ), where J: x H Ax for an n x n nonsingular
matrix A.
(a) Show that E = {x ERn: x T Bx ::; I}. What is the matrix B? ~
(b) Recall or look up appropriate theorems in linear algebra showing that
there is an orthonormal matrix T such that B' = T BT- 1 is a diagonal
matrix with the eigenvalues of B on the diagonal (check and use the fact
that B is positive definite in our case). ~
(c) What is the geometric meaning of T, and what is the relation of the
entries of T BT- 1 to the semiaxes of the ellipsoid E? ~
2. Prove the part of Theorem 13.4.1 dealing with not necessarily symmetric
convex bodies. 0
3. (Uniqueness of the smallest enclosing ellipsoid) Let Xc R n be a bounded
set that is not contained in a hyperplane (i.e., it contains n+1 affinely
independent points). Let £(X) be the set of all ellipsoids in R n contain-
ing X.
(a) Prove that there exists an Eo E £(X) with vol(Eo) = inf{ vol(E): E E
£(X)}. (Show that the infimum can be taken over a suitable compact
subset of £(X).) IT]
(b) Let E 1 , E2 be ellipsoids in Rn; check that after a suitable affine trans-
2
formation of coordinates, we may assume that El = {x ERn: 2:~= 1~ ::;
Ilx - cll ::; I}. Define E = {x ERn: 2 2: i =l ~ +
2 1 n x2
1} and E2 = { x ERn:
~ 2:~=1 (Xi
- Ci)2 ::; I}. Verify that El n E2 ~ E, that E is an ellips~id,
and that vol(E) 2 min(vol(Ed, vol(E2 )), with equality only if El = E 2.
Conclude that the smallest-volume enclosing ellipsoid of X is unique. IT]
4. (Uniqueness of the smallest enclosing ball)
(a) In analogy with Exercise 3, prove that for every bounded set Xc Rn,
there exists a unique minimum-volume ball containing X. [II
(b) Show that if X c R n is finite then the smallest enclosing ball is
determined by at most n+ 1 points of X; that is, there exists an at most
(n+1)-point subset of X whose smallest enclosing ball is the same as that
of X. [II
5. (a) Let P C R2 be a convex polygon with n vertices. Prove that there
are three consecutive vertices of P such that the area of their convex hull
is at most O(n- 3 ) times the area of P. IT]
(b) Using (a) and the fact that every triangle with vertices at integer
points has area at least ~ (check!), prove that every convex n-gon with
integral vertices has area O(n 3 ). [II
Remark. Renyi and Sulanke [RS64] proved that the worst case in (a) is
the regular convex n-gon.
14
As we will see later, w is of order n- 1 / 2 for large n. (Of course, one might
ask why the measure is concentrated just around the "equator" Xn = O. But
counterintuitive as it may sound, it is concentrated around any equator, i.e.,
near any hyperplane containing the origin.)
The second, considerably deeper, step shows that the measure on sn-l
is concentrated not only around the equator, but near the boundary of any
(measurable) subset A c sn-l covering half of the sphere. Here is a precise
quantitative formulation.
14.1 Measure Concentration on the Sphere 331
Thus, if A occupies half of the sphere, almost all points of the sphere
lie at distance at most O(n- 1/ 2 ) from A; only extremely small reserves can
vegetate undisturbed by the nearness of A. (There is nothing very special
about measure ~ here; see Exercise 1 for an analogous result with P [A] =
a E (0, ~).) To recover the concentration around the equator, it suffices to
choose A as the northern hemisphere and then as the southern hemisphere.
We present a simple and direct geometric proof of a slightly weaker version
of Theorem 14.1.1, with -t 2 n/4 in the exponent instead of -t 2 n/2. It deals
with both the steps mentioned above in one stroke.
It is based on the Brunn-Minkowski inequality: vol(A)l/n + vol(B)l/n :S
vol(A + B)l/n for any nonempty compact sets A, BeRn (Theorem 12.2.2).
We actually use a slightly different version of the inequality, which resembles
the well known inequality between the arithmetic and geometric means, at
least optically:
vol(~(A + B» ~ Jvol(A) vol(B). (14.1)
This is easily derived from the usual version: We have vol(~(A + B»l/n ~
vol(~A)l/n + vol(~B)l/n = ~(vol(A)l/n + vol(B)l/n) ~ (vol(A)vol(B»1/2n
by the inequality ~ (a + b) ~ JOlj.
Proof of a weaker version of Theorem 14.1.1. For a set A ~ sn-l,
we define A as the union of all the segments connecting the points of A to 0:
A = {ax: x E A, a E [0, I]} ~ Bn. Then we have
P[A] = JL(A),
where JL(A) = vol(A)/vol(Bn) is the normalized volume of A; in fact, this
can be taken as the definition of P[A].
Let t E [0,1]' let P[A] ~ ~, and let B = sn-l \ At. Then Iia - bll ~ t for
all a E A, bE B.
14.1.2 Lemma. For any x E A and fj E 13, we have II~II :S 1- t 2 /8.
Proof of the lemma. Let x = ax, fj = /3y, x E A, y E B:
332 Chapter 14: Measure Concentration and Almost Spherical Sections
I x; fJ I I ax 2+ yI : :; a I x; yI + (1 - a) I ~ II
= a(l- ~) + (1 - a)(1 - ~) :::; 1- ~.
The lemma is proved.
So
Exercises
1. Derive the following from Theorem 14.1.1: If A <:;; sn-l satisfies P[A] 2::
a, 0 < a :::; ~, then 1 - P [At] :::; 2e-(t-t o )2 n / 2 , where to is such that
2e-t~n/2 < a. III
2. Let A, Be sn-l be measurable sets with distance at least 2t. Prove that
min(P[A] , P[BD :::; 2e- t2n / 2 .12l
3. Use Theorem 14.1.1 to show that any I-dense set in the unit sphere sn-l
has at least ~en/8 points. I2l
4. Let K = n~l{X ERn: I(Ui,X)1 :::; I} be the intersection of symmetric
slabs determined by unit vectors Ul, ... , UN ERn. Using Theorem 14.1.1,
prove that vol(Bn)/vol(K) :::; (~lnN)n/2 for a suitable constant C.III
The relation to Theorem 13.2.1 is explained in the notes to Section 13.2.
14.2 Isoperimetric Inequalities and More on Concentration 333
(In the picture, assuming that the dark areas are the same, the light gray area
is the smallest for the disk.) Letting t -+ 0, one can get a statement involving
the perimeter or surface area. But the formulation with t-neighborhood makes
sense even in spaces where "surface area" is not defined; it suffices to have a
metric and a measure on the considered space.
Here is this "neighborhood" form of isoperimetric inequality for the Eu-
clidean space R n with Lebesgue measure.
14.2.1 Proposition. For any compact set A C Rd and any t ;::: 0, we have
vol(At) ;::: vol(Bt ), where B is a ball of the same volume as A.
Although we do not need this particular result in the further development,
let us digress and mention a nice proof using the Brunn-Minkowski inequality
(Theorem 12.2.2).
Proof. By rescaling, we may assume that B is a ball of unit radius. Then
At = A + tB, and so
isoperimetric inequality states that for all measurable sets A ~ sn-l and all
t :2:: 0, we have PlAt] :2:: P[Ct], where C is a spherical cap with P[C] = P[A].
We are not going to prove this; no really simple proof seems to be known.
The measure concentration on the sphere (Theorem 14.1.1) is a rather
direct consequence of this isoperimetric inequality, by the argument already
indicated above. If P[A] = ~, then PlAt] :2:: P[Ct], where C is a cap with
P[C] = ~, i.e., a hemisphere. Thus, it suffices to estimate the measure of the
complementary cap sn-l \ Ct. 1
Gaussian concentration. There are many other metric probability spaces
with measure concentration phenomena analogous to Theorem 14.1.1. Per-
haps the most important one is Rn with the Euclidean metric and with the
n-dimensional Gaussian measure, given by
Prob[Z., <
- z] = _l_jZ
v'27r
2
e- t2 / dt
-00
for all z E R. Then the vector (Zl' Z2, ... , Zn) E R n is distributed accord-
ing to the measure ,. This, is spherically symmetric; the density function
(27r)-n/2cllxI12/2 depends only on the distance of x from the origin. The dis-
tance of a point chosen at random according to this distribution is sharply
concentrated around Vii, and in many respects, choosing a random point
according to , is similar to choosing a random point from the uniform dis-
tribution on the sphere Vii sn-l.
The isoperimetric inequality for the Gaussian measure claims that among
all sets A with given ,(A), a half-space has the smallest possible measure
of the t-neighborhood. By simple calculation, this yields the corresponding
theorem about measure concentration for the Gaussian measure:
14.2.2 Theorem (Gaussian measure concentration). Let a measurable
set A ~ Rn satisfy ,(A) :2::~. Then ,(At ):2:: 1- e- t2 / 2 •
I Theorem 14.1.1 provides a good upper bound for the measure of a spherical cap,
but sometimes a lower bound is useful, too. Here are fairly precise estimates; for
convenience they are expressed with a different parameterization. Let C(T) =
{x E sn-l: Xl 2: T} denote the spherical cap of height 1 - T. Then for 0 :::; T :::;
yfiFt, we have f2 : :;
P[C(T)] :::; ~, and for yfiFt :::; T < 1, we have
Note that the dimension does not appear in this inequality, and indeed
the Gaussian concentration has infinite-dimensional versions as well. Measure
concentration on sn-1, with slightly suboptimal constants, can be proved as
an easy consequence of the Gaussian concentration; see, for example, Milman
and Schechtman [MS86] (Appendix V) or Pisier [Pis89].
Most of the results in the sequel obtained using measure concentration on
the sphere can be derived from the Gaussian concentration as well. In more
advanced applications the Gaussian concentration is often technically prefer-
able, but here we stick to the perhaps more intuitive measure concentration
on the sphere.
Other important "continuous" spaces with concentration results similar to
Theorem 14.1.1 include the n-dimensional torus (the n-fold Cartesian product
Sl x ... X Sl C R2n) and the group SO(n) of all rotations around the origin
in Rn (see Section 14.4 for a little more about SO(n)).
Discrete metric spaces. Similar concentration inequalities also hold in
many discrete metric spaces encountered in combinatorics. One of the sim-
plest examples is the n-dimensional Hamming cube en = {O, l}n. The points
are n-component vectors of O's and 1 's, and their Hamming distance is the
number of positions where they differ. The "volume" of a set A <;;; {O,l}n
is defined as P[A] = 2~ IAI. An r-ball B is the set of all 0/1 vectors that
differ from a given vector in at most r coordinates, and so its volume is
P[B] = 2- n (1 + G) + (~) + ... + G)). The isoperimetric inequality for the
Hamming cube, due to Harper, is exactly of the form announced above:
If A <;;; en is any set with P[A] 2: P[B], then PlAt] 2: P[Bt ].
Of course, if A is an r-ball, then At is an (r+t)-ball and we have equality.
Suitable estimates (tail estimates for the binomial distribution in probability
theory) then give an analogue of Theorem 14.1.1:
Proof. The first inequality can be derived from the a-additivity of the
measure P:
i]
00
(the numerical estimate of the last sum is not important; it is important that
it converges to some constant, which is obvious). 0
82
dim L :::: Slog(Sj8) . n - 1.
340 Chapter 14: Measure Concentration and Almost Spherical Sections
°
First we show that a bound on E(>..) implies concentration of Lipschitz
functions. Assume that E(>..) :::; ea>? /2 for some a > and all >.. > 0,
and let f: n --+ R be I-Lipschitz. We may suppose that E[f] = 0.
Using Markov's inequality for the random variable Y = e Af , we have
P[f:::: t] = p[Y:::: eO,] :::; E[Y] let>. :::; E(>..)/e t >. :::; ea>.2/ 2->.t, and
setting>.. = ~ yields P[f:::: t] :::; e- t2 / 2a .
Next, for some spaces, E(>..) can be bounded directly. Here we show
that if (n,p) has diameter at most 1, then E(>..) :::; e->.2/2. This can be
proved by the following elegant trick. First we note that eE[f] :::; E [e f ]
for any f, by Jensen's inequality in integral form, and so if E[f] = 0,
then E [e- f ] :::: 1. Then, for a I-Lipschitz f with E[J] = 0, we calculate
E[e>.f] = in e>'f(x)dP(x)
Exercises
1. Derive the measure concentration on the sphere (Theorem 14.1.1) from
Levy's lemma. 0
Given a centrally symmetric convex body KeRn and 10 > 0, we are in-
terested in finding a k-dimensional (linear) subspace L, with k as large as
possible, such that the "section" K n L is (1+10 )-almost spherical.
Ellipsoids. First we deal with ellipsoids, where the existence of large spher-
ical sections is not very surprising. But in the sequel it gives us additional
freedom: Instead of looking for a (l+c)-spherical section of a given convex
body, we can as well look for a (l+c)-ellipsoidal section, while losing only a
factor of at most 2 in the dimension. This means that we are free to trans-
form a given body by any (nonsingular) affine map, which is often convenient.
Let us remark that in the local theory of Banach spaces, almost-ellipsoidal
sections are usually as good as almost-spherical ones, and so the following
lemma is often not even mentioned.
Proof. °
Let E = {x E R 2k - 1 : L;~~l ~~ :::; I} with < al :::; a2 :::; ... :::;
a2k-l. We define the k-dimensionallinear subspace L by a system of k - 1
linear equations. The ith equation is
1 1
2" - -2--'
ak a 2k - i
for x E L. It follows that for x E L, we have x E E if and only if Ilxll :::; ak,
and so En L is a ball of radius ak. The reader is invited to find a geometric
meaning of this proof and/or express it in the language of eigenvalues. D
°
To make formulas simpler, we consider only the case 10 = 1 (2-almost
spherical sections) in the rest of this section. An arbitrary 10 > can always
be handled very similarly.
14.4 Almost Spherical Sections: The First Steps 343
The cube. The cube [-1, IJn is a good test case for finding almost-spherical
sections; it seems hard to imagine how a cube could have very round slices.
In some sense, this intuition is not totally wrong, since the almost-spherical
sections of a cube can have only logarithmic dimension, as we verify next.
(But the n-dimensional crosspolytope has (1 +€ )-spherical sections of dimen-
sion as high as c(€)n, and yet it does not look any rounder than the cube; so
much for the intuition.)
The intersection of the cube with a k-dimensionallinear subspace of R n
is a k-dimensional convex polytope with at most 2k facets.
Next, we show that the n-dimensional cube actually does have 2-almost
n
spherical sections of dimension (log n). First we need a k-dimensional 2-
almost spherical polytope with 4k facets. We note that if P is a convex
polytope with Bk c P C tBk, then the dual polytope P* satisfies t Bk C
P* C Bk (Exercise 1). So it suffices to construct a k-dimensional 2-almost
spherical polytope with 4k vertices, and this was done in Section 13.3: We can
take any I-net in Sk-l as the vertex set. (Let us remark that an exponential
lower bound for the number of vertices also follows from Theorem 13.2.1.)
By at most doubling the number of facets, we may assume that our k-
dimensional 2-almost spherical polytope is centrally symmetric. It remains
to observe that every k-dimensional centrally symmetric convex polytope P
with 2n facets is an affine image of the section [-1, 1In n L for a suitable k-di-
mensionallinear subspace L ~ Rn. Indeed, such a P can be expressed as the
344 Chapter 14: Measure Concentration and Almost Spherical Sections
intersection n~=l{x E Rk: l(ai,x)1 ::; I}, where ±al, ... ,±an are suitably
normalized normal vectors of the facets of P. Let f: Rk -t R n be the linear
map given by
f(x) = ((al' x), (a2' x), ... , (an, x)).
Since P is bounded, the ai span all of R k, and so f has rank k. Consequently,
its image L = f(R k ) is a k-dimensional subspace of Rn. We have P =
f-l([-I, l]n), and so the intersection [-1, l]n n L is the affine image of P.
We see that the n-dimensional cube has 2-almost ellipsoidal sections of
dimension Q(logn) (as well as 2-almost spherical sections, by Lemma 14.4.1).
Next, we make preparatory steps for finding almost-spherical sections of
arbitrary centrally symmetric convex bodies. These considerations are most
conveniently formulated in the language of norms.
Reminder on norms. We recall that a norm on a real vector space Z is
a mapping that assigns a nonnegative real number Ilxllz to each x E Z such
that Ilxllz = 0 implies x = 0, Ilaxllz = lal . Ilxllz for all a E R, and the
triangle inequality holds: Ilx + Yllz ::; Ilxliz + Ilyllz. (Since we have reserved
II . II for the Euclidean norm, we write other norms with various subscripts,
or occasionally we use the symbol I . I.)
Norms are in one-to-one correspondence with closed bounded convex bod-
ies symmetric about 0 and containing 0 in their interior. Here we need only
one direction of this correspondence: Given a convex body K with the listed
properties, we assign to it the norm II . 11K given by
IIxliK = 1
It is easy to verify the axioms of the norm (the convexity of K is needed for
the triangle inequality). The body K is the unit ball of the norm II· 11K. The
norm of points decreases by blowing up the body K.
General body: the first attempt. Let KeRn be a convex body defining
a norm (i.e., closed, bounded, symmetric, 0 in the interior). Let us define the
function fK: sn-l -t R as the restriction of the norm II . 11K on sn-\ that
is, fK(X) = IlxiIK. We note that K is t-almost spherical if (and only if) there
is a number a> 0 such that a ::; f(x) ::; ta for all x E sn-l. So for finding
a large almost-spherical section of K, we need a linear subspace L such that
14.4 Almost Spherical Sections: The First Steps 345
f does not vary too much on sn-l n L, and this is where Proposition 14.3.4,
about subspaces where a Lipschitz functi<;>u is almost constant, comes in.
Of course, that proposition has its assumptions, and one of them is that
fK is I-Lipschitz. A sufficient condition for that is that K should contain the
unit ball:
14.4.3 Observation. Suppose that the convex body K contains the R-ba11
B(O, R). Then Ilx/lK :::; fi Ilxll for all x, and the function x H IlxilK is fi-Lip-
schitz with respect to the Euclidean metric. 0
nm2 )
o ( 10g(24/m) .
ilimL=O ~.1'2) =0 ( nm 2)
( log(8/8) (14.2)
log(24/m)'
The section K nL is 2-almost spherical. o
A slight improvement. It turns out that the factor log(24/m) in the result
just proved can be eliminated by a refined argument, which uses the fact that
f K comes from a norm.
14.4.5 Theorem. With the assumptions as in Proposition 14.4.4, a 2-almost
spherical section exists of dimension at least !3nm 2 , where!3 > 0 is an absolute
constant.
Proof. The main new observation is that for our fK' we can afford a much
less dense net N in the proof of Proposition 14.3.4. Namely, it suffices to let
N be a i-net in sk-l, where k = r!3m 2 nl
If !3 > 0 is sufficiently small, Levy's lemma gives the existence of a rotation
p such that ~~m :::; fK(Y) :::; ~~m for all Y E p(N); this is exactly as in the
proof of Proposition 14.3.4. It remains to verify ~m :::; fK(X) :::; tm for all
x E sn-l n L, where L = p(Lo). This is implied by the following claim with
a = ~~m and 1·1 = /I. 11K:
Claim. Let N be a i-net in Sk-l with respect to the Euclidean metric, and
let I . I be a norm on Rk satisfying ia :::; IYI :::; a for all yEN and for some
number a > O. Then ~a :::; Ixl :::; ~a for all x E Sk-l.
346 Chapter 14: Measure Concentration and Almost Spherical Sections
To prove the claim, we begin with the upper bound (this is where the
new trick lies). Let M = max{lxI: x E Sk-l} and let Xo E Sk-l be a point
where M is attained. Choose a Yo EN at distance at most ~ from xo, and let
z = (xo - yo)/llxo - yoll be the unit vector in the direction of Xo - Yo. Then
M = Ixol :::; Iyol + Ixo - yol :::; a + Ilxo - yoll . Izl :::; a + ~M. The resulting
inequality M :::; a + ~M yields M :::; ~a.
The lower bound is now routine: If x E Sk-l and yEN is at distance at
most ~ from it, then Ixl ~ Iyl - Ix - yl ~ ia - . ~ ~a ~ ~a. The claim, as
well as Theorem 14.4.5, is proved. D
vide a dimension this large, but Kashin's result does not give (1+10)-
almost spherical sections for small E.
Exercises
1. Let K be a convex body containing 0 in its interior. Check that K ~ Bn
if and only if Bn ~ K* (recall that K* = {x E Rk: (x, y) :::; 1 for all y E
K}). Derive that if Bk eKe tBk, then tBk c K* C Bk. [2]
14.5.1 Theorem. There is a constant a > 0 such that for any centrally sym-
metric n-dimensional convex polytope P, we have log fo (P) ·log f n-l (P) 2: an
(recall that fo(P) denotes the number of vertices and fn-l (P) the number
offacets).
For the cube, the expression log fo (P) . log f n-l (P) is about n log n, which
is even slightly larger than the lower bound in the theorem. However, poly-
topes can be constructed with both log fo (P) and log f n-l (P) bounded by
O( y'n) (Exercise 1).
Proof of Theorem 14.5.1. We use the dual polytope P* with fo(P) =
f n-l (P*), and we prove the theorem in the equivalent form log f n-l (P*) .
logfn-l(P) 2: an.
John's lemma (Theorem 13.4.1) claims that for any symmetric convex
body K, there exists a (nonsingular) linear map that transforms K into a
y'n-almost spherical body. We can thus assume that the considered n-dimen-
sional polytope P is y'n-almost spherical (this is crucial for the proof).
After rescaling, we may suppose Bn c P c y'n Bn. Letting m = med(fp ),
where fp is the restriction of /I. lip on sn-l as usual, Theorem 14.4.5 tells us
that there is a linear subspace L of R n with P n L being 2-almost spherical
and with dim(L) = O(nm 2). Thus, since any k-dimensionaI2-almost spherical
polytope has efl(k) facets, we have log fn-l (P) = O(nm 2).
Now, we look at P*. Since Bn c Pc y'n Bn, by Exercise 14.4.1 we have
n- 1j2 Bn C P* c Bn. In order to apply Theorem 14.4.5, we set P = y'n P*,
and obtain a 2-almost spherical section L of P of dimension O(nm2), where
m= med(fp). This implies log fn-l (P*) = O(nm 2).
It remains to observe the following inequality:
348 Chapter 14: Measure Concentration and Almost Spherical Sections
Exercises
1. Construct an n-dimensional convex polytope P with log fo(P) = !1( fo)
and logfn-l(P) = !1(fo) , thereby demonstrating that Theorem 14.5.1
is asymptotically optimal. Start with the interval [0,1] C Rl, and alter-
nate the operations (.)* (passing to the dual polytope) and x (Cartesian
product) suitably; see Exercise 5.5.1 for some properties of the Cartesian
product of polytopes. [!]
The polytopes obtained from [0, 1] by a sequence of these operations are
called Hammer polytopes, and they form an important class of examples.
°
2. Let K be a bounded centrally symmetric convex body in Rn containing
in its interior, and let K* be the dual body.
(a) Show that IlxilK . IlxlIK* : : : 1 for all x E sn-l. IT]
(b) Let f,g:sn-l -+ R be (measurable) functions with f(x)g(x)::::: 1 for
all x E sn-l. Show that med(f) med(g) ::::: 1. 121
°
The assumption that K is symmetric can in fact be omitted; it suffices
to require that be an interior point of K. The proof of this more general
version is not much more difficult than the one shown below.
We prove Dvoretzky's theorem only for c = 1, since in Section 14.4 we
prepared the tools for this particular setting. But the general case is not very
different.
Preliminary considerations. Since affine transforms of K are practically
for free in view of Lemma 14.4.1, we may assume that Bn ~ K ~ Vn Bn
by John's lemma (Theorem 13.4.1). So the norm induced by K satisfies
n- 1 / 2 /1xll S; IlxilK S; Ilxll for all x. If JK is the restriction of II . 11K to
sn-l, we have the obvious bound med(fK) 2: n- 1 / 2 . Immediate applica-
tion of Theorem 14.4.5 shows the existence of a 2-almost spherical section
of K of dimension O(nmed(fK)2) = 0(1), so this approach gives nothing at
all! On the other hand, it just fails, and a small improvement in the order of
magnitude of the lower bound for med(fK) already yields Dvoretzky's theo-
rem.
We will not try to improve the estimate for med(fK) directly. Instead,
we find a relatively large subspace Z c R n such that the section K n Z can
be enclosed in a not too large parallelotope P. Then we estimate, by direct
computation, med(fp) (over the unit sphere in Z).
The selection of the subspace Z is known as the Dvoretzky-Rogers lemma.
We present a version with a particularly simple proof, where dim Z ~ nj log n.
(For our purposes, we would be satisfied with even much weaker estimates,
say dim Z 2: nO for some fixed 8 > 0, but on the other hand, another proof
gives even dim Z = ~.)
14.6.2 Lemma (A version of the Dvoretzky-Rogers lemma). Let
KeRn be a centrally symmetric convex body. Then there exist a lin-
ear subspace Z c Rn of dimension k = log2 -nIn
J, an orthonormal basis
Ul, U2, ... ,Uk of Z, and a nonsingular linear transform T of R n such that
if we let k = T(K) n Z, then Ilxllk S; /lxll for all x E Z and Ilui/lk 2: ~ for
all i = 1,2, ... ,k.
Geometrically, the lemma asserts that k is sandwiched between the unit ball
Bk and a parallelotope P as in the picture:
350 Chapter 14: Measure Concentration and Almost Spherical Sections
(The lemma claims that the points 2Ui are outside of K or on its boundary,
and P is obtained by separating these points from K by hyperplanes.)
Proof. By John's lemma, we may assume B n ~ K ~ tBn, where t = fo,.
Interestingly, the full power of John's lemma is not needed here; the same
proof works with, say, t = n or t = n 10, only the bound for k would become
worse by a constant factor.
Let Xo = Rn and Ko = K. Here is the main idea of the proof. The
current body Ki is enclosed between an inner ball and an outer ball. Either
Ki approaches the inner ball sufficiently closely at "many" places, and in
this case we can construct the desired UI, ... , Uk, or it stays away from the
inner ball on a "large" subspace. In the latter case, we can restrict to that
subspace and inflate the inner ball. But since the outer ball remains the
same, the inflation of the inner ball cannot continue indefinitely. A precise
argument follows; for notational reasons, instead of inflating the inner ball,
we will shrink the body and the outer ball.
We consider the following condition:
(*) Each linear subspace Y ~ Xo with dim(Xo) - dim(Y) < k con-
tains a vector U with Ilull = 1 and IlullKo : : : ~.
This condition mayor may not be satisfied. If it holds, we construct the
orthonormal basis UI, U2, ..• ,Uk by an obvious induction. If it is not satisfied,
we obtain a subspace Xl of dimension greater than n - k such that IlxllKo :::;
~llxll for all x E Xl. Thus, KOnXI is twice "more spherical" thanKo. Setting
KI = ~(Ko n Xl), we have
The parallelotope is no worse than the cube. From now on, we work
within the subspace Z as in Lemma 14.6.2. For convenient notation, we as-
sume that Z is all of R n and K is as k in the above lemma, i.e., B n ~ K
and IluillK :::: ~, i = 1,2, ... , n, where UI,"" Un is an orthonormal basis of
Rn. (Note that the reduction of the dimension from n to n/logn is nearly
insignificant for the estimate of n(k, c) in Dvoretzky's theorem.)
The goal is to show that med(fK) = n(J(logn)/n), where fK is II· 11K
restricted to sn-I. Instead of estimating med(fK), we bound the expectation
E[fK]' Since fK is I-Lipschitz (we have Bn ~ K), the difference Imed(fK)-
E[fK] I is O(n-I/2) by Proposition 14.3.3, which is negligible compared to
the lower bound we are heading for.
We have II· 11K :::: II· lip, where P is the parallelotope as in the illustration
to Lemma 14.6.2. So we actually bound E[fp] from below.
First we show, by an averaging trick, that E[Jp] :::: E[Jc], where fdx) =
~ Ilxli oo = ~ maxi IXil is the norm induced by the cube C of side 4. The idea
of the averaging is to consider, together with a point x = L~=I O:iUi E sn-I,
the 2n points of the form L~I O'iO:iUi, where 0' E {-I, l}n is a vector of
signs. For any measurable function f p: sn-I -+ R, we have
The following lemma with Vi = O:iUi and 1·1 = 11·llp implies that the integrand
on the left-hand side is always at least 2n maxi IIO:iuillp :::: 2n. ~ maxi IO:il,
and so indeed E [fp] :::: E [fcl.
14.6.3 Lemma. Let VI, V2, ... , Vn be arbitrary vectors in a normed space
with norm I . I. Then
14.6.4 Lemma. For a suitable positive constant c and for all n we have
E[Jcl = ~ 1Sn-l
Ilxll oo dP(x) :::: c ff
ogn
-,
n
Proof of Lemma 14.6.4. There are various proofs; a neat way is based
on the generally useful fact that the n-dimensional normal distribution is
spherically symmetric around the origin. We use probabilistic terminology.
Let ZI, Z2, ... , Zn be independent random variables, each of them with the
standard normal distribution N(O, 1). As was mentioned in Section 14.1, the
random vector Z = (Zl' Z2, ... , Zn) has a spherically symmetric (Gaussian)
distribution, and consequently, the random variable II~II is uniformly dis-
tributed in sn-l. Thus
E[f ] = lE[IIZlloo]
JC 2 IIZII·
We show first, that we have IIZII ::; ffn with probability at least ~, and
second, that for a suitable constant Cl > 1, IIZlloo ;::: Cl Jlogn holds with
probability at least ~. It follows that both these events occur simultaneously
with probability at least ~, and so E[IC] ;::: cJlogn/n as claimed.
As for the Euclidean norm IIZII, we obtain E[IIZI12] = nE[Zr] = n,
since an N(O, 1) random variable has variance 1. By Markov's inequality,
Prob [IIZII ;::: ffn] = Prob [IIZI1 2 ;::: 3E [IIZI1 2]] ::; ~.
Further, by the independence of the Zi we have
Exercises
1. Prove Lemma 14.6.3. [I]
2. (Large almost spherical sections of the crosspolytope) Use Theorem 14.4.5
and the method of the proof of Lemma 14.6.4 for proving that the n-di-
mensional unit ball of the iI-norm has a 2-almost spherical section of
dimension at least cn, for a suitable constant C > O. [I]
15
Embedding Finite Metric
Spaces into N ormed Spaces
1 There are various measures of dissimilarity, and not all of them yield a metric,
but many do.
356 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
This sounds very good, and indeed it is too good to be generally true: It
is easy to find examples of small metric spaces that cannot be represented in
this way by a planar point set. One example is 4 points, each two of them
at distance 1; such points cannot be found in the plane. On the other hand,
they exist in 3-dimensional Euclidean space.
Perhaps less obviously, there are 4-point metric spaces that cannot be
o
represented (exactly) in any Euclidean space. Here are two examples:
y
The metrics on these 4-point sets are given by the indicated graphs; that is,
the distance of two points is the number of edges of a shortest path connecting
them in the graph. For example, in the second picture, the center has distance
1 from the leaves, and the mutual distances of the leaves are 2.
So far we have considered isometric embeddings. A mapping f: X ---t Y,
where X is a metric space with a metric p and Y is a metric space with
a metric a, is called an isometric embedding if it preserves distances, Le.,
if a(f(x),j(y)) = p(x, y) for all x, y E X. But in many applications we
need not insist on preserving the distance exactly; rather, we can allow some
distortion, say by 10%. A notion of an approximate embedding is captured
by the following definition.
15.1.1 Definition (D-embedding of metric spaces). A mapping f: X ---t
Y, where X is a metric space with a metric p and Y is a metric space with
a metric a, is called a D-embedding, where D 2: 1 is a real number, if there
exists a number r > 0 such that for all x, y EX,
denote the £p-norm of x. Most of the time, we will consider the case p = 2,
i.e., the usual Euclidean norm IIxll2 = IIxli. Another particularly important
case is p = 1, the £l-norm (sometimes called the Manhattan distance). The
Roo-norm, or maximum norm, is given by IIxli oo = maxi IXil. It is the limit of
the £p-norms as p -+ 00.
Let £~ denote the space Rd equipped with the £p-norm. In particular, we
write £~ in order to stress that we mean R d with the usual Euclidean norm.
Sometimes we are interested in embeddings into some space £~, with p
given but without restrictions on the dimension d; for example, we can ask
whether there exists some Euclidean space into which a given metric space
embeds isometrically. Then it is convenient to speak about £p, which is the
space of all infinite sequences x = (Xl, X2, ... ) of real numbers with IIxlip < 00,
where IIxlip = (I::llxiIP riP. In particular, £2 is the (separable) Hilbert
space. The space £p contains each £~ isometrically, and it can be shown that
any finite metric space isometrically embeddable into £p can be isometrically
embedded into £~ for some d. (In fact, every n-point subspace of £p can be
isometrically embedded into £~ with d ::; G); see Exercise 15.5.2.)
Although the spaces £p are interesting mathematical objects, we will not
really study them; we only use embeddability into £p as a convenient short-
hand for embeddability into £~ for some d.
subset of a Euclidean space (or a Banach space) and being local home-
omorphisms. These mappings are called quasi-isometries (the defini-
tion of a quasi-isometry is slightly more general, though), and the
main question is how close to an isometry such a mapping has to be,
in terms of the dimension and c:; see Benyamini and Lindenstrauss
[BL99], Chapters 14 and 15, for an introduction.
Exercises
1. Consider the two 4-point examples presented above (the square and the
star); prove that they cannot be isometrically embedded into £~. ~ Can
you determine the minimum necessary distortion for embedding into £~?
2. (a) Prove that a bijective mapping f between metric spaces is a D-
embedding if and only if IIfllLip . 11f-111Lip ::; D. IT]
(b) Let (X, p) be a metric space, IXI 2: 3. Prove that the distortion
of an embedding f: X ---+ Y, where (Y,O") is a metric space, equals the
supremum of the factors by which f "spoils" the ratios of distances; that
is,
This result shows that any metric question about n points in £2 can
be considered for points in £~(1ogn), if we do not mind a distortion of the
distances by at most 10%, say. For example, to represent n points of £2 in
a computer, we need to store n 2 numbers. To store all of their distances, we
need about n 2 numbers as well. But by the flattening lemma, we can store
only O(nlogn) numbers and still reconstruct any of the n 2 distances with
error at most 10%.
15.2 The Johnson-Lindenstrauss Flattening Lemma 359
Various proofs of the flattening lemma, including the one below, provide
efficient randomized algorithms that find the almost isometric embedding
into £~ quickly. Numerous algorithmic applications have recently been found:
in fast clustering of high-dimensional point sets, in approximate searching
for nearest neighbors, in approximate multiplication of matrices, and also in
purely graph-theoretic problems, such as approximating the bandwidth of a
graph or multicommodity flows.
Let us set t = Jk/5n. Since k 2: lOlnn, we have 2e- t2n / 2 ::; ~, and from
the above inequality we calculate m 2: J(k-2)/n - t 2: ~Jk/n.
Let us remark that a more careful calculation shows that m = Jk/n +
O(Jn) for all k. 0
(1 - ~)m Ilx - yll ::; IIp(x) - p(y)11 ::; (1 + ~)m Ilx - yll (15.1)
is violated with probability at most n -2. Since there are fewer than n 2 pairs of
distinct x, y E X, there exists some L such that (15.1) holds for all x, y E X.
In such a case, the mapping p is a D-embedding of X into e~ with D ::;
1+c/3
l-c/3 < 1+€ (£or € ::; 1).
Let x and y be fixed. First we reformulate the condition (15.1). Let u =
X-yj since p is a linear mapping, we have p(x) -p(y) = p(u), and (15.1) can
be rewritten as (1- ~)m Ilull ::; IIp(u)11 ::; (l+~)m Iluli. This is invariant under
scaling, and so we may suppose that lIull = 1. The condition thus becomes
(15.2)
By Lemma 15.2.2 and the remark following it, the probability of violating
(15.2), for u fixed and L random, is at most
((b 1 , x), ... , (b k , x)). Using suitable concentration results, one can verify that
P is a (1 +€ )-embedding with probability close to 1. The procedure of picking
the bi is computationally much simpler.
Another way is to choose each component of each bi from the normal
distribution N(O, 1), all the nk choices of the components being independent.
The distribution of each bi in R n is rotationally symmetric (as was mentioned
in Section 14.1). Therefore, for every fixed U E sn-l, the scalar product (b i , u)
also has the normal distribution N(O, 1) and IIp(u)112, the squared length of
the image, has the distribution of I:7=1 Zl, where the Zi are independent
N(O, 1). This is the well known Chi-Square distribution with k degrees of
freedom, and a strong concentration result analogous to Lemma 15.2.2 can
be found in books on probability theory (or derived from general measure-
concentration results for the Gaussian measure or from Chernoff-type tail
estimates). A still different method, particularly easy to implement but with
a more difficult proof, uses independent random vectors bi E {-I, I} n.
Exercises
1. Let x, y E sn-l be two points chosen independently and uniformly at
random. Estimate their expected (Euclidean) distance, assuming that n
is large. 0
2. Let L ~ R n be a fixed k-dimensional linear subspace and let x be a
random point of sn-l. Estimate the expected distance of x from L, as-
suming that n is large. 0
3. (Lower bound for the flattening lemma)
(a) Consider the n+ 1 points 0, el, e2, ... ,en ERn (where the ei are the
vectors of the standard orthonormal basis). Check that if these points
with their Euclidean distances are (1 +E: )-embedded into £~, then there
exist unit vectors VI, V2, ... ,Vn E R k with I(Vi, Vj) I :::; lODE: for all i #- j
(the constant can be improved). ~
(b) Let A be an n x n symmetric real matrix with aii = 1 for all i and
laij I :::; n -1/2 for all j, j, i #- j. Prove that A has rank at least ~. [:±J
(c) Let A be an nxn real matrix of rank d, let k be a positive integer,
and let B be the nxn matrix with bij = a~j. Prove that the rank of B is
at most (ktd). [:±J
(d) Using (a)-(c), prove that if the set as in (a) is (l+E:)-embedded into
£~, where 100n- 1 / 2 :::; E: :::; ~, then
k =n( 1
E: 2 log ~
log n) .
o
This proof is due to Alon (unpublished manuscript, Tel Aviv University).
2 To see this, divide the vertices of G into two classes A and B arbitrarily, and
while there is a vertex in one of the classes having more neighbors in its class
than in the other class, move such a vertex to the other class; the number of
edges between A and B increases in each step. For another proof, assign each
vertex randomly to A or B and check that the expected number of edges between
A and B is ~ IE(G)I.
364 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
Of course, for odd £ we obtain an !l(n1+ 1/CR-2») bound by using the lemma
for £-I.
Proof. First we note that we may assume n ~ 4R- 1 ~ 16, for otherwise, the
bound in the lemma is verified by a path, say.
We consider the random graph G(n,p) with n vertices, where each of the
G) possible edges is present with probability p, 0 < p < 1, and these choices
are mutually independent. The value of p is going to be chosen later.
Let E be the set of edges of G (n, p) and let F ~ E be the edges contained
in cycles of length £ or shorter. By deleting all edges of F from G(n,p), we
obtain a graph with no cycles of length £ or shorter. If we manage to show,
for some m, that the expectation E[lE \ Fil is at least m, then there is an
instance of G(n,p) with IE \ FI ~ m, and so there exists a graph with n
vertices, m edges, and of girth greater than £.
We have E[lEll = G)p. What is the probability that a fixed pair e =
{u, v} of vertices is an edge of F? First, e must be an edge of G (n, p), which
has probability p, and second, there must be path of length between 2 and
£-1 connecting u and v. The probability that all the edges of a given potential
path of length k are present is pk, and there are fewer than n k- 1 possible
paths from u to v of length k. Therefore, the probability of e E F is at most
~~:,~ pk+1 n k-l, which can be bounded by 2pRn R-2, provided that np ~ 2.
Then E[1F1l ::; (~) . 2pRn R- 2 , and by the linearity of expectation, we have
There are several ways of proving a lower bound for m( £, n) similar to that
in Lemma 15.3.2, i.e., roughly n1+1/R; one of the alternatives is indicated in
Exercise 1 below. But obtaining a significantly better bound in an elementary
way and improving on the best known bounds (of roughly n1+4/3R) remain
challenging open problems.
We now use the knowledge about graphs without short cycles in lower
bounds for distortion.
15.3.3 Proposition (Distortion versus dimension). Let Z be a d-di-
mensional normed space, such as some £~, and suppose that all n-point metric
spaces can be D-embedded into Z. Let £ be an integer with D < £ ::; 5D (it
is essential that £ be strictly larger than D, while the upper bound is only
for technical convenience). Then
15.3 Lower Bounds By Counting 365
d> 1 ~(£,n)
- log2 ~~~ n
Proof. Let G be a graph with vertex set V = {VI, V2, ... , v n } and with
~ = ~(£,n) edges. Let 9 denote the set of all subgraphs H ~ G obtained
from G by deleting some edges (but retaining all vertices). For each H E 9,
we define a metric PH on the set V by PH(U,V) = min(£,dH(u,v)), where
dH(U, v) is the length of a shortest path connecting U and v in H.
The idea of the proof is that 9 contains many essentially different metric
spaces, and if the dimension of Z were small, then there would not be suffi-
ciently many essentially different placements of n points in Z.
Suppose that for every H E 9 there exists aD-embedding fH: (V, PH) -+
Z. By rescaling, we make sure that 15 PH(U,V) ::; IIfH(U) - fH(V)lIz ::;
PH(U, v) for all u, v E V. We may also assume that the images of all points
are contained in the £-ball Bz(O, £) = {x E Z: Ilxllz ::; £}.
Set /3 = i(iJ-1). We have 0 < /3::; 1. Let N be a /3-net in Bz(O,£). The
notion of /3-net was defined above Lemma 13.1.1, and that lemma showed that
a /3-net in the (d-1 )-dimensional Euclidean sphere has cardinality at most
(~)d. Exactly the same volume argument proves that in our case INI ::; (~)d.
For every H E g, we define a new mapping gH: V -+ N by letting gH(V)
be the nearest point to fH(V) in N (ties resolved arbitrarily). We prove that
for distinct HI, H2 E 9, the mappings gH 1 and gH2 are distinct.
The edge sets of HI and H2 differ, so we can choose a pair u, v of vertices
that form an edge in one of them, say in HI, and not in the other one (H2).
We have PH1(U,V) = 1, while PH2(U,V) = £, for otherwise, a u-v path in H2
of length smaller than £ and the edge {u, v} would induce a cycle of length
at most £ in G. Thus
IlgHl (u) - gHl (v)llz < IlfHl (u) - fHl (v)llz + 2/3 ::; 1 + 2/3
and
£
IlgH2(U) - gH2(V)llz > IlfH2(U) - fH2(V)llz - 2/3 2: D - 2/3 = 1 + 2/3.
attaining these bounds (they are called Moore graphs for odd girth
and generalized polygon graphs for even girth) are known to exist only
in very few cases (see, e.g., Biggs [Big93] for a nice exposition). Alon,
Hoory, and Linial [AHLOl] proved by a neat argument using random
walks that the same formulas still bound the number of vertices from
below if 15 is the average degree (rather than minimum degree) of the
graph. But none of this helps improve the bound on m(£, n) by any
substantial amount.
The proof of Lemma 15.3.2 is a variation on well known proofs by
Erdos.
The constructions mentioned in the text attaining the asymptot-
ically optimal value of m(£, n) for several small £ are due to Benson
[Ben66] (constructions with similar properties appeared earlier in Tits
[Tit59], where they were investigated for different reasons). As for the
other £, graphs with the parameters given in the text were constructed
by Lazebnik, Ustimenko, and Woldar [LUW95], [LUW96] by algebraic
methods, improving on earlier bounds (such as those in Lubotzky,
Phillips, Sarnak [LPS88]; also see the notes to Section 15.5).
Proposition 15.3.5 and the basic idea of Proposition 15.3.3 were
invented by Bourgain [Bou85]. The explicit use of graphs without
short cycles and the detection of the "thresholds" in the behavior
of the dimension as a function of the distortion appeared in Matousek
[Mat96b].
Proposition 15.3.3 implies that a normed space that should accom-
modate all n-point metric spaces with a given small distortion must
have large dimension. But what if we consider just one n-point metric
space M, and we ask for the minimum dimension of a normed space Z
such that M can be D-embedded into Z? Here Z can be "customized"
to M, and the counting argument as in the proof of Proposition 15.3.3
cannot work. By a nice different method, using the rank of certain
matrices, Arias-de-Reyna and Rodriguez-Piazza [AR92] proved that
for each D < 2, there are n-point metric spaces that do not D-embed
into any normed space of dimension below c(D)n, for some c(D) > O.
In [Mat96b] their technique was extended, and it was shown that for
any D > 1, the required dimension is at least c( lD j )n 1/ 2LD J, so for a
fixed D it is at least a fixed power of n. The proof again uses graphs
without short cycles. An interesting open problem is whether the pos-
sibility of selecting the norm in dependence on the metric can ever
help substantially. For example, we know that if we want one normed
space for all n-point metric spaces, then a linear dimension is needed
for all distortions below 3. But the lower bounds in [AR92]' [Mat96b]
for a customized normed space force linear dimension only for distor-
tion D < 2. Can every n-point metric space M be 2.99-embedded, say,
into some normed space Z = Z(M) of dimension o(n)?
368 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
Exercises
1. (Erdos-Sachs construction) This exercise indicates an elegant proof, by
Erdos and Sachs [ES63j, of the existence of graphs without short cycles
whose number of edges is not much smaller than in Lemma 15.3.2 and
that are regular. Let £ 2: 3 and 2: 3. a
a
(a) (Starting graph) For all and £, construct a finite a-regular graph
G( a, £) with no cycles of length £ or shorter; the number of vertices does
not matter. One possibility is by double induction: Construct G(a+1,£)
using G(a,£) and G(a',£-l) with a suitable a'. 8J
(b) Let G be a a-regular graph of girth at least £+1 and let u and v be
two vertices of G at distance at least £+2. Delete them together with
their incident edges, and connect their neighbors by a matching:
15.4 A Lower Bound for the Hamming Cube 369
v
Ill!
Check that the resulting graph still does not contain any cycle of length
at most C. 0
(c) Show that starting with a graph as in (a) and reducing it by the
operations as in (b), we arrive at a 8-regular graph of girth C+1 and with
at most 1 + 8 + 8(8-1) + '" + 8(8-1)€ vertices. What is the resulting
asymptotic lower bound for m(n,C), with C fixed and n ~ oo? GJ
2. (Sparse spanners) Let a be a graph with n vertices and with positive real
weights on edges, which represent the edge lengths. A subgraph H of a is
called a t-spanner of a if the distance of any two vertices u, v in H is no
more than t times their distance in a (both the distances are measured
in the shortest-path metric). Using Lemma 15.3.1, prove that for every
a and every integer t ~ 2, there exists a t-spanner with 0 (nl+l/Lt/2J)
edges. 0
3. Let an denote the graph arising from K 5 , the complete graph on 5 ver-
tices, by subdividing each edge n-1 times; that is, every two of the orig-
inal vertices of K5 are connected by a path of length n. Prove that the
vertex set of an, considered as a metric space with the graph-theoretic
distance, cannot be embedded into the plane with distortion smaller than
const· n. 0
4. (Another lower bound for the flattening lemma)
(a) Given c E (O,~) and n sufficiently large in terms of c, construct a
collection V of ordered n-tuples of points of C~ such that the distance of
every two points in each V E V is between two suitable constants, no two
V =I- V' E V can have the same (l+c)-embedding (that is, there are i,j
such that the distances between the ith point and the jth point in V and
in V' differ by a factor of at least l+c), and log IVI = f2(c 2 nlogn). 0
(b) Use (a) and the method of this section to prove a lower bound of
f2(~11 logn) for the dimension in the Johnson-Lindenstrauss flatten-
c: og.-
ing lemma. 0
method and exhibit metric spaces with n(log n) lower bound, which turns
out to be optimal. We recall that Cm denotes the space {O,l}m with the
Hamming (or £1) metric, where the distance of two 0/1 sequences is the
number of places where they differ.
E
F
15.4.2 Lemma (Short diagonals lemma). Let Xl, X2, X3, X4 be arbitrary
points in a Euclidean space. Then
Proof. Four points can be assumed to lie in R 3 , so one could start some
stereometric calculations. But a better way is to observe that it suffices to
prove the lemma for points on the real line! Indeed, for the Xi in some Rd we
can write the I-dimensional inequality for each coordinate and then add these
inequalities together. (This is the reason for using squares in the definition
of the ratio R(u): Squares of Euclidean distances split into the contributions
of individual coordinates, and so they are easier to handle than the distances
themselves. )
If the Xi are real numbers, we calculate
(Xl - X2)2 + (X2 - X3)2+ (X3 - X4)2 + (X4 - xd 2 - (Xl - X3)2 - (X2 - X4)2
= (Xl - X2 + X3 - X4)2 ~ 0,
For U E {O, 1}m-l, we consider the quadrilateral with vertices uO, uO, u1, u1;
for U = 00, it is indicated in the picture:
372 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
.!r------,-::;;;;tf 111
001~--+--..
~:-::o--I--~ 110
000.-=-----
Its sides are two edges of E OI , one diagonal from Fo and one from FI , and
its diagonals are from F. If we write the inequality of Lemma 15.4.2 for this
quadrilateral and sum up over all such quadrilaterals (they are 2m - 2 , since
u and u yield the same quadrilaterals), we get
By the inductive assumption for the two subcubes, the right-hand side is at
most a 2 (Eod + a 2 (Eo) + a 2 (Ed = a 2 (E). 0
Exercises
1. Consider the second graph in the introductory section, the star with 3
leaves, and prove a lower bound of ~ for the distortion required to
embed into a Euclidean space. Follow the method used for the 4-cycle. [1]
2. (Planar graphs badly embeddable into £2) Let Go, G I , ... be the following
graphs:
Go
3. (Almost Euclidean subspaces) Prove that for every k and c > 0 there
exists n = n(k,c) such that every n-point metric space (X,p) contains a
k-point subspace that is (l+c)-embeddable into £2. Use Ramsey's theo-
rem. 0
This result is due to Bourgain, Figiel, and Milman [BFM86]; it is a kind
of analogue of Dvoretzky's theorem for metric spaces.
for u = v,
ifu -=I- v and {u,v} E E(G),
otherwise.
This proves (15.3), and we can also see that x = b2 yields equality in (15.3).
So we can write fL2 = min{xT Lax: Ilxll = 1, LvEv Xv = O} (this is a special
case of the variational definition of eigenvalues discussed in many textbooks
of linear algebra).
Now, we are ready to prove the main result of this section.
Proof. We again consider the ratios RE,F(p) and RE,F(a) as in the proof
for the cube (Theorem 15.4.1). This time we let E be the edge set of G, and
F = (~) are all pairs of distinct vertices. In the graph metric all pairs in E
have distance 1, while most pairs in F have distance about log n, as we will
check below. On the other hand, it turns out that in any embedding into £2
such that all the distances in E are at most 1, a typical distance in F is only
0(1). The calculations follow.
We have p2(E) = lEI = n;. To bound p2(F) from below, we observe that
for each vertex vo, there are at most 1 +r+r(r-l) + ... +r(r-l)k-l ::; rk+l
vertices at distance at most k from Vo. So for k = logr n;-l , at least half of
the pairs in F have distance more than k, and we obtain p2(F) = fl(n 2k 2) =
fl(n 2 10g2 n). Thus
RE,F(p) = fl (Vn .logn).
Let f: V --+ £~ be an embedding into a Euclidean space, and let a be the
metric induced by it on V. To prove the theorem, it suffices to show that
RE,F(a) = O(vIn); that is,
By the observation in the proof of Lemma 15.4.2 about splitting into coordi-
nates, it is enough to prove this inequality for a one-dimensional embedding.
So for every choice of real numbers (xv )VEV, we want to show that
(15.4)
{u,V}EF {u,v}EE
376 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
By adding a suitable number to all the Xv, we may assume that L:vEV Xv = 0.
This does not change anything in (15.4), but it allows us to relate both sides
to the Euclidean norm of the vector x.
We calculate, using L:vEV Xv = 0,
the last inequality being (15.3). This establishes (15.4) and concludes the
proof of Theorem 15.5.1. 0
The proof actually shows that the maximum of RE,F(a) over all Euclidean
J
metrics a equals P,2/n (which is an interesting geometric interpretation of
P,2). The maximum is attained for the a induced by the mapping V -+ R
specified by b2 , the eigenvector belonging to P,2.
The cone of squared .e 2 -metrics and universality of the lower-bound
method. For the Hamming cubes, we obtained the exact minimum distor-
tion required for a Euclidean embedding. This was due to the lucky choice of
the sets E and F of point pairs. As we will see below, a "lucky" choice, leading
to an exact bound, exists for every finite metric space if we allow for sets of
weighted pairs. Let (V, p) be a finite metric space and let rJ, <.p: (~) -+ [0, (0)
be weight functions. We define
p2(rJ) = L rJ(u,v)p(u,v)2
{u,v}E(~)
R'1,<p(p) =
15.5.2 Proposition. Let (V, p) be a finite metric space and let D ::::: 1 be the
smallest number such that (V, p) can be D-embedded into £2. Then there are
weight functions rJ, <.p: (~) -+ [0,(0) such that R'1,<p (p) ::::: D and R'1,<p (a) ~ 1
for any metric a induced on V by an embedding into £2.
Thus, the exact lower bound for the embeddability into Euclidean spaces
always has an "easy" proof, provided that we can guess the right weight
functions rJ and <.p. (As we will see below, there is even an efficient algorithm
for deciding D-embeddability into £2.)
15.5 A Tight Lower Bound via Expanders 377
This IC includes all squares of metrics arising by D-embeddings of (V, p). But
not all elements of IC are necessarily squares of metrics, since the triangle
inequality may be violated. Since there is no Euclidean D-embedding of (V, p),
we have IC n £2 = 0. Both IC and £2 are convex sets in R N , and so they can
be separated by a hyperplane, by the separation theorem (Theorem 1.2.4).
Moreover, since £2 is a cone and K is a cone minus the origin 0, the separating
hyperplane has to pass through o. So there is an a ERN such that
ry(u, v)
°
{ auv if auv 2: 0,
otherwise;
x _ {D 2p(u,v)2
uv - p(u, v)2 l'f auv < .°
if auv 2: 0,
°
Next, let u be a metric induced by a Euclidean embedding of V. This
time we apply (a, x) :::; with the x E £2 corresponding to u, i.e., Xuv =
u( u, v)2. This yields u 2 (r/) - u 2(cp) :::; 0, and so R'I},<p( u) :::; 1. This proves
Proposition 15.5.2. 0
Algorithmic remark: Euclidean embeddings and semidefinite pro-
gramming. The problem of deciding whether a given n-point metric space
(V, p) admits a D-embedding into £2 (i.e., into a Euclidean space without re-
striction on the dimension), for a given D 2: 1, can be solved by a polynomial-
time algorithm. Let us stress that the dimension of the target Euclidean space
cannot be prescribed in this method. If we insist that the embedding be into
£~, for some given d, we obtain a different algorithmic problem, and it is not
known how hard it is. Many other similar-looking embedding problems are
known to be NP-hard, such as the problem of D-embedding into £1'
The algorithm for D-embedding into £2 is based on a powerful technique
called semidefinite programming, where the problem is expressed as the exis-
tence of a positive semidefinite matrix in a suitable convex set of matrices.
Let (V, p) be an n- point metric space, let f: V ---+ R n be an embedding,
and let X be the n x n matrix whose columns are indexed by the elements
of V and such that the vth column is the vector f(v) ERn. The matrix
Q = X T X has both rows and columns indexed by the points of V, and the
entry quv is the scalar product (f (u), f (v)).
The matrix Q is positive semidefinite, since for any x ERn, we have
xTQx = (x T XT)(Xx) = IIXxl12 2: 0. (In fact, as is not too difficult to check,
a real symmetric n x n matrix P is positive semidefinite if and only if it can
be written as X T X for some real n x n matrix X.)
Let u(u, v) = Ilf(u) - f(v)11 = (f(u) - f(v), f(u) - f(v))1/2. We can ex-
press
u(u, V)2 = (j(u), f(u)) + (f(v), f(v)) - 2(f(u), f(v)) = quu + qvv - 2quv'
Therefore, the space (V, p) can be D-embedded into £2 if and only if there
exists a symmetric real positive semidefinite matrix Q whose entries satisfy
15.5 A Tight Lower Bound via Expanders 379
for all u, v E V. These are linear inequalities for the unknown entries of Q.
The problem of finding a positive semidefinite matrix whose entries sat-
isfy a given system of linear inequalities can be solved efficiently, in time
polynomial in the size of the unknown matrix Q and in the number of the
linear inequalities. The algorithm is not simple; we say a little more about it
in the remarks below.
vex set in the space of all real nxn matrices, and in principle it is
not difficult to construct a polynomial-time membership oracle for it
(see the explanation following Theorem 13.2.1). Then the ellipsoid
method can solve the optimization problem in polynomial time; see
Grotschel, Lovasz and Schrijver [GLS88]. More practical algorithms
are based on interior point methods. Semidefinite programming is an
extremely powerful tool in combinatorial optimization and other ar-
eas. For example, it provides the only known polynomial-time algo-
rithms for computing the chromatic number of perfect graphs and the
best known approximation algorithms for several fundamental NP-
hard graph-theoretic problems. Lovasz's recent lecture notes [Lov] are
a beautiful concise introduction. Here we outline at least one lovely
application, concerning the approximation of the maximum cut in a
. graph, in Exercise 8 below.
The second eigenvalue. The investigation of graph eigenvalues consti-
tutes a well established part of graph theory; see, e.g., Biggs [Big93]
for a nice introduction. The second eigenvalue of the Laplace matrix as
an important graph parameter was first considered by Fiedler [Fie73]
(who called it the algebraic connectivity). Tanner [Tan84] and Alon
and Milman [AM$5] gave a lower bound for the so-called vertex ex-
pansion of a regular graph (a notion similar to edge expansion) in
terms of f.L2(G), and a reverse relation was proved by Alon [Alo86a].
There are many useful analogies of graph eigenvalues with the
eigenvalues of the Laplace operator .6. on manifolds, whose theory is
classical and well developed; this is pursued to a considerable depth in
Chung [Chu97]. This point of view prefers the eigenvalues of the Lapla-
cian matrix of a graph, as considered in this section, to the eigenvalues
of the adjacency matrix. In fact, for nonregular graphs, a still closer
correspondence with the setting of manifolds is obtained with a differ-
ently normalized Laplacian matrix Co: (Co)v.v = 1 for all v E V(G),
(Co)uv = -(dego(u)dega(v))-1/2 for {u,v} E E(G), and (Co)uv =
otherwise.
°
Expanders have been used to address many fundamental problems of
computer science in areas such as network design, theory of compu-
tational complexity, coding theory, on-line computation, and crypto-
graphy; see, e.g., [RVWOO] for references.
For random graphs, parameters such as edge expansion or vertex
expansion are usually not too hard to estimate (the technical difficulty
of the arguments depends on the chosen model of a random graph). On
the other hand, estimating the second eigenvalue of a random r-regular
graph is quite challenging, and a satisfactory answer is known only for
r large (and even); see Friedman, Koml6s, and Szemeredi [FKS89] or
Friedman [Fri91]. Namely, with high probability, a random r-regular
graph with r even has A2 ::; 2vr-1 + O(1ogr). Here the number of
382 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
Exercises
1. Show that every real symmetric positive semidefinite n x n matrix can
be written as XT X for a real n x n matrix X. 0
2. (Dimension for isometric Cp-embeddings)
(a) Let V be an n-point set and let N = G). Analogous to the set
£2 defined in the text, let £~fin) C RN be the set of all metrics on V
induced by embeddings f: V -+ C}, k = 1,2,... . Show that £~fin) is
the convex hull of line pseudometrics,5 i.e., pseudometrics induced by
mappings f: V -+ C~. I2l
(b) Prove that any metric from £~fin) can be isometrically embedded
into c1(. That is, any n-point set in some C} can be realized in cf'. 0
(Examples show that one cannot do much better and that dimension
f2(n 2 ) is necessary, in contrast to Euclidean embeddings, where dimension
n-1 always suffices.)
(c) Let £1 eRN be all metrics induced by embeddings of V into C1 (the
space of infinite sequences with finite C1-norm). Show that £1 = £~fin),
and thus that any n- point subset of C1, can be realized in c1(. 0
(d) Extend the considerations in (a)-(c) to Cp-metrics with arbitrary
p E [1,00).0
See Ball [Bal90] for more on the dimension of isometric Cp-embeddings.
3. With the notation as in Exercise 2, show that every line pseudometric
v on an n-point set V is a nonnegative linear combination of at most
n-1 cut pseudometrics: v = L:7:11aiTi, a1, ... ,an -1 2: 0, where each
Ti is a cut pseudometric, i.e., a line pseudometric induced by a mapping
'(Pi: V -+ {O, I}. (Consequently, by Exercise 2(a), every finite metric iso-
metrically embeddable into £1 is a nonnegative linear combination of cut
pseudometrics.) 0
4. (An Cp-analogue of Proposition 15.5.2) Let p E [1,00) be fixed. Using
Exercise 2, formulate and prove an appropriate £p-analogue of Proposi-
tion 15.5.2. 0
5. (Finite C2-metrics embed isometrically into Cp )
(a) Let p be fixed. Check that if for all c > 0, a finite metric space
(V, p) can be (l+c)-embedded into some C~, k = k(c), then (V, p) can be
isometrically embedded into £{:, where N = (I~I). Use Exercise 2. I2l
(b) Prove that every n-point set in C2 can be isometrically embedded into
C{:. I2l
6. (The second eigenvalue and edge expansion) Let G be an r-regular graph
with n vertices, and let A, B ~ V be disjoint. Prove that the number of
edges connecting A to B is at least e(A, B) 2: J..l2(G) . IAIJBI (use (15.3)
with a suitable vector x), and deduce that <p(G) 2: ~ J..l2(G). 0
5 A pseudometric v satisfies all the axioms of a metric except that we may have
v(x, y) = 0 even for two distinct points x and y.
384 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
M opt = max { ! L
{u,v}EE
(1 - xuxv): Xv E {-I, I}, v E V}.
m
(b) Let
M re1ax = max { ! L
{u,v}EE
(1- (Yu,Yv)): Yv ERn, IIYvll = 1, v E V}.
Clearly, Mrel ax ~ M opt . Verify that this relaxed version of the problem is
an instance of a semidefinite program, that is, the maximum of a linear
function over the intersection of a polytope with the cone of all symmetric
positive semidefinite real matrices. m
(c) Let (yv: v E V) be some system of unit vectors in R n for which Mrel ax
is attained. Let r E R n be a random unit vector, and set Xv = sgn(yv, r),
!
v E V. Let Mapprox = L{u,v}EE(1 - xuxv) for these Xv. Show that
the expectation, with respect to the random choice of r, of Mapprox is
at least 0.878 . M re1ax (consider the expected contribution of each edge
separately). So we obtain a polynomial-time randomized algorithm pro-
ducing a solution to MAX CUT whose expected value is at least about
88% of the optimal solution. [!]
Remark. This algorithm is due to Goemans and Williamson [GW95].
Later, Hastad [Has97] proved that no polynomial-time algorithm can
produce better approximation in the worst case than about 94% unless
P=NP (also see Feige and Schechtman [FS01] for nice mathematics show-
ing that the Goemans-Williamson value 0.878 ... is, in a certain sense,
optimal for approaches based on semidefinite programming).
15.6 Upper Bounds for foo-Embeddings 385
f: (V,p) ~ c'/xo
means to define d functions iI, ... , fd: V ~ R, the coordinates of the embed-
ded points. If we aim at a D-embedding, without loss of generality we may
require it to be nonexpanding, which means that Ifi(U) - fi(v)1 :::; p(u, v) for
all u, v E V and all i = 1, 2, ... ,d. The D-embedding condition then means
that for every pair {u, v} of points of V, there is a coordinate i = i (u, v) that
"takes care" of the pair: Ifi(U) - fi(v)l2:: fJp(U, v).
One of the key tricks in constructions of such embeddings is to take each
fi as the distance to some suitable subset Ai ~ V; that is, fi(U) = p(u, Ai) =
maxaEA;p(u,a). By the triangle inequality, we have Ip(u,A i ) - p(v,Ai)1 :::;
p(u, v) for any u, v E V, and so such an embedding is automatically nonex-
panding. We "only" have to choose a suitable collection of the Ai that take
care of all pairs {u, v}.
We begin with a simple case: an old observation showing that every finite
metric space embeds isometrically into Coo.
Proof. Here the coordinates in C~ are indexed by the points of V, and the
vth coordinate is given by fv (u) = p( u, v). In the notation above, we thus put
Av = {v}. As we have seen, the embedding is nonexpanding by the triangle
inequality. On the other hand, the coordinate v takes care of the pairs {u, v}
for all u E V:
d = O(qn 1 / q In n).
, .0
• o.
•
00
00.0
·0
•
..
o•
o
0
.0
o•
·0
0
'6
o.
o
00
00.0
0
00
o.
0
0000
• ••
0 0 0
•
0 0 •
0
•
0
0
0
0 0
•
A*l A*2 A*3
15.6 Upper Bounds for foo-Embeddings 387
(15.6)
is at least i2'
First, assuming this lemma, we finish the proof of the theorem. To show
that! is a D-embedding, it suffices to show that with a nonzero probability,
for every pair {u, v} there are i, j such that the event (15.6) in the lemma
occurs for the set A ij . Consider a fixed pair {u, v} and select the appropriate
index j as in the lemma. The probability that the event (15.6) does not occur
for any of the m indices i is at most (1- f2)m::::; e- pm / 12 ::::; n- 2 . Since there
are G) < n 2 pairs {u, v}, the probability that we fail to choose a good set
for any of the pairs is smaller than 1. 0
If the sequence (nl, n2, ... , nq) is not monotone increasing, i.e., if nt+l < nt
for some t, then (15.7) holds for the j such that 1j contains nt. On the other
hand, if 1 = no :::; nl :::; ... :::; nq :::; n, then by the pigeonhole principle, there
exist t and j such that the interval 1j contains both nt and nt+l. Then (15.7)
holds for this j as well.
In this way, we have selected the index j whose existence is claimed in the
lemma. We will show that with probability at least 1'2, the set A ij , randomly
selected with point probability Pj, includes a point of B t (event E l ) and is
disjoint from the interior of Bt+1 (event E 2 ); such an Aij satisfies (15.6).
Since B t and the interior of Bt+l are disjoint, the events El and E2 are
independent.
We calculate
P< t
min( l.,p). For P 2: ~, we get Prob[ElJ 2: 1 - e- l / 2 > ~ 2: ~, while for
we have Prob [EIJ 2: 1 - e- P , and a bit of calculus verifies that the
last expression is well above ~ for all P E [0, ~).
Further,
Exercises
1. (a) Find an isometric embedding of Ct into C~. m
(b) Explain how an embedding as in (a) can be used to compute the
diameter of an n-point set in Ct in time O(d2 d n). m
2. Show that if the unit ball K of some finite-dimensional normed space
is a convex polytope with 2m facets, then that normed space embeds
isometrically into C:.
~
(Using results on approximation of convex bodies by polytopes, this yields
useful approximate embeddings of arbitrary norms into C~.)
3. Deduce from Theorem 15.6.2 that every n-point metric space can be D-
embedded into C~ with D = 0(1og2 n) and k = 0(1og2 n). ~
15.7 Upper Bounds for Euclidean Embeddings 389
(the inequality between two (pseudo )metrics on the same point set means
inequality for each pair of points).
The following easy lemma shows that if a metric p on V can be approx-
imated by a convex combination of line pseudometrics, each of them domi-
nated by p, then a good embedding of (V, p) into f2 exists.
15.7.3 Lemma. Let (V, p) be a finite metric space, and let V!, ... , VN be
line pseudometrics on V with Vi ::; P for all i and such that
N 1
"
~ " Q'V-
t t_ > -D P
i=l
/2
Ilf(u)-f(v)1I = (t, QiVi(U, V)2r = (t,Qir/2(t,QiVi(U,V)2r/2
N
~ LQiVi(U,V)
i=l
Proof of Theorem 15.7.1. As was remarked above, each of the line pseu-
dometrics V A corresponding to the mapping v H p( v, A) is dominated by p.
It remains to observe that Lemma 15.7.2 provides a convex combination of
these line pseudometrics that is bounded from below by 4~q' p. The coefficient
of each V A in this convex combination is given by the probability of A appear-
ing as one of the sets Aj in Lemma 15.7.2. More precisely, write 7fj(A) for
the probability that a random subset of V, with points picked independently
with probability 2- j , equals A. Then the claim of Lemma 15.7.2 implies, for
every pair {u, v},
15.7 Upper Bounds for Euclidean Embeddings 391
LetA = 1,
A<;;V
complete binary tree with unit edge lengths, and for that example,
he also constructed an embedding with O( vllog log n) distortion. For
embedding the complete binary tree into £P' p > 1, the distortion is
n((log log n)min(1/2,1/p»), with the constant of proportionality depend-
ing on p and tending to 0 as p -+ 1. (For Banach-space specialists, we
also remark that all tree metrics can be embedded into a given Banach
space Z with bounded distortion if and only if Z is not superreflexive.)
In Matousek [Mat99b] it was shown that the complete binary tree is
essentially the worst example; that is, every n-point tree metric can be
embedded into £p with distortion O((loglogn)min(1/2,1/P»). An alter-
native, elementary proof was given for the matching lower bound (see
Exercise 5 for a weaker version). Another proof of the lower bound,
very short but applying only for embeddings into £2, was found by
Linial and Saks [LS02] (Exercise 6).
In the notes to Section 15.3 we mentioned that general n-point
metric spaces require worst-case distortion n(n 1/ L(d+1)/2 J ) for embed-
ding into £g, d;:::: 2 fixed. Gupta [GupOO] proved that for n-point tree
metrics, O(n1/(d-l»)-embeddings into £g are possible. The best known
lower bound is n(nl/d), from a straightforward volume argument. Ba-
bilon, Matousek, Maxova, and Valtr [BMMV02] showed that every
n-vertex tree with unit-length edges can be O( Vn )-embedded into £~.
Planar-graph metrics and metrics with excluded minor. A planar-
graph metric is a P-metric with P standing for the class of all pla-
nar graphs (the shorter but potentially confusing term planar met-
ric is used in the literature). Rao [Ra099] proved that every n-point
planar-graph metric can be embedded into £2 with distortion only
O( vllog n ), as opposed to log n for general metrics. More generally,
the same method shows that whenever H is a fixed graph and Excl(H)
is the class of all graphs not containing H as a minor, then Excl(H)-
metrics can be O( vllog n )-embedded into £2. For a matching lower
bound, valid already for the class Excl(K4 ) (series-parallel graphs),
and consequently for planar-graph metrics; see Exercise 15.4.2.
We outline Rao's method of embedding. We begin with graphs
where all edges have unit weight (this is the setting in [Ra099], but
our presentation differs in some details), and then we indicate how
graphs with arbitrary edge weights can be treated. The main new
ingredient in Rao's method, compared to Bourgain's approach, is a
result of Klein, Plotkin, and Rao [KPR93] about a decomposition of
graphs with an excluded minor into pieces of low diameter. Here is the
decomposition procedure.
Let G be a graph, let p be the corresponding graph metric (with all
edges having unit length), and let D. be an integer parameter. We fix a
vertex Va E V(G) arbitrarily, we choose an integer r E {O, 1, ... , D.-I}
uniformly at random, and we let Bl = {v E V(G): p(V, va) ==
394 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
/
.
~
.............
whenever GI::1 < p(x,y) :::; 2GI::1. As in the proof of Lemma 15.7.3,
this yields a I-Lipschitz embedding h: V(G) -t £~ (for some N) that
shortens distances for pairs x, y as above by at most a constant factor.
(It is not really necessary to use all the possible pairs (B, a) in the
embedding; it is easy to show that const . log n independent random
B and a will do.)
To construct the final embedding f: V (G) -t £2, we let f (v) be the
concatenation of the vectors h for 1::1 E {2 j : 1 :::; 2j :::; diam(G)}. No
distance is expanded by more than O( Jlog diam( G)) = O( Jlog n ),
and the contraction is at most by a constant factor, and so we have
an embedding into £2 with distortion O( Jlog n ).
Why do we get a better bound than for Bourgain's embedding?
In both cases we have about log n groups of coordinates in the em-
bedding. In Rao's embedding we know that for every pair (x, y), one
of the groups contributes at least a fixed fraction of p(x, y) (and no
group contributes more than p(x, y)). Thus, the sum of squares of the
contributions is between p(x, y)2 and p(x, y)210gn. In Bourgain's em-
bedding (with a comparable scaling) no group contributes more than
p(x, y), and the sum of the contributions of all groups is at least a
fixed fraction of p(x, y). But since we do not know how the contri-
butions are distributed among the groups, we can conclude only that
the sum of squares of the contributions is between p(x, y)2 / log nand
p(x, y)2log n.
It remains to sketch the modifications of Rao's embedding for a
graph G with arbitrary nonnegative weights on edges. For the un-
weighted case, we defined Bl as the vertices lying exactly at the given
396 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
distances from Vo. In the weighted case, there need not be vertices
exactly at these distances, but we can add artificial vertices by subdi-
viding the appropriate edges; this is a minor technical issue. A more
serious problem is that the distances p(x, y) can be in a very wide
range, not just from 1 to n. We let ~ run through all the relevant
powers of 2 (that is, such that C~ < p(x,y) :S 2C~ for some x =I y),
but for producing the decomposition for a particular ~, we use a mod-
ified graph G 6. obtained from G by contracting all edges shorter than
~. In this way, we can have many more than logn values of ~, but
only o (log n) of them are relevant for each pair (x, y), and the analysis
works as before.
Gupta, Newman, Rabinovich, and Sinclair [GNRS99] proved that
any Excl(K4 )-metric, as well as any Excl(K2,3)-metric, can be 0(1)-
embedded into 1'1, and they conjectured that for any H, Excl(H)-
metrics might be O(I)-embeddable into £1 (the constant depending
on H).
Volume-respecting embeddings. Feige [FeiOO] introduced an interest-
ing strengthening of the notion of the distortion of an embedding,
concerning embeddings into Euclidean spaces. Let f: (V, p) ---+ £2 be
an embedding that for simplicity we require to be I-Lipschitz (nonex-
panding). The usual distortion of f is determined by looking at pairs
of points, while Feige's notion takes into account all k-tuples for some
k ?: 2. For example, if V has 3 points, every two with distance 1, then
the following two embeddings into £~ have about the same distortion:
•
• • •
• •
But while the left embedding is good in Feige's sense for k = 3, the
right one is completely unsatisfactory. For a k-point set P C £2, de-
by P (so Evol(P) = °
fine Evol(P) as the (k-I)-dimensional volume of the simplex spanned
if P is affinely dependent). For a k-point
metric space (8, p), the volume Vol(8) is defined as sup! Evol(f(8)),
where the supremum is over all I-Lipschitz f: 8 ---+ £2. An embedding
f: (V,p) ---+ £2 is (k,D) volume-respecting if for every k-point subset
8 ~ V, we have D· Evol(f(8))1/(k-1) ?: Vol(8)1/(k-l). For D small,
this means that the image of any k-tuple spans nearly as large a vol-
ume as it possibly can for a I-Lipschitz map. (Note, for example, that
an isometric embedding of a path into £2 is not volume-respecting.)
Feige showed that Vol(8) can be approximated quite well by an
intrinsic parameter of the metric space (not referring to embeddings),
namely, by the tree volume Tvol(8), which equals the products of the
edge lengths in a minimum spanning tree on 8 (with respect to the
metric on 8). Namely, Vol(8) :S (k~l)! Tvol(8) :S 2(k-2)/2 Vol(8). He
15.7 Upper Bounds for Euclidean Embeddings 397
proved that for any n-point metric space and all k 2: 2, the embed-
ding as in the proof of Theorem 15.7.1 is (k, O(log n + y'k log n log k))
volume-respecting (the result in the conference version of his paper is
slightly weaker).
The notion of volume-respecting embeddings currently still looks
somewhat mysterious. In an attempt to convey some feeling about
it, we outline Feige's application and indicate the use of the volume-
respecting condition in it. He considered the problem of approximat-
ing the bandwidth of a given n-vertex graph G. The bandwidth is
the minimum, over all bijective maps cp:V(G) -+ {I,2, ... ,n}, of
max{lcp(u) - cp(v)l: {u, v} E E(G)} (so it has the flavor of an approx-
imate embedding problem). Computing the bandwidth is NP-hard,
but Feige's ingenious algorithm approximates it within a factor of
O((1og n )const). The algorithm has two main steps: First, embed the
graph (as a metric space) into e~, with m being some suitable power
oflogn, by a (k,D) volume-respecting embedding f, where k = logn
and D is as small as one can get. Second, let A be a random line in
e~ and let 'ljJ(v) denote the orthogonal projection of f(v) on A. This
'ljJ: V(G) -+ A is almost surely injective, and so it provides a linear or-
dering of the vertices, that is, a bijective map cp: V (G) -+ {I, 2, ... , n},
and this is used for estimating the bandwidth.
To indicate the analysis, we need the notion of local density of the
graph G: Id(G) = max{IB(v, r)l/r: v E V(G), r = 1,2, ... , n}, where
B(v, r) are all vertices at distance at most r from v. It is not hard to
see that ld( G) is a lower bound for the bandwidth, and Feige's analysis
shows that O(ld(G)(lognyonst) is an upper bound.
One first verifies that with high probability, if {u, v} E E( G), then
the images 'ljJ(u) and 'ljJ(v) on A are close; concretely, 1'ljJ(u) - 'ljJ(v) I : : ;
~ = O( J (log n) / m ). For proving this, it suffices to know that f is
I-Lipschitz, and it is an immediate consequence of measure concentra-
tion on the sphere. If b is the bandwidth obtained from the ordering
given by 'ljJ, then some interval of length ~ on A contains the images of
b vertices. Call a k-tuple S c V( G) squeezed if'ljJ(S) lies in an interval
of length ~. If b is large, then there are many squeezed S. On the
other hand, one proves that, not surprisingly, if Id(G) is small, then
Vol(S) is large for all but a few k-tuples S c V(G). Now, the volume-
respecting condition enters: If Vol(S) is large, then conv(f(S)) has
large (k-I)-dimensional volume. It turns out that the projection of a
convex set in e~ with large (k-I)-dimensional volume on a random
line is unlikely to be short, and so S with large Vol(S) is unlikely to be
squeezed. Thus, by estimating the number of squeezed k-tuples in two
ways, one gets an inequality bounding b from above in terms of Id(G).
Vempala [Vem98] applied volume-respecting embeddings in an-
other algorithmic problem, this time concerning arrangement of graph
398 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
Exercises
1. (Embedding into £p) Prove that under the assumptions of Lemma 15.7.3,
the metric space (V, p) can be D-embedded into £~, 1 ::; p ::; 00, with
distortion at most D. (You may want to start with the rather easy cases
p = 1 and p = 00, and use Holder's inequality for an arbitrary p.) 0
2. (Dimension reduction for the embedding)
(a) Let El"'" Em be independent events, each of them having proba-
bility at least 112, Prove that the probability of no more than ~ of the
Ei occurring is at most e- cm , for a sufficiently small positive constant c.
Use suitable Chernoff-type estimates or direct estimates of binomial co-
efficients. 0
15.7 Upper Bounds for Euclidean Embeddings 399
o
4. Let Pn be the metric space {O, 1, ... ,n} with the metric inherited from
R (or a path of length n with the graph metric). Prove the following
Ramsey-type result: For every D > 1 and every c > 0 there exists an
n = n(D, c) such that whenever f: Pn -t (Z, a) is a D-embedding of Pn
into some metric space, then there are a < b < c, b = ate, such that f
restricted to the subspace {a, b, c} of P n is a (1 +c )-embedding. That is,
if a sufficiently long path is D-embedded, then it contains a scaled copy
of a path of length 2 embedded with distortion close to 1. I1l
Can you extend the proof so that it provides a scaled copy of a path of
length k?
5. (Lower bound for embedding trees into t'2)
(a) Show that for every E > 0 there exists 0 > 0 with the following
property. Let XO,XI,X2,X; E t'2 be points such that Ilxo - xIII, IIXI -
X;
x211, IlxI - II E [1,1 + 0] and Ilxo - x211, IIxo - x;11 E [2,2 + 0] (so all the
distances are almost like the graph distances in the following tree, except
possibly for the one marked by a dotted line).
Then IIx2 - x; II :s; c; that is, the remaining distance must be very short.
!II
(b) Let Tk,m denote the complete k-ary tree of height m; the following
picture shows T3 ,2:
Show that for every rand m there exists k such that whenever the leaves
of Tk,m are colored by r colors, there is a subtree of Tk,m isomorphic to
T 2 ,m with all leaves having the same color. 0
400 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces
(c) Use (a), (b), and Exercise 4 to prove that for any D > 1 there exist
m and k such that the tree Tk,m considered as a metric space with the
shortest-path metric cannot be D-embedded into £2. ~
6. (Another lower bound for embedding trees into £2)
(a) Let XO,X1, ... ,Xn be arbitrary points in a Euclidean space (we think
of them as images of the vertices of a path of length n under some em-
bedding). Let r = {(a, a+2 k , a+2k+1): a = 0,1,2, ... , a+2 k+1 ::; n, k =
0,1,2 ... }. Prove that
"
~
Ilxa - 2Xb
(c-a )2
+ xcl1 2 ::; ~ II
~ Xa - X a +1
112.,
(a,b,c)Er a=O
this shows that an average triple (xa, Xb, xc) is "straight" (and provides
an alternative solution to Exercise 4 for Z = £2). ~
(b) Prove that the complete binary tree T 2 ,m requires n( yllog m) dis-
tortion for embedding into £2. Consider a nonexpanding embedding
f: V(T2 ,m) ---+ £2 and sum the inequalities as in (a) over all images of
the root-to-Ieaf paths. ~
7. (Bourgain's embedding of complete binary trees into £2) Let Bm = T 2 ,m
be the complete binary tree of height m (notation as in Exercise 5).
We identify the vertices of Bm with words of length at most mover
the alphabet {O, I}: The root of Bm is the empty word, and the sons
of a vertex ware the vertices wO and wl. We define the embedding
f: V(Bm) ---+ £~V(B",)I-l, where the coordinates in the range of fare
indexed by the vertices of Bm distinct from the root, i.e., by nonempty
words. For a word w E V(Bm) of length a, let f(w)u = yla-b+1 if u is
a nonempty initial segment of w of length b, and f(w)u = 0 otherwise.
Prove that this embedding has distortion O( yllog m ). ~
8. Prove that any finite tree metric can be isometrically embedded into £1.
~
9. (Low-dimensional embedding of trees)
(a) Let T be a tree (in the graph-theoretic sense) on n ::::: 3 vertices. Prove
that there exist subtrees Tl and T2 of T that share a single vertex and
no edge and together cover T, such that min(IV(Tdl, IV(T2 )1) ::; l+~n.
~
(b) Using (a), prove that every tree metric space with n points can be
isometrically embedded into £~ with d = O(logn). ~
This result is from [LLR95].
What Was It About? An
Informal Summary
Chapter 1
• Linear and affine notions (dependence, hull, subspace, mapping); hyper-
plane, k-flat.
• General position: Degenerate configurations have measure zero in the
space of all configurations, provided that degeneracy can be described by
countably many polynomial equations.
• Convex set, hull, combination.
• Separation theorem: Disjoint convex sets can be separated by a hyper-
plane; strictly so if one of them is compact and the other closed.
• Theorems involving the dimension: Helly (if:F is a finite family of convex
sets with empty intersection, then there is a subfamily of at most d+ 1 sets
with empty intersection), Radon (d+2 points can be partitioned into two
subsets with intersecting convex hulls), CaratModory (if x E conv(X),
then x E conv(Y) for some at most (d+l)-point Y ~ X).
• Centerpoint of X: Every half-space containing it contains at least d!l
of X. It always exists by Helly. Ham-sandwich: Any d mass distributions
in Rd can be simultaneously bisected by a hyperplane.
Chapter 2
• Minkowski's theorem: A O-symmetric convex body of volume larger than
2d contains a nonzero integer point.
• General lattice: a discrete subgroup of (Rd, + ). It can be written as the
set of all integer linear combinations of at most d linearly independent
vectors (basis). Determinant = volume of the parallelotope spanned by
a basis.
• Minkowski for general lattices: Map the lattice onto Zd by a linear map-
ping.
402 What Was It About? An Informal Summary
Chapter 3
• Erdos-Szekeres theorem: Every sufficiently large set in the plane in gen-
eral position contains k points in convex position. How large? Exponential
in k.
• What about k-holes (vertex sets of empty convex k-gons)? For k = 5
yes (in sufficiently large sets), for k ;::: 7 no (Horton sets), k = 6 is a
challenging open problem.
Chapter 4
• Szemeredi-Trotter theorem: m distinct points and n distinct lines in the
plane have at most O(m 2 / 3 n 2 / 3 + m + n) incidences.
• This is tight in the worst case. Example for m = n: Use the k x 4k2 grid
and lines y = ax + b with a = 0, 1, ... , 2k-1 and b = 0,1, ... , 2k 2 -1.
• Crossing number theorem: A simple graph with n vertices and m ;::: 4n
edges needs n(m 3 jn 2 ) crossings. Proof: At least m-3n crossings, since
planar graphs have fewer than 3n edges, then random sampling.
• Forbidden bipartite subgraphs: A graph on n vertices without KT,s has
O(n 2 - 1 / T ) edges.
• Cutting lemma: Given n lines and r, the plane can be subdivided into
O(r2) generalized triangles such that the interior of each triangle is in-
tersected by at most ~ lines. Proof of a weaker version: Triangulate the
arrangement of a random sample and show that triangles intersected by
many lines won't survive. Application: geometric divide-and-conquer.
• For unit distances and distinct distances in the plane, bounds can be
proved, but a final answer seems to be far away.
Chapter 5
• Geometric duality: Sends a point a to the hyperplane (a, x) = 1 and vice
versa; preserves incidences and sidedness.
• Convex polytope: the convex hull of a finite set and also the intersection
of finitely many half-spaces.
• Face, vertex, edge, facet, ridge. A polytope is the convex hull of its ver-
tices. A face of a face is a face. Face lattice. Duality turns it upside down.
Simplex. Simple and simplicial polytopes.
• The convex hull of n points in Rd can have as many as n( n Ld/2J) facets;
cyclic polytopes.
• This is as bad as it can get: Given the number of vertices, cyclic polytopes
maximize the number of faces in each dimension (upper bound theorem).
• Gale transform: An n-point sequence in Rd (affinely spanning R d) is
mapped to a sequence of n vectors in Rn-d-l. Properties: a simple linear
algebra. Faces of the convex hull go to subsets whose complement contains
o in the convex hull.
What Was It About? An Informal Summary 403
Chapter 6
• Arrangement of hyperplanes (faces, vertices, edges, facets, cells). For d
fixed, there are O(n d ) faces.
• Clarkson's theorem on levels: At most O( n ld/2Jk rd/21) vertices are at
level at most k. Proof: Express the expected number of level-O vertices
of a random sample in two ways!
• Zone theorem: The zone of a hyperplane has O(n d - 1 ) vertices. Proof:
Delete a random hyperplane, and look at how many zone faces are sliced
into two by adding it back.
• Proof of the cutting lemma by a finer sampling argument: Vertically
decompose the arrangement of a sample taken with probability p, show
that the number of trapezoids intersected by at least tnp lines decreases
exponentially with t, take i-cuttings within the trapezoids.
• Canonical triangulation, cutting lemma in R d (O( r d ) simplices).
• Milnor-Thom theorem: The arrangement of the zero sets of n polynomi-
als of degree at most D in d real variables has at most O(Dn/d)d faces.
• Most arrangements of pseudolines are nonstretchable (by Milnor-Thom).
Similarly for many other combinatorial descriptions of geometric config-
urations; usually most of them cannot be realized.
Chapter 7
• Davenport-Schinzel sequences of order 8 (no abab . .. with 8+2 letters);
maximum length As(n). Correspond to lower envelopes of curves: The
curves are graphs of functions defined everywhere, every two intersecting
at most 8 times. Lower envelopes of segments yield DS sequences of or-
der 3.
• A3 = 8(na(n)); As(n) is almost linear for every fixed 8.
• The lower envelope of n algebraic surface patches in R d, as well as a
single cell in their arrangement, have complexity O(n d -1+€). Charging
schemes and more random sampling.
404 What Was It About? An Informal Summary
Chapter 8
• Fractional Helly theorem: If a family of n convex sets has aC~I) inter-
secting (d+l)-tuples, then there is a point common to at least d~l n of
the sets.
• Colored Caratheodory theorem: If each of d+ 1 sets contains 0 in the
convex hull, then we can pick one point from each set so that the convex
hull of the picked points contains O.
• Tverberg's theorem: (d+l)(r-l)+l points can be partitioned into r sub-
sets with intersecting convex hulls (the number is the smallest conceivable
one: r-l simplices plus one extra point).
• Colored Tverberg theorem: Given points partitioned into d+ 1 color
classes by t points each, we can choose r disjoint rainbow subsets with
intersecting convex hulls, t = t(d, r). Only topological proofs are known.
Chapter 9
• The dimension is considered fixed in this chapter. First selection lemma:
Given n points, there exists a point contained in a fixed fraction of all
simplices with vertices in the given points.
• Second selection lemma: If a(d~l) of the simplices are marked, we can
find a point in many of the marked simplices (at least n(a Sd (d~l) )).
Needs colored Tverberg and Erdos-Simonovits.
• Order type. Same-type lemma: Given n points in general position and k
fixed, one can find k disjoint subsets of size n( n), all of whose transversals
have the same order type.
• A hypergraph regularity lemma: For an c > 0 and a k-partite hypergraph
of density bounded below by a constant (3 > 0 and with color classes
Xl'·.·' Xn of size n, we can choose subsets YI ~ Xl' ... ' Yk ~ Xk, WII =
... = IYkl ;::: en, e = (k, (3, c) > 0, such that any ZI ~ YI , ... , Zk ~ Yk
with IZil ;::: clYiI induce some edge.
• Positive-fraction selection lemma: Given n red, n white, and n blue points
in the plane, we can choose {'2 points of each color so that all red-white-
blue triangles have a common point; similarly in Rd.
Chapter 10
• Set systems; transversal number T, packing number //. Fractional transver-
sal and fractional packing; //* = T* by LP duality.
• Epsilon net, shattered set, VC-dimension. Shatter function lemma: A set
system on n points with VC-dimension d has at most 2::%=0 (~) sets.
• Epsilon net theorem: A random sample of C ~ log ~ points in a set system
of VC-dimension d is an c-net with high probability. In particular, c-nets
exist of size depending only on d and c.
• Corollary: T = O(T*logT*) for bounded VC-dimension.
What Was It About? An Informal Summary 405
Chapter 11
• k-sets, k-facets (only for sets in general position!), halving facets. Dual:
cells of level k, vertices of level k. The k-set problem is still unsolved.
Straightforward bounds from Clarkson's theorem on levels.
• Bounds for halving facets yield bounds for k-facets sensitive to k.
• A recursive planar construction with a superlinear. number of halving
edges.
• Lovasz lemma: No line intersects more than O(n d - l ) halving facets.
Proof: When a moving line crosses the convex hull of d-l points of X,
the number of halving facets intersected changes by 1 (halving-facet in-
terleaving lemma).
• Implies an upper bound of O(n d - c5 (d)) for halving facets by the second
selection lemma.
• In the plane a continuous motion argument proves that the crossing num-
ber of the halving-edge graph is O(n 2 ), and consequently, it has O(n 4 / 3 )
°
edges by the crossing number theorem. This is the best we can do in the
plane, although O(n1+E) for every fixed E > is suspected.
Chapter 12
• Perfect graph (X = w hereditarily). weak perfect graph conjecture (now
theorem): A graph is perfect iff its complement is.
• Proof via the polytope {x E R v: x 2: 0, x(K) ::; 1 for every clique K}.
• Brunn's slice volume inequality: For a compact convex C c Rn+l,
voln({x E C: Xl = t})l/n is a concave function of t (as long as the
slices do not miss the body).
406 What Was It About? An Informal Summary
• Order polytope: °: :;
whose average heights differ by less than l.
x :::; 1, Xa :::; Xb whenever a j b. Linear extensions
correspond to congruent simplices and good comparison to dividing the
volume evenly by a hyperplane Xa = Xb. The best ratio is not known
(conjectured to be -! : ~).
Chapter 13
• Volumes and other things in high dimensions behave differently from
what we know in R2 and R3. For example, the ball inscribed in the unit
cube has a tiny volume.
• An 1J-net is an inclusion-maximal 1J-separated set. It is mainly useful
because it is 1J-dense. In sn-l, a simple volume argument yields 1J-nets
of size at most (4/1J)n.
• An N-vertex convex polytope inscribed in the unit ball B n occupies at
most O(ln( ~+I)/n)n/2 of the volume of Bn. Thus, with polynomially
many vertices, the error of deterministic volume approximation is expo-
nential in the worst case.
• Polytopes with such volume can be constructed: For N = 2n use the
crosspolytope, for N = 4n a I-net in the dual sn-l, and interpolate
using a product.
• Ellipsoid: an affine image of Bn. John's lemma: Every n-dimensional
convex body has inner and outer ellipsoids with ratio at most n, and
a symmetric convex body admits the better ratio ..;n. The maximum-
volume inscribed ellipsoid (which is unique) will do as the inner ellipsoid.
Chapter 14
• Measure concentration on sn-l: For any set A occupying half of the
sphere, almost all of sn-l is at most O(n-l/2) away from A. Quantita-
tively, 1 - PlAt] :::; 2e- t2n / 2 .
• Similar concentration phenomena in many other high-dimensional spaces:
Gaussian measure on Rn, cube {O, l}n, permutations, etc.
• Many concentration inequalities can be proved via isoperimetric inequal-
ities. Isoperimetric inequality: Among all sets of given volume, the ball
has the smallest volume of at-neighborhood.
• Levy's lemma: A I-Lipschitz function f on sn-l is within O(n-l/2) of
its median on most of sn-l.
• Consequently (using 1J-nets), there is a high-dimensional subspace on
which f is almost constant (use a random subspace).
What Was It About? An Informal Summary 407
Chapter 15
• Metric space; the distortion of a mapping between two metric spaces,
D-embedding. Spaces £~ and £p.
• Flattening lemma: Any n- point Euclidean metric space can be (1 +E)-
embedded into £~, k = O(c 2 Iogn) (project on a random k-dimensional
subspace).
• Lower bound for D-embedding into ad-dimensional normed space: count-
ing; take all subgraphs of a graph without short cycles and with many
edges.
• The m-dimensional Hamming cube needs Vm distortion for embedding
into £2 (short diagonals and induction).
• Edge expansion (conductance), second eigenvalue of the Laplacian ma-
trix. Constant-degree expanders need n(log n) distortion for embedding
into £2 (tight). Method: Compare sums of squared distances over the
edges and over all pairs, in the graph and in the target space.
• D-embeddability into £2 is polynomial-time decidable by semidefinite
programming.
• All n-point spaces embed isometrically into £~. For embeddings with
smaller dimension, use distances to random subsets of suitable density
as coordinates. A similar method yields o (log n )-embedding into £2 (or
any other £p).
• Example of algorithmic application: approximating the sparsest cut. Em-
bed the graph metric into £1 with low distortion; this yields a cut pseu-
dometric defining a sparse cut.
Hints to Selected Exercises
Consider a graph drawing with P as the vertices and the polygonal curves
defining edges.
4.3.4(c). Consider a drawing of G witnessing pair-cr(G) = k. At most 2k
edges are involved in any crossings, and the remaining ones (the good edges)
form a planar graph. Redraw the edges with crossings so that they do not
intersect any of the good edges and, subject to this, have the minimum pos-
sible number of crossings.
4.4.1(a). O(n lO / 7 ) = O(n1. 43 ).
4.4.1 (b). Let C i be the points of C that are the centers of at least 2i and
at most 2i+l circles. We have ICil = qi :::; n/2i. One incidence of a line of the
form Cuv with acE C i contributes at most 2i+2 edges.
4.4.2(b). Look at u,v with /1(U, v) 2: 4y1(:4, and suppose that at least half
of the uv edges have their partner edges adjacent to u, say. These partner
edges connect u to at least 2y1(:4 distinct neighbor vertices. By (a), at most
yI(:4/2 of these partner edges may belong to E h .
4.4.2(c). We get lEI = O(IE \ Ehl) = O(n4/3d;/6); at the same time, lEI 2:
ndd2. This gives di = O(n 2 / 5 ) and Icirc(n, n) = O(n 7 / 5 ) = O(n1. 4).
4.7.1. Consider a trapezoid ABB'A'; AB is the bottom side and A'B' the
top side. Suppose AB is contained in an edge CD of Pj and A' B' is an edge
of Pj+l (the few other possible cases are discussed similarly). Let Al be the
intersection of the level qj + i with the vertical line AA', and similarly for B l .
The segments A'B', A'Al' and B'BI each have at most q+1 intersections.
Observe that if AAI has some a intersections, then C A also has at least a
intersections, and similarly for BBI and BD. At the same time CD has at
most q+1 intersections altogether. Therefore, AA l , AB, and BBI have no
more than q+ 1 intersections in total.
5.1.9(b). Geometric duality and Helly's theorem.
5.1.9(c). The first segment Sl is a chord of the unit circle passing near the
center. Each Si+l has one endpoint on the unit circle, and the other endpoint
almost touches Si near the center.
5.3.2. Ask in this way: Given a normal vector a E Rd of a hyperplane, which
if ai > 0, then Xi has to be +1; if ai < 0, then Xi = -1; and for ai = both
Xi = ±1 are possible.
°
vertices maximize the linear function x f-t (a, x)? For example, for the cube,
5.3.8. If the removed vertices u, v lie in a common 2-face j, let h be the plane
defining j; from each vertex there is an edge going "away from h," except for
the vertices of a single face g -I- j "opposite" to j. The graph of the face g
is connected and can be reached from any other vertex. If u, v do not share
a 2-face, pass a plane h through them and one more vertex w. The subgraph
on the vertices below h is connected, and so is the subgraph on the vertices
above h; they are connected via the vertex w.
5.4.2. Do not forget to check that (3 is not contained in any hyperplane.
Hints to Selected Exercises 411
8.1.2. Make the sets compact as in the proof of the fractional Helly theorem.
Consider all d-element collections K containing one set from each Ci but one,
n
and let VIC be the lexicographic minimum of the intersection of K. Let Ko
be such that V = VICo is the lexicographically largest among all VIC, and let
io be the index such that Ko contains no set from Cio . Show that for each
n
G E Cio ' V is the minimum of G n Ko, and in particular, v E G.
8.2.1. Regard BuT as a Gale transform of a point sequence and reformulate
the problem using that sequence. Or lift BuT into R d+1 suitably.
9.2.2(b). For d = 3: Choose k points on the moment curve, say, and replace
each by a cluster of n/k points. Use all tetrahedra having two vertices in one
cluster and the other two vertices in another cluster. There are about n 4 /k 2
such tetrahedra, and no point is contained in more than n 4 / k4 of them if the
clusters are small and k is not too large compared to n.
9.3.1(b). Be careful with degenerate cases; first determine the dimension of
the affine hull of PI, ... ,Pd+1 and test whether Pd+2 lies in it. Then you may
need to use some number of other affinely independent points among the Pi.
9.3.3(a). Let Xi'X~ E Xi be such that (XI, ••• ,xd+d and (x~, ... ,X~+l)
have different orientations. Let Yi be a point moving along the segment XiX~
at constant speed, starting at Xi at time 0 and reaching x~ at time 1. By
continuity of the determinant, all the Yi lie in a common hyperplane at some
moment, and this hyperplane intersects the convex hulls of all the Xi.
9.3.3(b). Let the hyperplane h intersect all the Gi , and let ai E h n Gi . Use
Radon's lemma.
9.3.3(c). Suppose that 0 E conv(UEI Gi ) n conv(Uj!i!'I Gj ). Then there are
points Xi E Gi , i = 1,2, ... , d+l, such that 0 E conv{xi: i E f} and 0 E
conv{xj: j fj. I}. Hence the vectors {Xi: i E f} are linearly dependent, as well
as those of {Xj: j fj. I}. Thus, the linear subspace generated by all the Xi has
dimension at most d-1.
9.3.5(a). Partition Pinto 3 sets and apply the same-type lemma. IfYI, Y2 , Y3
are the resulting sets, then each line misses at least one conv(Yi). Let pI be
the Yi whose convex hull is missed by the largest number of lines of L.
9.3.5(b). First apply (a) with P consisting of the left endpoints of the seg-
ments of B. Then apply (a) again with the right endpoints of the remaining
segments and the remaining lines. Finally, discard either the lines intersected
by all segments or those intersected by no segment.
9.3.5(c). Use (b) twice.
9.4.4. Consider the complete bipartite graphs with classes Vi and Vj, 1 ::::;
i < j $ 4, and color each of their edges randomly either red or blue with
equal probability. A triple {u, v, w} with u E Vi, v E Vj, W E Vk, i <: j .::: k,
is present if and only if the edges {u, v} and {u, w} have distinct colors.
10.1.3. Choose the appropriate number of points independently at random.
according to the distribution given by an optimal fractional transversal.
Hints to Selected Exercises 413
10.1.4(a). Let mk be the number of yet uncovered sets after the last step
i such that Xi covered more than k previously uncovered sets (md = IFI,
mo = 0). Derive t ::; 2::~=1 mk-;;k-l and note that mk ::; vk(F).
1O.1.6(b). By the Farkas lemma, it suffices to check the following: For all
u E R m , vERn, and z E R such that u ;::: 0, v ;::: 0, z ;::: 0, uTA::; zc, and
Av ;::: zb, we have uTb ::; cT v. For z =J: 0 this is (a), and for z = 0 choose
Xo E P and Yo ED and use uTb::; u T Axo ::; 0 and cTv;::: Y6 Av;::: O.
10.2.2. All subsets of size at most d.
10.3.1. 7.
10.3.3. Such a p would have to be 0 on the boundary, but if a polynomial is
o on a segment, then it is 0 on the whole line containing that segment.
1O.3.4(b). Choose a ~-net S S;; L for the set system (L, T) and triangulate
the arrangement of S. No dangerous triangle appears in this triangulation.
10.3.6(c). The shattering graph SCd considered in Exercise 5 contains a
subdivision of Kd where each edge is subdivided once. Some care is needed,
since some vertices might be both shattering and shattered in C.
10.4.1(b). This method gives size 0 (€_2 d - I
).
1O.4.2(b). (a) yields f(€) ::; m + £f(£€/3); set £ = 3/JE. The exponent of
log ~ is log2 3.
10.4.3. We may assume that € is sufficiently small. Let C be convex with
IC n X I ;::: En. Then C n X contains points a, b, c such that the shortest of the
3 arcs determined by them, call it G, is at least O(€). Show that the triangle
abc contains a point of N i , where i is the smallest with €(l.Ol)i /10 > G.
10.5.2. If X is the last among the lexicographic minima of d-wise intersections
of F, the family {F E F: x tJ. F} satisfies the (p-d, q-d+ 1)-condition.
1O.5.3(b). By ham-sandwich, choose lines £,£' with IRi nXI::; k+1, where
R 1 , •.. ,R4 are the "quadrants" determined by £ and £'. The point £ n £' and
centerpoints of Ri n X form a transversal.
10.6.1(a). No need to invoke the Alon-Kleitman machinery here.
10.6.1(b). Use Ramsey's theorem.
10.6.2(a). Count the incidences of endpoints with intervals (it can be as-
sumed that all the intervals have distinct endpoints). To get a better /3, apply
Thran's theorem.
10.6.3. For F c K~ finite, let 9 = US EF {Sl, S2,"" Sd, where S = Sl U
... u S k with the Si convex. If F has many intersecting (d+ 1)-tuples, then 9
has many intersecting (d+1)-tuples and so fractional Helly for F, with worse
parameters, follows from that for g.
10.6.4. Let C = f(d+1, d, k), where f(p, d, k) is as in Exercise 3, and h =
(d+1)C. Let F' be the family of all intersections of C-tuples of sets of F.
414 Hints to Selected Exercises
°
°
13.2.3. Fix the coordinate system so that c = and F lies in the coordinate
hyperplane h = {xn = O}. Since is not the center of gravity, for some i
we have I = IFXi dx =I 0. Without loss of generality, i = 1 and I > 0. Let
hI be h slightly rotated around the flat {Xl = Xn = a}; i.e., hI = {X E
Rn: (a, X) = O} with a = (6',0, ... ,0,1). Let Sl be the simplex determined
by the same facet hyperplanes as S except that h is replaced by hI. The
difference vol(S) - vol(Sd is proportional to €I + 0(6'2) as 6' ~ 0. Let h'
be a parallel translation of hI that touches B n (near 0), and let S' be the
corresponding simplex. Calculation shows that j VOl(SI) - vol(S')j = 0(6'2).
°
13.2.5. The Thales theorem implies that if X .;. B( ~v, ~ IIvll), then v lies in
the open half-space 'Yx containing and bounded by the hyperplane passing
through X and perpendicular to OX.
13.3.1(b). Geometric duality and Theorem 13.2.1.
13.4.4(b). Helly's theorem for suitable sets in Rn+l.
13.4.5(a). Since the ratio of areas is invariant under affine transforms, we
may assume that P contains B(O, 1) and is contained in B(0,2). Infer that
99% of the edges of P have length O(~) and 99% of the angles are 7I"-0(~).
Then there are two consecutive short edges with angle close to 71".
14.1.4. Choose a radius r such that the caps cut off from rBn by the
considered slabs together cover at most half of the surface of r Bn. Then
vol(K) 2: vol(K n rBn) 2: ~rn.
416 Hints to Selected Exercises
14.6.1. Suppose that maxi IVil = IVll. For any fixed choice of er2,"" ern, use
~(Ix + yl + Ix - yl) :2: IYI with Y = VI and x = L~=2 erivi·
14.6.2. We need to bound n- l / 2EUIZlldIIZlll from below for Z as in
Lemma 14.6.4. Each IZil is at least a small constant f3 > 0 with proba-
bility at least ~; derive that IIZlll = !l(n) with probability at least ~.
15.2.3(b). Let AI,"" An be the eigenvalues of A. The rank is the number
of nonzero Ai' Estimate L A; in two ways: First use the trace of AT A, and
then the trace of A and Cauchy-Schwarz.
15.2.3( d). If VI, ... ,Vn E R k, then the matrix A with aij = (Vi, Vj) has rank
at most k.
15.3.4(a). Let n = 2m+1 and let each n-tuple in V have the form
(0, el, e2,"" em, em+l + 10C:Wl, em+l + 10c:w2,.··, e2m + 10c:wm), where each
Wi is an 0/1 vector with l406e: 2 J ones among the first m positions and zeros
elsewhere.
15.4.2. Let G i = (Vi, E i ), where Vo C VI C ... C Vm . For each e E E i - l , we
have a pair {u e , vel of new vertices in G i in the square that replaces e; let
Fi = {{ U e , vel: e E Ei-d. With notation as in the proof of Theorem 15.4.1,
put E = Em and F = Eo U U7:l Fi and show that RE,F(p) = v'm+1, while
RE,F(er) ::; 1. For the latter, sum up the inequalities er 2(Fi) + er 2(Ei _ l ) ::;
er 2(Ei), i = 1,2, ... ,m, obtained from the short diagonals lemma.
15.4.3. Color the pairs of points; the color of {x, y} is the remainder of
flogHe:/2 p(x, y)l modulo r, where r is a sufficiently large integer. Show by
induction that a homogeneous set can be embedded satisfactorily.
15.5.2(b). By (a) and CaratModory's theorem, every metric in fin ) is a .ci
convex combination of at most N + 1 line metrics. To get rid of the extra + 1,
.ci
use the fact that fin ) is a convex cone.
15.5.8(c). The expectation of ~(l-xuxv) is the probability that the hyper-
plane through 0 perpendicular to r separates Yu and Yv, and this equals ~,
where {) E [0, 1f) is the angle of Yu and Yv' On the other hand, the contribution
of the edge {u,v} to Mrelax is ~(1- (Yu,Yv) = (1-cos{)/2. The constant
0.878 ... is the minimum of ~ . 1-:08'0' 0 ::; {) ::; 1f.
15.7.5(c). Suppose that there is aD-embedding f of Tk,m' For every leaf C,
consider f restricted to the path p( e) from the root to C, fix a triple {ae, be, ce}
of vertices as in Exercise 4 (a scaled copy of P2), and label the corresponding
leaf by the distances of ae,be,ce from the root. Using (b), choose a T 2 ,m
subtree where all leaves have the same labels, consider leaves C and C' of this
subtree such that p(e) and p(e') first meet at be = be', and use (a) with
Xo = f(ae), Xl = f(be), X2 = f(ce), x~ = f(ce')'
15.7.6(a). Sum the parallelogram identities (c~a)2 (11(x a -Xb) - (Xb -xc )11 2 +
II(xa-Xb)+(Xb-Xc)112) = (c!a)2(llxa-XbI12+lIxb-XcI12) over (a, b, c) E r.
Bibliography
[AHL01] N. Alon, S. Hoory, and N. Linial. The Moore bound for irregular
graphs. Graphs and Combinatorics, 2001. In press. (ref: p. 367)
[AI88] F. Aurenhammer and H. Imai. Geometric relations among
Voronoi diagrams. Geom. Dedicata, 27:65-75, 1988. (ref:
p. 121)
[Ajt98] M. Ajtai. Worst-case complexity, average-case complexity and
lattice problems. Documenta Math. J. DMV, Extra volume
ICM 1998, vol. III:421-428, 1998. (ref: p. 26)
[AK85] N. Alon and G. Kalai. A simple proof of the upper bound
theorem. European J. Combin., 6:211-214, 1985. (ref: p. 103)
[AK92] N. Alon and D. Kleitman. Piercing convex sets and the Had-
wiger Debrunner (p,q)-problem. Adv. Math., 96(1):103-112,
1992. (ref: p. 258)
[AK95] N. Alon and G. Kalai. Bounding the piercing number. Discrete
Comput. Geom., 13:245-256, 1995. (ref: p. 261)
[AKOO] F. Aurenhammer and R. Klein. Voronoi diagrams. In J.-R. Sack
and J. Urrutia, editors, Handbook of Computational Geometry,
pages 201-290. Elsevier Science Publishers B.V. North-Holland,
Amsterdam, 2000. (refs: pp. 120, 121)
[AKMM01] N. Alon, G. Kalai, J. Matousek, and R. Meshulam. Transversal
numbers for hypergraphs arising in geometry. Adv. Appl. Math.,
2001. In press. (ref: p. 262)
[AKP89] N. Alon, M. Katchalski, and W. R. Pulleyblank. The maximum
size of a convex polygon in a restricted set of points in the plane.
Discrete Comput. Geom., 4:245-251, 1989. (ref: p. 33)
[AKPW95] N. Alon, R. M. Karp, D. Peleg, and D. West. A graph-theoretic
game and its application to the k-server problem. SIAM J.
Computing, 24(1):78-100, 1995. (ref: p. 398)
[AKV92] R. Adamec, M. Klazar, and P. Valtr. Generalized Davenport-
Schinzel sequences with linear upper bound. Discrete Math.,
108:219-229, 1992. (ref: p. 176)
[Alo] N. Alon. Covering a hypergraph of subgraphs. Discrete Math.
In press. (ref: p. 262)
[Alo86a] N. Alon. Eigenvalues and expanders. Combinatorica, 6:83-96,
1986. (ref: p. 381)
[Alo86b] N. Alon. The number of polytopes, configurations, and real
matroids. Mathematika, 33:62-71, 1986. (ref: p. 140)
420 Bibliography
[Bec83] J. Beck. On the lattice property of the plane and some prob-
lems of Dirac, Motzkin and Erdos in combinatorial geometry.
Combinatorica, 3(3-4):281-297, 1983. (refs: pp. 45, 50)
[Ben66] C. T. Benson. Minimal regular graphs of girth eight and twelve.
Canad. J. Math., 18:1091-1094, 1966. (ref: p. 367)
[BEPY91] M. Bern, D. Eppstein, P. Plassman, and F. Yao. Horizon the-
orems for lines and polygons. In J. Goodman, R. Pollack, and
W. Steiger, editors, Discrete and Computational Geometry: Pa-
pers fmm the DIMACS Special Year, volume 6 of DIMACS
Series in Discrete Mathematics and Theoretical Computer Sci-
ence, pages 45-66. American Mathematical Society, Association
for Computing Machinery, Providence, RI, 1991. (ref: p. 151)
[Ber61] C. Berge. Farbungen von Graphen, deren samtliche bzw.
deren ungerade Kreise starr sind (Zusammenfassung). Wis-
sentschaftliche Zeitschrift, Martin Luther Universitiit Halle-
Wittenberg, Math.-Naturwiss. Reihe, pages 114-115, 1961. (ref:
p. 293)
[Ber62] C. Berge. Sur une conjecture relative au probleme des codes op-
timaux. Communication, 13eme assemblee generale de l'URSI,
Tokyo, 1962. (ref: p. 293)
[BF84] E. Boros and Z. Fiiredi. The number of triangles covering the
center of an n-set. Geom. Dedicata, 17:69-77, 1984. (ref:
p.21O)
[BF87] I. Barany and Z. Fiiredi. Computing the volume is difficult.
Discrete Comput. Geom., 2:319-326, 1987. (refs: pp. 320, 322,
324)
[BF88] I. Barany and Z. Fiiredi. Approximation of the sphere by poly-
topes having few vertices. Pmc. Amer. Math. Soc., 102(3):651-
659, 1988. (ref: p. 320)
[BFL90] I. Barany, Z. Fiiredi, and L. Lovasz. On the number of halving
planes. Combinatorica, 10:175-183, 1990. (refs: pp. 205, 215,
229, 269, 270, 280)
[BFM86] J. Bourgain, T. Figiel, and V. Milman. On Hilbertian subsets
of finite metric spaces. Israel J. Math., 55:147-152, 1986. (ref:
p.373)
[BFT95] G. R. Brightwell, S. Felsner, and W. T. Trotter. Balancing pairs
and the cross product conjecture. Order, 12(4):327-349, 1995.
(ref: p. 308)
424 Bibliography
The index starts with notation composed of special symbols, and Greek let-
ters are listed next. Terms consisting of more than one word mostly appear
in several variants, for example, both "convex set" and "set, convex." An
entry like "armadillo, 19(8.4.1), 22(Ex. 4)" means that the term is located in
theorem (or definition, etc.) 8.4.1 on page 19 and in Exercise 4 on page 22.
For many terms, only the page with the term's definition is shown. Names or
notation used only within a single proof or remark are usually not indexed
at all. For important theorems, the index also points to the pages where they
are applied.