Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Lectures On Discrete Geometry, Jiri Matousek

Download as pdf or txt
Download as pdf or txt
You are on page 1of 491

Graduate Texts in Mathematics 212

Editorial Board
S. Axler F.W. Gehring K.A. Ribet

Springer
New York
Berlin
Heidelberg
Barcelona
Hong Kong
London
Milan
Paris
Singapore
Tokyo
Graduate Texts in Mathematics
TAKEUTIIZARING. Introduction to 34 SPITZER. Principles of Random Walk.
Axiomatic Set Theory. 2nd ed. 2nded.
2 OXTOBY. Measure and Category. 2nd ed. 35 ALEXANDERIWERMER. Several Complex
3 SCHAEFER. Topological Vector Spaces. Variables and Banach Algebras. 3rd ed.
2nded. 36 KELLEy/NAMIOKA et al. Linear
4 HILTON/STAMMBACH. A Course in Topological Spaces.
Homological Algebra. 2nd ed. 37 MONK. Mathematical Logic.
5 MAC LANE. Categories for the Working 38 GRAUERT/FRIlZSCHE. Several Complex
Mathematician. 2nd ed. Variables.
6 HUGHEs/PIPER. Projective Planes. 39 ARVESON. An Invitation to C*-Algebras.
7 SERRE. A Course in Arithmetic. 40 KEMENy/SNELL/KNAPP. Denumerable
8 TAKEUTIIZARING. Axiomatic Set Theory. Markov Chains. 2nd ed.
9 HUMPHREYS. Introduction to Lie Algebras 41 APOSTOL. Modular Functions and
and Representation Theory. Dirichlet Series in Number Theory.
10 COHEN. A Course in Simple Homotopy 2nded.
Theory. 42 SERRE. Linear Representations of Finite
11 CONWAY. Functions of One Complex Groups.
Variable I. 2nd ed. 43 GlLLMAN/JERISON. Rings of Continuous
12 BEALS. Advanced Mathematical Analysis. Functions.
13 ANDERSON/fuLLER. Rings and Categories 44 KENDIG. Elementary Algebraic Geometry.
of Modules. 2nd ed. 45 LoEVE. Probability Theory I. 4th ed.
14 GOLUBITSKy/GUILLEMIN. Stable Mappings 46 LoEVE. Probability Theory II. 4th ed.
and Their Singularities. 47 MOISE. Geometric Topology in
15 BERBERIAN. Lectures in Functional Dimensions 2 and 3.
Analysis and Operator Theory. 48 SACHSlWu. General Relativity for
16 WINTER. The Structure of Fields. Mathematicians.
17 ROSENBLATT. Random Processes. 2nd ed. 49 GRUENBERGIWEIR. Linear Geometry.
18 HALMos. Measure Theory. 2nded.
19 HALMOS. A Hilbert Space Problem Book. 50 EDWARDS. Fermat's Last Theorem.
2nded. 51 KLINGENBERG. A Course in Differential
20 HUSEMOLLER. Fibre Bundles. 3rd ed. Geometry.
21 HUMPHREYS. Linear Algebraic Groups. 52 HARTSHORNE. Algebraic Geometry.
22 BARNES/MACK. An Algebraic Introduction 53 MANIN. A Course in Mathematical Logic.
to Mathematical Logic. 54 GRAVERlWATKINS. Combinatorics with
23 GREUB. Linear Algebra. 4th ed. Emphasis on the Theory of Graphs.
24 HOLMES. Geometric Functional Analysis 55 BROWN/PEARCY. Introduction to Operator
and Its Applications. Theory I: Elements of Functional Analysis.
25 HEWITT/STROMBERG. Real and Abstract 56 MASSEY. Algebraic Topology: An
Analysis. Introduction.
26 MANES. Algebraic Theories. 57 CRoWELL/Fox. Introduction to Knot
27 KELLEY. General Topology. Theory.
28 ZARISKIISAMUEL. Commutative Algebra. 58 KOBUTZ. p-adic Numbers, p-adic
Vol.l. Analysis, and Zeta-Functions. 2nd ed.
29 ZARISKIISAMUEL. Commutative Algebra. 59 LANG. Cyclotomic Fields.
Vol.lI. 60 ARNOW. Mathematical Methods in
30 JACOBSON. Lectures in Abstract Algebra I. Classical Mechanics. 2nd ed.
Basic Concepts. 61 WHITEHEAD. Elements of Homotopy
31 JACOBSON. Lectures in Abstract Algebra II. Theory.
Linear Algebra. 62 KARGAPOLOv/MERLZJAKOV. Fundamentals
32 JACOBSON. Lectures in Abstract Algebra of the Theory of Groups.
Ill. Theory of Fields and Galois Theory. 63 BOLLOBAS. Graph Theory.
33 HIRSCH. Differential Topology.
(continued after index)
Jiff Matousek

Lectures on
Discrete Geometry

With 206 Illustrations

Springer
Jin Matousek
Department of Applied Mathematics
Charles University
Malostranske mim. 25
118 00 Praha 1
Czech Republic
matousek@kam.mff.cuni.cz

Editorial Board
S. Axler F. w. Gehring K.A. Ribet
Mathematics Department Mathematics Department Mathematics Department
San Francisco State East Hall University of California,
University University of Michigan Berkeley
San Francisco, CA 94132 Ann Arbor, MI 48109 Berkeley, CA 94720-3840
USA USA USA
axler@sfsu.edu fgehring@math.lsa. ribet@math.berkeley.edu
umich.edu

Mathematics Subject Classification (2000): 52-01

Library of Congress Cataloging-in-Publication Data


Matousek, mf.
Lectures on discrete geometry / Jin Matousek.
p. cm. - (Graduate texts in mathematics; 212)
Includes bibliographical references and index.
ISBN 978-0-387-95374-8 ISBN 978-1-4613-0039-7 (eBook)
00110.1007/978-1-4613-0039-7
I. Convex geometry. 2. Combinatorial geometry. I. Title. II. Series.
QA639.5 .M37 2002
516--dc21 2001054915

Printed on acid-free paper.

© 2002 Springer-Verlag New York, Inc.


Al! rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY
10010, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in
connection with any form of information storage and retrieval, electronic adaptation, computer soft-
ware, or by similar or dissimilar methodology now known or hereafter developed is forbidden.
The use in this pUblication of trade names, trademarks, service marks, and similar terms, even if they
are not identified as such, is not to be taken as an expression of opinion as to whether or not they are
subject to proprietary rights.

Production managed by Michael Koy; manufacturing supervised by Jacqui Ashri.


Typesetting: Pages created by author using Springer TeX macro package.

9 8 7 6 54 3 2 1

Springer-Verlag New York Berlin Heidelberg


A member of BerteismannSpringer Science+Business Media GmbH
Preface

The next several pages describe the goals and the main topics of this book.
Questions in discrete geometry typically involve finite sets of points, lines,
circles, planes, or other simple geometric objects. For example, one can ask,
what is the largest number of regions into which n lines can partition the
plane, or what is the minimum possible number of distinct distances occur-
ring among n points in the plane? (The former question is easy, the latter
one is hard.) More complicated objects are investigated, too, such as convex
polytopes or finite families of convex sets. The emphasis is on "combinato-
rial" properties: Which of the given objects intersect, or how many points
are needed to intersect all of them, and so on.
Many questions in discrete geometry are very natural and worth studying
for their own sake. Some of them, such as the structure of 3-dimensional
convex polytopes, go back to the antiquity, and many of them are motivated
by other areas of mathematics. To a working mathematician or computer
scientist, contemporary discrete geometry offers results and techniques of
great diversity, a useful enhancement of the "bag of tricks" for attacking
problems in her or his field. My experience in this respect comes mainly
from combinatorics and the design of efficient algorithms, where, as time
progresses, more and more of the first-rate results are proved by methods
drawn from seemingly distant areas of mathematics and where geometric
methods are among the most prominent.
The development of computational geometry and of geometric methods in
combinatorial optimization in the last 20-30 years has stimulated research in
discrete geometry a great deal and contributed new problems and motivation.
Parts of discrete geometry are indispensable as a foundation for any serious
study of these fields. I personally became involved in discrete geometry while
working on geometric algorithms, and the present book gradually grew out of
lecture notes initially focused on computational geometry. (In the meantime,
several books on computational geometry have appeared, and so I decided to
concentrate on the nonalgorithmic part.)
In order to explain the path chosen in this book for exploring its subject,
let me compare discrete geometry to an Alpine mountain range. Mountains
can be explored by bus tours, by walking, by serious climbing, by playing
vi Preface

in the local casino, and in many other ways. The book should provide safe
trails to a few peaks and lookout points (key results from various subfields
of discrete geometry). To some of them, convenient paths have been marked
in the literature, but for others, where only climbers' routes exist in research
papers, I tried to add some handrails, steps, and ropes at the critical places,
in the form of intuitive explanations, pictures, and concrete and elementary
proofs. l However, I do not know how to build cable cars in this landscape:
Reaching the higher peaks, the results traditionally considered difficult, still
needs substantial effort. I wish everyone a clear view of the beautiful ideas in
the area, and I hope that the trails of this book will help some readers climb
yet unconquered summits by their own research. (Here the shortcomings of
the Alpine analogy become clear: The range of discrete geometry is infinite
and no doubt, many discoveries lie ahead, while the Alps are a small spot on
the all too finite Earth.)
This book is primarily an introductory textbook. It does not require any
special background besides the usual undergraduate mathematics (linear al-
gebra, calculus, and a little of combinatorics, graph theory, and probability).
It should be accessible to early graduate students, although mastering the
more advanced proofs probably needs some mathematical maturity. The first
and main part of each section is intended for teaching in class. I have actually
taught most of the material, mainly in an advanced course in Prague whose
contents varied over the years, and a large part has also been presented by
students, based on my writing, in lectures at special seminars (Spring Schools
of Combinatorics). A short summary at the end of the book can be useful for
reviewing the covered material.
The book can also serve as a collection of surveys in several narrower
subfields of discrete geometry, where, as far as I know, no adequate recent
treatment is available. The sections are accompanied by remarks and biblio-
graphic notes. For well-established material, such as convex polytopes, these
parts usually refer to the original sources, point to modern treatments and
surveys, and present a sample of key results in the area. For the less well cov-
ered topics, I have aimed at surveying most of the important recent results.
For some of them, proof outlines are provided, which should convey the main
ideas and make it easy to fill in the details from the original source.
Topics. The material in the book can be divided into several groups:
• Foundations (Sections 1.1-1.3, 2.1, 5.1-5.4, 5.7, 6.1). Here truly basic
things are covered, suitable for any introductory course: linear and affine
subspaces, fundamentals of convex sets, Minkowski's theorem on lattice
points in convex bodies, duality, and the first steps in convex polytopes,
Voronoi diagrams, and hyperplane arrangements. The remaining sections
of Chapters 1, 2, and 5 go a little further in these topics.

1 I also wanted to invent fitting names for the important theorems, in order to
make them easier to remember. Only few of these names are in standard usage.
Preface Vll

• Combinatorial complexity of geometric configurations (Chapters 4, 6, 7,


and 11). The problems studied here include line-point incidences, com-
plexity of arrangements and lower envelopes, Davenport-Schinzel se-
quences, and the k-set problem. Powerful methods, mainly probabilistic,
developed in this area are explained step by step on concrete nontriv-
ial examples. Many of the questions were motivated by the analysis of
algorithms in computational geometry.
• Intersection patterns and transversals of convex sets. Chapters 8-10 con-
tain, among others, a proof of the celebrated (p, q)-theorem of Alon and
Kleitman, including all the tools used in it. This theorem gives a suffi-
cient condition guaranteeing that all sets in a given family of convex sets
can be intersected by a bounded (small) number of points. Such results
can be seen as far-reaching generalizations of the well-known ReIly's the-
orem. Some of the finest pieces of the weaponry of contemporary discrete
and computational geometry, such as the theory of the VC-dimension or
the regularity lemma, appear in these chapters.
• Geometric Ramsey theory (Chapters 3 and 9). Ramsey-type theorems
guarantee the existence of a certain "regular" subconfiguration in every
sufficiently large configuration; in our case we deal with geometric ob-
jects. One of the historically first results here is the theorem of Erdos
and Szekeres on convex independent subsets in every sufficiently large
point set.
• Polyhedral combinatorics and high-dimensional convexity (Chapters 12-
14). Two famous results are proved as a sample of polyhedral combina-
torics, one in graph theory (the weak perfect graph conjecture) and one in
theoretical computer science (on sorting with partial information). Then
the behavior of convex bodies in high dimensions is explored; the high-
lights include a theorem on the volume of an N-vertex convex polytope
in the unit ball (related to algorithmic hardness of volume approxima-
tion), measure concentration on the sphere, and Dvoretzky's theorem on
almost-spherical sections of convex bodies.
• Representing finite metric spaces by coordinates (Chapter 15). Given an
n-point metric space, we would like to visualize it or at least make it com-
putationally more tractable by placing the points in a Euclidean space,
in such a way that the Euclidean distances approximate the given dis-
tances in the finite metric space. We investigate the necessary error of
such approximation. Such results are of great interest in several areas;
for example, recently they have been used in approximation algorithms
in combinatorial optimization (multicommodity flows, VLSI layout, and
others).
These topics surely do not cover all of discrete geometry, which is a rather
vague term anyway. The selection is (necessarily) subjective, and naturally
I preferred areas that I knew better and/or had been working in. (Unfortu-
nately, I have had no access to supernatural opinions on proofs as a more
viii Preface

reliable guide.) Many interesting topics are neglected completely, such as the
wide area of packing and covering, where very accessible treatments exist,
or the celebrated negative solution by Kahn and Kalai of the Borsuk conjec-
ture, which I consider sufficiently popularized by now. Many more chapters
analogous to the fifteen of this book could be added, and each of the fifteen
chapters could be expanded into a thick volume. But the extent of the book,
as well as the time for its writing, are limited.
Exercises. The sections are complemented by exercises. The little framed
numbers indicate their difficulty: III is routine, 0 may need quite a bright
idea. Some of the exercises used to be a part of homework assignments in my
courses and the classification is based on some experience, but for others it
is just an unreliable subjective guess. Some of the exercises, especially those
conveying important results, are accompanied by hints given at the end of
the book.
Additional results that did not fit into the main text are often included as
exercises, which saves much space. However, this greatly enlarges the danger
of making false claims, so the reader who wants to use such information may
want to check it carefully.
Sources and further reading. A great inspiration for this book project
and the source of much material was the book Combinatorial Geometry of
Pach and Agarwal [PA95]. Too late did I become aware of the lecture notes by
Ball [BaI97] on modern convex geometry; had I known these earlier I would
probably have hesitated to write Chapters 13 and 14 on high-dimensional
convexity, as I would not dare to compete with this masterpiece of mathe-
matical exposition. Ziegler's book [Zie94] can be recommended for studying
convex polytopes. Many other sources are mentioned in the notes in each
chapter. For looking up information in discrete geometry, a good starting
point can be one of the several handbooks pertaining to the area: Handbook
of Convex Geometry [GW93], Handbook of Discrete and Computational Ge-
ometry [G097], Handbook of Computational Geometry [SUOO], and (to some
extent) Handbook of Combinatorics [GGL95], with numerous valuable sur-
veys. Many of the important new results in the field keep appearing in the
journal Discrete and Computational Geometry.
Acknowledgments. For invaluable advice and/or very helpful comments on
preliminary versions of this book I would like to thank Micha Sharir, Gunter
M. Ziegler, Yuri Rabinovich, Pankaj K. Agarwal, Pavel Valtr, Martin Klazar,
Nati Linial, Gunter Rote, Janos Pach, Keith Ball, Uli Wagner, Imre Barany,
Eli Goodman, Gyorgy Elekes, Johannes Blomer, Eva Matouskova, Gil Kalai,
Joram Lindenstrauss, Emo Welzl, Komei Fukuda, Rephael Wenger, Piotr In-
dyk, Sariel Har-Peled, Vojtech Rodl, Geza T6th, Karoly Boroczky Jr., Rados
Radoicic, Helena Nyklova, Vojtech Franek, Jakub Simek, Avner Magen, Gre-
gor Baudis, and Andreas Marwinski (I apologize if I forgot someone; my notes
are not perfect, not to speak of my memory). Their remarks and suggestions
Preface ix

allowed me to improve the manuscript considerably and to eliminate many of


the embarrassing mistakes. I thank David Kramer for a careful copy-editing
and finding many more mistakes (as well as offering me a glimpse into the
exotic realm of English punctuation). I also wish to thank everyone who par-
ticipated in creating the friendly and supportive environments in which I
have been working on the book.
Errors. If you find errors in the book, especially serious ones, I would
appreciate it if you would let me know (email: matousek@kam.mff.cuni.cz).
I plan to post a list of errors at http://www.ms.mff.cuni.cz;-matousek.

Prague, July 2001 Jin Matousek


Contents

Preface v

Notation and Terminology xv

1 Convexity 1
1.1 Linear and Affine Subspaces, General Position ............. 1
1.2 Convex Sets, Convex Combinations, Separation. . . . . . . . . . . . 5
1.3 Radon's Lemma and HeIly's Theorem. .. . . . .. . . . . ..... . . .. 9
1.4 Centerpoint and Ham Sandwich. . . . . . . . . . . . . . . . . . . . . . . . .. 14

2 Lattices and Minkowski's Theorem 17


2.1 Minkowski's Theorem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 17
2.2 General Lattices ....................................... 21
2.3 An Application in Number Theory. . . . . . . . . . . . . . . . . . . . . . .. 27

3 Convex Independent Subsets 29


3.1 The Erdos-Szekeres Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . .. 30
3.2 Horton Sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 34

4 Incidence Problems 41
4.1 Formulation........................................... 41
4.2 Lower Bounds: Incidences and Unit Distances. . . . . . . . . . . . .. 51
4.3 Point-Line Incidences via Crossing Numbers. . . . . . . . . . . . . .. 54
4.4 Distinct Distances via Crossing Numbers . . . . . . . . . . . . . . . . .. 59
4.5 Point-Line Incidences via Cuttings ....................... 64
4.6 A Weaker Cutting Lemma. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 70
4.7 The Cutting Lemma: A Tight Bound ..................... 73

5 Convex Polytopes 77
5.1 Geometric Duality. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 78
5.2 H-Polytopes and V-Polytopes. . . . . . . . . . . . . . . . . . . . . . . . . . .. 82
5.3 Faces of a Convex Polytope. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 86
5.4 Many Faces: The Cyclic Polytopes. . . . . . . . . . . . . . . . . . . . . . .. 96
5.5 The Upper Bound Theorem ............................. 100
xii Contents

5.6 The Gale Transform .................................... 107


5.7 Voronoi Diagrams ...................................... 115

6 Number of Faces in Arrangements 125


6.1 Arrangements of Hyperplanes ............................ 126
6.2 Arrangements of Other Geometric Objects ................. 130
6.3 Number of Vertices of Level at Most k .................... 140
6.4 The Zone Theorem ..................................... 146
6.5 The Cutting Lemma Revisited ........................... 152

7 Lower Envelopes 165


7.1 Segments and Davenport-Schinzel Sequences ............... 165
7.2 Segments: Superlinear Complexity of the Lower Envelope .... 169
7.3 More on Davenport-Schinzel Sequences ................... 173
7.4 Towards the Tight Upper Bound for Segments ............. 178
7.5 Up to Higher Dimension: Triangles in Space ............... 182
7.6 Curves in the Plane ..................................... 186
7.7 Algebraic Surface Patches ............................... 189

8 Intersection Patterns of Convex Sets 195


8.1 The Fractional HeIly Theorem ........................... 195
8.2 The Colorful CaratModory Theorem ...................... 198
8.3 Tverberg's Theorem .................................... 200

9 Geometric Selection Theorems 207


9.1 A Point in Many Simplices: The First Selection Lemma ..... 207
9.2 The Second Selection Lemma ............................ 210
9.3 Order Types and the Same-Type Lemma .................. 215
9.4 A Hypergraph Regularity Lemma ........................ 223
9.5 A Positive-Fraction Selection Lemma ..................... 228

10 'Iransversals and Epsilon Nets 231


10.1 General Preliminaries: Transversals and Matchings ......... 231
10.2 Epsilon Nets and VC-Dimension .......................... 237
10.3 Bounding the VC-Dimension and Applications ............. 243
10.4 Weak Epsilon Nets for Convex Sets ....................... 251
10.5 The Hadwiger-Debrunner (p, q)-Problem .................. 255
10.6 A (p, q)- Theorem for Hyperplane Transversals .............. 259

11 Attempts to Count k-Sets 265


11.1 Definitions and First Estimates ........................... 265
11.2 Sets with Many Halving Edges ........................... 273
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions ... 277
11.4 A Better Upper Bound in the Plane ...................... 283
Contents xiii

12 Two Applications of High-Dimensional Polytopes 289


12.1 The Weak Perfect Graph Conjecture ...................... 290
12.2 The Brunn-Minkowski Inequality ......................... 296
12.3 Sorting Partially Ordered Sets ........................... 302

13 Volumes in High Dimension 311


13.1 Volumes, Paradoxes of High Dimension, and Nets ........... 311
13.2 Hardness of Volume Approximation ....................... 315
13.3 Constructing Polytopes of Large Volume .................. 322
13.4 Approximating Convex Bodies by Ellipsoids ............... 324

14 Measure Concentration and Almost Spherical Sections 329


14.1 Measure Concentration on the Sphere ..................... 330
14.2 Isoperimetric Inequalities and More on Concentration ....... 333
14.3 Concentration of Lipschitz Functions ...................... 337
14.4 Almost Spherical Sections: The First Steps ................ 341
14.5 Many Faces of Symmetric Polytopes ...................... 347
14.6 Dvoretzky's Theorem ................................... 348

15 Embedding Finite Metric Spaces into Normed Spaces 355


15.1 Introduction: Approximate Embeddings ................... 355
15.2 The Johnson-Lindenstrauss Flattening Lemma ............. 358
15.3 Lower Bounds By Counting .............................. 362
15.4 A Lower Bound for the Hamming Cube ................... 369
15.5 A Tight Lower Bound via Expanders ...................... 373
15.6 Upper Bounds for t'oo-Embeddings ........................ 385
15.7 Upper Bounds for Euclidean Embeddings .................. 389

What Was It About? An Informal Summary 401

Hints to Selected E:x;ercises 409

Bibliography 417

Index 459
Notation and Terminology

This section summarizes rather standard things, and it is mainly for reference.
More special notions are introduced gradually throughout the book. In order
to facilitate independent reading of various parts, some of the definitions are
even repeated several times.
If X is a set, IXI denotes the number of elements (cardinality) of X. If X
is a multiset, in which some elements may be repeated, then IXI counts each
element with its multiplicity.
°
The very slowly growing function log* x is defined by log* x = for x :::; 1
and log* x = 1 + log* (log2 x) for x > 1.
For a real number x, l x J denotes the largest integer less than or equal
r
to x, and x 1 means the smallest integer greater than or equal to x. The
boldface letters Rand Z stand for the real numbers and for the integers,
respectively, while Rd denotes the d-dimensional Euclidean space. For a point
x = (Xl, X2,"" Xd) E R d , IIxll = Jxi + x~ + ... + x~ is the Euclidean norm
of x, and for x, Y E R d , (x, y) = XIYI +X2Y2 + ... +XdYd is the scalar product.
Points of Rd are usually considered as column vectors.
The symbol B(x, r) denotes the closed ball of radius r centered at x in
some metric space (usually in Rd with the Euclidean distance), i.e., the set
of all points with distance at most r from x. We write Bn for the unit ball
B(O, 1) in Rn. The symbol 8A denotes the boundary of a set A ~ R d , that
is, the set of points at zero distance from both A and its complement.
For a measurable set A ~ R d , vol(A) is the d-dimensional Lebesgue mea-
sure of A (in most cases the usual volume).
Let I and 9 be real functions (of one or several variables). The notation
I = O(g) means that there exists a number C such that III :::; Glgi for all
values of the variables. Normally, C should be an absolute constant, but if
I and 9 depend on some parameter(s) that we explicitly declare to be fixed
(such as the space dimension d), then C may depend on these parameters
as well. The notation I = D(g) is equivalent to 9 = 0U), I(n) = o(g(n))
to limn~ooU(n)/g(n)) = 0, and I = 8(g) means that both I = O(g) and
1= D(g).
For a random variable X, the symbol E[Xj denotes the expectation of X,
and Prob [Aj stands for the probability of an event A.
xvi Notation and Terminology

Graphs are considered simple and undirected in this book unless stated
otherwise, so a graph G is a pair (V, E), where V is a set (the vertex set) and
E ~ (~) is the edge set. Here (~) denotes the set of all k-element subsets
of V. For a multigraph, the edges form a multiset, so two vertices can be
connected by several edges. For a given (multi)graph G, we write V(G) for
the vertex set and E( G) for the edge set. A complete graph has all possible
edges; that is, it is of the form(V, ).
(~) A complete graph on n vertices is
denoted by Kn- A graph G is bipartite if the vertex set can be partitioned
into two subsets VI and V2, the (color) classes, in such a way that each edge
connects a vertex of VI to a vertex of V2. A graph G' = (V', E') is a subgraph
of a graph G = (V, E) if V' ~ V and E' ~ E. We also say that G contains
a copy of H if there is a subgraph G' of G isomorphic to H, where G' and
H are isomorphic if there is a bijective map <p: V(G') -t V(H) such that
{u,v} E E(G') if and only if {<p(u),<p(v)} E E(H) for all u,v E V(G'). The
degree of a vertex v in a graph G is the number of edges of G containing v.
SAn r-regular graph has all degrees equal to r. Paths and cycles are graphs as
'in the following picture,

IjN~ ~oOO
paths cycles

and a path or cycle in G is a subgraph isomorphic to a path or cycle, respec-


tively. A graph G is connected if every two vertices can be connected by a
path in G.
We recall that a set X ~ R d is compact if and only if it is closed and
bounded, and that a continuous function f: X -t R defined on a compact X
attains its minimum (there exists Xo E X with f(xo) :::; f(x) for all x EX).
The Cauchy-Schwarz inequality is perhaps best remembered in the form
(x, y) :::; Ilxll'llyll for all x, y ERn.
A real function f defined on an interval A ~ R (or, more generally, on a
convex set A ~ R d ) is convex if f(tx + (l-t)y) :::; tf(x) + (l-t)f(y) for all
x, yEA and t E [0,1]. Geometrically, the graph of f on [x, y]lies below the
segment connecting the points (x, f(x)) and (y, f(y)). If the second derivative
satisfies f"(x) 2: 0 for all x in an (open) interval A ~ R, then f is convex
on A. Jensen's inequality is a straightforward generalization of the definition
of convexity: f(tlXI + t2X2 + ... + tnx n ) :::; td(xd + t2!(X2) + ... + tnf(xn)
for all choices of nonnegative ti summing to 1 and all Xl, ... , Xn E A. Or in
integral form, if p, is a probability measure on A and f is convex on A, we have
f (fAxdp,(x)) :::; fA f(x) dp,(x). In the language of probability theory, if X
is a real random variable and f: R -t R is convex, then f(E[X]) :::; E[f(X)];
for example, (E[X])2 :::; E [X2].
1

Convexity

We begin with a review of basic geometric notions such as hyperplanes and


affine subspaces in R d, and we spend some time by discussing the notion
of general position. Then we consider fundamental properties of convex sets
in R d, such as a theorem about the separation of disjoint convex sets by a
hyperplane and Helly's theorem.

1.1 Linear and Affine Subspaces, General Position


Linear subspaces. Let R d denote the d-dimensional Euclidean space. The
points are d-tuples of real numbers, x = (Xl, X2, •.. , Xd).
The space Rd is a vector space, and so we may speak of linear subspaces,
linear dependence of points, linear span of a set, and so on. A linear subspace
of Rd is a subset closed under addition of vectors and under multiplication
by real numbers. What is the geometric meaning? For instance, the linear
subspaces of R2 are the origin itself, all lines passing through the origin,
and the whole of R2. In R3, we have the origin, all lines and planes passing
through the origin, and R3.
Affine notions. An arbitrary line in R 2, say, is not a linear subspace unless
it passes through O. General lines are what are called affine subspaces. An
affine subspace of R d has the form X + L, where X E R d is some vector and L
is a linear subspace of Rd. Having defined affine subspaces, the other "affine"
notions can be constructed by imitating the "linear" notions.
What is the affine hull of a set X ~ Rd? It is the intersection of all affine
subspaces of Rd containing X. As is well known, the linear span of a set X
can be described as the set of all linear combinations of points of X. What
is an affine combination of points aI, a2,"" an E Rd that would play an
analogous role? To see this, we translate the whole set by -an, so that an
becomes the origin, we make a linear combination, and we translate back by
2 Chapter 1: Convexity

+an · This yields an expression of the form /31(a1 - an) + /32(a2 - an) + ... +
/3n (an - an) +an = /31 a1 + /32 a2 + ... + /3n-1 an-1 + (1- /31 - /32 - ... - /3n-dan ,
where /31, ... ,/3n are arbitrary real numbers. Thus, an affine combination of
points aI, ... ,an E R d is an expression of the form

Then indeed, it is not hard to check that the affine hull of X is the set of all
affine combinations of points of X.
The affine dependence of points aI, ... ,an means that one of them can
be written as an affine combination of the others. This is the same as the
existence of real numbers aI, a2, ... an, at least one of them nonzero, such
that both

(Note the difference: In an affine combination, the ai sum to 1, while in an


affine dependence, they sum to 0.)
Affine dependence of aI, ... ,an is equivalent to linear dependence of the
n-1 vectors a1 -an, a2 -an, ... , a n -1 -an. Therefore, the maximum possible
number of affinely independent points in R d is d+ 1.
Another way of expressing affine dependence uses "lifting" one dimension
higher. Let bi = (ai, 1) be the vector in R d +1 obtained by appending a new
coordinate equal to 1 to ai; then aI, ... ,an are affinely dependent if and only
if b1 , ... ,bn are linearly dependent. This correspondence of affine notions in
Rd with linear notions in R d+1 is quite general. For example, if we identify
R2 with the plane X3 = 1 in R3 as in the picture,

-------::::~-----"7"X3 = 1

___------=~--"7" X3 =0

then we obtain a bijective correspondence of the k-dimensional linear sub-


spaces of R 3 that do not lie in the plane X3 = 0 with (k-1 )-dimensional affine
subspaces of R2. The drawing shows a 2-dimensionallinear subspace of R3
and the corresponding line in the plane X3 = 1. (The same works for affine
subspaces of Rd and linear subspaces of R d+1 not contained in the subspace
Xd+1 = 0.)
This correspondence also leads directly to extending the affine plane R 2
into the projective plane: To the points of R 2 corresponding to nonhorizontal
1.1 Linear and Affine Subspaces, General Position 3

lines through 0 in R3 we add points "at infinity," that correspond to hori-


zontal lines through 0 in R 3 . But in this book we remain in the affine space
most of the time, and we do not use the projective notions.
Let all a2, .. " ad+! be points in R d , and let A be the d x d matrix with
ai - ad+l as the ith column, i = 1,2, ... , d. Then al, .. . ,ad+! are affinely
independent if and only if A has d linearly independent columns, and this is
equivalent to det(A) i- O. We have a useful criterion of affine independence
using a determinant.
Affine subspaces of R d of certain dimensions have special names. A (d-1)-
dimensional affine subspace of Rd is called a hyperplane (while the word plane
usually means a 2-dimensional subspace of Rd for any d). One-dimensional
subspaces are lines, and a k-dimensional affine subspace is often called a k-
fiat.
A hyperplane is usually specified by a single linear equation of the form
alXl +a2x2 + ... + adXd = b. We usually write the left-hand side as the scalar
product (a, x). So a hyperplane can be expressed as the set {x E Rd; (a, x) =
b} where a E Rd \ {O} and b E R. A (closed) half-space in Rd is a set
of the form {x E Rd; (a,x) ;::: b} for some a E Rd \ {O}; the hyperplane
{x E Rd; (a,x) = b} is its boundary.
General k-flats can be given either as intersections of hyperplanes or as
affine images of Rk (parametric expression). In the first case, an intersection
of k hyperplanes can also be viewed as a solution to a system Ax = b of linear
equations, where x E Rd is regarded as a column vector, A is a k x d matrix,
and b E R k. (As a rule, in formulas involving matrices, we interpret points
of Rd as column vectors.)
An affine mapping f; Rk -+ Rd has the form f; y H By+c for some d x k
matrix B and some c E R d , so it is a composition of a linear map with a
translation. The image of f is a k'-flat for some k' :::; minCk, d). This k' equals
the rank of the matrix B.
General position. "We assume that the points (lines, hyperplanes, ... ) are
in general position. " This magical phrase appears in many proofs. Intuitively,
general position means that no "unlikely coincidences" happen in the consid-
ered configuration. For example, if 3 points are chosen in the plane without
any special intention, "randomly," they are unlikely to lie on a common line.
For a planar point set in general position, we always require that no three
of its points be collinear. For points in Rd in general position, we assume
similarly that no unnecessary affine dependencies exist; No k :::; d+1 points
lie in a common (k-2)-flat. For lines in the plane in general position, we
postulate that no 3 lines have a common point and no 2 are parallel.
The precise meaning of general position is not fully standard; It may
depend on the particular context, and to the usual conditions mentioned
above we sometimes add others where convenient. For example, for a planar
point set in general position we can also suppose that no two points have the
same x-coordinate.
4 Chapter 1: Convexity

What conditions are suitable for including into a "general position" as-
sumption? In other words, what can be considered as an unlikely coincidence?
For example, let X be an n-point set in the plane, and let the coordinates of
the ith point be (Xi, Yi). Then the vector v(X) = (Xl, X2,···, Xn, YI, Y2,···, Yn)
can be regarded as a point of R2n. For a configuration X in which Xl = X2,
i.e., the first and second points have the same x-coordinate, the point v(X)
lies on the hyperplane {Xl = X2} in R2n. The configurations X where some
two points share the x-coordinate thus correspond to the union of G) hy-
perplanes in R2n. Since a hyperplane in R 2n has (2n-dimensional) measure
zero, almost all points of R2n correspond to planar configurations X with all
the points having distinct x-coordinates. In particular, if X is any n-point
planar configuration and c > 0 is any given real number, then there is a con-
figuration X', obtained from X by moving each point by distance at most c,
such that all points of X' have distinct x-coordinates. Not only that: Almost
all small movements (perturbations) of X result in X' with this property.
This is the key property of general position: Configurations in general
position lie arbitrarily close to any given configuration (and they abound
in any small neighborhood of any given configuration). Here is a fairly gen-
eral type of condition with this property. Suppose that a configuration X
is specified by a vector t = (tl' t2,"" t m ) of m real numbers (coordinates).
The objects of X can be points in R d , in which case m = dn and the tj
are the coordinates of the points, but they can also be circles in the plane,
with m = 3n and the tj expressing the center and the radius of each circle,
and so on. The general position condition we can put on the configuration
X is p(t) = p(h, t2, ... , t m ) i:- 0, where p is some nonzero polynomial in m
variables. Here we use the following well-known fact (a consequence of Sard's
theorem; see, e.g., Bredon [Bre93], Appendix C): For any nonzero m-variate
polynomial P(tl, ... , t m ), the zero set {t E Rm: p(t) = O} has measure 0 in
Rm.
Therefore, almost all configurations X satisfy p(t) i:- O. So any condition
that can be expressed as p(t) i:- 0 for a certain polynomial p in m real
variables, or, more generally, as PI (t) i:- 0 or P2 (t) i:- 0 or ... , for finitely or
countably many polynomials Pl>P2,"" can be included in a general position
assumption.
For example, let X be an n-point set in R d , and let us consider the con-
dition "no d+l points of X lie in a common hyperplane." In other words, no
d+l points should be affinely dependent. As we know, the affine dependence
of d+ 1 points means that a suitable d x d determinant equals O. This deter-
minant is a polynomial (of degree d) in the coordinates of these d+ 1 points.
Introducing one polynomial for every (d+l)-tuple of the points, we obtain
(d~l) polynomials such that at least one of them is 0 for any configuration X
with d+ 1 points in a common hyperplane. Other usual conditions for general
position can be expressed similarly.
1.2 Convex Sets, Convex Combinations, Separation 5

In many proofs, assuming general position simplifies matters consider-


ably. But what do we do with configurations Xo that are not in general
position? We have to argue, somehow, that if the statement being proved is
valid for configurations X arbitrarily close to our X o, then it must be valid
for Xo itself, too. Such proofs, usually called perturbation arguments, are of-
ten rather simple, and almost always somewhat boring. But sometimes they
can be tricky, and one should not underestimate them, no matter how tempt-
ing this may be. A nontrivial example will be demonstrated in Section 5.5
(Lemma 5.5.4).

Exercises
1. Verify that the affine hull of a set X ~ Rd equals the set of all affine
combinations of points of X. 0 .
2. Let A be a 2 x 3 matrix and let b E R 2 . Interpret the solution of the
system Ax = b geometrically (in most cases, as an intersection of two
planes) and discuss the possible cases in algebraic and geometric terms.
o
3. (a) What are the possible intersections of two (2-dimensional) planes
in R4? What is the "typical" case (general position)? What about two
hyperplanes in R 4? 0
(b) Objects in R4 can sometimes be "visualized" as objects in R3 moving
in time (so time is interpreted as the fourth coordinate). Thy to visualize
the intersection of two planes in R 4 discussed (a) in this way.

1.2 Convex Sets, Convex Combinations, Separation


Intuitively, a set is convex if its surface has no "dips":
not allowed in a conv x e

1.2.1 Definition (Convex set). A set C ~ Rd is convex if for every two


points x, y E C the whole segment xy is also contained in C. In other words,
for every t E [0,1], the point tx + (1 - t)y belongs to C.
The intersection of an arbitrary family of convex sets is obviously convex.
So we can define the convex hull of a set X ~ R d, denoted by conv(X), as the
intersection of all convex sets in R d containing X. Here is a planar example

..
with a finite X:

X conv(X)
6 Chapter 1: Convexity

An alternative description of the convex hull can be given using convex


combinations.

1.2.2 Claim. A point x belongs to conv(X) if and only if there exist points
Xl, X2,'" Xn E X and nonnegative real numbers t l , t2, ... , tn with L~l ti =
1 such that X = L~=l tiXi.

The expression L~=l tixi as in the claim is called a convex combination


of the points Xl, X2,"" X n . (Compare this with the definitions of linear and
affine combinations.)
Sketch of proof. Each convex combination of points of X must lie in
conv(X): For n = 2 this is by definition, and for larger n by induction.
Conversely, the set of all convex combinations obviously contains X, and it
is convex. D
In R d , it is sufficient to consider convex combinations involving at most
d+l points:
1.2.3 Theorem (Caratheodory's theorem). Let X ~ Rd. Then each
point of conv(X) is a convex combination of at most d+ 1 points of X.
For example, in the plane, conv(X) is the union of all triangles with
vertices at points of X. The proof of the theorem is left as an exercise to the
subsequent section.
A basic result about convex sets is the separability of disjoint convex sets
by a hyperplane.
1.2.4 Theorem (Separation theorem). Let C, D ~ Rd be convex sets
with C n D = 0. Then there exists a hyperplane h such that C lies in one
of the closed half-spaces determined by h, and D lies in the opposite closed
half-space. In other words, there exist a unit vector a E Rd and a number
bE R such that for all X E C we have (a, xl ;::: b, and for all xED we have
(a, xl ::; b.
If C and D are closed and at least one of them is bounded, they can be
separated strictly; in such a way that C n h = D n h = 0.
In particular, a closed convex set can be strictly separated from a point.
This implies that the convex hull of a closed set X equals the intersection of
all closed half-spaces containing X.
Sketch of proof. First assume that C and D are compact (i.e., closed and
bounded). Then the Cartesian product C x D is a compact space, too, and
the distance function (x, y) f--t Ilx - yll attains its minimum on C x D. That
is, there exist points p E C and qED such that the distance of C and D
equals the distance of p and q.
The desired separating hyperplane h can be taken as the one perpendic-
ular to the segment pq and passing through its midpoint:
1.2 Convex Sets, Convex Combinations, Separation 7

It is easy to check that h indeed avoids both C and D.


If D is compact and C closed, we can intersect C with a large ball and
get a compact set C'. If the ball is sufficiently large, then C and C' have the
same distance to D. So the distance of C and D is attained at some p E C'
and qED, and we can use the previous argument.
For arbitrary disjoint convex sets C and D, we choose a sequence C 1 ~
C 2 ~ C 3 ~ ..• of compact convex subsets of C with U~=l C n = C. For
example, assuming that 0 E C, we can let Cn be the intersection of the
closure of (1- ~)C with the ball of radius n centered at O. A similar sequence
Dl ~ D2 ~ ... is chosen for D, and we let h n = {x E Rd: (an,x) = bn } be a
hyperplane separating C n from D n , where an is a unit vector and bn E R. The
sequence (bn)~=l is bounded, and by compactness, the sequence of (d+1)-
component vectors (an, bn ) E R d+ 1 has a cluster point (a, b). One can verify,
by contradiction, that the hyperplane h = {x E R d: (a, x) = b} separates C
and D (nonstrictly). 0

The importance of the separation theorem is documented by its presence


in several branches of mathematics in various disguises. Its home territory is
probably functional analysis, where it is formulated and proved for infinite-
dimensional spaces; essentially it is the so-called Hahn-Banach theorem. The
usual functional-analytic proof is different from the one we gave, and in a
way it is more elegant and conceptual. The proof sketched above uses more
special properties of R d , but it is quite short and intuitive in the case of
compact C and D.
Connection to linear programming. A basic result in the theory of
linear programming is the Farkas lemma. It is a special case of the duality of
linear programming (discussed in Section 10.1) as well as the key step in its
proof.

1.2.5 Lemma (Farkas lemma, one of many versions). For every d x n


real matrix A, exactly one of the following cases occurs:
(i) The system of linear equations Ax = 0 has a nontrivial nonnegative
solution x ERn (all components of x are nonnegative and at least one
of them is strictly positive).
8 Chapter 1: Convexity

(ii) There exists ayE Rd such that yT A is a vector with all entries strictly
negative. Thus, if we multiply the j th equation in the system Ax = 0 by
Yj and add these equations together, we obtain an equation that obviously
has no nontrivial nonnegative solution, since all the coefficients on the
left-hand sides are strictly negative, while the right-hand side is O.

Proof. Let us see why this is yet another version of the separation theorem.
Let V C Rd be the set of n points given by the column vectors of the
matrix A. We distinguish two cases: Either 0 E conv(V) or 0 tj. conv(V).
In the former case, we know that 0 is a convex combination of the points
of V, and the coefficients of this convex combination determine a nontrivial
nonnegative solution to Ax = O.
In the latter case, there exists a hyperplane strictly separating V from 0,
i.e., a unit vector y E Rd such that (y, v) < (y,O) = 0 for each v E V. This is
just the y from the second alternative in the Farkas lemma. D

Bibliography and remarks. Most of the material in this chapter is


quite old and can be found in many surveys and textbooks. Providing
historical accounts of such well-covered areas is not among the goals
of this book, and so we mention only a few references for the specific
results discussed in the text and add some remarks concerning related
results.
The concept of convexity and the rudiments of convex geometry
have been around since antiquity. The initial chapter of the Handbook
of Convex Geometry [GW93] succinctly describes the history, and the
handbook can be recommended as the basic source on questions re-
lated to convexity, although knowledge has progressed significantly
since its publication.
For an introduction to functional analysis, including the Hahn-
Banach theorem, see Rudin [Rud91], for example. The Farkas lemma
originated in [Far94] (nineteenth century!). More on the history of the
duality of linear programming can be found, e.g., in Schrijver's book
[Sch86].
As for the origins, generalizations, and applications of Caratheo-
dory's theorem, as well as of Radon's lemma and Helly's theorem dis-
cussed in the subsequent sections, a recommendable survey is Eckhoff
[Eck93], and an older well-known source is Danzer, Griinbaum, and
Klee [DGK63].
Caratheodory's theorem comes from the paper [Car07], concerning
power series and harmonic analysis. A somewhat similar theorem, due
to Steinitz [Ste16], asserts that if x lies in the interior of conv(X)
for an X ~ R d, then it also lies in the interior of conv(Y) for some
Y ~ X with WI ::; 2d. Bonnice and Klee [BK63] proved a common
generalization of both these theorems: Any k-interior point of X is
a k- interior point of Y for some Y ~ X with at most max( 2k, d+ 1)
1.3 Radon's Lemma and Helly's Theorem 9

points, where x is called a k-interior point of X if it lies in the relative


interior of the convex hull of some k+ 1 affinely independent points
of X.

Exercises
1. Give a detailed proof of Claim 1.2.2. 0
2. Write down a detailed proof of the separation theorem. [I]
3. Find an example of two disjoint closed convex sets in the plane that are
not strictly separable. ITl
4. Let I: R d ---+ R k be an affine map.
(a) Prove that if C ~ Rd is convex, then I(C) is convex as well. Is the
preimage of a convex set always convex? 0
(b) For X ~ Rd arbitrary, prove that conv(f(X)) = conv(f(X)). ITl
5. Let X ~ Rd. Prove that diam(conv(X)) = diam(X), where the diameter
diam(Y) of a set Y is sup{/lx - y/l: x, y E Y}. [I]
6. A set C ~ Rd is a convex cone if it is convex and for each x E C, the ray
01 is fully contained in C.
(a) Analogously to the convex and affine hulls, define the appropriate
"conic hull" and the corresponding notion of "combination" (analogous
to the convex and affine combinations). [I]
(b) Let C be a convex cone in Rd and b (j. C a point. Prove that there
exists a vector a with (a, x) 2: 0 for all x E C and (a, b) < O. 0
7. (Variations on the Farkas lemma) Let A be a dxn matrix and let b E Rd.
(a) Prove that the system Ax = b has a nonnegative solution x E Rn if
and only if every y E Rd satisfying yT A 2: 0 also satisfies yTb 2: O. [I]
(b) Prove that the system of inequalities Ax :::; b has a nonnegative
solution x if and only if every nonnegative y E Rd with yT A 2: 0 also
satisfies yTb 2: O. [I]
8. (a) Let C C Rd be a compact convex set with a nonempty interior, and
let p E C be an interior point. Show that there exists a line £ passing
through p such that the segment £ n C is at least as long as any segment
parallel to £ and contained in C. [iJ
(b) Show that (a) may fail for C compact but not convex. ITl

1.3 Radon's Lemma and Helly's Theorem


Caratheodory's theorem from the previous section, together with Radon's
lemma and ReIly's theorem presented here, are three basic properties of con-
vexity in Rd involving the dimension. We begin with Radon's lemma.
1.3.1 Theorem (Radon's lemma). Let A be a set of d+2 points in Rd.
Then there exist two disjoint subsets AI, A2 c A such that
10 Chapter 1: Convexity

A point x E conv(Al) nconv(A 2 ), where Al and A2 are as in the theorem,


is called a Radon point of A, and the pair (AI, A 2 ) is called a Radon partition
of A (it is easily seen that we can require Al U A2 = A).
Here are two possible cases in the plane:



• •
Proof. Let A = {al,a2,'" ,ad+2}' These d+2 points are necessarily affinely
dependent. That is, there exist real numbers al,.'" ad+2, not all of them 0,
such that L~~; ai = 0 and L~~; aiai = O.
Set P = {i: ai > O} and N = {i: ai < O}. Both P and N are nonempty.
We claim that P and N determine the desired subsets. Let us put Al =
{ai: i E P} and A2 = {ai: i EN}. We are going to exhibit a point x that is
contained in the convex hulls of both these sets.
Put S = LiEP ai; we also have S = - LiEN ai. Then we define

(1.1)

(1.2)

The coefficients of the ai in (1.1) are nonnegative and sum to 1, so x is a


convex combination of points of AI' Similarly, (1.2) expresses x as a convex
combination of points of A 2 • D
Helly's theorem is one of the most famous results of a combinatorial nature
about convex sets.
1.3.2 Theorem (Helly's theorem). Let Gll G2 , ... , Gn be convex sets in
R d, n 2:: d+ 1. Suppose that the intersection of every d+ 1 of these sets is
nonempty. Then the intersection of all the Gi is nonempty.
The first nontrivial case states that if every 3 among 4 convex sets in
the plane intersect, then there is a point common to all 4 sets. This can be
proved by an elementary geometric argument, perhaps distinguishing a few
cases, and the reader may want to try to find a proof before reading further.
In a contrapositive form, Helly's theorem guarantees that whenever
G l , G2 , ..• , Gn are convex sets with n~=l Gi = 0, then this is witnessed by
some at most d+1 sets with empty intersection among the Gi . In this way,
many proofs are greatly simplified, since in planar problems, say, one can deal
with 3 convex sets instead of an arbitrary number, as is amply illustrated in
the exercises below.
1.3 Radon's Lemma and Helly's Theorem 11

It is very tempting and quite usual to formulate Helly's theorem as fol-


lows: "If every d+I among n convex sets in Rd intersect, then all the sets
intersect." But, strictly speaking, this is false, for a trivial reason: For d 2: 2,
the assumption as stated here is met by n = 2 disjoint convex sets.
Proof of Reily's theorem. (Using Radon's lemma.) For a fixed d, we
proceed by induction on n. The case n = d+I is clear, so we suppose that
n 2: d+2 and that the statement of Helly's theorem holds for smaller n.
Actually, n = d+2 is the crucial case; the result for larger n follows at once
by a simple induction.
Consider sets G 1 , G2 , ... , Gn satisfying the assumptions. If we leave out
anyone of these sets, the remaining sets have a nonempty intersection by
the inductive assumption. Let us fix a point ai E nj"oi Gj and consider the
points all a2, ... ,ad+2. By Radon's lemma, there exist disjoint index sets
h,12 C {I, 2, ... , d+2} such that

conv( {ai: i E h}) n conv( {ai: i E h}) -I- 0.


We pick a point x in this intersection. The following picture illustrates the
case d = 2 and n = 4:

We claim that x lies in the intersection of all the Gi . Consider some i E


{I, 2, ... ,n}; then i rf- h or i rf- 12 . In the former case, each aj with j E h lies
in Gi , and so x E conv( {aj: j E h}) ~ Gi . For i rf- 12 we similarly conclude
that x E conv( {aj: j E 12}) ~ Gi . Therefore, x E n~=l Gi . 0

An infinite version of Reily's theorem. If we have an infinite collection


of convex sets in Rd such that any d+I of them have a common point, the
entire collection still need not have a common point. Two examples in R 1 are
the families of intervals {(O, lin): n = I,2, ... } and {[n, 00): n = 1,2, ... }.
The sets in the first example are not closed, and the second example uses
unbounded sets. For compact (i.e., closed and bounded) sets, the theorem
holds:

1.3.3 Theorem (Infinite version of Reily's theorem). Let C be an ar-


bitrary infinite family of compact convex sets in R d such that any d+ 1 of the
sets have a nonempty intersection. Then all the sets of C have a nonempty
intersection.
12 Chapter 1: Convexity

Proof. By Helly's theorem, any finite subfamily of C has a nonempty inter-


section. By a basic property of compactness, if we have an arbitrary family
of compact sets such that each of its finite subfamilies has a nonempty inter-
section, then the entire family has a nonempty intersection. 0

Several nice applications of Helly's theorem are indicated in the exercises


below, and we will meet a few more later in this book.

Bibliography and remarks. Helly proved Theorem 1.3.2 in 1913


and communicated it to Radon, who published a proof in [Rad21]. This
proof uses Radon's lemma, although the statement wasn't explicitly
formulated in Radon's paper. References to many other proofs and
generalizations can be found in the already mentioned surveys [Eck93]
and [DGK63].
Helly's theorem inspired a whole industry of Helly-type theorems.
A family B of sets is said to have Helly number h if the following holds:
Whenever a finite subfamily F ~ B is such that every h or fewer sets
n
of F have a common point, then F -=I- 0. So Helly's theorem says
that the family of all convex sets in Rd has Helly number d+1. More
generally, let P be some property of families of sets that is hereditary,
meaning that if F has property P and F' ~ F, then F' has P as well.
A family B is said to have Helly number h with respect to P if for
every finite F ~ B, all subfamilies of F of size at most h having P
implies F having P. That is, the absence of P is always witnessed by
some at most h sets, so it is a "local" property.

Exercises
1. Prove Caratheodory's theorem (you may use Radon's lemma). 8J
2. Let K C Rd be a convex set and let Cb""Cn ~ R d, n 2:: d+1, be
convex sets such that the intersection of every d+ 1 of them contains a
translated copy of K. Prove that then the intersection of all the sets C i
also contains a translated copy of K. ~
This result was noted by Vincensini [Vin39] and by Klee [Kle53].
3. Find an example of 4 convex sets in the plane such that the intersection
of each 3 of them contains a segment of length 1, but the intersection of
all 4 contains no segment of length 1. ITl
4. A strip of width w is a part of the plane bounded by two parallel lines at
distance w. The width of a set X ~ R2 is the smallest width of a strip
containing X.
(a) Prove that a compact convex set of width 1 contains a segment of
length 1 of every direction. GJ
(b) Let {C b C2 , ... ,Cn } be closed convex sets in the plane, n 2:: 3, such
that the intersection of every 3 of them has width at least 1. Prove that
n~=l Ci has width at least 1. ~
1.3 Radon's Lemma and Helly's Theorem 13

The result as in (b), for arbitrary dimension d, was proved by Sallee


[SaI75), and a simple argument using ReIly's theorem was noted by Buch-
man and Valentine [BV82].
5. Statement: Each set X C R2 of diameter at most 1 (Le., any 2 points
have distance at most 1) is contained in some disc of radius 1/\1'3.
(a) Prove the statement for 3-element sets X. iii
(b) Prove the statement for all finite sets X. iii
(c) Generalize the statement to Rd: determine the smallest r = r(d) such
that every set of diameter 1 in R d is contained in a ball of radius r (prove
your claim). ~
The result as in (c) is due to Jung; see [DGK63].
6. Let C C Rd be a compact convex set. Prove that the mirror image of C
can be covered by a suitable translate of C blown up by the factor of d;
that is, there is an x E Rd with -C ~ x + dC. ~
7. (a) Prove that if the intersection of each 4 or fewer among convex sets
C 1 , .•. , C n ~ R2 contains a ray then n~=l Ci also contains a ray. ~
(b) Show that the number 4 in (a) cannot be replaced by 3. iii
This result, and an analogous one in Rd with the ReIly number 2d, are
due to Katchalski [Kat78].
8. For a set X ~ R2 and a point x E X, let us denote by V(x) the set of all
points y E X that can "see" x, i.e., points such that the segment xy is
contained in X. The kernel of X is defined as the set of all points x E X
such that V(x) = X. A set with a nonempty kernel is called star-shaped.
(a) Prove that the kernel of any set is convex. [D
(b) Prove that if V(x) n V(y) n V(z) -=I- 0 for every x, y, z E X and X is
compact, then X is star-shaped. That is, if every 3 paintings in a (planar)
art gallery can be seen at the same time from some location (possibly
different for different triples of paintings), then all paintings can be seen
simultaneously from somewhere. If it helps, assume that X is a polygon.
o
(c) Construct a nonempty set X ~ R 2 such that each of its finite subsets
can be seen from some point of X but X is not star-shaped. iii
The result in (b), as well as the d-dimensional generalization (with ev-
ery d+1 regions V(x) intersecting), is called Krasnosel'skiI's theorem; see
[Eck93] for references and related results.
9. In the situation of Radon's lemma (A is a (d+2)-point set in R d ), call
a point x E R d a Radon point of A if it is contained in convex hulls of
two disjoint subsets of A. Prove that if A is in general position (no d+ 1
points affinely dependent), then its Radon point is unique. m
10. (a) Let X, Y C R2 be finite point sets, and suppose that for every subset
8 ~ Xu Y of at most 4 points, 8 n X can be separated (strictly) by a
line from 8 n Y. Prove that X and Yare line-separable. m
(b) Extend (a) to sets X, Y C R d , with 181 : : ; d+2. 0
The result (b) is called Kirchberger's theorem [Kir03].
14 Chapter 1: Convexity

1.4 Centerpoint and Ham Sandwich


We prove an interesting result as an application of Helly's theorem.

1.4.1 Definition (Centerpoint). Let X be an n-point set in Rd. A point


x E R d is called a centerpoint of X if each closed half-space containing x
contains at least d~l points of X.
Let us stress that one set may generally have many centerpoints, and a
centerpoint need not belong to X.
The notion of centerpoint can be viewed as a generalization of the me-
dian of one-dimensional data. Suppose that Xl,"" Xn E R are results of
measurements of an unknown real parameter x. How do we estimate x from
the Xi? We can use the arithmetic mean, but if one of the measurements is
completely wrong (say, 100 times larger than the others), we may get quite
a bad estimate. A more "robust" estimate is a median, i.e., a point x such
that at least ~ of the Xi lie in the interval (-00, xl and at least ~ of them lie
in [x, 00 ). The centerpoint can be regarded as a generalization of the median
for higher-dimensional data.
In the definition of centerpoint we could replace the fraction d!l by some
other parameter a E (0,1). For a > d!l' such an "a-centerpoint" need not
always exist: Take d+1 points in general position for X. With a = d!l as in
the definition above, a centerpoint always exists, as we prove next.
Centerpoints are important, for example, in some algorithms of divide-
and-conquer type, where they help divide the considered problem into smaller
subproblems. Since no really efficient algorithms are known for finding
"exact" centerpoints, the algorithms often use a-centerpoints with a suit-
able a < d!l' which are easier to find.

1.4.2 Theorem (Centerpoint theorem). Each finite poi~t set in Rd has


at least one centerpoint.

Proof. First we note an equivalent definition of a centerpoint: x is a cen-


terpoint of X if and only if it lies in each open half-space , such that
IX n ,I > d!l n.
We would like to apply Helly's theorem to conclude that all these open
half-spaces intersect. But we cannot proceed directly, since we have infinitely
many half-spaces and they are open and unbounded. Instead of such an open

.
half-space " we thus consider the compact convex set conv(X n ,) c ,.

...~...........~ .
.......

• •• •
"

,
........ " .

conv{r n X )
1.4 Centerpoint and Ham Sandwich 15

Letting'Y run through all open half-spaces 'Y with IX n 'YI > d~l n, we obtain
a family C of compact convex sets. Each of them contains more than d~l n
points of X, and so the intersection of any d+ 1 of them contains at least
one point of X. The family C consists of finitely many distinct sets (since X
n
has finitely many distinct subsets), and so C i= 0 by Helly's theorem. Each
point in this intersection is a centerpoint. 0

In the definition of a centerpoint we can regard the finite set X as defining


a distribution of mass in Rd. The centerpoint theorem asserts that for some
point x, any half-space containing x encloses at least d!l of the total mass.
It is not difficult to show that this remains valid for continuous mass distri-
butions, or even for arbitrary Borel probability measures on Rd (Exercise 1).
Ham-sandwich theorem and its relatives. Here is another important
result, not much related to convexity but with a flavor resembling the cen-
terpoint theorem.
1.4.3 Theorem (Ham-sandwich theorem). Every d finite sets in Rd can
be simultaneously bisected by a hyperplane. A hyperplane h bisects a finite
set A if each of the open half-spaces defined by h contains at most LlAI/2 J
points of A.
This theorem is usually proved via continuous mass distributions using
a tool from algebraic topology: the Borsuk-Ulam theorem. Here we omit a
proof.
Note that if Ai has an odd number of points, then every h bisecting Ai
passes through a point of Ai' Thus if AI, ... ,Ad all have odd sizes and their
union is in general position, then every hyperplane simultaneously bisecting
them is determined by d points, one of each Ai. In particular, there are only
finitely many such hyperplanes.
Again, an analogous ham-sandwich theorem holds for arbitrary d Borel
probability measures in Rd.
Center transversal theorem. There can be beautiful new things to dis-
cover even in well-studied areas of mathematics. A good example is the fol-
lowing recent result, which "interpolates" between the centerpoint theorem
and the ham-sandwich theorem.
1.4.4 Theorem (Center transversal theorem). Let 1 S k S d and let
A 1 ,A2, ... ,A k be finite point sets in Rd. Then there exists a (k-l)-Bat f
such that for every hyperplane h containing f, both the closed half-spaces
defined by h contain at least d_~+2IAd points of Ai, i = 1,2, ... ,k.
The ham-sandwich theorem is obtained for k = d and the centerpoint
theorem for k = 1. The proof, which we again have to omit, is based on a
result of algebraic topology, too, but it uses a considerably more advanced
machinery than the ham-sandwich theorem. However, the weaker result with
d!l instead of d-~+2 is easy to prove; see Exercise 2.
16 Chapter 1: Convexity

Bibliography and remarks. The centerpoint theorem was es-


tablished by Rado [Rad47]. According to Steinlein's survey [Ste85],
the ham-sandwich theorem was conjectured by Steinhaus (who also
invented the popular 3-dimensional interpretation, namely, that the
ham, the cheese, and the bread in any ham sandwich can be simulta-
neously bisected by a single straight motion of the knife) and proved
by Banach. The center transversal theorem was found by Dol'nikov
[Dol'92] and, independently, by Zivaljevic and Vrecica [ZV90].
Significant effort has been devoted to efficient algorithms for find-
ing (approximate) centerpoints and ham-sandwich cuts (i.e., hyper-
planes as in the ham-sandwich theorem). In the plane, a ham-sandwich
cut for two n-point sets can be computed in linear time (Lo, Matousek,
and Steiger [LMS94]). In a higher but fixed dimension, the complexity
of the best exact algorithms is currently slightly better than O(n d - l ).
A centerpoint in the plane, too, can be found in linear time (Jadhav
and Mukhopadhyay [JM94]). Both approximate ham-sandwich cuts
(in the ratio 1 : 1+10 for a fixed 10 > 0) and approximate centerpoints
((d!l -c)-centerpoints) can be computed in time O(n) for every fixed
dimension d and every fixed 10 > 0, but the constant depends expo-
nentially on d, and the algorithms are impractical if the dimension is
not quite small. A practically efficient randomized algorithm for com-
puting approximate centerpoints in high dimensions (ex-centerpoints
with ex ~ 1/d2 ) was given by Clarkson, Eppstein, Miller, Sturtivant,
and Teng [CEM+96].

Exercises
1. (Centerpoints for general mass distributions)
(a) Let J.1 be a Borel probability measure on Rd; that is, J.1(Rd) = 1 and
each open set is measurable. Show that for each open half-space 'Y with
J.1( 'Y) > t there exists a compact set C C 'Y with J.1( C) > t. I2l
(b) Prove that each Borel probability measure in Rd has a centerpoint
(use (a) and the infinite Helly's theorem). I2l
2. Prove that for any k finite sets AI, ... ,Ak C Rd, where 1:::::; k:::::; d, there

d
exists a (k-1)-flat such that every hyperplane containing it has at least
l IAil points of Ai in both ofits closed half-spaces for all i = 1,2, ... , k.
2

Lattices and Minkowski's


Theorem

This chapter is a quick excursion into the geometry of numbers, a field where
number-theoretic results are proved by geometric arguments, often using
properties of convex bodies in Rd. We formulate the simple but beautiful
theorem of Minkowski on the existence of a nonzero lattice point in every
symmetric convex body of sufficiently large volume. We derive several con-
sequences, concluding with a geometric proof of the famous theorem of La-
grange claiming that every natural number can be written as the sum of at
most 4 squares.

2.1 Minkowski's Theorem


In this section we consider the integer lattice Zd, and so a lattice point is a
point in R d with integer coordinates. The following theorem can be used in
many interesting situations to establish the existence of lattice points with
certain properties.

2.1.1 Theorem (Minkowski's theorem). Let G ~ Rd be symmetric


(around the origin, i.e., G = -G), convex, bounded, and suppose that
vol (G) > 2d. Then G contains at least one lattice point different from O.

Proof. We put G' = ~G = gx: x E G}.


Claim: There exists a nonzero integer vector v E Zd \ {O} such that G' n
(C' + v) =1= 0; Le., G' and a translate of G' by an integer vector intersect.

Proof. By contradiction; suppose the claim is false. Let R be a large


integer number. Consider the family C of translates of G' by the
18 Chapter 2: Lattices and Minkowski's Theorem

integer vectors in the cube [-R, Rjd: C = {C' +v: v E [-R, RjdnZ d },
as is indicated in the drawing (C is painted in gray).

Each such translate is disjoint from C', and thus every two of these
translates are disjoint as well. They are all contained in the enlarged
cube K = [-R - D, R + Djd, where D denotes the diameter of C'.
Hence

vol(K) = (2R + 2D)d? ICI vol(C') = (2R + l)d vo l(C'),


and
vol(C') <
-
(1 + _1)d
2D
2R+l
The expression on the right-hand side is arbitrarily close to 1 for
sufficiently large R. On the other hand, vol (C') = 2- d vole C) > 1 is
a fixed number exceeding 1 by a certain amount independent of R,
a contradiction. The claim thus holds. 0

Now let us fix a v E Zd as in the claim and let us choose a point x E


C' n (C' + v). Then we have x - v E C', and since C' is symmetric, we obtain
v - x E C'. Since C' is convex, the midpoint of the segment xCv - x) lies in
C' too, and so we have ~x + ~(v - x) = ~v E C'. This means that v E C,
which proves Minkowski's theorem. 0

2.1.2 Example (About a regular forest). Let K be a circle of diameter


26 (meters, say) centered at the origin. Trees of diameter 0.16 grow at each
lattice point within K except for the origin, which is where you are standing.
Prove that you cannot see outside this miniforest.
2.1 Minkowski's Theorem 19

........
~

.
. . . .........
~ . .. .. . .

• • • • • • • • • • • • · 0 · ••••••.•••.•

Proof. Suppose than one could see outside along some line epassing through
the origin. This means that the strip S of width 0.16 with e as the middle
line contains no lattice point in K except for the origin. In other words, the
symmetric convex set C = KnS contains no lattice points but the origin. But
as is easy to calculate, vol( C) > 4, which contradicts Minkowski's theorem.
o
2.1.3 Proposition (Approximating an irrational number by a frac-
tion). Let a E (0,1) be a real number and N a natural number. Then there
exists a pair of natural numbers m, n such that n ::::; Nand

la- mln < nN _1 .

This proposition implies that there are infinitely many pairs m, n such
that la - ~I < 1/n2 (Exercise 4). This is a basic and well-known result
in elementary number theory. It can also be proved using the pigeonhole
principle.
The proposition has an analogue concerning the approximation of several
numbers al,"" ak by fractions with a common denominator (see Exercise 5),
and there a proof via Minkowski's theorem seems to be the simplest.
Proof of Proposition 2.1.3. Consider the set

C = {(x,y) E R2: -N - ~ ::::; x::::; N +~, lax - yl < tt}.


y=ax
20 Chapter 2: Lattices and Minkowski's Theorem

This is a symmetric convex set of area (2N+1)~ > 4, and therefore it con-
tains some nonzero integer lattice point (n, m). By symmetry, we may assume
n> O. The definition of C gives n :::; N and lam - ml < -k.
In other words,
lex - !!!ol
n
< _1 •
nN
D

Bibliography and remarks. The name "geometry of numbers"


was coined by Minkowski, who initiated a systematic study of this
field (although related ideas appeared in earlier works). He proved
Theorem 2.1.1, in a more general form mentioned later on, in 1891
(see [Min96]). His first application was a theorem on simultaneously
making linear forms small (Exercise 2.2.4). While geometry of numbers
originated as a tool in number theory, for questions in Diophantine
approximation and quadratic forms, today it also plays a significant
role in several other diverse areas, such as coding theory, cryptography,
the theory of uniform distribution, and numerical integration.
Theorem 2.1.1 is often called Minkowski's first theorem. What is,
then, Minkowski's second theorem? We answer this natural question
in the notes to Section 2.2, where we also review a few more of the
basic results in the geometry of numbers and point to some interesting
connections and directions of research.
Most of our exposition in this chapter follows a similar chapter in
Pach and Agarwal [PA95]. Older books on the geometry of numbers
are Cassels [Cas59] and Gruber and Lekkerkerker [GL87]. A pleasant
but somewhat aged introduction is Siegel [Sie89]. The Gruber [Gru93]
provides a concise recent overview.

Exercises
1. Prove: If C ~ R d is convex, symmetric around the origin, bounded, and
such that vol( C) > k2 d , then C contains at least 2k lattice points. ~
2. By the method of the proof of Minkowski's theorem, show the following
result (Blichtfeld; Van der Corput): If S ~ Rd is measurable and vol(S) >
k, then there are points Sll S2, . .. ,Sk E S with all Si - 8j E Zd, 1 :::; i,j :::;
k. @]
3. Show that the boundedness of C in Minkowski's theorem is not really
necessary. [IJ
4. (a) Verify the claim made after Example 2.1.3, namely, that for any
irrational ex there are infinitely many pairs m, n such that lex - mini <
1/n2 • [IJ
(b) Prove that for ex = J2 there are only finitely many pairs m, n with
lex - mini < 1/4n2 . ~
(c) Show that for any algebraic irrational number a (Le., a root of a
univariate polynomial with integer coefficients) there exists a constant D
such that lex - mini < linD holds for finitely many pairs (m, n) only.
Conclude that, for example, the number 2:::1 2- ii is not algebraic. ill
2.2 General Lattices 21

5. (a) Let 0:1> 0:2 E (0,1) be real numbers. Prove that for a given N E N
there exist ml,m2,n E N, n ~ N, such that 100i - ~I < nffi, i = 1,2.
8J
(b) Formulate and prove an analogous result for the simultaneous ap-
proximation of d real numbers by rationals with a common denominator.
o (This is a result of Dirichlet [Dir42].)
6. Let K c R 2 be a compact convex set of area 0: and let x be a point
chosen uniformly at random in [0, 1)2.
(a) Prove that the expected number of points of Z2 in the set K + x
equals 0:. 0
(b) Show that with probability at least 1 - 0:, K + x contains no point
of Z2. III

2.2 General Lattices


Let Zl, Z2, ... , Zd be ad-tuple of linearly independent vectors in Rd. We define
the lattice with basis {Zl,Z2, ... ,Zd} as the set of all linear combinations of
the Zi with integer coefficients; that is,

Let us remark that this lattice has in general many different bases. For in-
stance, the sets {(O, 1), (1, On and {(I, 0), (3, In are both bases of the "stan-
dard" lattice Z2.
Let us form a d x d matrix Z with the vectors Zl, ... , Zd as columns. We
define the determinant of the lattice A = A(Zl, Z2, ... , Zd) as det A = 1 det Z I.
Geometrically, det A is the volume of the parallelepiped {O:lZl + 0:2Z2 + ... +
O:dZd: 0:1, ... , O:d E [0, I]}:

(the proof is left to Exercise 1). The number det A is indeed a property of the
lattice A (as a point set), and it does not depend on the choice of the basis
of A (Exercise 2). It is not difficult to show that if Z is the matrix of some
basis of A, then the matrix of every basis of A has the form BU, where U is
an integer matrix with determinant ±l.
22 Chapter 2: Lattices and Minkowski's Theorem

2.2.1 Theorem (Minkowski's theorem for general lattices). Let A be


a lattice in R d , and let C ~ Rd be a symmetric convex set with vol(C) >
2d det A. Then C contains a point of A different from 0.

Proof. Let {Zl' ... , zd be a basis of A. We define a linear mapping f: R d ----t


Rd by f(Xl, X2,"" Xd) = XlZl + X2Z2 + ... + XdZd. Then f is a bijection and
A = f(Zd). For any convex set X, we have vol(f(X)) = det(A) vol(X).
(Sketch of proof: This holds if X is a cube, and a convex set can be ap-
proximated by a disjoint union of sufficiently small cubes with arbitrary
precision.) Let us put C' = f- l (C). This is a symmetric convex set with
vol(C') = vol(C)jdetA > 2d. Minkowski's theorem provides a nonzero vec-
tor v E C' n Zd, and f (v) is the desired point as in the theorem. D

A seemingly more general definition of a lattice. What if we consider


integer linear combinations of more than d vectors in Rd? Some caution is
necessary: If we take d = 1 and the vectors VI = (1), V2 = (V2), then
the integer linear combinations i l VI + i2v2 are dense in the real line (by
Example 2.1.3), and such a set is not what we would like to call a lattice.
In order to exclude such pathology, we define a discrete subgroup of Rd
as a set A C Rd such that whenever x, YEA, then also x - YEA, and such
that the distance of any two distinct points of A is at least J, for some fixed
positive real number J > 0.
It can be shown, for instance, that if VI, V2, ••• ,Vn E R d are vectors with
rational coordinates, then the set A of all their integer linear combinations
is a discrete subgroup of Rd (Exercise 3). As the following theorem shows,
any discrete subgroup of Rd whose linear span is all of Rd is a lattice in the
sense of the definition given at the beginning of this section.

2.2.2 Theorem (Lattice basis theorem). Let A C Rd be a discrete


subgroup of Rd whose linear span is Rd. Then A has a basis; that is,
there exist d linearly independent vectors Zl, Z2, ... ,Zd E R d such that
A = A(Zl' Z2, ... , Zd).

Proof. We proceed by induction. For some i, 1 :::; i :::; d+l, suppose that
linearly independent vectors Zl, Z2,"" Zi-l E A with the following prop-
erty have already been constructed. If F i - l denotes the (i-l )-dimensional
subspace spanned by Zl, ... , Zi-l, then all points of A lying in F i - l can be
written as integer linear combinations of Zl, ... , Zi-l. For i = d+ 1, this gives
the statement of the theorem.
So consider an i :::; d. Since A generates R d , there exists a vector w E A
not lying in the subspace F i - l . Let P be the i-dimensional parallelepiped
determined by Zl, Z2, ... , Zi-l and by w: P = {alzl +a2z2 + ... +ai-lzi-l +
aiw: al,"" ai E [0, I]}. Among all the (finitely many) points of A lying in
P but not in F i - l , choose one nearest to F i - l and call it Zi, as in the picture:
2.2 General Lattices 23

Note that if the points of A n P are written in the form a1z1 + a2z2 + ... +
ai-1Zi-1 + aiW, then Zi is one with the smallest ai. It remains to show that
Zl, Z2, ... , Zi have the required property.
So let v E A be a point lying in Fi (the linear span of Zl,"" Zi). We
can write v = {31Z1 + {32Z2 + ... + {3izi for some real numbers {31,'" ,{3i' Let
'Yj be the fractional part of {3j, j = 1,2, ... , i; that is, 'Yj = {3j - l{3jJ. Put
v' = 'Y1Z1 + 'Y2Z2 + ... + 'YiZi. This point also lies in A (since v and v' differ
by an integer linear combination of vectors of A). We have 0 :s; 'Yj < 1, and
hence v' lies in the parallelepiped P. Therefore, we must have 'Yi = 0, for
otherwise, v' would be nearer to F i - 1 than Zi' Hence v' E An Fi-I, and by
the inductive hypothesis, we also get that all the other 'Yj are O. So all the (3j
are in fact integer coefficients, and the inductive step is finished. 0
Therefore, a lattice can also be defined as a full-dimensional discrete sub-
group of Rd.

Bibliography and remarks. First we mention several fundamental


theorems in the "classical" geometry of numbers.
Lattice packing and the Minkowski-Hlawka theorem. For a compact
C c R d, the lattice constant .t. (C) is defined as min {det (A): A n C =
{O}}, where the minimum is over all lattices A in Rd (it can be shown
by a suitable compactness argument, known as the compactness theo-
rem of Mahler, that the minimum is attained). The ratio vol(C)j .t.(C)
is the smallest number D = D(C) for which the Minkowski-like re-
sult holds: Whenever det(A) > D, we have C n A i= {O}. It is also
easy to check that 2- d D( C) equals the maximum density of a lattice
packing of C; i.e., the fraction of Rd that can be filled by the set
C + A for some lattice A such that all the translates C + v, v E A,
have pairwise disjoint interiors. A basic result (obtained by an aver-
aging argument) is the Minkowski-Hlawka theorem, which shows that
D 2: 1 for all star-shaped compact sets C. If C is star-shaped and
symmetric, then we have the improved lower bound (better packing)
D 2: 2«(d) = 2 2::::"=1 n- d • This brings us to the fascinating field of
lattice packings, which we do not pursue in this book; a nice geometric
24 Chapter 2: Lattices and Minkowski's Theorem

introduction is in the first half of the book Pach and Agarwal [PA95],
and an authoritative reference is Conway and Sloane [CS99]. Let us
remark that the lattice constant (and hence the maximum lattice pack-
ing density) is not known in general even for Euclidean spheres, and
many ingenious constructions and arguments have been developed for
packing them efficiently. These problems also have close connections
to error-correcting codes.
Successive minima and Minkowski's second theorem. Let C C Rd
be a convex body containing 0 in the interior and let A c R d
be a lattice. The ith successive minimum of C with respect to A,
denoted by Ai = Ai (C, A), is the infimum of the scaling factors
A > 0 such that AC contains at least i linearly independent vec-
tors of A. In particular, Al is the smallest number for which AIC
contains a nonzero lattice vector, and Minkowski's theorem guaran-
tees that At ~ 2d det(A)/vol(C). Minkowski's second theorem asserts
(2d /d!) det(A) ~ AIA2··· Ad· vol(C) ~ 2d det(A).
The flatness theorem. If a convex body K is not required to be sym-
metric about 0, then it can have arbitrarily large volume without con-
taining a lattice point. But any lattice-point free body has to be flat:
For every dimension d there exists c( d) such that any convex body
K ~ Rd with K n Zd = 0 has lattice width at most c(d). The lat-
tice width of K is defined as min{maxxEK (x,y) - minxEK(x,y): y E
Zd \ {O}}; geometrically, we essentially count the number of hyper-
planes orthogonal to y, spanned by points of Zd, and intersecting K.
Such a result was first proved by Khintchine in 1948, and the current
best bound c(d) = O(d3 / 2 ) is due to Banaszczyk, Litvak, Pajor, and
Szarek [BLPS99]; we also refer to this paper for more references.
Computing lattice points in convex bodies. Minkowski's theorem pro-
vides the existence of nonzero lattice points in certain convex bodies.
Given one of these bodies, how efficiently can one actually compute
a nonzero lattice point in it? More generally, given a convex body in
Rd, how difficult is it to decide whether it contains a lattice point, or
to count all lattice points? For simplicity, we consider only the integer
lattice Zd here.
First, if the dimension d is considered as a constant, such prob-
lems can be solved efficiently, at least in theory. An algorithm due to
Lenstra [Len83] finds in polynomial time an integer point, if one exists,
in a given convex polytope in R d, d fixed. It is based on the flatness
theorem mentioned above (the ideas are also explained in many other
sources, e.g., [GLS88], [Lov86], [Sch86], [Bar97]). More recently, Barvi-
nok [Bar93] (or see [Bar97]) provided a polynomial-time algorithm for
counting the integer points in a given fixed-dimensional convex poly-
tope. Both algorithms are nice and certainly nontrivial, and especially
2.2 General Lattices 25

the latter can be recommended as a neat application of classical math-


ematical results in a new context.
On the other hand, if the dimension d is considered as a part of the
input then (exact) calculations with lattices tend to be algorithmically
difficult. Most of the difficult problems of combinatorial optimization
can be formulated as instances of integer programming, where a given
linear function should be minimized over the set of integer points in a
given convex polytope. This problem is well known to be NP-hard, and
so is the problem of deciding whether a given convex polytope contains
an integer point (both problems are actually polynomially equivalent).
For an introduction to integer programming see, e.g., Schrijver [Sch86].
Some much more special problems concerning lattices have also
been shown to be algorithmically difficult. For example, finding a
shortest (nonzero) vector in a given lattice A specified by a basis is
NP-hard (with respect to randomized polynomial-time reductions). (In
the notation introduced above, we are asking for >"l(Bd,A), the first
successive minimum of the ball. This took quite some time to prove
(Micciancio [Mic98] has obtained the strongest result to date, inap-
proximability up to the factor of )2, building on earlier work mainly
of Ajtai), although the analogous hardness result for the shortest vec-
tor in the maximum norm (i.e., >"1([-1, l]d,A)) has been known for a
long time.
Basis reduction and applications. Although finding the shortest vec-
tor of a lattice A is algorithmically difficult, the shortest vector can
be approximated in the following sense. For every c: > 0 there is a
polynomial-time algorithm that, given a basis of a lattice A in R d ,
computes a nonzero vector of A whose length is at most (1 +c:)d times
the length of the shortest vector of A; this was proved by Schnorr
[Sch87]. The first result of this type, with a worse bound on the approx-
imation factor, was obtained in the seminal work of Lenstra, Lenstra,
and Lovasz [LLL82]. The LLL algorithm, as it is called, computes not
only a single short vector but a whole "short" basis of A.
The key notion in the algorithm is that of a reduced basis of A;
intuitively, this means a basis that cannot be much improved (made
significantly shorter) by a simple local transformation. There are many
technically different notions of reduced bases. Some of them are clas-
sical and have been considered by mathematicians such as Gauss and
Lagrange. The definition of the Lovasz-reduced basis used in the LLL
algorithm is sufficiently relaxed so that a reduced basis can be com-
puted from any initial basis by polynomially many local improvements,
and, at the same time, is strong enough to guarantee that a reduced
basis is relatively short. These results are covered in many sources; the
thin book by Lovasz [Lov86] can still be recommended as a delightful
26 Chapter 2: Lattices and Minkowski's Theorem

introduction. Numerous refinements of the LLL algorithm, as well as


efficient implementations, are available.
We sketch an ingenious application of the LLL algorithm for poly-
nomial factorization (from Kannan, Lenstra, and LOV3sZ [KLL88]; the
original LLL technique is somewhat different). Assume for simplicity
that we want to factor a monic polynomial p(x) E Z[x] (integer coeffi-
cients, leading coefficient 1) into a product of factors irreducible over
Z[x]. By numerical methods we can compute a root a of p(x) with
very high precision. If we can find the minimal polynomial of a, i.e.,
the lowest-degree monic polynomial q(x) E Z[x] with q(a) = 0, then
we are done, since q(x) is irreducible and divides p(x). Let us write
q(x) = x d + ad_Ixd-1 + ... + ao. Let K be a large number and let us
consider the d-dimensionallattice A in R d +l with basis (K, 1,0, ... ,0),
(Ka, 0,1,0, ... ,0), (Ka 2, 0, 0,1,0, ... ,0), ... , (Ka d, 0, ... ,0,1). Com-
bining the basis vectors with the coefficients ao, al,"" ad-I, 1, respec-
tively, we obtain the vector Vo = (0, ao, al, ... , ad-I, 1) E A. It turns
out that if K is sufficiently large compared to the ai, then Vo is the
shortest nonzero vector, and moreover, every vector not much longer
than Vo is a multiple of Vo. The LLL algorithm applied to A thus finds
Vo, and this yields q(x). Of course, we do not know the degree of q(x),
but we can test all possible degrees one by one, and the required mag-
nitude of K can be estimated from the coefficients of p( x).
The LLL algorithm has been used for the knapsack problem and for
the subset sum problem. Typically, the applications are problems where
one needs to express a given number (or vector) as a linear combina-
tion of given numbers (or vectors) with small integer coefficients. For
example, the subset sum problem asks, for given integers aI, a2, ... , an
and b, for a subset I ~ {I, 2, ... ,n} with L:iEI ai = b; i.e., b should be
expressed as a linear combination of the ai with 0/1 coefficients. These
and many other significant applications can be found in Grotschel,
LOV3sZ, and Schrijver [GLS88]. In cryptography, several cryptographic
systems proposed in the literature were broken with the help of the
LLL algorithm (references are listed, e.g., in [GLS88], [Dwo97]). On
the other hand, lattices playa prominent role in recent constructions,
mainly due to Ajtai, of new cryptographic systems. While currently
the security of every known efficient cryptographic system depends
on an (unproven) assumption of hardness of a certain computational
problem, Ajtai's methods suffice with a considerably weaker and more
plausible assumption than those required by the previous systems (see
[Ajt98] or [Dwo97] for an introduction).

Exercises
1. Let VI, ... , Vd be linearly independent vectors in Rd. Form a matrix A
with VI, ... ,Vd as rows. Prove that Idet AI is equal to the volume of the
2.3 An Application in Number Theory 27

parallelepiped {alvl + a2V2 + ... + adVd: al, ... ,ad E [0, I]}. (You may
want to start with d = 2.) 12]
2. Prove that if Zl, ... , Zd and z~, ... , z~ are vectors in R d such that
A(ZI, ... ,Zd) = A(z~, ... ,z~), then IdetZI = IdetZ'I, where Z is the
d x d matrix with the Zi as columns, and similarly for Z'. 12]
3. Prove that for n rational vectors VI, ... ,Vn , the set A = {il VI + i2V2 +
... + invn : iI, i 2, ... , in E Z} is a discrete subgroup of Rd. 12]
4. (Minkowski's theorem on linear forms) Prove the following from Min-
kowski's theorem: Let £i(X) = L;~=l aijXj be linear forms in d variables,
i = 1,2, ... , d, such that the d x d matrix (aij kj has determinant 1.
Let bl , . .. ,bd be positive real numbers with bl b2 .•. bd = 1. Then there
exists a nonzero integer vector Z E Zd \ {o} with I£i (z) I ::; bi for all
i = 1,2, ... ,d. 12]

2.3 An Application in Number Theory


We prove one nontrivial result of elementary number theory. The proof via
Minkowski's theorem is one of several possible proofs. Another proof uses the
pigeonhole principle in a clever way.

2.3.1 Theorem (Two-square theorem). Each prime p == 1 (mod 4) can


be written as a sum of two squares: p = a 2 + b2 , a, b E Z.

Let F = G F(p) stand for the field of residue classes modulo p, and let
F* = F \ {a}. An element a E F* is called a quadratic residue modulo p
if there exists an x E F* with x2 == a (modp). Otherwise, a is a quadratic
nonresidue.

2.3.2 Lemma. If p is a prime with p == 1 (mod 4) then -1 is a quadratic


residue modulo p.

Proof. The equation i 2 = 1 has two solutions in the field F, namely i = 1


and i = -1. Hence for any i -j. ±1 there exists exactly one j -=f. i with
ij = 1 (namely, j = i-I, the inverse element in F), and all the elements of
F* \ {-I, I} can be divided into pairs such that the product of elements in
each pair is 1. Therefore, (p-l)! = 1· 2··· (p-l) == -1 (modp).
For a contradiction, suppose that the equation i 2 = -1 has no solution
in F. Then all the elements of F* can be divided into pairs such that the
product of the elements in each pair is -1. There are (p-l) /2 pairs, which
is an even number. Hence (p-l)! == (_I)(p-I)/2 = 1, a contradiction. 0

Proof of Theorem 2.3.1. By the lemma, we can choose a number q such


that q2 == -1 (modp). Consider the lattice A = A(ZI' Z2), where Zl = (1, q)
and Z2 = (O,p). We have detA = p. We use Minkowski's theorem for general
lattices (Theorem 2.2.1) for the disk C = {(x,y) E R2: X2 +y2 < 2p}. The
28 Chapter 2: Lattices and Minkowski's Theorem

area of Cis 27rp > 4p = 4detA, and so C contains a point (a, b) E A \ {O}. We
have 0 < a 2 + b2 < 2p. At the same time, (a, b) = iZ 1 + j Z2 for some i, j E Z,
which means that a = i, b = iq + jp. We calculate a 2 + b2 = i 2 + (iq + jp)2 =
i 2 + i 2q2 + 2iqjp + j2p2 = i 2 (1 + q2) = 0 (modp). Therefore a 2 + b2 = p. D

Bibliography and remarks. The fact that every prime congruent


to 1 mod 4 can be written as the sum of two squares was already known
to Fermat (a more rigorous proof was given by Euler). The possibility
of expressing every natural number as a sum of at most 4 squares was
proved by Lagrange in 1770, as a part of his work on quadratic forms.
The proof indicated in Exercise 1 below is due to Davenport.

Exercises
1. (Lagrange's four-square theorem) Let p be a prime.
(a) Show that there exist integers a, b with a2 + b2 = -1 (modp). 0
(b) Show that the set A = {(x,y,z,t) E Z4: z = ax + by (modp), t =
bx - ay (modp)} is a lattice, and compute det(A). Iil
(c) Show the existence of a nonzero point of A in a ball of a suitable
radius, and infer that p can be written as a sum of 4 squares of integers.
[3J
(d) Show that any natural number can be written as a sum of 4 squares
of integers. 0
3

Convex Independent Subsets

Here we consider geometric Ramsey-type results about finite point sets in


the plane. Ramsey-type theorems are generally statements of the following
type: Every sufficiently large structure of a given type contains a "regular"
substructure of a prescribed size. In the forthcoming Erdos-Szekeres theorem
(Theorem 3.1.3), the "structure of a given type" is simply a finite set of points
in general position in R2, and the "regular substructure" is a set of points
forming the vertex set of a convex polygon, as is indicated in the picture:
• •
®
®


® •
® ®
• •
A prototype of Ramsey-type results is Ramsey's theorem itself: For every
choice of natural numbers p, r, n, there exists a natural number N such that
whenever X is an N -element set and c: (X) --+ {I, 2, ... ,r} is an arbitrary
coloring of the system of all p-element subsets of X by r colors, then there
is an n-element subset Y S;; X such that all the p-tuples in (~) have the
same color. The most famous special case is with p = r = 2, where (-;) is
interpreted as the edge set of the complete graph KN on N vertices. Ramsey's
theorem asserts that if each of the edges of KN is colored red or blue, we can
always find a complete subgraph on n vertices with all edges red or all edges
blue.
Many of the geometric Ramsey-type theorems, including the Erdos-
Szekeres theorem, can be derived from Ramsey's theorem. But the quantita-
tive bound for the N in Ramsey's theorem is very large, and consequently,
30 Chapter 3: Convex Independent Subsets

the size of the "regular" configurations guaranteed by proofs via Ramsey's


theorem is very small. Other proofs tailored to the particular problems and
using more of their geometric structure often yield much better quantitative
results.

3.1 The Erdos-Szekeres Theorem


3.1.1 Definition (Convex independent set). We say that a set X ~ R d
is convex independent if for every x EX, we have x 1- conv( X \ {x}).
The phrase "in convex position" is sometimes used synonymously with
"convex independent." In the plane, a finite convex independent set is the
set of vertices of a convex polygon. We will discuss results concerning the
occurrence of convex independent subsets in sufficiently large point sets. Here
is a simple example of such a statement.
3.1.2 Proposition. Among any 5 points in the plane in general position (no
3 collinear), we can find 4 points forming a convex independent set.

Proof. If the convex hull has 4 or 5 vertices, we are done. Otherwise, we


have a triangle with two points inside, and the two interior points together
with one of the sides of the triangle define a convex quadrilateral. D
Next, we prove a general result.
3.1.3 Theorem (Erdos-Szekeres theorem). For every natural number k
there exists a number n(k) such that any n(k)-point set X C R2 in general
position contains a k-point convex independent subset.

First proof (using Ramsey's theorem and Proposition 3.1.2). Color


a 4-tuple T c X red if its four points are convex independent and blue
otherwise. If n is sufficiently large, Ramsey's theorem provides a k-point
subset Y c X such that all 4-tuples from Y have the same color. But for
k 2 5 this color cannot be blue, because any 5 points determine at least
one red 4-tuple. Consequently, Y is convex independent, since every 4 of its
points are (CaratModory's theorem). D

Next, we give an inductive proof; it yields an almost tight bound for n(k).
Second proof of the Erdos-Szekeres theorem. In this proof, by a set
in general position we mean a set with no 3 points on a common line and no
2 points having the same x-coordinate. The latter can always be achieved by
rotating the coordinate system.
Let X be a finite point set in the plane in general position. We call X a
cup if X is convex independent and its convex hull is bounded from above by
a single edge (in other words, if the points of X lie on the graph of a convex
function).
3.1 The Erdos-Szekeres Theorem 31

•....
....•....... _- ...
..' ..
Similarly, we define a cap, with a single edge bounding the convex hull from
below.

e---
. ... __ .............
....

A k-cap is a cap with k points, and similarly for an C-cup.


We define j(k, C) as the smallest number N such than any N-point set in
general position contains a k-cup or an C-cap. By induction on k and C, we
prove the following formula for j(k, C):

k+C-
j(k, C) :s;; ( k _ 2
4) + 1. (3.1)

Theorem 3.1.3 clearly follows from this, with n(k) < j(k, k). For k :s;; 2
or C :s;; 2 the formula holds. Thus, let k, C 2: 3, and consider a set P in
general position with N = j(k-1,C) + j(k,C-1)-1 points. We prove that
it contains a k-cup or an C-cap. This will establish the inequality j(k, C) :s;;
j(k-1,C) + j(k,C-1)-1, and then (3.1) follows by induction; we leave the
simple manipulation of binomial coefficients to the reader.
Suppose that there is no C-cap in X. Let E ~ X be the set of points
p E X such that X contains a (k-1)-cup ending with p.
We have lEI 2: N - j(k-1,C) + 1 = j(k, C-1), because X \ E contains no
(k-1)-cup and so IX \ EI < j(k-1,C).
Either the set E contains a k-cup, and then we are done, or there is an
(C-1)-cap. The first point p of such an (C-1)-cap is, by the definition of E,
the last point of some (k-1)-cup in X, and in this situation, either the cup
or the cap can be extended by one point:

r
k-1 C-1 k-1 C-1

J~",1
"",•.... ,.. ,., ..•...,p
or
j,."" "'r
"'•.. " ..... " .•'/p
~I
... -.. e----_ • ....

This finishes the inductive step. D

A lower bound for sets without k-cups and i-caps. Interestingly, the
bound for j(k, C) proved above is tight, not only asymptotically but exactly!
This means, in particular, that there are n-point planar sets in general posi-
tion where any convex independent subset has at most O(1og n) points, which
is somewhat surprising at first sight.
An example of a set Xk,R of (kt~24) points in general position with no
k-cup and no C-cap can be constructed, again by induction on k + C. If k :s;; 2
or C :s;; 2, then Xk,e can be taken as a one-point set.
32 Chapter 3: Convex Independent Subsets

Supposing both k ::::: 3 and £ ::::: 3, the set Xk,i is obtained from the sets
L = Xk-l,£ and R = Xk,i-l according to the following picture:

........
...........
....

L = k - l ,t

The set L is placed to the left of R in such a way that all lines determined
by pairs of points in L go below Rand all lines determined by pairs of points
of R go above L.
Consider a cup C in the set Xk,i thus constructed. If C n L = 0, then
!C1 ::; k-l by the assumption on R. If C n L i- 0, then C has at most 1 point
in R, and since no cup in L has more than k-2 points, we get !C! ::; k-l as
well. The argument for caps is symmetric.
We have !Xk,i! = !Xk-1,e! + !Xk,i-l!, and the formula for !Xk,e! follows
by induction; the calculation is almost the same as in the previous proof. 0

Determining the exact value of n(k) in the Erdos-Szekeres theorem is


much more challenging. Here are the best known bounds:

2k - 2 2k -
+1::;n(k)::; ( k-2
5) +2.
The upper bound is a small improvement over the bound f(k, k) derived
above; see Exercise 5. The lower bound results from an inductive construction
slightly more complicated than that of Xk,i.

Bibliography and remarks. A recent survey of the topics discussed


in the present chapter is Morris and Soltan [MSOO].
The Erdos-Szekeres theorem was one of the first Ramsey-type re-
sults [ES35], and Erdos and Szekeres independently rediscovered the
general Ramsey's theorem at that occasion. Still another proof, also
using Ramsey's theorem, was noted by Tarsi: Let the points of X be
numbered Xl,X2, ... ,X n , and color the triple {xi,xj,xd, i < j < k,
red if we make a right turn when going from Xi to Xk via Xj, and blue
if we make a left turn. It is not difficult to check that a homogeneous
subset, with all triples having the same color, is in convex position.
3.1 The Erdos-Szekeres Theorem 33

The original upper bound of n(k) ~ (2:-=-24) +1 from [ES35] has been
improved only recently and very slightly; the last improvement to the
bound stated in the text above is due to T6th 1 and Valtr [TV98].
The Erdos-Szekeres theorem was generalized to planar convex sets.
The following somewhat misleading term is used: A family of pairwise
disjoint convex sets is in general position if no set is contained in the
convex hull of the union of two other sets of the family. For every k
there exists n such that in any family of n pairwise disjoint convex sets
in the plane in general position, there are k sets in convex position,
meaning that none of them is contained in the convex hull of the union
of the others. This was shown by Bisztriczky and G. Fejes T6th [BT89]
and, with a different proof and better quantitative bound, by Pach and
T6th [PT98]. The assumption of general position is necessary.
An interesting problem is the generalization of the Erdos-Szekeres
theorem to R d , d ;::: 3. The existence of nd(k) such that every nd(k)
points in Rd in general position contain a k-point subset in convex
position is easy to see (Exercise 4), but the order of magnitude is wide
open. The current best upper bound nd(k) ~ (2kk':~-1)+d [KarOl]
slightly improves the immediate bound. Fiiredi [unpublished] conjec-
tured that n3 (k) ~ eO( v'k). If true, this would be best possible: A
construction of Karolyi and Valtr [KV01] shows that for every fixed
d ;::: 3, nd(k) ;::: eCdk1/(d-l) with a suitable Cd > O. The construction
starts with a one-point set X o, and X H1 is obtained from Xi by re-
placing each point x E Xi by the two points x - (Ef,Ef-l,.·. ,Ei)
and x + (Ef,Ef-l, ... ,Ei), with Ei > 0 sufficiently small, and then
perturbing the resulting set very slightly, so that Xi+! is in suitable
general position. We have \Xi \ = 2i , and the key lemma asserts that
mc(XH1 ) ~ mc(Xi ) +mc(-7l'(Xi)) , where mc(X) denotes the maximum
size of a convex independent subset of X and 7f is the projection to
the hyperplane {Xd = O}.
Another interesting generalization of the Erdos-Szekeres theorem
to Rd is mentioned in Exercise 5.4.3.
The bounds in the Erdos-Szekeres theorem were also investigated
for special point sets, namely, for the so-called dense sets in the plane.
An n-point Xc R2 is called c-dense if the ratio of the maximum and
minimum distances of points in X is at most cVn. For every planar
n-point set, this ratio is at least coVn for a suitable constant Co > 0,
as an easy volume argument shows, and so the dense sets are quite
well spread. Improving on slightly weaker results of Alon, Katchalski,
and Pulleyblank [AKP89], Valtr [VaI92a] showed, by a probabilistic
argument, that every c-dense n-point set in general position contains

1 The reader should be warned that four mathematicians named Toth are men-
tioned throughout the book. For two of them, the surname is actually Fejes Toth
(Laszlo and Gabor), and for the other two it is just Toth (Geza and Csaba).
34 Chapter 3: Convex Independent Subsets

a convex independent subset of at least cln 1 / 3 points, for some Cl >


o depending on c, and he proved that this bound is asymptotically
optimal. Simplified proofs, as well as many other results on dense
sets, can be found in Valtr's thesis [Val94].

Exercises
1. Find a configuration of 8 points in general position in the plane with no
5 convex independent points (thereby showing that n(5) 2:: 9). 0
2. Prove that the set {(i,j); i = 1,2, ... ,m,j = 1,2, ... ,m} contains no
convex independent subset with more that Cm 2 / 3 points (with C some
constant independent of m). ~
3. Prove that for each k there exists n(k) such that each n(k)-point set in
the plane contains a k-point convex independent subset or k points lying
on a common line. 0
4. Prove an Erdos-Szekeres theorem in Rd: For every k there exists n =
nd(k) such that any n points in Rd in general position contain a k-point
convex independent subset. 0
5. (A small improvement on the upper bound on n(k)) Let X C Rd be a
planar set in general position with f(k, £)+ 1 points, where f is as in the
second proof of Erdos-Szekeres, and let t be the (unique) topmost point
of X. Prove that X contains a k-cup with respect to t or an £-cap with
respect to t, where a cup with respect to t is a subset Y S;; X \ {t} such
that Y U { t} is in convex position, and a cap with respect to t is a subset
Y S;; X \ {t} such that {x, y, z, t} is not in convex position for any triple
{x,y,z} S;; Y. Infer that n(k) ::; f(k-l,k)+1. 0
6. Show that the construction of Xk,i described in the text can be realized
on a polynomial-size grid. That is, if we let n = IXk,il, we may suppose
that the coordinates of all points in Xk,i are integers between 1 and n C
with a suitable constant c. (This was observed by Valtr.) 0

3.2 Horton Sets


Let X be a set in Rd. A k-point set Y S;; X is called a k-hole in X if Y
is convex independent and conv(Y) n X = Y. In the plane, Y determines a
convex k-gon with no points of X inside. Erdos raised the question about the
rather natural strengthening of the Erdos-Szekeres theorem: Is it true that
for every k there exists an n(k) such that any n(k)-point set in the plane in
general position has a k-hole?
A construction due to Horton, whose streamlined version we present be-
low, shows that this is false for k 2:: 7: There are arbitrarily large sets without
a 7-hole. On the other hand, a positive result holds for k ::; 5. For k = 6, the
answer is not known, and this "6-hole problem" appears quite challenging.
3.2 Horton Sets 35

3.2.1 Proposition (The existence of a 5-hole). Every sufficiently large


planar point set in general position contains a 5-hole.

Proof. By the Erdos-Szekeres theorem, we may assume that there exists a


6-point convex independent subset of our set X. Consider a 6-point convex
independent subset H ~ X with the smallest possible IX n conv(H) I. Let
1= conv(H) n (X \ H) be the points inside the convex hull of H.

• If I = 0, we have a 6-hole.
• If there is one point x in I, we consider a diagonal that partitions the
hexagon into two quadrilaterals:

The point x lies in one of these quadrilaterals, and the vertices of the
other quadrilateral together with x form a 5-hole.
• If III 2:: 2, we choose an edge xy of conv(I). Let I be an open half-plane
bounded by the line xy and containing no points of I (it is determined
uniquely unless III = 2).
If II n HI 2:: 3, we get a 5-hole formed by x, y, and 3 points of I n H.
For Ii n HI :::; 2, we have one of the two cases indicated in the following
picture:

v
lL
~y

it x
u

By replacing u and v by x and y in the left situation, or u by x in the


right situation, we obtain a 6-point convex independent set having fewer
points inside than H, which is a contradiction. 0

3.2.2 Theorem (Seven-hole theorem). There exist arbitrarily large finite


sets in the plane in general position without a 7-hole.
The sets constructed in the proof have other interesting properties as well.
Definitions. Let X and Y be finite sets in the plane. We say that X is high
above Y (and that Y is deep below X) if the following hold:
36 Chapter 3: Convex Independent Subsets

(i) No line determined by two points of X u Y is vertical.


(ii) Each line determined by two points of X lies above all the points of Y.
(iii) Each line determined by two points of Y lies below all the points of X.
For a set X = {Xl,X2, ... ,X n }, with no two points having equal x-
coordinates and with notation chosen so that the x-coordinates of the Xi
increase with i, we define the sets Xo = {X2' X4, .•. } (consisting of the points
with even indices) and Xl = {Xl, X3, ••• } (consisting of the points with odd
indices).
A finite set H C R2 is a Horton set if IHI :::; 1, or the following conditions
hold: IHI > 1, both Ho and HI are Horton sets, and HI lies high above Ho
or Holies high above HI'

3.2.3 Lemma. For every n ~ 1, an n-point Horton set exists.

Proof. We note that one can produce a smaller Horton set from a larger
one by deleting points from the right. We construct H(k), a Horton set of size
2k, by induction.
We define H(O) as the point (0,0). Suppose that we can construct a Horton
set H(k) with 2k points whose x-coordinates are 0, 1, ... , 2k-1. The induction
step goes as follows.
Let A = 2H(k) (i.e., H(k) expanded twice), and B = A + (1, hk), where
hk is a sufficiently large number. We set H(k+l) = Au B. It is easily seen
that if hk is large enough, B lies high above A, and so H(k+l) is Horton as
well. The set H(3) looks like this:

• •

• •


D

Closedness from above and from below. A set X in R2 is r-closedfrom


above if for any r-cup in X there exists a point in X lying above the r-cup
(i.e., above the bottom part of its convex hUll).

r=4

Similarly, we define a set r-closed from below using r-caps.

3.2.4 Lemma. Every Horton set is both 4-c1osed from above and 4-c1osed
from below.
3.2 Horton Sets 37

Proof. We proceed by induction on the size of the Horton set. Let H be a


Horton set, and assume that Ha lies deep below HI (the other possible case
is analogous). Let C <;;; H be a 4-cup.
If C <;;; Ha or C <;;; HI, then a point closing C from above exists by the
inductive hypothesis. Thus, let C n Ha -=j:. 0 -=j:. C n HI.
The cup C may have at most 2 points in HI (the upper part): If there
were 3 points, say a, b, e (in left-to-right order), then Ha lies below the lines
ab and be, and so the remaining point of C, which was supposed to lie in Ha,
cannot form a cup with {a,b,c}:

HI

Ho

This means that C has at least 2 points, a and b, in the lower part Ha.
Since the points of Ha and HI alternate along the x-axis, there is a point
e E HI between a and b in the ordering by x-coordinates. This e is above the
segment ab, and so it closes the cup C from above. We argue similarly for a
~~p. D

3.2.5 Proposition. No Horton set contains a 7-hole.

Proof. (Very similar to the previous one.) For contradiction, suppose there
is a 7-hole X in the considered Horton set H. If X <;;; Ha or X <;;; HI, we
use induction. Otherwise, we select the part (Ha or Ht} containing the larger
portion of X; this has at least 4 points of X. If this part is, say, Ha, and it lies
deep below HI, these 4 points must form a cup in Ha, for if some 3 of them
were a cap, no point of HI could complete them to a convex independent set.
By Lemma 3.2.4, Ha (being a Horton set) contains a point closing the 4-cup
from above. Such a point must be contained in the convex hull of the 7-hole
X, a contradiction. D

Bibliography and remarks. The existence of a 5-hole in every 10-


point planar set in general position was proved by Harborth [Har79].
Horton [Hor83] constructed arbitrarily large sets without a 7-hole; we
followed a presentation of his construction according to Valtr [VaI92a].
The question of existence of k-holes can be generalized to point sets
in Rd. Valtr [VaI92b] proved that (2d+l)-holes exist in all sufficiently
large sets in general position in Rd, and he constructed arbitrarily
large sets without k-holes for k ::::: 2d - I (p(d-l)+1), where P(d-l) is
the product of the first d-l primes. We outline the construction. Let H
38 Chapter 3: Convex Independent Subsets

be a finite set in R d , d 2:: 2, in general position (no d+l on a common


hyperplane and no two sharing the value of any coordinate). Let H =
{Xl, X2, ... , Xn} be enumeration of H by increasing first coordinate,
and let Hq,r = {Xi: i == r (mod q)}. Let PI = 2,P2 = 3, ... ,Pd-l be
the first d-l primes, and let us write P = Pd-l for brevity. The set H
is called d-Horton if
(i) its projection on the first d-l coordinates is a (d-l)-Horton set in
R d - l (where all sets in RI are I-Horton), and
(ii) either IHI ::; 1 or all the sets Hp,r are d-Horton, r = 0,1, ... ,p-l,
and for every subset I S;; {O, 1, ... ,p-l} of at least two indices, there
is a partition I = J U K, J -# 0 -# K, such that UrEJ Hp,r lies high
above UrEK Hp,r.
Here A lies high above B if every hyperplane determined by d points
of A lies above B (in the direction of the dth coordinate) and vice
versa. Arbitrarily large d-Horton sets can be constructed by induc-
tion: We first construct the (d-l )-dimensional projection, and then
we determine the dth coordinates suitably to meet condition (ii). The
nonexistence of large holes is proved using an appropriate generaliza-
tion of r-closedness from above and from below.
Since large sets generally need not contain k-holes, it is natural to
look for other, less special, configurations. Bialostocki, Dierker, and
Voxman [BDV91] proved the existence of k-holes modulo q: For every
q and for all k 2:: q+2, each sufficiently large set X (in terms of q and
k) in general position contains a k-point convex independent subset
Y such that the number of points of X in the interior of conv(Y)
is divisible by q; see Exercise 6. Karolyi, Pach, and T6th [KPTOl]
obtained a similar result with the weaker condition k 2:: ~ q + 0(1).
They also showed that every sufficiently large 1-almost convex set in
the plane contains a k-hole, and Valtr [VaIOl] extended this to k-almost
convex sets, where X is k-almost convex if no triangle with vertices at
points of X contains more than k points of X inside.

Exercises
1. Prove that an n-point Horton set contains no convex independent subset
with more than 4log 2 n points. [2J
2. Find a configuration of 9 points in the plane in general position with no
5-hole. [2J
3. Prove that every sufficiently large set in general position in R3 has a
7-hole. m
4. Let H be a Horton set and let k 2:: 7. Prove that if Y S;; H is a k-point
subset in convex position, then IH n conv(Y)I 2:: 2Lk/4J. Thus, not only
does H contain no k-holes, but each convex k-gon has even exponentially
many points inside. [!]
3.2 Horton Sets 39

This result is due to Nyklova. [NykOO], who proved exact bounds for
Horton sets and observed that the number of points inside each convex
k-gon can be somewhat increased by replacing each point of a Horton set
by a tiny copy of a small Horton set.
5. Call a set X C R2 in general position almost convex if no triangle with
vertices at points of X contains more than 1 point of X in its interior.
Let X C R2 be a finite set in general position such that no triangle with
vertices at vertices of conv(X) contains more than 1 point of X. Prove
that X is almost convex. 0
6. (a) Let q :::: 2 be an integer and let k = mq+2 for an integer m :::: 1. Prove
that every sufficiently large set X c R 2 in general position contains a
k-point convex independent subset Y such that the number of points of
X in the interior of conv(Y) is divisible by q. Use Ramsey's theorem for
triples. 0
(b) Extend the result of (a) to all k :::: q + 2.0
4

Incidence Problems

In this chapter we study a very natural problem of combinatorial geometry:


the maximum possible number of incidences between m points and n lines
in the plane. In addition to its mathematical appeal, this problem and its
relatives are significant in the analysis of several basic geometric algorithms.
In the proofs we encounter number-theoretic arguments, results about graph
drawing, the probabilistic method, forbidden subgraphs, and line arrange-
ments.

4.1 Formulation
Point-line incidences. Consider a set P of m points and a set L of n lines
in the plane. What is the maximum possible number of their incidences, i.e.,
pairs (p, £) such that pEP, £ E L, and p lies on £? We denote the number
of incidences for specific P and L by J(P, L), and we let J(m, n) be the
maximum of J(P, L) over all choices of an m-element P and an n-element L.
For example, the following picture illustrates that J(3, 3) ;:: 6,

and it is not hard to see that actually J(3, 3) = 6.


A trivial upper bound is J(m, n) :::; mn, but it it can never be attained
unless m = 1 or n = 1. In fact, if m has a similar order of magnitude as n then
J(m, n) is asymptotically much smaller than mn. The order of magnitude is
known exactly:

4.1.1 Theorem (Szemeredi-Trotter theorem). For all m, n ;:: 1, we


have J(m, n) = O(m 2 / 3 n 2 / 3 +m+n), and this bound is asymptotically tight.
42 Chapter 4: Incidence Problems

We give two proofs in the sequel, one simpler and one including techniques
useful in more general situations. We will mostly consider only the most
interesting case m = n. The general case needs no new ideas but only a little
more complicated calculation.
Of course, the problem of point-line incidences can be generalized in many
ways. We can consider incidences between points and hyperplanes in higher
dimensions, or between points in the plane and some family of curves, and
so on. A particularly interesting case is that of points and unit circles, which
is closely related to counting unit distances.
Unit distances and distinct distances. Let U(n) denote the maximum
possible number of pairs of points with unit distance in an n-point set in the
plane. For n ~ 3 we have U(n) = (~) (all distances can be 1), but already
for n = 4 at most 5 of the 6 distances can be 1; i.e., U(4) = 5:

<1>
We are interested in the asymptotic behavior of the function U (n) for n -+ 00.
This can also be reformulated as an incidence problem. Namely, consider
an n-point set P and draw a unit circle around each point of p, thereby
obtaining a set C of n unit circles. Each pair of points at unit distance con-
tributes two point-circle incidences, and hence U(n) ~ ~hcirc(n, n), where
hcirc (m, n) denotes the maximum possible number of incidences between m
points and n unit circles.
Unlike the case of point-line incidences, the correct order of magnitude of
U(n) is not known. An upper bound of O(n 4 / 3 ) can be obtained by modifying
proofs of the Szemen§di-Trotter theorem. But the best known lower bound
is U(n) 2: nHcl/loglogn, for some positive constant el; this is superlinear in
n but grows more slowly than nHc: for every fixed c > o.
A related quantity is the minimum possible number of distinct distances
determined by n points in the plane; formally,

g(n) = min I{dist(x,y): x,y E P}I.


PCR2: IPI=n

Clearly, g(n) 2: (~)/U(n), and so the bound U(n) = O(n 4 / 3 ) mentioned


above gives g(n) = n(n 2 / 3 ). This has been improved several times, and the
current best lower bound is approximately n(nO. 863 ). The best known upper
bound is O(n/v'logn).
Arrangements of lines. We need to introduce some terminology concern-
ing line arrangements. Consider a finite set L of lines in the plane. They
divide the plane into convex subsets of various dimensions, as is indicated in
the following picture with 4 lines:
4.1 Formulation 43

The intersections of the lines, indicated by black dots, are called the vertices.
By removing all the vertices lying on a line £ E L, the line is split into
two unbounded rays and several segments, and these parts are the edges.
Finally, by deleting all the lines of L, the plane is divided into open convex
polygons, called the cells. In Chapter 6 we will study arrangements of lines
and hyperplanes further, but here we need only this basic terminology and
(later) the simple fact that an arrangement of n lines in general position has
G) vertices, n 2 edges, and G)+n+1 cells. For the time being, the reader can
regard this as an exercise, or wait until Chapter 6 for a proof.
Many cells in arrangements. What is the maximum total number of
vertices of m distinct cells in an arrangement of n lines in the plane? Let us
denote this number by K (m, n). A simple construction shows that the maxi-
mum number of incidences I(m, n) is asymptotically bounded from above by
K(m, n); more exactly, we have I(m, n) ::; ~ K(m, 2n). To see this, consider
a set P of m points and a set L of n lines realizing I (m, n), and replace each
line £ E L by a pair of lines £', £" parallel to £ and lying at distance c from £:

e'-,--~--~~,,~-------

e" ---''-----I.........,......-+---~-----

If c > 0 is sufficiently small, then a point pEP incident to k lines in the


original arrangement now lies in a tiny cell with 2k vertices in the modified
arrangement.
It turns out that K(m, n) has the same order of magnitude as I(m, n),
and the upper bound can be obtained by methods similar to those used
for I (m, n). In higher-dimensional problems, even determining the maximum
possible complexity of a single cell can be quite challenging. For example, the
maximum complexity of a single cell in an arrangement of n hyperplanes is
described by the so-called upper bound theorem from the 1970s, which will
be discussed in Chapter 5.
44 Chapter 4: Incidence Problems

Bibliography and remarks. This chapter is partially based on


a nice presentation of the discussed topics in the book by Pach and
Agarwal [PA95], which we recommend as a source of additional in-
formation concerning history, bibliographic references, and various re-
lated problems. But we also include some newer results and techniques
discovered since the publication of that book.
The following neat problem concerning point-line incidences was
posed by Sylvester [SyI93] in 1893: Prove that it is impossible to ar-
range a finite number of points in the plane so that a line through
every two of them passes through a third, unless they all lie on the
same line. This problem remained unsolved until 1933, when it was
asked again by Erdos and solved shortly afterward by Gallai. The so-
lution shows, in particular, that it is impossible to embed the points of
a finite projective plane :F into R2 in such a way that points of each
line of :F lie on a straight line in R 2 . For example, the well-known
drawing of the Fano plane of order 3 has to contain a curved line:

Recently Pinchasi [Pin02] proved the following conjecture of Bez-


dek, resembling Sylvester's problem: For every finite family of at least
5 unit circles in the plane, every two of them intersecting, there exists
an intersection point common to exactly 2 of the circles.
The problems of estimating the maximum number of point-line
incidences, the maximum number of unit distances, and the minimum
number of distinct distances were raised by Erdos [Erd46]. For point-
line incidences, he proved the lower bound I(m, n) = O(m 2 / 3 n 2 / 3 +
m + n) (see Section 4.2) and conjectured it to be the right order of
magnitude. This was first proved by Szemen§di and Trotter [ST83].
Simpler proofs were found later by Clarkson, Edelsbrunner, Guibas,
Sharir, and Welzl [CEG+90], by Szekely [Sze97], and by Aronov and
Sharir [AS01a]; they are quite different from one another, and we dis-
cuss them all in this chapter.
T6th [T6t01a] proved the analogy of the Szemeredi-Trotter the-
orem for the complex plane; he used the original Szemeredi-Trotter
technique, since none of the simpler proofs seems to work there.
A beautiful application of techniques of Clarkson et al. [CEG+90]
in geometric measure theory can be found in Wolff [WoI97]. This pa-
per deals with a variation of the Kakeya problem: It shows that any
Borel set in the plane containing a circle of every radius has Hausdorff
dimension 2.
4.1 Formula.tion 45

For unit distances in the plane Erdos [Erd46] established the lower
bound U(n) = !l(nHc/loglogn) (Section 4.2) and again conjectured it
to be tight, but the best known upper bound remains O(n 4 / 3 ). This
was first shown by Spencer, Szemeredi, and Trotter [SST84], and it
can be re-proved by modifying each of the proofs mentioned above for
point-line incidences. Further improvement of the upper bound prob-
ably needs different, more "algebraic," methods, which would use the
"circularity" in a strong way, not just in the form of simple combi-
natorial axioms (such as that two points determine at most two unit
circles).
For the analogous problem of unit distances among n points in R 3 ,
Erdos [Erd60] proved !l(n4 / 3 log logn) from below and O(n 5 / 3 ) from
above. The example for the lower bound is the grid {I, 2, ... , ln 1 / 3 JP
appropriately scaled; the bound !l(n 4 / 3 ) is entirely straightforward,
and the extra log log n factor needs further number-theoretic consid-
erations. The upper bound follows by an argument with forbidden
K 3 ,3; similar proofs are shown in Section 4.5. The current best bound
is close to O(n 3 / 2 ); more precisely, it is n 3 / 2 20(a?(n)) [CEG+90]. Here
the function a(n), to be defined in Section 7.2, grows extremely slowly,
more slowly than log n, log log n, log log log n, etc. In dimensions 4 and
higher, the number of unit distances can be !l(n 2 ) (Exerdse 2). Here
even the constant at the leading term is known; see [PA95]. Among
other results related to the unit-distance problems and considering
point sets with various restrictions, we mention a neat construction of
Erdos, Hickerson, and Pach [EHP89] showing that, for every a E (0,2),
there is an n-point set on the 2-dimensional unit sphere with the dis-
tance a occurring at least !l( n log* n) times (the special distance J2
can even occur !l(n 4 / 3 ) times), and the annoying (and still unsolved)
problem of Erdos and Moser, whether the number of unit distances in
an n-point planar set in convex position is always bounded by O(n)
(see [PA95] for partial results and references).
For distinct distances in the plane, the best known upper bound,
due to Erdos, is O(n/v'logn). This bound is attained for the VnxVn
square grid. After a series of increases of the lower bound (Moser
[Mos52], Chung [Chu84], Beck [Bec83], Clarkson et al. [CEG+90],
Chung, Szemen§di, and Trotter [CST92], Szekely [Sze97], Solymosi and
T6th [STOll) the current record is !l(n 4 /(5-1/e)-E) for every fixed E: > 0
(the exponent is approximately 0.863) by Tardos [TarOl], who im-
proved a number-theoretic lemma in the Solymosi-T6th proof. Aronov
and Sharir [ASOlb] obtained the lower bound of approximately nO. 526
for distinct distances in R3.
Another challenging quantity is the number lcirc(m, n) of inci-
dences of m points with n arbitrary circles in the plane. The lower
bound for point-line incidences can be converted to an example with
46 Chapter 4: Incidence Problems

m points, n circles, and O(m 2 / 3n 2 / 3 + m + n) incidences, but in the


case of lcirc(m, n), this lower bound is not the best possible for all m
and n: Consider an example of an n-point set with t = O(n/v'logn)
distinct distances and draw the t circles with these distances as radii
around each point; the resulting tn = 0(n 2 ) circles have O(n 2 ) in-
cidences with the n points. The current record in the upper bound
is due to Aronov and Sharir [ASOla], and for m = n it yields
lcirc(n, n) = O(n 15 / 11+e) = O(n1.364). A little more about their ap-
proach is mentioned in the notes to Section 4.5, including an outline
of a proof of a weaker bound lcirc(n, n) = O(n1.4). Two other methods
for obtaining upper bounds are indicated in Exercises 4.4.2 and 4.6.4.
More generally, one can consider I(P, r), the number of incidences
between an m-point Pc R2 and a family r of n planar curves. Pach
and Sharir [PS98a] proved by Szekely's method that if r is a family
of curves with k degrees of freedom and multiplicity type s, meaning
that for any k points there are at most s curves of r passing through
all of them and no two curves intersect in more than k points, then
II(P, r)1 = 0 (m k/(2k-l)n 1 - 1 /(2k-l) + m + n), with the constant of
proportionality depending on k and s. Earlier [PS92], they proved the
same bound with some additional technical assumptions on the family
r by the technique of Clarkson et al. [CEG+90]. Most likely this bound
is not tight for k 2:: 3. Aronov and Sharir [ASOla] improved the bound
slightly for r a family of graphs of univariate polynomials of degree
at most k. The best known lower bound is mentioned in the notes to
Section 4.2 below.
Point-plane incidences. Considering n points on a line in R3 and
m planes containing that line, we see that the number of incidences
can be mn without further assumptions on the position of the points
and/or planes. Agarwal and Aronov [AA92] proved the upper bound
O(m 3 / 5 n 4 / 5 + m + n) for the number of incidences between m planes
and n points in R3 if no 3 of the points are collinear, slightly improving
on a result of Edelsbrunner, Guibas, and Sharir [EGS90]. In dimension
d, the maximum number of incidences of n hyperplanes with m vertices
of their arrangement is O(m 2 / 3 n d / 3 + n d - 1 ) [AA92], and this is tight
for m 2:: n d - 2 (for smaller m, the trivial O(mn) bound is tight).
The complexity of many cells in an arrangement of lines was first
studied by Canham [Can69], who proved K(m, n) = O(m 2 + n),
using the fact that two cells can have at most 4 lines incident to
both of them (essentially a "forbidden K 2 ,5" argument; see Sec-
tion 4.5). The tight bound O(m 2 / 3 n 2 / 3 + m + n) was first achieved
by Clarkson et al. [CEG+90]. Among results for the complexity
of m cells in other types of arrangements we mention the bound
O(m 2 / 3n 2 / 3 + na(n) + n log m) for segments by Aronov, Edelsbrun-
ner, Guibas, and Sharir [AEGS92], O(m 2 / 3n 2 / 3a(n)1/3 + n) for unit
4.1 Formulation 47

circles [CEG+90] (improved to O(m 2 / 3n 2 / 3)+n) by Agarwal, Aronov,


and Sharir [AAS01]), O(m3/5n4/52o.4a{n) + n) for arbitrary circles
[CEG+90] (also improved in [AASOl]; see the notes to Section 4.5),
O(m 2 / 3n+n 2 ) for planes in R3 by Agarwal and Aronov [AA92] (which
is tight), and O(ml/2nd/2(logn){Ld/2J-I)/2) for hyperplanes in Rd by
Aronov, Matousek, and Sharir [AMS94]. If one counts only facets of m
cells in an arrangement of n hyperplanes in R d, then the tight bound
is O(m 2 / 3 n d / 3 + n d - I ) [AA92]. A few more references on this topic
can be found in Agarwal and Sharir [ASOOa].
The number of similar copies of a configuration. The problem of unit
distances can be rephrased as follows. Let K denote a set consisting
of two points in the plane with unit distance. What is the maximum
number of congruent copies of K that can occur in an n-point set in
the plane? This reformulation opens the way to various interesting
generalizations, where one can vary K, or one can consider homo-
thetic or similar copies of K, and so on. Elekes's survey [Ele01] nicely
describes these problems, their relation to the incidence bounds, and
other connections. Here we sketch some of the main developments.
Beautiful results were obtained by Laczkovich and Ruzsa [LR97],
who investigated the maximum number of similar copies of a given
finite configuration K that can be contained in an n-point set in the

°
plane. Earlier, Elekes and Erdos [EE94] proved that this number is
O(n 2-{logn)-c) for all K, where c > depends on K, and it is O(n 2 )
whenever all the coordinates of the points in K are algebraic num-
bers. Building on these results, Laczkovich and Ruzsa proved that the
maximum number of similar copies of K is O(n 2) if and only if the

.
cross-ratio of every 4 points of K is algebraic, where the cross-ratio
of points a, b, c, d E R2 equals ~=~ ~=~, with a, b, c, d interpreted as
complex numbers in this formula.
Their proof makes use of very nice results from the additive the-
ory of numbers, most notably a theorem of Freiman [Fre73] (also see
Ruzsa [Ruz94]): If A is a set of n integers such that IA + AI :::;; en,
°
where A + A = {a + b: a, b E A} and c > is a constant, then A
is contained in a d-dimensional generalized arithmetic progression of
size at most Cn, with C and d depending on c only. Here a d-dimen-
sional generalized arithmetic progression is a set of integers of the form
{zo+iIql +i2q2+·· ·+idqd: i l = 0,1, ... , nl, i2 = 0,1, ... , n2,···, id =
0,1, ... , nd} for some integers zo and ql, q2, ... , qd. It is easy to see that
IA + AI :::;; CdlAI for every d-dimensional generalized arithmetic pro-
gression, and Freiman's theorem is a sort of converse statement: If
IA + AI = O(IA!), then A is not too far from a generalized arithmetic
progression. (Freiman's theorem has also been used for incidence-
related problems by Erdos, Fiiredi, Pach, and Ruzsa [EFPR93], and
48 Chapter 4: Incidence Problems

Gowers's paper [Gow98] is an impressive application of results of this


type in combinatorial number theory.)
Polynomials attaining O(n) values on Cartesian products. Interesting
results related to those of Freiman, as well as to incidence problems,
were obtained in a series of papers by Elekes and his coworkers (they
are described in the already mentioned survey [EleOl]). Perhaps even
more significant than the particular results is the direction of research
opened by them, combining algebraic and combinatorial tools. Let us
begin with a conjecture of Purdy proved by Elekes and R6nyai [EROO]
as a consequence of their theorems. Let P be a set of n distinct points
lying on a line u c R 2, let Q be a set of n distinct points lying on
a line v C R2, and let Dist(P,Q) = {lip - qll: p E P,q E Q}. If, for
example, u and v are parallel and if both P and Q are placed with equal
spacing along their lines, then IDist(P, Q)I ~ 2n. Another such case
is P = {(VI, 0): i = 1,2, ... ,n} and Q = {(O,J)): j = 1,2, ... ,n}:
This time u and v are perpendicular, and again IDist(P, Q)I ~ 2n.

°
According to Purdy's conjecture, these are the only possible positions
of u and v if the number of distances is linear: For every C > there
is an no such that if n 2: no and I Dist(P, Q)I ~ Cn, then u and v are
parallel or perpendicular.
If we parameterize the line u by a real parameter x, and v by y, and
denote the cosine of the angle of u and v by A, then Purdy's conjecture
can be reformulated in algebraic terms as follows: Whenever X, Y c R
are n-point sets such that the polynomial F(x, y) = x 2 + y2 + 2AXY

°
attains at most Cn distinct values on X x Y, i.e., I{ F(x, y): x E X, Y E
Y}I ~ Cn, then necessarily A = or A = ±l, provided that n 2: no(C).
Elekes and R6nyai [EROO] characterized all bivariate polynomials
F(x,y) that attain only O(n) values on Cartesian products X x Y.
For every C, d there exists an no such that if F(x, y) is a bivariate
polynomial of degree at most d and X, Y c Rare n-point sets, n 2: no,
such that F(x,y) attains at most Cn distinct values on X x Y, then
F(x, y) has one of the two special forms f(g(x) +h(y)) or f(g(x)h(y)),
where f, g, h are univariate polynomials. In fact, we need not consider
the whole X x Y; it suffices to assume that F attains at most Cn values
on an arbitrary subset of on 2 pairs from X x Y (with no depending
on 0, too). A similar result holds for a bivariate rational function
F( x, y), with one more special form to consider, namely, F( x, y) =
f((g(x) + h(y))/(l - g(x)h(y))).
We indicate a proof only for the special case of the polynomial
F(x, y) = x 2 + y2 + 2AXY from Purdy's conjecture (following Elekes
[Ele99]); the basic idea of the general case is similar, but several more
tools are needed, especially from elementary algebraic geometry. So let
Z = F(X, Y) be the set of values attained by F on X x Y. For each
Yi E Y, put h(x) = F(x, Yi), and define the family r = hij: i,j =
4.1 Formulation 49

1,2, ... ,n, i =1= j} of planar curves by lij = {(fi(t),h(t)): t E R}


(this is the key trick). Each lij contains at least ~ points of Z x Z,
since among the n points (!i(Xk), !j(Xk)), Xk E X, no 3 can coincide,
because the !i are quadratic polynomials. Moreover, a straightfor-
ward (although lengthy) calculation using resultants verifies that for
oX t/. {a, ±1}, at most 8 distinct curves lij can pass through any two
given distinct points a, bE R2. Consequently, r contains at least ~n2
distinct curves. Using the bound of Pach and Sharir [PS92], [PS98a]
on the number of incidences between points and algebraic curves men-
tioned above, with Z x Z as the points and the at least ~n2 distinct
curves of r as the curves, we obtain that IZI = f2(n 5 / 4 ). So there is
even a significant gap: Either oX E {a, ±1}, and then F(X, Y) can have
only 2n distinct elements for suitable X, Y, or oX t/. {a, ±1} and then
IF(X, Y)I = f2(n 5 / 4 ) for all X, Y.

° Perhaps this latter bound can be improved to f2(n 2 -E:) for every
c: > (so there would be an almost-dichotomy: either the number of
values of F can be linear, or it has to be always near-quadratic). On the
other hand, it is known that the polynomial x2 + y2 + xy attains only
O(n 2 / vllog n) distinct values for x, y ranging over {l, 2, ... ,n}, and
so the bound need not always be linear or quadratic. It seems likely
that in the general case of the Elekes-R6nyai theorem the number of
values attained by F should be near-quadratic unless F is one of the
special forms.
Further generalizations of the Elekes-R6nyai theorem were ob-
tained by Elekes and SzabO; see [Ele01].

Exercises
1. Let hcirc (m, n) be the maximum number of incidences of m points with
n unit circles and let U(n) be the maximum number of unit distances for
an n-point set.
(a) Prove that hcirc(2n, 2n) = O(hcirc(n, n)). IT]
(b) We have seen that U(n) :S ~hcirc(n, n). Prove that hcirc(n, n)
O(U(n)).121
2. Show that an n-point set in R4 may determine f2(n 2 ) unit distances. m
3. Prove that if X c Rd is a set where every two points have distance 1,
then IXI :S d+1. 0
4. What can be said about the maximum possible number of incidences of
n lines in R 3 with m points? 121
5. Use the Szemeredi-Trotter theorem to show that n points in the plane
determine at most
(a) O(n 7 / 3 ) triangles of unit area, 0
(b) O(n 7 / 3 ) triangles with a given fixed angle Ct. 121
50 Chapter 4: Incidence Problems

The result in (a) was first proved by Erdos and Purdy [EP71]. As for
(b), Pach and Sharir [PS92] proved the better bound O(n 2 Iogn); also
see [PA95].
6. (a) Using the Szemen§di-Trotter theorem, show that the maximum pos-
sible number of distinct lines such that each of them contains at least k
points of a given m-point set P in the plane is O(m 2 /k 3 + m/k). ~
(b) Prove that such lines have at most 0 (m 2 / k 2 + m) incidences with P.
~
7. (Many points on a line or many lines)
(a) Let P be an m-point set in the plane and let k ::; y'rii be an integer
parameter. Prove (using Exercise 6, say) that at most O(m 2 /k) pairs of
points of P lie on lines containing at least k and at most y'rii points of
P.IT!
(b) Similarly, for K 2: y'rii, the number of pairs lying on lines with at
least y'rii and at most K points is O(Km). IT!
(c) Prove the following theorem of Beck [Bec83]: There is a constant
c > 0 such that for any n- point P ~ R 2 , at least cn 2 distinct lines are
determined by P or there is a line containing at least cn points of P. ~
(d) Derive that there exists a constant c> 0 such that for every n-point
set P in the plane that does not lie on a single line there exists a point
pEP lying on at least en distinct lines determined by points of P. IT]
Part (d) is a weak form of the Dirac-Motzkin conjecture; the full conjec-
ture, still unsolved, is the same assertion with c = ~.
8. (Many distinct radii)
(a) Assume that Icirc(m, n) = O(mD: n ,6 +m+n) for some constants a < 1
and (3 < 1, where lcirc(m, n) is the maximum number of incidences of m
points with n circles in the plane. In analogy with to Exercise 7, derive
that there is a constant c > 0 such that for any n-point set P C R2,
there are at least cn 3 distinct circles containing at least 3 points of P
each or there is a circle or line containing at least cn points of P. IT!
(b) Using (a), prove the following result of Elekes (an answer to a question
of Balog): For any n-point set P C R2 not lying on a common circle or
line, the circles determined by P (i.e., those containing 3 or more points
of P) have n( n) distinct radii. [I]
(c) Find an example of an n-point set with only O(n) distinct radii. IT!
9. (Sums and products cannot both be few) Let A c R be a set of n distinct
real numbers and let S = A + A = {a + b: a, b E A} and P = A . A =
{ab: a, bE A}. .
(a) Check that each of the n 2 lines {(x,y) E R2: y = a(x - b)}, a,b E A,
contains at least n distinct points of S x P. IT]
(b) Conclude using Exercise 6 that IS x PI = n(n5 / 2 ), and consequently,
max(ISI, IP) = n(n 5 / 4 ); i.e., the set of sums and the set of products can
never both have almost linear size. ~ (This is a theorem of Elekes [Ele97]
improving previous results on a problem raised by Erdos and Szemeredi.)
4.2 Lower Bounds: Incidences and Unit Distances 51

10. (a) Find n-point sets in the plane that contain f2(n 2 ) similar copies of
the vertex set of an equilateral triangle. [II
(b) Verify that the following set Pm has n = O(m4) points and contains
f2(n 2 ) similar copies of the vertex set of a regular pentagon: Identify R2
with the complex plane C, let w = e 27ri / 5 denote a primitive 5th root of
unity, and put

Pm = {in + i1w + i2W 2 + i3W3: io, iI, i 2, i3 E Z, lij Ism}.


~
The example in (b) is from Elekes and Erdos [EE94], and the set P co is
called a pentagonal pseudolattice. The following picture shows P 2 :

. . . .
.............
.. ... ... .......
.. ......... .
. .. ..... .
.. . .. .
. .... °:. .... .. .. .. .
. ... .
'

. .. . .. °::.
°0 • °0 •• ° 0 •• ° 0 •• ° 0 ••

°0 • :." .. 0 0. : :... ..".

0 ... °0 .... °0 .... ° 0 .... ° 0 ....


.. °0 .. "0 °0 " °0 °0 °0 °0 0.
o ............................. ..

. .... . .... .... . .... .


°0 .... ° 0 °0 .. °0 °0 " ° 0 ..
.. .. ° 0 " .. ° 0 .... °0 .... " •

. :.. . 0:: 0: :. : .0° : ... : ....... :. . . .


..
.... . ° ..........
.. ° 0 °

" .. 0 .... 0 .... 0 .. ".

. .......... "" ........


o

...................... .
................... .

. . ....
. .... . .... . . ......
. ...

4.2 Lower Bounds: Incidences and Unit Distances


4.2.1 Proposition (Many point-line incidences). We have J(n, n)
f2(n 4 / 3 ), and so the upper bound for the maximum number of incidences
of n points and n lines in the plane in the Szemeredi-Trotter theorem is
asymptotically optimal.
It is not easy to come up with good constructions "by hand." Small cases
do not seem to be helpful for discovering a general pattern. Surprisingly, an
asymptotically optimal construction is quite simple. The appropriate lower
bound for J (m, n) with n -=I- m is obtained similarly (Exercise 1).
Proof. For simplicity, we suppose that n = 4k 3 for a natural number k.
For the point set P, we choose the k x 4k 2 grid; i.e., we set P = {(i,j): i =
0,1,2, ... , k-1, j = 0,1, ... , 4k 2 -1}. The set L consists of all the lines with
equations y = ax + b, where a = 0,1, ... ,2k-1 and b = 0,1, ... ,2k2 -1.
These are n lines, as it should be. For x E [0, k), we have ax + b < ak + b <
52 Chapter 4: Incidence Problems

2k2 + 2k2 = 4k 2 . Therefore, for each i = 0, I, ... , k-l, each line of L contains
a point of P with the x-coordinate equal to i, and so I(P, L) 2: k.ILI = ~ n 4 / 3 .
o
Next, we consider unit distances, where the construction is equally simple
but the analysis uses considerable number-theoretic tools.
4.2.2 Theorem (Many unit distances). For all n 2: 2, there exist con-
figurations of n points in the plane determining at least n l+cl/ log log n unit
distances, with a positive constant Cl'
A configuration with the asymptotically largest known number of unit
distances is a Vii x Vii regular grid with a suitably chosen step. Here unit
distances are related to the number of possible representations of an integer
as a sum of two squares. We begin with the following claim:
4.2.3 Lemma. Let PI < P2 < ... < Pr be primes of the form 4k+l, and
put M = P1P2'" Pro Then M can be expressed as a sum of two squares of
integers in at least 2r ways.

Proof. As we know from Theorem 2.3.1, each Pj can be written as a sum


of two squares: Pj = aJ + bJ. In the sequel, we work with the ring Z[iJ, the
so-called Gaussian integers, consisting of all complex numbers u + iv, where
u, v E Z. We use the fact that each element of Z[i] can be uniquely factored
into primes. From algebra, we recall that a prime in the ring Z [i] is an element
'Y E Z[i] such that whenever 'Y = 'Yn2 with 'Yl,'Y2 E Z[iJ, then bll = 1 or
b21 = 1. Both existence and uniqueness of prime factorization follows from
the fact that Z[i] is a Euclidean ring (see an introductory course on algebra
for an explanation of these notions).
Let us put aj = aj + i bj , and let iij = aj - i bj be the complex con-
jugate of aj. We have ajiij = (aj + ibj)(aj - ibj ) = aJ + bJ = Pj' Let us
choose an arbitrary subset J <;;; I = {I, 2, ... ,r} and define AJ + iB J =
(I1 j EJ aj) (I1j E1 V iij). Then AJ - iBJ = (I1 j EJ iij)(I1j EIV aj), and
hence M = (AJ + iB J )(AJ - iB J ) = A) + B;. This gives one expression of
the number M as a sum of two squares. It remains to prove that for two sets
J =I- J', AJ + iB J =I- AJ' + iB J'. To this end, it suffices to show that all the
aj and iij are primes in Z[i]. Then the numbers AJ + iB J and AJ' + iBJ, are
distinct, since they have distinct prime factorizations. (No aj or iij can be
obtained from another one by multiplying it by a unit of the ring Z[i]: The
units are only the elements I, -I, i, and -i.)
So suppose that aj = 'Yn2, 'Yl, 'Y2 E Z[i]. We have Pj = ajiij =
'Yn2'hi'2 = bl1 2b212. Now, 1'Y112 and 1'Y212 are both integers, and since Pj is
a prime, we get that 1'Y11 = 1 or b21 = 1. 0
Next, we need to know that the primes of the form 4k+ 1 are sufficiently
dense. First we recall the well-known prime number theorem: If 1f( n) denotes
the number of primes not exceeding n, then
4.2 Lower Bounds: Incidences and Unit Distances 53

n
7f(n) = (1 + 0(1»-1- as n ---+ 00.
nn
Proofs of this fact are quite complicated; on the other hand, it is not so hard
to prove weaker bounds cn/logn < 7f(n) < Cn/logn for suitable positive
constants c, C.
We consider primes in the arithmetic progression 1,5,9, ... ,4k+ 1, .... A
famous theorem of Dirichlet asserts that every arithmetic progression con-
tains infinitely many primes unless this is impossible for a trivial reason,
namely, unless all the terms have a nontrivial common divisor. The following
theorem is still stronger:
4.2.4 Theorem. Let d and a be relatively prime natural numbers, and let
7fd,a(n) be the number of primes of the form a + kd (k = 0,1,2, .. ') not
exceeding n. We have
1 n
7fd,a(n) = (1 + 0(1)) 'P(d) . Inn'
where 'P denotes the Euler function: 'P( d)is the number of integers between 1
and d that are relatively prime to d.
For every d :::: 2, there are 'P( d) residue classes modulo d that can possi-
bly contain primes. The theorem shows that the primes are quite uniformly
distributed among these residue classes.
The proof of the theorem is not simple, and we omit it, but it is very
nice, and we can only recommend to the reader to look it up in a textbook
on number theory.
Proof of the lower bound for unit distances (Theorem 4.2.2). Let us
suppose that n is a square. For the set P we choose the points of the fo x fo
grid with step I/VM, where M is the product of the first r-l primes of the
form 4k+ 1, and r is chosen as the largest number such that M :::; ~.
It is easy to see that each point of the grid participates in at least as many
unit distances as there are representations of M as a sum of two squares of
nonnegative integers. Since one representation by a sum of two squares of
nonnegative integers corresponds to at most 4 representations by a sum of
two squares of arbitrary integers (the signs can be chosen in 4 ways), we have
at least 2r - 1 /16 unit distances by Lemma 4.2.3.
By the choice of r, we have 4PIP2'" Pr-l :::; n < 4PIP2'" Pr, and
hence 2r :::; nand Pr > (~)l/r. Further, we obtain, by Theorem 4.2.4,
r = 7f4,1(Pr) :::: (1 - o(I»Pr/logPr :::: y'Pr :::: n 1/ 3r for sufficiently large
n, and thus r 3r :::: n. Taking logarithms, we have 3r log r :::: log n, and hence
r :::: logn/(31ogr) :::: log n/(3 log log n). The number of unit distances is at
least n 2r - 4 :::: n l+cl/ log log n, as Theorem 4.2.2 claims. Let us remark that for
sufficiently large n the constant Cl can be made as close to 1 as desired. 0

Bibliography and remarks. Proposition 4.2.1 is due to Erdos


[Erd46J. His example is outlined in Exercise 2 (also see [PA95]); the
54 Chapter 4: Incidence Problems

analysis requires a bit of number theory. The simpler example in the


text is from Elekes [EleOl]. Its extension provides the best known
lower bound for the number of incidences between m points and n ~
m(k-I)/2 curves with k degrees of freedom: For a parameter t ::::; mIlk,
let P = {(i,j): 0 ::::; i < t,O ::::; j < ¥'}, and let r consist of the
graphs of the polynomials L;:~ aexe with ae = 0,1, ... , lktr+1 J,
e=
0,1, ... , k-1.
Theorem 4.2.2 is due to Erdos [Erd46]' and the proof uses ingredi-
ents well known in number theory. The prime number theorem (and
also Theorem 4.2.4) was proved in 1896, by de la Valee Poussin and
independently by Hadamard (see Narkiewicz [NarOO]).

Exercises
1. By extending the example in the text, prove that for all m, n with n 2 ::::; m
and m 2 ::::; n, we have I(m, n) = D(n 2 / 3 m 2 / 3 ). [I]
2. (Another example for incidences) Suppose that n = 4t 6 for an integer
t ~ 1 and let P = Hi,j): 0 ::::; i,j < fo}. Let S = {(a, b), a, b =
1,2, ... ,t, gcd(a, b) = I}, where gcd(a, b) denotes the greatest common
divisor of a and b. For each point pEP, consider the lines passing
through p with slope alb, for all pairs (a, b) E S. Let L be the union of
all the lines thus obtained for all points pEP.
(a) Check that ILl::::; n. ~
(b) Prove that lSI ~ ct 2 for a suitable positive constant c > 0, and infer
that I(P,L) = D(nt 2 ) = D(n 4 / 3 ). [IJ

4.3 Point-Line Incidences via Crossing Numbers


Here we present a very simple proof of the Szemeredi-Trotter theorem based
on a result concerning graph drawing. We need the notion of the crossing
number of a graph G; this is the minimum possible number of edge crossings
in a drawing of G. To make this rigorous, let us first recall a formal definition
of a drawing.
An arc is the image of a continuous injective map [0,1] --+ R2. A drawing
of a graph G is a mapping that assigns to each vertex of G a point in the plane
(distinct vertices being assigned distinct points) and to each edge of G an
arc connecting the corresponding two (images of) vertices and not incident
to any other vertex. We do not insist that the drawing be planar, so the
arcs are allowed to cross. A crossing is a point common to at least two arcs
but distinct from all vertices. In this section we will actually deal only with
drawings where each edge is represented by a straight segment.
Let G be a graph (or multigraph). The crossing number of a drawing of
G in the plane is the number of crossings in the considered drawing, where a
crossing incident to k ~ 2 edges is counted (;) times. So a drawing is planar
4.3 Point-Line Incidences via Crossing Numbers 55

if and only if its crossing number is O. The crossing number of the graph G
is the smallest possible crossing number of a drawing of G; we denote it by
cr(G). For example, cr(K5) = 1:

As is well known, for n > 2, a planar graph with n vertices has at most
3n-6 edges. This can be rephrased as follows: If the number of edges is
at least 3n-5 then cr( G) > O. The following theorem can be viewed as a
generalization of this fact.

4.3.1 Theorem (Crossing number theorem). Let G = (V, E) be a sim-


ple graph (no multiple edges). Then

1 IEI3
cr(G) > -
- 64 1V12 - IVI
.-

(the constant l4 can be improved by a more careful calculation).

The lower bound in this theorem is asymptotically tight; i.e., there exist
graphs with n vertices, m edges, and crossing number O(m 3 /n 2 ); see Exer-
cise 1. The assumption that the graph is simple cannot be omitted.
For a proof of this theorem, we need a simple lemma:

4.3.2 Lemma. The crossing number of any simple graph G = (V, E) is at


least lEI - 31V1·

Proof. If lEI> 31V1 and some drawing of the graph had fewer than 1E1-31V1
crossings, then we could delete one edge from each crossing and obtain a
planar graph with more than 31V1 edges. D

Proof of Theorem 4.3.1. Consider some drawing of a graph G = (V, E)


with n vertices, m edges, and crossing number x. We may assume m ~ 4n,
for otherwise, the claimed bound is negative. Let p E (0,1) be a parameter;
later on we set it to a suitable value. We choose a random subset V' ~ V by
including each vertex v E V into V' independently with probability p. Let G'
be the subgraph of G induced by the subset V'. Put n' = IV'I, m' = IE(G')I,
and let x' be the crossing number of the graph G' in the drawing "inherited"
from the considered drawing of G. The expectation of n' is E[n'] = np. The
probability that a given edge appears in E(G') is p2, and hence E[m'] = mp2,
and similarly we get E[x'] = Xp4. At the same time, by Lemma 4.3.2 we
always have x' ~ m' - 3n', and so this relation holds for the expectations as
well: E[x'] ~ E[m'] - 3E[n']. So we have Xp4 ~ mp2 - 3np. Setting p = ;;:
(which is at most 1, since we assume m ~ 4n), we calculate that
56 Chapter 4: Incidence Problems

1 m3
x->
64-n-
2 .

The crossing number theorem is proved. o


Proof of the Szemeredi-Trotter theorem (Theorem 4.1.1). We con-
sider a set P of m points and a set L of n lines in the plane realizing the max-
imum number of incidences J(m, n). We define a certain topological graph
G = (V, E), that is, a graph together with its drawing in the plane. Each
point pEP becomes a vertex of G, and two points p, q E P are connected
by an edge if they lie on a common line £ E L next to one another. So we
have a drawing of G where the edges are straight segments. This is illustrated
below, with G drawn thick:

If a line £ E L contains k ;::: 1 points of P, then it contributes k-1 edges to


P, and hence J(m, n) = lEI + n. Since the edges are parts of the n lines, at
most G) pairs may cross: cr( G) :::; G). On the other hand, from the crossing
number theorem we get cr(G) ;::: l4 . IEI31m 2 - m. So 6~ . IEI31m 2 - m :::;
cr(G) :::; G), and a calculation gives lEI = O(n 2 / 3 m 2 / 3 +m). This proves the
Szemeredi-Trotter theorem. 0

The best known upper bound on the number of unit distances, U(n) =
O(n 4 / 3 ), can be proved along similar lines; see Exercise 2.

Bibliography and remarks. The presented proof of the Szemeredi-


Trotter theorem is due to Szekely [Sze97].
The crossing number theorem was proved by Ajtai, Chvatal, New-
born, and Szemeredi [ACNS82] and independently by Leighton [Lei84].
This result belongs to the theory of geometric graphs, which studies
the properties of graphs drawn in the plane (most often with edges
drawn as straight segments). A nice introduction to this area is given
in Pach and Agarwal [PA95], and a newer survey is Pach [Pac99]. In
the rest of this section we mention mainly some of the more recent
results.
Pach and T6th [PT97] improved the constant 614 in Theorem 4.3.1
to approximately 0.0296, which is already within a factor of 2.01 of the
best known upper bound (obtained by connecting all pairs of points of
distance at most d in a regular Vn x Vn grid, for a suitable d). The im-
provement is achieved by establishing a better version of Lemma 4.3.2,
namely, cr( G) ;::: 51EI - 251V1 for lEI> 71V1 - 14.
4.3 Point-Line Incidences via Crossing Numbers 57

Pach, Spencer, and T6th [PSTOO] proved that for graphs with cer-
tain forbidden subgraphs, the bound can be improved substantially:
For example, if G has n vertices, m edges, and contains no cycle of
length 4, then cr(G) = n(m 4 jn 3 ) for m 2: 400n, which is asymp-
totically tight. Generally, let g be a class of graphs that is mono-
tone (closed under adding edges) and such that any n-vertex graph
in g has at most O(n1+a) edges, for some a E (0,1). Then cr(G) 2:
cm 2+l/ a jn1+1/ a for any G E g with n vertices and m 2: Cn 10g2 n
edges, with suitable constants C, c > 0 depending on g. The proof
applies a generally useful lower bound on the crossing number, which
we outline next. Let bw(G) denote the bisection width of G, i.e., the
minimum number of edges connecting VI and V2 , over all partitions
!
(VI, V2 ) of V(G) with lVII, IV2 1 2: IV(G)I. Leighton [Lei83] proved
that cr(G) = n(bw(G)2) - IV(G)I for any graph G of maximum de-
gree bounded by a constant. Pach, Shahrokhi, and Szegedy [PSS96],
and independently Sykora and Vrt'o [SV94], extended this to graphs
with arbitrary degrees:

(4.1)

where degc(v) is the degree of v in G. The proof uses the fol-


lowing version, due to Gazit and Miller [GM90J, of the well-known
Lipton-Tarjan separator theorem for planar graphs: For any planar
graph H and any nonnegative weight function w: V(H) ----+ [O,~] with
L:vEV(H) w(v) = 1, one can delete at most 1.58VL:vEV(H) degH(v)2
edges in such a way that the total weight of vertices in each component
of the resulting graph is at most ~. To deduce (4.1), consider a drawing
of G with the minimum number of crossings, replace each crossing by a
vertex of degree 4, assign weight 0 to these vertices and weight IVlG) I
to the original vertices, and apply the separator theorem (see, e.g.,
[PA95] for a more detailed account). Djidjev and Vrt'o [DV02] have re-
cently strengthened (4.1), replacing bw( G) by the cutwidth of G. To
define the cutwidth, we consider an injective mapping f: V(G) ----+ R.
Each edge corresponds to a closed interval, and we find the maximum
number of these intervals with a common interior point. The cutwidth
is the minimum of this quantity over all f.
To derive the result of Pach et al. [PSTOO] on the crossing number
of graphs with forbidden subgraphs mentioned above from (4.1), we
consider a graph G E g with n vertices and m edges. If cr( G) is
small, then the bisection width is small, so G can be cut into two
parts of almost equal size by removing not too many edges. For each
of these parts, we bisect again, and so on, until parts of some suitable
size s (depending on nand m) are reached. By the assumption on g,
58 Chapter 4: Incidence Problems

each of the resulting parts has 0(81+0<) edges, and so there are 0( n8 0<)
edges within the parts. This number of edges plus the number of edges
deleted in the bisections add up to m, and this provides an inequality
relating cr( G) to nand m; see [PSTOOj for the calculations.
The notion of crossing number is a subtle one. Actually, one can
give several natural definitions; a study of various notions and of their
relations was made by Pach and T6th [PTOOj. Besides counting the
crossings, as we did in the definition of cr(G), one can count the
number of (unordered) pairs of edges that cross; the resulting no-
tion is called the pairwise crossing number in [PTOOj, and we denote
it by pair-cr( a). We always have pair-cr( G) ::; cr( G), but since two
edges (arcs) are allowed to cross several times, it is not clear whether
pair-cr(G) = cr(G) for all graphs G, and currently this seems to be a
challenging open problem (see Exercise 4 for a typical false attempt at
a proof). A simple argument shows that cr(G) ::; 2pair-cr(G)2 (Exer-
cise 4( c)). A stronger claim, proved in [PTOOj, is cr( G) ::; 2 odd-cr( G)2 ,
where odd-cr(G) is the odd-crossing number of G, counting the num-
ber of pairs of edges that cross an odd number of times. An inspiration
for their proof is a theorem of Ranani and Tutte claiming that a graph
G is planar if and only if odd-cr(G) = O. In a drawing of G, call an
edge e even if there is no edge crossed by e an odd number of times.
Pach and T6th show, by a somewhat complicated proof, that if we
consider a drawing of G and let Eo be the set of the even edges, then
there is another drawing of G in which the edges of Eo are involved in
no crossings at all. The inequality cr( G) ::; 2 odd-cr( G)2 then follows
by an argument similar to that in Exercise 4( c).
Finally, let us remark that if we consider rectilinear drawings
(where each edge is drawn as a straight segment), then the result-
ing rectilinear crossing number can be much larger than any of the
crossing numbers considered above: Graphs are known with cr( G) = 4
and arbitrarily large rectilinear crossing numbers (Bienstock and Dean
[BD93]).

Exercises
1. Show that for any nand m, 5n ::; m ::; (~), there exist graphs with n
vertices, m edges, and crossing number 0(m 3 /n 2 ). 0
2. In a manner similar to the above proof for point-line incidences, prove the
bound hcirc(n, n) = 0(n 4 / 3 ), where hcirc(m, n) denotes the maximum
possible number of incidences between m points and n unit circles in the
plane (be careful in handling possible multiple edges in the considered
topological graph!). m
3. Let K (n, m) denote the maximum total number of edges of m dis-
tinct cells in an arrangement of n lines in the plane. Prove K(n, m) =
0(n 2 / 3 m 2 / 3 +n+m) using the method of the present section (it may be
4.4 Distinct Distances via Crossing Numbers 59

convenient to classify edges into top and bottom ones and bound each
type separately). 0
4. (a) Prove that in a drawing of G with the smallest possible number of
crossings, no two edges cross more than once. 0
(b) Explain why the result in (a) does not imply that pair-cr(G) = cr(G)
(where pair-cr( G) is the minimum number of pairs of crossing edges in a
drawing of G). 0
(c) Prove that if G is a graph with pair-cr( G) = k, then cr( G) ::::; (2;). [II

4.4 Distinct Distances via Crossing Numbers


Here we use the methods from the preceding sections to establish a lower
bound on the number of distinct distances determined by an n-point set
in the plane. We do not go for the best known bound, whose proof is too
complicated for our purposes, but in the notes below we indicate how the
improvement is achieved.

4.4.1 Proposition (Distinct distances in R2). The minimum number


g( n) of distinct distances determined by an n-point set in the plane satisfies
g(n) = Sl(n4 / 5 ).

Proof. Fix an n-point set P, and let t be the number of distinct distances
determined by P. This means that for each point PEP, all the other points
are contained in t circles centered at p (the radii correspond to the t distances
appearing in P).

These tn circles obtained for all the n points of P have n( n-l) incidences
with the points of P. The first idea is to bound this number of incidences from
above in terms of nand t, in a way similar to the proof of the Szemen§di-
Trotter theorem in the preceding section, which yields a lower bound for t.
First we delete all circles with at most 2 points on them (the innermost
circle and the second outermost circle in the above picture). We have de-
stroyed at most 2nt incidences, and so still almost n 2 incidences remain (we
may assume that t is much smaller than n, for otherwise, there is nothing
to prove). Now we define a graph G: The vertices are the points of P and
the edges are the arcs of the circles between the points. This graph has n
vertices, almost n 2 edges, and there are at most t 2 n 2 crossings because every
two circles intersect in at most 2 points.
60 Chapter 4: Incidence Problems

Now if we could apply the crossing number theorem to this graph, we


would get that with n vertices and n 2 edges there must be at least O(n6 /n 2 ) =
O(n4) crossings, and so t = O(n) would follow. This, of course, is too good
to be true, and indeed we cannot use the crossing number theorem directly
because our graph may have multiple edges: Two points can be connected by
several arcs.

A multigraph can have arbitrarily many edges even if it is planar. But if we


have a bound on the maximum edge multiplicity, we can still infer a lower
bound on the crossing number:
4.4.2 Lemma. Let G = (V, E) be a multigraph with maximum edge multi-
plicity k. Then
cr(G) -_ 0 (IEI 3
kIV12 ) - O(k 2 IVI)·

We defer the proof to the end of this section.


In the graph G defined above, it appears that the maximum edge multi-
plicity can be as high as t. If we used Lemma 4.4.2 with k = t in the manner
indicated above, we would get only the estimate t = O(n 2 / 3 ).
The next idea is to deal with the edges of very high multiplicity separately.
Namely, we observe that if a pair {u, v} of points is connected by k arcs, then
the centers of these arcs lie on the symmetry axis fuv of the segment uv:
v

So the line fuv has at least k incidences with the points of P. But the Sze-
meredi-Trotter theorem tells us that there cannot be too many distinct lines,
each incident to many points of P. Let us make this precise.
By a consequence of the Szemeredi-Trotter theorem stated in Exer-
cise 4.1.6(b), lines containing at least k points of P each have altogether
no more than O(n2 /k 2 + n) incidences with P.
Let M be the set of pairs {u, v} of vertices of G connected by at least
k edges in G, and let E be the set of edges (arcs) connecting these pairs.
Each edge in E connecting the pair {u, v} contributes one incidence of the
bisecting line fuv with a point pEP. On the other hand, one incidence of
4.4 Distinct Distances via Crossing Numbers 61

such p with some Puv can correspond to at most 2t edges of E, because at


most t circles are centered at p, and so Puv intersects at most 2t arcs with
center p. So we have lEI = O(tn 2 /k 2 + tn).
Let us set k as large as possible but so that lEI ::; ~n2, i.e., k = C0
for a sufficiently large constant C. If we delete all edges of E, the remaining
graph still has O(n 2 ) edges, but the maximum multiplicity is now below k.
We can finally apply Lemma 4.4.2: With n vertices, O(n 2 ) edges, and edge
multiplicity at most k = O( 0), we have at least O( n 4 / 0) crossings. This
number must be below t 2 n 2 , which yields t = O(n 4 / 5 ) as claimed. 0

Proof of Lemma 4.4.2. Consider a fixed drawing of G. We choose a


subgraph G' of G by the following random experiment. In the first stage,
we consider each edge of G independently, and we delete it with probability
1- t. In the second stage, we delete all the remaining multiple edges, and
this gives G', which has n vertices, m' edges, and x' crossing pairs of edges.
Consider the probability Pe that a fixed edge e E E remains in G'. Clearly,

Pe ::; On the other hand, if e was one of k' ::; k edges connecting the same
pair of vertices, then the probability that e survives the first stage while all
the other edges connecting its two vertices are deleted is

1( k1)
k 1-
k'-l 1
~ 3k

(since (l-l/k)k-l ~ !).


We get E[m'] ~ IEI/3k and E[x'] ::; X/k2. Applying
the crossing number theorem for the graph G' and taking expectations, we
have
1 E[m'3]
E [x'] >- . - n.
- 64 n2
By convexity (Jensen's inequality), we have E [m'3] ~ (E[m']/ = O(IEI 3/k 3).
Plugging this plus the bound E[x'] ::; x/k 2 into the above formula, we get

= 0 (IEI
3
kx2 k 3n 2 ) - O(n),

and the lemma follows. o

Bibliography and remarks. The proof presented above is, with


minor modifications, that of Szekely [Sze97]. The bound has subse-
quently been improved by Solymosi and T6th [STOll to O(n 6 / 7 ) and
then by Tardos [TarO 1] to (approximately) O(nO. 863 ).
The weakest point of the proof shown above seems to be the lower
bound on the number of incidences between the points of P and the
"rich" bisectors Puv ({ u, v} being the pairs connected by k or more
edges). We counted as if each such incidence could be responsible for as
many as t edges. While this does not look geometrically very plausible,
62 Chapter 4: Incidence Problems

it seems hard to exclude such a possibility directly. Instead, Solymosi


and T6th prove a better lower bound for the number of incidences of
p with the rich bisectors differently; they show that if there are many
edges with multiplicity at least k, then each of n(n) suitable points is
incident to many (namely n(njt 3 / 2 ) in their proof) rich bisectors. We
outline this argument.
We need to modify the definition of the graph G. The new definition
uses an auxiliary parameter r (a constant, with r = 3 in the original
Solymosi-T6th proof). First, we note that by the theorem of Beck
mentioned in Exercise 4.1.7, there is a subset pi ~ P of n(n) points
such that each p E pi sees the other points of P in n( n) distinct
directions. For each p E pi, we draw the t circles around p. If several
points of P are visible from p in the same direction, we temporarily
delete all of them but one. Then, on each circle, we group the remaining
points into groups by r consecutive points, and on each circle we delete
the at most r-l leftover points fitting in no such group. This still
leaves n(n) r-point groups on the circles centered at p.
Next, we consider one such r-point group and all the G) bisecting
lines of its points. If at least one of these bisectors, call it £uv, contains
fewer than k points of P (k being a suitable threshold), then we add
the arc connecting u and v as an edge of G:
! If this bisector has at most k points of P,

~ then the arc {u, v} is added to G.

(This is not quite in agreement with our definition of a graph drawing,


since the arc may pass though other vertices of G, but it is easy to
check that if we permit arcs through vertices and modify the definition
of the crossing number appropriately, Lemma 4.4.2 remains valid.) The
groups where every bisector contains at least k points of P (call them
rich groups) do not contribute any edges of G.
Setting k = an 2 jt 2 for a small constant a, we argue by Lemma 4.4.2
that G has at most f3n 2 edges for a small f3 = f3(a) > O. It follows
that most of the r-point groups must be rich, and so there is a subset
P" ~ pi of n(n) points, each of them possessing n(n) rich groups
on its circles. It remains to prove that each point pEP" is incident
to many rich bisectors. We divide the plane around p into angular
sectors such that each sector contains about 3rt points (of the n(n)
points in the rich groups belonging to p). Each sector contains at least
t complete rich groups (since there are t circles, and so the sector's
boundaries cut through at most 2t groups), and we claim that it has
to contain many rich bisectors. This leads to the following number-
theoretic problem: we have tr distinct real numbers (corresponding to
the angles of the points in the sector as seen from p), arranged into
4.4 Distinct Distances via Crossing Numbers 63

t groups by r numbers, and we form all the G) arithmetic averages


of the pairs in each group (corresponding to the rich bisectors of the
group). This yields tG) real numbers, and we want to know how many
of them must be distinct.
It is not hard to see that for r = 3, there must be at least n(t 1 / 3 )
distinct numbers, because the three averages (a + b)/2, (a + c)/2, and
(b + c)/2 determine the numbers a, b, c uniquely. It follows, still for
r = 3, that each of the nun sectors has n(t 1 / 3 ) distinct bisectors,
and so each point in P" has n(n/t 2 / 3 ) incidences with the rich lines.
Applying Szemen3di-Trotter now yields the Solymosi-T6th bound of
t = n(n 6 / 7 ) distinct distances.
Tardos [TarOl] considered the number-theoretic problem above for
larger r, and he proved, by a complicated argument, that for r large
but fixed, the number of distinct pairwise averages is n(t 1 / e + e ), with
E -t 0 as r -t 00. Plugging this into the proof leads to the current
best bound mentioned above. An example by Ruzsa shows that the
number of distinct pairwise averages can be O( Jt) for any fixed r,
and it follows that the Solymosi-T6th method as is cannot provide a
bound better than n(n8 / 9 ). But surely one can look forward to the
further continuation of the adventure of distinct distances.

Exercises
1. Let lcirc(m, n) be the maximum number of incidences between m points
and n arbitrary circles in the plane. Fill in the details of the following
approach to bounding lcirc(n, n). Let K be a set of n circles, C the set
of their centers, and P a set of n points.
(a) First, assume that the centers of the circles are mutually distinct, i.e.,
101 = IKI· Proceed as in the proof in the text: Remove circles with at
most 2 incidences, and let the others define a drawing of a multigraph G
with vertex set P and arcs of the circles as edges. Handle the edges with
multiplicity k or larger via Szemen3di-Trotter, using the incidences ofthe
bisectors with the set C, and those with multiplicity < k by Lemma 4.4.2.
Balance k suitably. What bound is obtained for the total number of
incidences? 0
(b) Extend the argument to handle concentric circles too. 0
2. This exercise provides another bound for lcirc(n, n), the maximum possi-
ble number of incidences between n arbitrary circles and n points in the
plane. Let K be the set of circles and P the set of points. Let Pi be the
points with at least di = 2i and fewer than 2i+l incidences; we will argue
for each Pi separately.
Define the multigraph G on Pi as usual, with arcs of circles of K con-
necting neighboring points of Pi (the circles with at most 2 incidences
with Pi are deleted). Let E be the set of edges of G. For a point u E Pi,
let N(u) be the set of its neighboring points, and for a v E N(u), let
64 Chapter 4: Incidence Problems

/-L(U, v) be the number of edges connecting U and v. For an edge e, define


its partner edge as the edge following after e clockwise around its circle.
(a) Show that for each U E Pi, I{v E N(u): /-L(u,v) :::: 4Jdi}1 < Jdi/2.
CD
(b) Let Eh ~ E be the edges of multiplicity at least 4Jdi. Argue that
for at least ~ of the edges in E h , their partner edges do not belong to
E h, and hence IE \ Ehl = f2(IEI). 0
(c) Delete the edges of Eh from the graph, and apply Lemma 4.4.2 to
bound IE \ Ehl. What overall bound does all this give for Jcirc(n, n)? 0
A similar proof appears in Pach and Sharir [PS98a] (for the more general
case of curves mentioned in the notes to Section 4.1).

4.5 Point-Line Incidences via Cuttings


Here we explain another proof of the upper bound J(n, n) = O(n 4 / 3 ) for
point-line incidences. The technique is quite different. It leads to an efficient
algorithm and seems more generally applicable than the one with the crossing
number theorem.

4.5.1 Lemma (A worse but useful bound).

J(m,n) = O(nJm+m), (4.2)


J(m, n) = O(mvn + n). (4.3)

Proof. There are at most G) crossing pairs of lines in total. On the other
hand, a point Pi E P with d i incidences "consumes" (~i) crossing pairs (their
intersections all lie at Pi). Therefore, l:~l (~i) :::; (;).
We want to bound l:~l di from above. Since points with no incidences
can be deleted from P in advance, we may assume d i :::: 1 for all i, and then
we have (~) :::: (d i -1)2/2. By the Cauchy-Schwarz inequality,

and hence l:d i = O(nJffi+m).


The other inequality in the lemma can be proved similarly by looking at
pairs of points on each line. Alternatively, the equality J(n, m) = J(m, n) for
all m, n follows using the geometric duality introduced in Section 5.1. D

Forbidden subgraph arguments. For integers r,8 :::: 1, let Kr,s denote
the complete bipartite graph on r + 8 vertices; the picture shows K 3 ,4:
4.5 Point-Line Incidences via Cuttings 65

The above proof can be expressed using graphs with forbidden K 2,2 as a
subgraph and thus put into the context of extremal graph theory.
A typical question in extremal graph theory is the maximum possible
number of edges of a (simple) graph on n vertices that does not contain a
given forbidden subgraph, such as K 2 ,2. Here the subgraph is understood in
a noninduced sense: For example, the complete graph K4 does contain K 2 ,2
as a subgraph. More generally, one can forbid all subgraphs from a finite or
infinite family F of graphs, or consider "containment" relations other than
being a subgraph, such as "being a minor."
If the forbidden subgraph H is not bipartite, then, for example, the com-
plete bipartite graph Kn,n has 2n vertices, n 2 edges, and no subgraph iso-
morphic to H. This shows that forbidding a nonbipartite H does not reduce
the maximum number of edges too significantly, and the order of magnitude
remains quadratic.
On the other hand, forbidding Kr,s with some fixed rand s decreases
the exponent of n, and forbidden bipartite subgraphs are the key to many
estimates in incidence problems and elsewhere.
4.5.2 Theorem (Kovari-Sos-Turan theorem). Let r ::::; s be fixed nat-
ural numbers. Then any graph on n vertices containing no Kr,s as a subgraph
has at most O(n2-1/r) edges.
If G is a bipartite graph with color classes of sizes m and n containing no
subgraph Kr,s with the r vertices in the class of size m and the s vertices in
the class of size n, then

/E(G)/ = 0 (min(mn 1 - 1/ r +n,m1-1/Sn+m»).

(In both parts, the constant of proportionality depends on rand s.)


Note that in the second part of the theorem, the situation is not symmet-
ric: By forbidding the "reverse" placement of Kr,s, we get a different bound
in general.
The upper bound in the theorem is suspected to be tight, but a matching
lower bound is known only for some special values of rand s, in particular
for r ::::; 3 (and all s ~ r).
To see the relevance of forbidden K 2 ,2 to the point-line incidences, we
consider a set P of points and a set L of lines and we define a bipartite
graph with vertex set P ULand with edges corresponding to incidences.
An edge {p, C} means that the point p lies on the line C. So the number of
incidences equals the number of edges. Since two points determine a line,
this graph contains no K 2 ,2 as a subgraph: Its presence would mean that
two distinct lines both contain the same two distinct points. The Kovari-
Sos-Thran theorem thus immediately implies Lemma 4.5.1, and the above
proof of this lemma is the usual proof of that theorem, for the special case
r = s = 2, rephrased in terms of points and lines.
66 Chapter 4: Incidence Problems

As was noted above, for arbitrary bipartite graphs with forbidden K 2 ,2,
not necessarily being incidence graphs of points and lines in the plane, the
bound in the K6vari-Sos-Thran theorem cannot be improved. So, in order
to do better for point-line incidences, one has to use some more geometry
than just the excluded K 2 ,2. In fact, this was one of the motivations of the
problem of point-line incidences: In a finite projective plane of order q, we
have n = q2+q+1 points, n lines, and (q+1)n >::;; n 3/2 incidences, and so the
Szemeredi-Trotter theorem strongly distinguishes the Euclidean plane from
finite projective planes in a combinatorial sense.
Proof of the Szemeredi-Trotter theorem (Theorem 4.1.1) for m =
n. The bound from Lemma 4.5.1 is weaker than the tight Szemeredi-Trotter
bound, but it is tight if n 2 ::; m or m 2 ::; n. The idea of the present proof
is to convert the "balanced" case (n points and n lines) into a collection of
"unbalanced" subproblems, for which Lemma 4.5.1 is optimal. We apply the
following important result:
4.5.3 Lemma (Cutting lemma ). Let L be a set of n lines in the plane,
and let r be a parameter, 1 < r < n. Then the plane can be subdivided
into t generalized triangles (this means intersections of three half-planes)
~I' ~2"'" ~t in such a way that the interior of each ~i is intersected by at
most ~ lines of L, and we have t ::; Cr2 for a certain constant C independent
ofn and r.
Such a collection ~I"'" ~t may look like this, for example:

The lines of L are not shown.


In order to express ourselves more economically, we introduce the follow-
ing terminology. A cutting is a subdivision of the plane into finitely many
generalized triangles. (We sometimes omit the adjective "generalized" in the
sequel.) A given cutting is a ~ -cutting for a set L of n lines if the interior of
each triangle of the cutting is intersected by at most ~ lines of L.
Proofs of the cutting lemma will be discussed later, and now we continue
the proof of the Szemeredi-Trotter theorem.
Let P be the considered n-point set, L the set of n lines, and I(P, L)
the number of their incidences. We fix a "magic" value r = n 1/3 , and we
4.5 Point-Line Incidences via Cuttings 67

divide the plane into t = O(r2) = O(n 2/ 3 ) generalized triangles ~l"'" ~t


so that the interior of each ~i is intersected by at most n/r = n 2/ 3 lines of
L, according to the cutting lemma.
Let Pi denote the points of P lying inside ~i or on its boundary but not at
the vertices of ~i' and let Li be the set of lines of L intersecting the interior
of ~i' The pairs (Li' Pi) define the desired "unbalanced" subproblems. We
have ILil S; n 2 / 3 , and while the size of the Pi may vary, the average IPil is
T ~ n 1/ 3, which is about the square root of the size of L i .
We have to be a little careful, since not all incidences of Land Pare
necessarily included among the incidences of some Li and Pi' One exceptional
case is a point pEP not appearing in any of the Pi'

Such a point has to be the vertex of some ~i' and so there are no more than
3t such exceptional points. These points have at most J(n, 3t) incidences with
the lines of L. Another exceptional case is a line of L containing a side of ~i
but not intersecting its interior and therefore not included in L i , although it
may be incident with some points on the boundary of ~i'

There are at most 3t such exceptional lines, and they have at most J(3t, n)
incidences with the points of P. So we have
t

J(L, P) S; J(n, 3t) + J(3t, n) + L J(Li' Pi).


i=l
By Lemma 4.5.1, J(n,3t) and J(3t, n) are both bounded by O(tyTi + n)
O(n 7 / 6 ) « n 4 / 3 , and it remains to estimate the main term. We have ILil S;
n 2/ 3 and 2::~=1IPil S; 2n, since each point of P goes into at most two Pi'
Using the bound (4.2) for each J(Li' Pi) we obtain
t t t

LJ(Li,Pi ) S; LJ(n2/ 3 , !Pi!) = LO(IPiln1/3 +n2/ 3 )


i=l i=l i=l
68 Chapter 4: Incidence Problems

This finally shows that J(n, n) = O(n 4 / 3). o


Bibliography and remarks. The bound in Lemma 4.5.1 using
excluded K 2 ,2 is due to Erdos [Erd46].
Determining the maximum possible number of edges in a Kr,s-
free bipartite graph with given sizes of the two color classes is known
as the Zarankiewicz problem. The general upper bound given in the
text was shown by Kovari, Sos, and Turan [KST54]. For a long time,
matching lower bounds (constructions) were known only for r :::; 3
and all s 2: r (in these cases, even the constant in the leading term
is known exactly; see Fiiredi [Fiir96] for some of these results and
references). In particular, K2,2-free graphs on n vertices with Q(n 3/ 2)
edges are provided by incidence graphs of finite projective planes, and
K3,3-free graphs on n vertices with Q(n 5 / 3) edges were obtained by
Brown [Bro66]. His construction is the "distance-k graph" in the 3-
dimensional affine space over finite fields of order q == -1 mod 4, for
a suitable k = k(q). Recently, Kollar, Ronyai, and SzabO [KRS96]
constructed asymptotically optimal Kr,s-free graphs for s very large
compared to r, namely s 2: r! + 1, using results of algebraic geometry.
This was slightly improved by Alon, Ronyai, and SzabO [ARS99] to s 2:
(r-1)!+l. They also obtained an alternative to Brown's construction
of K3,3-free graphs with a better constant, and asymptotically tight
lower bounds for some "asymmetric" instances of the Zarankiewicz
problem, where one wants a Kr,s-free bipartite graph with color classes
of sizes nand m (with the "orientation" of the forbidden Kr,s fixed).
The approach to incidence problems using cuttings first appeared
in a seminal paper of Clarkson, Edelsbrunner, Guibas, Sharir, and
Welzl [CEG+90J, based on probabilistic methods developed in compu-
tational geometry ([Cla87J, [HW87J, and [CS89] are among the most
influential papers in this development). Clarkson et al. did not use
cuttings in our sense but certain "cuttings on the average." Namely,
if ni is the number of lines intersecting the interior of ~i' then their
cuttings have t = O(r2) triangles and satisfy 2::!=1 n'j :::; C(c)· r2(~)C,
where c 2: 1 is an integer constant, which can be selected as needed
for each particular application, and C(c) is a constant depending on
c. This means that the cth degree average of the ni is, up to a con-
stant, the same as if all the ni were O( ~ ). Technically, these "cuttings
on the average" can replace the optimal ~-cuttings in most applica-
tions. Clarkson et al. [CEG+90] proved numerous results on various
incidence problems and many-cells problems by this method; see the
notes to Section 4.l.
The cutting lemma was first proved by Chazelle and Friedman
[CF90] and, independently, by Matousek [Mat90a]. The former proof
yields an optimal cutting lemma in every fixed dimension and will be
discussed in Section 6.5, while the latter proof applies only to planar
4.5 Point-Line Incidences via Cuttings 69

cutting and is presented in Section 4.7. A third, substantially different,


proof was discovered by Chazelle [Cha93a].
Yet another proof of the Szemeredi- Trotter theorem was recently
found by Aronov and Sharir (it is a simplification of the techniques
in [ASOla]). It is based on the case d = 2 of the following partition
theorem of Matousek [Mat92]: For every n-point set X C R d , d fixed,
and every r, 1 S r S n, there exists a partition X = Xl UX2U· .. UXt ,
t = O( r), such that ~ S IXi I s 2; for all i and no hyperplane h crosses
more than O(r - / ) of the sets Xi. Here h crossing Xi means that Xi
l l d

is not completely contained in one of the open half-spaces defined by


h or in h itself. 1 This result is proved using the d-dimensional cutting
lemma (see Section 4.6). The bound O(r l - l / d ) is asymptotically the
best possible in general.
To use this result for bounding I(L, P), where L is a set of n lines
and P a set of n points in the plane, we let X = Do(L) be the set of
points dual to the lines of L (see Section 5.1). We apply the partition
theorem to X with r = n 2 / 3 and dualize back, which yields a partition
L = Ll U L2 U··· U L t , t = O(r), with ILil >:::! ~ = n l / 3 . The crossing
condition implies that no point p is incident to lines from more than
O( vir) of the L i , not counting the pathological Li where p is common
to all the lines of L i .
We consider the incidences of a point pEP with the lines of L i .
The i where p lies on at most one line of Li contribute at most O( vir)
incidences, which gives a total of O(nvlr) = 0(n 4/ 3 ) for all pEP.
On the other hand, if p lies on at least two lines of Li then it is a
vertex of the arrangement of L i . As is easy to show, the number of in-
cidences of k lines with the vertices of their arrangement is 0(k2)
(Exercise 6.1.6), and so the total contribution from these cases is
0(2: ILiI2) = O(n 2 /r) = 0(n 4/ 3 ). This proves the balanced case of
Szemen§di-Trotter, and the unbalanced case works in the same way
with an appropriate choice of r. Unlike the previous proofs, this one
does not directly apply with pseudolines instead of lines.
Improved point-circle incidences. A similar method also proves that
Icirc(n, n) = O(n1.4) (see Exercise 4.4.2 for another proof). Circles
are dualized to points and points to surfaces of cones in R 3 , and the
appropriate partition theorem holds as well, with no surface of a cone
crossing more than 0(r 2 / 3 ) of the subsets Xi.
Aronov and Sharir [AS01a] improved the bound to Icirc(m, n) =
0(m 2 / 3 n 2 / 3 + m) for large m, namely m ::::: n(5-3c:)/(4-9c:) , and to
Icirc(m, n) = O(m(6+3c:)/lln(9-c:)/1l + n) fo~ the smaller m (here, as
usual, c > 0 can be chosen arbitrarily small, influencing the constants
1 A slightly stronger result is proved in [Mat92j: For every Xi we can choose
a relatively open simplex (ji ;2 Xi, and no h crosses more than O(rl-l/d) of
the (ji.
70 Chapter 4: Incidence Problems

of proportionality). Agarwal et al. [AASOl] obtained almost the same


bounds for the maximum complexity of m cells in an arrangement of
n circles.
A key ingredient in the Aronov-Sharir proof are results on the
following question of independent interest. Given a family of n curves
in the plane, into how many pieces ("pseudosegments") must we cut
them, in the worst case, so that no two pieces intersect more than once?
This problem, first studied by Tamaki and Tokuyama [TT98], will be
briefly discussed in the notes to Section 11.1. For the curves being
circles, Aronov and Sharir [ASOla] obtained the estimate O(n 3 / 2 +e ),
improving on several previous results.
To bound the number [( P, C) of incidences of an m-point set P and
a set C of n circles, we delete the circles containing at most 2 points, we
cut the circles into O(n 3 / 2 +e ) pieces as above, and we define a graph
with vertex set P and with edges being the circle arcs that connect
consecutive points along the pieces. The number of edges is at least
[(P, C) - O(n 3 /2+ e ). The crossing number theorem applies (since the
graph is simple) and yields [(P, C) = O(m 2 / 3 n 2 / 3 + n 3 /2+ e ), which is
tight for m about n 5 / 4 and larger. For smaller m, Aronov and Sharir
use the method with partition in the dual space outlined above to
divide the original problem into smaller subproblems, and for these
they use the bound just mentioned.

Exercises
1. Let hcirc (m, n) be the maximum number of incidences between m points
and n unit circles in the plane. Prove that hcirc (m, n) = O( my'n + n) by
the method of Lemma 4.5.1. III
2. Let [circ(m, n) be the maximum possible number of incidences between
m points and n arbitrary circles in the plane. Prove that lcirc(m, n)
O(nvrn + n) and [circ(m, n) = O(mn 2 / 3 + n). III

4.6 A Weaker Cutting Lemma


Here we prove a version of the cutting lemma (Lemma 4.5.3) with a slightly
worse bound on the number of the ~i' The proof uses the probabilistic
method and the argument is very simple and general. We will improve on
it later and obtain tight bounds in a more general setting in Section 6.5. In
Section 4.7 below we give another, self-contained, elementary geometric proof
of the planar cutting lemma .
Here we are going to prove that every set of n lines admits a ~-cutting
consisting of O(r2 10g2 n) triangles. But first let us see why at least O(r2)
triangles are necessary.
4.6 A Weaker Cutting Lemma 71

A lower bound. Consider n lines in general position. Their arrangement


has, as we know, G)+n+1 2: n 2 /2 cells. On the other hand, considering
a triangle Ai whose interior is intersected by k ::; ~ lines (k 2: 1), we see
that Ai is divided into at most (;) +k+ 1 ::; 2k2 cells. Since each cell of the
arrangement has to show up in the interior of at least one triangle Ai, the
number of triangles is at least n 2/4k 2 = n(r 2 ). Hence the cutting lemma is
asymptotically optimal for r -+ 00.
Proof of a weaker version of the cutting lemma (Lemma 4.5.3). We
select a random sample S <:;;; L of the given lines. We make s independent
random draws, drawing a random line from L each time. These are draws
with replacement: One line can be selected several times, and so S may have
fewer than s lines.
Consider the arrangement of S. Partition the cells that are not (general-
ized) triangles by adding some suitable diagonals, as illustrated below:

lin of

a ld d diagonal

lin s of L \

This creates (generalized) triangles AI, A2,"" At with t = 0(s2) (since we


have a drawing of a planar graph with (;)+1 vertices; also see Exercise 2).

4.6.1 Lemma. For s = 6r In n, the following holds with a positive probabil-


ity: The Ai form a ;-cutting for L; that is, the interior of no Ai is intersected
by more than ~ lines of L.

This implies the promised weaker version of the cutting lemma: Since the
probability of the sample S being good is positive, there exists at least one
good S that yields the desired collection of triangles.
Proof of Lemma 4.6.1. Let us say that a triangle T is dangerous if its
interior is intersected by at least k = ~ lines of L. We fix some arbitrary
dangerous triangle T. What is the probability that no line of the sample S
intersects the interior of T? We select a random line s times. The probability
that we never hit one of the k lines intersecting the interior of T is at most
72 Chapter 4: Incidence Problems

(1 - k/n)s. Using the well-known inequality l+x ::; eX, we can bound this
probability by e- ks / n = e- 61nn = n- 6 •
Call a triangle T interesting (for L) if it can appear in a triangulation for
some sample S S;; L. Any interesting triangle has vertices at some three ver-
tices of the arrangement of L, and hence there are fewer than n 6 interesting
triangles. 2 Therefore, with a positive probability, a random sample S inter-
sects the interiors of all the dangerous interesting triangles simultaneously.
In particular, none of the triangles ~i appearing in the triangulation of such
a sample S can be dangerous. This proves Lemma 4.6.1. D

More sophisticated probabilistic reasoning shows that it is sufficient to


choose s = const . r log r in Lemma 4.6.1, instead of const . r log n, and still,
with a positive probability no interesting dangerous triangle is missed by S
(see Section 6.5 and also Exercise 10.3.4). This improvement is important for
r small, say constant: It shows that the number of triangles in a ~-cutting
can be bounded independent of n.
To prove the asymptotically tight bound O(r2) by a random sampling
argument seems considerably more complicated and we will discuss this in
Section 6.5.

Bibliography and remarks. The ideas in the above proof of the


weaker cutting lemma can be traced back at least to early papers of
Clarkson (such as [Cla87]) on random sampling in computational ge-
ometry. The presented proof was constructed ex post facto for didactic
purposes; the cutting lemma was first proved, as far as I know, in a
stronger form (with logr instead of logn).

Exercises
1. Calculate the exact expected size of S, a sample drawn from n elements
by s independent random draws with replacements. 0
2. Calculate the number of (generalized) triangles arising by triangulating
an arrangement of n lines in the plane in general position. (First, specify
how exactly the unbounded cells are triangulated.) 0
3. (A cutting lemma for circles) Consider a set K of n circles in the plane.
Select a sample S S;; K by s independent random draws with replacement.
Consider the arrangement of S, and construct its vertical decomposition;
that is, from each vertex extend vertical segments upwards and down-
wards until they hit a circle of S (or all the way to infinity). Similarly
extend vertical segments from the leftmost and rightmost points of each
circle.

2 The unbounded triangles have only 1 or 2 vertices, but they are completely
determined by their two unbounded rays, and so their number is at most n 2 •
4.7 The Cutting Lemma: A Tight Bound 73

(a) Show that this partitions the plane into 0(8 2 ) "circular trapezoids"
(shapes bounded by at most two vertical segments and at most two cir-
cular arcs). I2l
(b) Show that for 8 = Cr In n with a sufficiently large constant C, there
is a positive probability that the sample S intersects all the dangerous
interesting circular trapezoids, where "dangerous" and "interesting" are
defined analogously to the definitions in the proof of the weaker version
of the cutting lemma . 0
4. Using Exercises 3 and 4.5.1, show that the number of unit distances
determined by n points in the plane is 0(n4/3Iog2 /3 n). 0
5. Using Exercises 3 and 4.5.2, show that lcirc(n, n) = 0(n1.41ogC n) (for
some constant c), where Icirc(m, n) is the maximum possible number of
incidences between m points and n arbitrary circles in the plane. 0

4.7 The Cutting Lemma: A Tight Bound


Here we prove the cutting lemma in full strength. The proof is simple and
elementary, but it does not seem to generalize to higher-dimensional situa-
tions.
For simplicity, we suppose that the given set L of n lines is in general
position. (If not, perturb it slightly to get general position, construct the ~­
cutting, and perturb back; this gives a ~-cutting for the original collection of
lines; we omit the details.) First we need some definitions and observations
concerning levels.
Levels and their simplifications. Let L be a fixed finite set of lines in
the plane; we assume that no line of L is vertical. The level of a point x E R2
is defined as the number of lines of L lying strictly below x.
We note that the level of all points of an (open) cell of the arrangement of
L is the same, and similarly for a (relatively open) edge. On the other hand,
the level of an edge can differ from the levels of its endpoints, for example.
We define the level k of the arrangement of L, where 0 ::; k < n, as the set
Ek of all edges of the arrangement of L having level exactly k. These edges
plus their endpoints form an x-monotone polygonal line, where x-monotone
means that each vertical line intersects it at exactly one point.
It is easy to see that the level k makes a turn at each endpoint of its
edges; it looks somewhat like this:
74 Chapter 4: Incidence Problems

The level k is drawn thick, while the thin segments are pieces of lines of L
and they do not belong to the level k.
Let eo, el, ... , et be the edges of Ek numbered from left to right; eo and
et are the unbounded rays. Let us fix a point Pi in the interior of ei. For an
integer parameter q :::: 2, we define the q-simplification of the level k as the
monotone polygonal line containing the left part of eo up to the point Po, the
segments POPq, PqP2q,· .. , PL(t-l)/qJqPt, and the part of et to the right of Pt·
Thus, the q-simplification has at most ~+2 edges. Here is an illustration for
t = 9, q = 4:

(We could have defined the q-simplification by connecting every qth vertex
of the level, but the present way makes some future considerations a little
simpler.)
4.7.1 Lemma.
(i) The portion II of the level k (considered as a polygonal line) between the
points Pj and Pj+q is intersected by at most q+1 lines of L.
(ii) The segment PjPj+q is intersected by at most q+1 lines of L.
(iii) The q-simplification of the level k is contained in the strip between the
levels k - Iq/21 and k + Iq/21·

Proof. Part (i) is obvious: Each line of L intersecting II contains one of


the edges ej, ej+1, ... , ej+q. As for (ii), II is connected, and hence all lines
intersecting its convex hull must intersect II itself as well. The segment PjPj+q
is contained in conv(II).
Concerning (iii), imagine walking along some segment PjPj+q of the q-
simplification. We start at an endpoint, which has level k. Our current level
may change only as we cross lines of L. Moreover, having traversed the whole
segment we must be back to level k. Thus, to get from level k to k + i and
back to k we need to cross at least 2i lines on the way. From this and (ii),
2i::; q+1, and hence i::; l(q+1)/2J = Iq/21- 0

Proof of the cutting lemma for lines in general position. Let r be the
given parameter. If r = Q(n), then it suffices to produce a O-cutting of size
O(n 2 ) by simply triangulating the arrangement of L. Hence we may assume
that r is much smaller than n.
Set q = In/lOr 1- Divide the levels Eo, E l , ... , E n - l into q groups: The
ith group contains all E j with j congruent to i modulo q (i = 0,1, ... ,q-1).
Since the total number of edges in the arrangement is n 2 , there is an i such
4.7 The Cutting Lemma: A Tight Bound 75

that the ith group contains at most n 2 /q edges. We fix one such i; from now
on, we consider only the levels i, q+i, 2q+i, . .. , and we construct the desired
~-cutting from them.
Let Pj be the q-simplification of the level jq+i. If E jq +i has mj edges,
then Pj has at most mj / q + 3 edges, and the total number of edges of the Pj ,
j = 0, 1, ... , l(n-1)/qJ, can be estimated by n 2/q2 + 3(n/q+1) = 0(n 2/q2).
We note that the polygonal chains Pj never intersect properly: If they did, a
vertex of some Pj, which has level qj + i, would be above PHI, and this is
ruled out by Lemma 4.7.1(iii).
We form the vertical decomposition for the Pj ; that is, we extend vertical
segments from each vertex of Pj upwards and downwards until they hit Pj - I
and PHI:

This subdivides the plane into 0(n 2/q2) = 0(r2) trapezoids.


We claim that each such trapezoid is intersected by at most ~ lines of L.
We look at a trapezoid in the strip between Pj and Pj + l . By Lemma 4.7.1(iii),
r r
it lies between the levels qj + i - q /21 and q(j +1) + i + q/21, and therefore,
each of its vertical sides is intersected by no more than 3q lines. The bottom
side is a part of an edge of Pj , and consequently, it is intersected by no
more than q+ 1 lines; similarly for the top side. Hence the number of lines
intersecting the considered trapezoid is certainly at most lOq :::; ~. (A more
careful analysis shows that one trapezoid is in fact intersected by at most
2q + 0(1) lines; see Exercise 1.)
Finally, a ~-cutting can be obtained by subdividing each trapezoid into
two triangles by a diagonal. But let us remark that for applications of ~­
cuttings, trapezoids are usually as good as triangles. 0

Bibliography and remarks. The basic ideas of the presented proof


are from [Mat90a], and the presentation basically follows [Mat98].
The latter paper provides some estimates for the number of trian-
gles or trapezoids in a ~-cutting, as r --t 00: For example, at least
2.54(1- 0(1) )r2 trapezoids are sometimes necessary, and 8(1 + 0(1) )r2
trapezoids always suffice. The notion of levels and their simplifications,
as well as Lemma 4.7.1, are due to Edelsbrunner and Welzl [EW86].
76 Chapter 4: Incidence Problems

Exercises
1. (a) Verify that each trapezoid arising in the described construction is
intersected by at most 2.5q+O(1) lines. Setting q appropriately, show that
the plane can subdivided into 12.5r2 + O(r) trapezoids, each intersected
by at most ~ lines, assuming 1 « r « n. ~
(b) Improve the bounds from (a) to 2q+O(1) and Sr2+0(r), respectively.
[!J
5

Convex Polytopes

Convex polytopes are convex hulls of finite point sets in Rd. They constitute
the most important class of convex sets with an enormous number of appli-
cations and connections.
Three-dimensional convex polytopes, especially the regular ones, have
been fascinating people since the antiquity. Their investigation was one of
the main sources of the theory of planar graphs, and thanks to this well-
developed theory they are quite well understood. But convex polytopes in
dimension 4 and higher are considerably more challenging, and a surprisingly
deep theory, mainly of algebraic nature, was developed in attempts to under-
stand their structure.
A strong motivation for the study of convex polytopes comes from prac-
tically significant areas such as combinatorial optimization, linear program-
ming, and computational geometry. Let us look at a simple example illus-
trating how polytopes can be associated with combinatorial objects. The
3-dimensional polytope in the picture
2341 1342

4213 3214
78 Chapter 5: Convex Polytopes

is called the permutahedron. Although it is 3-dimensional, it is most natu-


rally defined as a subset of R 4 , namely, the convex hull of the 24 vectors
obtained by permuting the coordinates of the vector (1,2,3,4) in all possible
ways. In the picture, the (visible) vertices are labeled by the correspond-
ing permutations. Similarly, the d-dimensional permutahedron is the con-
vex hull of the (d+ I)! vectors in R d+ 1 arising by permuting the coordinates
of (1,2, ... , d+l). One can observe that the edges of the polytope connect
exactly pairs of permutations differing by a transposition of two adjacent
numbers, and a closer examination reveals other connections between the
structure of the permutahedron and properties of permutations.
There are many other, more sophisticated, examples of convex polytopes
assigned to combinatorial and geometric objects such as graphs, partially or-
dered sets, classes of metric spaces, or triangulations of a given point set. In
many cases, such convex polytopes are a key tool for proving hard theorems
about the original objects or for obtaining efficient algorithms. Two impres-
sive examples are discussed in Chapter 12, and several others are scattered
in other chapters.
The present chapter should convey some initial working knowledge of
convex polytopes for a nonpolytopist. It is just a small sample of an extensive
theory. A more comprehensive modern introduction is the book by Ziegler
[Zie94].

5.1 Geometric Duality


First we discuss geometric duality, a simple technical tool indispensable in
the study of convex polytopes and handy in many other situations. We begin
with a simple motivating question.
How can we visualize the set of all lines intersecting a convex pentagon
as in the picture?

A suitable way is provided by line-point duality.


5.1.1 Definition (Duality transform). The (geometric) duality transform
is a mapping denoted by 'Do. To a point a E R d \ {O} it assigns the hyperplane
'Do(a) = {x E Rd: (a,x) = I},
and to a hyperplane h not passing through the origin, which can be uniquely
written in the form h = {x E Rd: (a,x) = I}, it assigns the point 'Do(h) =
a E Rd \ {O} ..
5.1 Geometric Duality 79

Here is the geometric meaning of the duality transform. If a is a point


at distance <5 from 0, then Vo(a) is the hyperplane perpendicular to the line
Oa and intersecting that line at distance ~ from 0, in the direction from 0
towards a.
~a

~D,(a)
A nice interpretation of duality is obtained by working in Rd+l and iden-
tifying the "primal" Rd with the hyperplane 7f = {x E R d + l : Xd+1 = I}
and the "dual" Rd with the hyperplane p = {x E R d+ 1 : Xd+1 = -I}. The
hyperplane dual to a point a E 7f is produced as follows: We construct the
hyperplane in R d + 1 perpendicular to Oa and containing 0, and we intersect
it with p. Here is an illustration for d = 2:

~--------~~------~7f

In this way, the duality Vo can be naturally extended to k-flats in Rd, whose
duals are (d-k-l)-flats. Namely, given a k-flat J C 7f, we consider the (k+l)-
flat F through 0 and J, we construct the orthogonal complement of F, and
we intersect it with p, obtaining Vo(f).
Let us consider the pentagon drawn above and place it so that the origin
lies in the interior. Let Vi = VO(£i), where £i is the line containing the side
aiai+l. Then the points dual to the lines intersecting the pentagon al a2 ... as
fill exactly the exterior of the convex pentagon VI V2 ... Vs:
80 Chapter 5: Convex Polytopes

This follows easily from the properties of duality listed below (of course, there
is nothing special about a pentagon here). Thus, the considered set of lines
can be nicely described in the dual plane. A similar passage from lines to
points or back is useful in many geometric or computational problems.
Properties of the duality transform. Let p be a point of Rd distinct
from the origin and let h be a hyperplane in R d not containing the origin.
Let h - stand for the closed half-space bounded by h and containing the
origin, while h+ denotes the other closed half-space bounded by h. That is,
if h = {x E Rd: (a,x) = I}, then h- = {x E Rd: (a,x) ::; I}.

5.1.2 Lemma (Duality preserves incidences).


(i) p E h if and only ifVo(h) E Vo(p).
(ii) pE h- if and only ifVo(h) E Vo(p)-.

Proof. (i) Let h = {x E Rd: (a, x) = I}. Then p E h means (a, p) = 1.


Now, Vo(h) is the point a, and Vo(p) is the hyperplane {y E Rd: (y,p) = I},
and hence Vo(h) = a E Vo(p) also means just (a,p) = 1. Part (ii) is proved
similarly. 0

5.1.3 Definition (Dual set). For a set X ~ R d , we define the set dual to
X, denoted by X*, as follows:

X* = {y E Rd: (x,y) ::; I for all x E X}.

Another common name used for the duality is polarity; the dual set would
then be called the polar set. Sometimes it is denoted by XO.
Geometrically, X* is the intersection of all half-spaces of the form Vo (x)-
with x E X. Or in other words, X* consists of the origin plus all points y
such that X ~ Vo(y)-. For example, if X is the pentagon ala2 ... a5 drawn
above, then X* is the pentagon VIV2 .•. V5'
For any set X, the set X* is obviously closed and convex and contains the
origin. Using the separation theorem (Theorem 1.2.4), it is easily shown that
for any set X ~ R d , the set (X*)* is the closure conv(XU{O}). In particular,
for a closed convex set containing the origin we have (X*)* = X (Exercise 3).
For a hyperplane h, the dual set h* is different from the point VO(h).l
For readers familiar with the duality of planar graphs, let us remark that
it is closely related to the geometric duality applied to convex polytopes in
R3. For example, the next drawing illustrates a planar graph and its dual
graph (dashed):

1 In the literature, however, the "star" notation is sometimes also used for the dual
point or hyperplane, so for a point p, the hyperplane Vo(p) would be denoted by
p*, and similarly, h* may stand for Vo(h).
5.1 Geometric Duality 81

Later we will see that these are graphs of the 3-dimensional cube and of
the regular octahedron, which are polytopes dual to each other in the sense
defined above. A similar relation holds for all 3-dimensional polytopes and
their graphs.
Other variants of duality. The duality transform Do defined above is just
one of a class of geometric transforms with similar properties. For some pur-
poses, other such transforms (dualities) are more convenient. A particularly
important duality, denoted by V, corresponds to moving the origin to the
"minus infinity" of the xd-axis (the xd-axis is considered vertical). A formal
definition is as follows.
5.1.4 Definition (Another duality). A non vertical hyperplane h can be
uniquely written in the form h = {x E Rd: Xd = alXI + ... +ad-Ixd-l - ad}.
We set V( h) = (aI, ... , ad-I, ad). Conversely, the point a = (aI, ... , ad-I, ad)
maps back to h.
The property (i) of Lemma 5.1.2 holds for this V, and an analogue of (ii)
is:
(ii') A point p lies above a hyperplane h if and only if the point V(h) lies
above the hyperplane V(p).

Exercises
1. Let C = {x E Rd: IXII + ... + IXdl ::::; I}. Show that C* is the d-dimen-
sional cube {x E R d: max IXi I ::::; I}. Picture both bodies for d = 3. 0
2. Prove the assertion made in the text about the lines intersecting a convex
pentagon. 0
3. Show that for any X ~ R d , (X*)* equals the closure of conv(X U {O}),
where X* stands for the dual set to X. 0
4. Let C ~ Rd be a convex set. Prove that C* is bounded if and only if 0
lies in the interior of C. 0
5. Show that C = C* if and only if C is the unit ball centered at the origin.
o
n
6. (a) Let C = conv(X) ~ Rd. Prove that C* = xEX Vo(x)-. 0
n
(b) Show that if C = hEH h - , where H is a collection of hyperplanes not
passing through 0, and if C is bounded, then C* = conv{Vo(h): hE H}.
o
(c) What is the right analogue of (b) if C is unbounded? 0
7. What is the dual set h* for a hyperplane h, and what about h**? 0
82 Chapter 5: Convex Polytopes

8. Verify the geometric interpretation of the duality Do outlined in the text


(using the embeddings of Rd into Rd+l). ~
9. (a) Let s be a segment in the plane. Describe the set of all points dual
to lines intersecting s. I2J
(b) Consider n 2': 3 segments in the plane, such that none of them contains
o but they all lie on lines passing through o. Show that if every 3 among
such segments can be intersected by a single line, then all the segments
can be simultaneously intersected by a line. 0
(c) Show that the assumption in (b) that the extensions of the segments
pass through 0 is essential: For each n 2': 1, construct n+ 1 pairwise
disjoint segments in the plane that cannot be simultaneously intersected
by a line but every n of them can (such an example was first found by
Hadwiger and Debrunner). [II

5.2 H-Polytopes and V-Polytopes


A convex polytope in the plane is a convex polygon. Famous examples of
convex polytopes in R3 are the Platonic solids: regular tetrahedron, cube,
regular octahedron, regular dodecahedron, and regular icosahedron. A convex
polytope in R3 is a convex set bounded by finitely many convex polygons.
Such a set can be regarded as a convex hull of a finite point set, or as an
intersection of finitely many half-spaces. We thus define two types of convex
polytopes, based on these two views.
5.2.1 Definition (H-polytope and V-polytope). An H-polyhedron is
an intersection of finitely many closed half-spaces in some Rd. An H-poly-
tope is a bounded H -polyhedron.
A V-polytope is the convex hull of a finite point set in Rd.
A basic theorem about convex polytopes claims that from the mathemat-
ical point of view, H-polytopes and V-polytopes are equivalent.
5.2.2 Theorem. Each V -polytope is an H -polytope. Each H -polytope is a
V -polytope.
This is one of the theorems that may look "obvious" and whose proof
needs no particularly clever idea but does require some work. In the present
case, we do not intend to avoid it. Actually, we have quite a neat proof in
store, but we postpone it to the end of this section.
Although H-polytopes and V-polytopes are mathematically equivalent,
there is an enormous difference between them from the computational point
of view. That is, it matters a lot whether a convex polytope is given to
us as a convex hull of a finite set or as an intersection of half-spaces. For
example, given a set of n points specifying a V-polytope, how do we find
its representation as an H-polytope? It is not hard to come up with some
algorithm, but the problem is to find an efficient algorithm that would allow
5.2 H -Polytopes and V -Polytopes 83

one to handle large real-world problems. This algorithmic question is not yet
satisfactorily solved. Moreover, in some cases the number of required half-
spaces may be astronomically large compared to the number n of points, as
we will see later in this chapter.
As another illustration of the computational difference between V-po-
lytopes and H-polytopes, we consider the maximization of a given linear
function over a given polytope. For V-polytopes it is a trivial problem, since
it suffices to substitute all points of V into the given linear function and select
the maximum of the resulting values. But maximizing a linear function over
the intersection of a collection of half-spaces is the basic problem of linear
programming, and it is certainly nontrivial.
Terminology. The usual terminology does not distinguish V-polytopes and
H-polytopes. A convex polytope means a point set in Rd that is a V-polytope
(and thus also an H-polytope). An arbitrary, possibly unbounded, H-poly-
hedron is called a convex polyhedron. All polytopes and polyhedra considered
in this chapter are convex, and so the adjective "convex" is often omitted.
The dimension of a convex polyhedron is the dimension of its affine hull.
It is the smallest dimension of a Euclidean space containing a congruent copy
of P.
Basic examples. One of the easiest classes of polytopes is that of cubes.
The d-dimensional cube as a point set is the Cartesian product [-1, l]d.

d=l
D d=2 d=3
As a V-polytope, the d-dimensional cube is the convex hull of the set {-I, l}d
(2d points), and as an H-polytope, it can be described by the inequalities
-1 :s: Xi :s: 1, i = 1,2, ... , d, i.e., by 2d half-spaces. We note that it is also
the unit ball of the maximum norm IIxll oo = maxi IxJ
Another important example is the class of crosspolytopes (or generalized
octahedra). The d-dimensional crosspolytope is the convex hull of the "co-
ordinate cross," i.e., conv{el, -el, e2, -e2, ... , ed, -ed}, where el,"" ed are
the vectors of the standard orthonormal basis.

d=l
0 d=2 d=3
84 Chapter 5: Convex Polytopes

It is also the unit ball of the t'l-norm IIxliI = I:~=l IXil . As an H-polytope,
it can be expressed by the 2d half-spaces of the form (a, :::;)1, where a runs
through all vectors in {-I, l}d.
The polytopes with the smallest possible number of vertices (for a given
dimension) are called simplices.

5.2.3 Definition (Simplex). A simplex is the convex hull of an afflnely


independent point set in some Rd.
A d-dimensional simplex in Rd can also be represented as an intersection
of d+ 1 half-spaces, as is not difficult to check.
A regular d-dimensional simplex is the convex hull of d+ 1 points with all
pairs of points having equal distances .

d=O d=1 d=2 d=3

Unlike cubes and crosspolytopes, d-dimensional regular simplices do not have


a very nice coordinate representation in Rd. The simplest and most useful
representation lives one dimension higher: The convex hull of the d+ 1 vectors
el, ... ,ed+l of the standard orthonormal basis in R d+ l is ad-dimensional
regular simplex with side length J2.

(0, 0, 1)

(1, 0, 0)

Proof of Theorem 5.2.2 (equivalence of H -polytopes and V -poly-


topes). We first show that any H-polytope is also a V-polytope. We proceed
by induction on d. The case d = 1 being trivial, we suppose that d 2: 2.
n
So let r be a finite collection of closed half-spaces in Rd such that P = r
is nonempty and bounded. For each "( E r, let F"{ = pnfh be the intersection
of P with the bounding hyperplane of "(. Each nonempty F"{ is an H-polytope
of dimension at most d-l (correct?), and so it is the convex hull of a finite
set V"{ c F"{ by the inductive hypothesis.
We claim that P = conv(V), where V = U"{Er V"{. Let x E P and let t'
be a line passing through x. The intersection t' n P is a segment; let y and z
be its endpoints. There are 0:, (3 E r such that y E Fa: and z E F{3 (if Y were
5.2 H-Polytopes and V-Polytopes 85

not on the boundary of any I E r, we could continue along i! a little further


within P).

We have y E conv(Va ) and Z E conv(V,e) , and thus x E conv(Va U V,e) C


conv(V).
We have proved that any H-polytope is a V-polytope, and it remains to
show that a V-polytope can be expressed as the intersection of finitely many
half-spaces. This follows easily by duality (and implicitly uses the separation
theorem).
Let P = conv(V) with V finite, and assume that 0 is an interior point
n
of P. By Exercise 5.1.6(a), the dual body P* equals VEV Do(v)-, and by
Exercise 5.1.4 it is bounded. By what we have already proved, P* is a V-
polytope, and by Exercise 5.1.6(a) again, P = (P*)* is the intersection of
finitely many half-spaces. 0

Bibliography and remarks. The theory of convex polytopes is


a well-developed area covered in numerous books and surveys, such
as the already recommended recent monograph [Zie94] (with addenda
and updates on the web page of its author), the very influential book
by Griinbaum [Grii67l, the chapters on polytopes in the handbooks
of discrete and computational geometry [G097], of convex geometry
[GW93], and of combinatorics [GGL95j, or the books McMullen and
Shephard [MS71] and Br0nsted [Br083l, concentrating on questions
about the numbers of faces. Recent progress in combinatorial and com-
putational polytope theory is reflected in the collection [KZOO]. For
analyzing examples, one should be aware of (free) software systems
for manipulating convex polytopes, such as polymake by Gawrilow
and Joswig [GJOO].
Interesting discoveries about 3-dimensional convex polytopes were
already made in ancient Greece. The treatise by Schliifli [Sch01] writ-
ten in 1850-52 is usually mentioned as the beginning of modern theory,
and several books were published around the turn of the century. We
refer to Griinbaum [Grii67l, Schrijver [Sch86], and to the other sources
mentioned above for historical accounts.
The permutahedron mentioned in the introduction to this chapter
was considered by Schoute [SchUl, and it arises by at least two other
quite different and natural constructions (see [Zie94]).
There are several ways of proving the equivalence of H-polytopes
and V-polytopes. Ours is inspired by a proof by Edmonds, as presented
86 Chapter 5: Convex Polytopes

in Fukuda's lecture notes (ETH Zurich). A classical algorithmic proof


is provided by the Fourier-Motzkin elimination procedure, which pro-
ceeds by projections on coordinate hyperplanes; see [Zie94] for a de-
tailed exposition. The double-description method is a similar algorithm
formulated in the dual setting, and it is still one of the most efficient
known computational methods. We will say a little more about the
algorithmic problem of expressing the convex hull of a finite set as the
intersection of half-spaces in the notes to Section 5.5.
One may ask, What is a "vertex description" of an unbounded H-
polyhedron? Of course, it is not the convex hull of a finite set, but it
can be expressed as the Minkowski sum P + C, where P is a V-poly-
tope and C is a convex cone described as the convex hull of finitely
many rays emanating from O.

Exercises
1. Verify that a d-dimensional simplex in Rd can be expressed as the inter-
section of d+ 1 half-spaces. ~
2. (a) Show that every convex polytope in Rd is an orthogonal projection
of a simplex of a sufficiently large dimension onto the space Rd (which
we consider embedded as a d-£lat in some Rn). 121
(b) Prove that every convex polytope P symmetric about 0 (i.e., with
P = - P) is the affine image of a crosspolytope of a sufficiently large
dimension. 121

5.3 Faces of a Convex Polytope


The surface of the 3-dimensional cube consists of 8 "corner" points called
vertices, 12 edges, and 6 squares called facets. According to the perhaps more
usual terminology in 3-dimensional geometry, the facets would be called faces.
But in the theory of convex polytopes, the word face has a slightly different
meaning, defined below. For the cube, not only the squares but also the
vertices and the edges are all called faces of the cube.
5.3.1 Definition (Face). A face of a convex polytope P is defined as
• either P itself, or
• a subset of P of the form P n h, where h is a hyperplane such that P is
fully contained in one of the closed half-spaces determined by h.
88 Chapter 5: Convex Polytopes

Here is an example of a 3-dimensional polytope, the regular octahedron,


with its graph:

For polytopes in R 3 , the graph is always planar: Project the polytope from its
interior point onto a circumscribed sphere, and then make a "cartographic
map" of this sphere, say by stereographic projection. Moreover, it can be
shown that the graph is vertex 3-connected. (A graph G is called vertex k-
connected if IV(G)I 2: k+1 and deleting any at most k-1 vertices leaves G
connected.) Nicely enough, these properties characterize graphs of convex 3-
polytopes:
5.3.3 Theorem (Steinitz theorem). A finite graph is isomorphic to the
graph of a 3-dimensional convex polytope if and only if it is planar and vertex
3-connected.
We omit a proof of the considerably harder "if" part (exhibiting a poly-
tope for every vertex 3-connected planar graph); all known proofs are quite
complicated.
Graphs of higher-dimensional polytopes probably have no nice description
comparable to the 3-dimensional case, and it is likely that the problem of
deciding whether a given graph is isomorphic to a graph of a 4-dimensional
convex polytope is NP-hard. It is known that the graph of every d-dimen-
sional polytope is vertex d-connected (Balinski's theorem), but this is only a
necessary condition.
Examples. A d-dimensional simplex has been defined as the convex hull of
a (d+1)-point affinely independent set V. It is easy to see that each subset of
V determines a face of the simplex. Thus, there are (~!i) faces of dimension
k, k = -1,0, ... ,d, and 2d +1 faces in total.
The d-dimensional crosspolytope has V = {ell -el, ... ,ed, -ed as the
vertex set. A proper subset F c V determines a face if and only if there is
no i such that both ei E F and -ei E F (Exercise 2). It follows that there
are 3d +1 faces, including the empty one and the whole crosspolytope.
The nonempty faces of the d-dimensional cube [-l,lJd correspond to
vectors v E {-I, 1, O} d. The face corresponding to such v has the vertex
set {u E {-I, I} d: Ui = Vi for all i with Vi -I=- O}. Geometrically, the vector v
is the center of gravity of its face.
The face lattice. Let F(P) be the set of all faces of a (bounded) convex
polytope P (including the empty face 0 of dimension -1). We consider the
partial ordering of F(P) by inclusion.
5.3 Faces of a Convex Polytope 87

We observe that each face of P is a convex polytope. This is because P is


the intersection of finitely many half-spaces and h is the intersection of two
half-spaces, so the face is an H-polyhedron, and moreover, it is bounded.
If P is a polytope of dimension d, then its faces have dimensions -1, 0,
1, ... , d, where -1 is, by definition, the dimension of the empty set. A face
of dimension j is also called a j-face.
Names of faces. The O-faces are called vertices, the I-faces are called
edges, and the (d-I)-faces of a d-dimensional polytope are called facets. The
(d-2)-faces of a d-dimensional polytope are ridges; in the familiar 3-dimen-
sional situation, edges = ridges. For example, the 3-dimensional cube has 28
faces in total: the empty face, 8 vertices, 12 edges, 6 facets, and the whole
cube.
The following proposition shows that each V -polytope is the convex hull
of its vertices, and that the faces can be described combinatorially: They are
the convex hulls of certain subsets of vertices. This includes some intuitive
facts such as that each edge connects two vertices.
A helpful notion is that of an extremal point of a set: For a set X S;;; R d ,
a point x E X is extremal if x tI- conv( X \ {x}).
5.3.2 Proposition. Let P C Rd be a (bounded) convex polytope.
(i) ("Vertices are extremal") The extremal points of P are exactly its ver-
tices, and P is the convex hull of its vertices.
(ii) ("Face of a face is a face") Let F be a face of P. The vertices of Fare
exactly those vertices of P that lie in F. More generally, the faces of F
are exactly those faces of P that are contained in F.
The proof is not essential for our further considerations, and it is given at
the end of this section (but Exercise 9 below illustrates that things are not
quite as simple as it might perhaps seem). The proposition has an appropriate
analogue for polyhedra, but in order to avoid technicalities, we treat the
bounded case only.
Graphs of polytopes. Each I-dimensional face, or edge, of a convex poly-
tope has exactly two vertices. We can thus define the graph G(P) of a polytope
P in the natural way: The vertices of the polytope are vertices of the graph,
and two vertices are connected by an edge in the graph if they are vertices of
the same edge of P. (The terms "vertices" and "edges" for graphs actually
come from the corresponding notions for 3-dimensional convex polytopes.)
5.3 Faces of a Convex Polytope 89

5.3.4 Definition (Combinatorial equivalence). Two convex polytopes


P and Q are called combinatorially equivalent if F(P) and F(Q) are isomor-
phic as partially ordered sets.
We are going to state some properties of the partially ordered set F(P)
without proofs. These are not difficult and can be found in [Zie94].
It turns out that F(P) is a lattice (a partially ordered set satisfying
additional axioms). We recall that this means the following two conditions:
• Meets condition: For any two faces F, G E F(P), there exists a face
ME F(P), called the meet of F and G, that is contained in both F and
G and contains all other faces contained in both F and G .
• Joins condition: For any two faces F, G E F(P), there exists a face
J E F(P), called the join of F and G, that contains both F and G and
is contained in all other faces containing both F and G.
The meet of two faces is their geometric intersection F n G.
For verifying the joins and meets conditions, it may be helpful to know
that for a finite partially ordered set possessing the minimum element and the
maximum element, the meets condition is equivalent to the joins condition,
and so it is enough to check only one of the conditions.
Here is the face lattice of a 3-dimensional pyramid:
P

Ji73
5

12 45

1 2
P

o
The vertices are numbered 1-5, and the faces are labeled by the vertex sets.
The face lattice is graded, meaning that every maximal chain has the same
length (the rank of a face F is dim(F)+I). Quite obviously, it is atomic: Every
face is the join of its vertices. A little less obviously, it is coatomic; that is,
every face is the meet (intersection) of the facets containing it. An important
consequence is that combinatorial type of a polytope is determined by the
vertex-facet incidences. More precisely, if we know the dimension and all
subsets of vertices that are vertex sets of facets (but without knowing the
coordinates of the vertices, of course), we can uniquely reconstruct the whole
face lattice in a simple and purely combinatorial way.
Face lattices of convex polytopes have several other nice properties, but no
full algebraic characterization is known, and the problem of deciding whether
90 Chapter 5: Convex Polytopes

a given lattice is a face lattice is algorithmically difficult (even for 4-dimen-


sional polytopes).
The face lattice can be a suitable representation of a convex polytope in
a computer. Each j-face is connected by pointers to its (j-l)-faces and to
the (j+l)-faces containing it. On the other hand, it is a somewhat redundant
representation: Recall that the vertex-facet incidences already contain the
full information, and for some applications, even less data may be sufficient,
say the graph of the polytope.
The dual polytope. Let P be a convex polytope containing the origin in
its interior. Then the dual set P* is also a polytope; we have verified this in
the proof of Theorem 5.2.2.
5.3.5 Proposition. For each j = -1,0, ... , d, the j-faces of P are in a
bijective correspondence with the (d- j -1 )-faces of P*. This correspondence
also reverses inclusion; in particular, the face lattice of P* arises by turning
the face lattice of P upside down.
Again we refer to the reader's diligence or to [Zie94] for a proof. Let us
examine a few examples instead.
Among the five regular Platonic solids, the cube and the octahedron are
dual to each other, the dodecahedron and the icosahedron are also dual, and
the tetrahedron is dual to itself. More generally, if we have a 3-dimensional
convex polytope and G is its graph, then the graph of the dual polytope
is the dual graph to G, in the usual graph-theoretic sense. The dual of a
d-simplex is a d-simplex, and the d-dimensional cube and the d-dimensional
crosspolytope are dual to each other.

We conclude with two notions of polytopes "in general position."


5.3.6 Definition (Simple and simplicial polytopes). A polytope P is
called simplicial if each of its facets is a simplex (this happens, in particular, if
the vertices of P are in general position, but general position is not necessary).
A d-dimensiona1 polytope P is called simple if each of its vertices is contained
in exactly d facets.
The faces of a simplex are again simplices, and so each proper face of a sim-
plicial polytope is a simplex. Among the five Platonic solids, the tetrahedron,
the octahedron, and the icosahedron are simplicial; and the tetrahedron, the
cube, and the dodecahedron are simple. Crosspolytopes are simplicial, and
cubes are simple. An example of a polytope that is neither simplicial nor
simple is the 4-sided pyramid used in the illustration of the face lattice.
The dual of a simple polytope is simplicial, and vice versa. For a simple
d-dimensional polytope, a small neighborhood of each vertex looks combina-
torially like a neighborhood of a vertex of the d-dimensional cube. Thus, for
each vertex v of a d-dimensional simple polytope, there are d edges emanat-
ing from v, and each k-tuple of these edges uniquely determines one k-face
incident to v. Consequently, v belongs to (~) k-faces, k = 0,1, ... , d.
5.3 Faces of a Convex Polytope 91

Proof of Proposition 5.3.2. In (i) ("vertices are extremal"), we assume


that P is the convex hull of a finite point set. Among all such sets, we fix one
that is inclusion-minimal and call it Vo. Let Vv be the vertex set of P, and
let Ve be the set of all extremal points of P. We prove that Vo = Vv = Ve ,
which gives (i). We have Ve ~ Vo by the definition of an extremal point.
Next, we show that Vv ~ Ve' If v E Vv is a vertex of P, then there is a
hyperplane h with P n h = {v}, and all of P \ {v} lies in one of the open
half-spaces defined by h. Hence P \ {v} is convex, which means that v is an
extremal point of P, and so Vv ~ Ve'
Finally we verify Vo ~ Vv . Let v E Vo; by the inclusion-minimality of Vo,
we get that v (j. C = conv(Vo \ {v} ). Since C and {v} are disjoint compact
convex sets, they can be strictly separated by a hyperplane h. Let hv be the
hyperplane parallel to h and containing v; this hv has all points of Vo \ {v}
on one side.

We want to show that P n hv = {v} (then v is a vertex of P, and we are


done). The set P \ hv = conv(Vo) \ hv, being the intersection of a convex set
with an open half-space, is convex. Any segment vx, where x E P\h v , shares
only the point v with the hyperplane hv, and so (P \ hv) U {v} is convex as
well. Since this set contains Vo and is convex, it contains P = conv(Vo), and
so P n hv = {v} indeed.
As for (ii) ("face of a face is a face"), it is clear that a face G of P contained
in F is a face of F too (use the same witnessing hyperplane). For the reverse
direction, we begin with the case of vertices. By a consideration similar to
that at the end of the proof of (i), we see that F = conv(V)nh = conv(Vnh).
Hence all the extremal points of F, which by (i) are exactly the vertices of
F, are in V.
Finally, let F be a face of P defined by a hyperplane h, and let G c F be
a face of F defined by a hyperplane g within h; that is, g is a (d-2)-dimen-
sional affine subspace of h with G = g n F and with all of F on one side. Let
'Y be the closed half-space bounded by h with P c 'Y. We start rotating the
boundary h of 'Y around g in the direction such that the rotated half-space
'Y' still contains F.
92 Chapter 5: Convex Polytopes

If we rotate by a sufficiently small amount, then all the vertices of P not


lying in F are still in the interior of ,'. At the same time, the interior of "
contains all the vertices of F not lying in G, while all the vertices of G remain
on the boundary hi of ,'. So hi defines a face of P (since all of P is on one
side), and this face has the same vertex set as G, and so it equals G by the
first part of (ii) proved above. 0

Bibliography and remarks. Most of the material in this section


is quite old, and we restrict ourselves to a few comments and remarks
on recent developments.
Graphs of polytopes. The Steinitz theorem was published in [Ste22]. A
proof (of the harder implication) can be found in [Zie94]. In this type
of proof, one starts with the planar graph K 4 , which is obviously re-
alizable as a graph of a 3-dimensional polytope, and creates the given
3-connected planar graph by a sequence of suitable elementary opera-
tions, the so-called L\Y transformations, which are shown to preserve
the realizability. Another type of proof first finds a suitable straight
edge planar drawing of the given graph G and then shows that the
vertices of such a drawing can be lifted to R 3 to form the appropriate
polytope. The drawings needed here are "rubber band" drawings: Pin
down the vertices of an outer face and think of the edges as rubber
bands of various strengths, which left alone would contract to points.
Then the equilibrium position, where the forces at every inner vertex
add up to 0, specifies the drawing (see, e.g., Richter-Gebert [RG97]
for a presentation). These ideas go back to Maxwell; the result about
the equilibrium position specifying straight edge drawing for every
3-connected planar graph was proved by Thtte [Tht60]. Very interest-
ing related results about graphs with higher connectivity are due to
Linial, Lovlisz, and Wigderson [LW88]. Another way of obtaining suit-
able drawings is via Koebe's representation theorem (see, e.g., [PA95]
for an exposition): Every planar graph G can be represented by touch-
ing circles; that is, every vertex v E V(G) can be assigned a circular
disk in the plane in such a way that the disks have pairwise disjoint
interiors and two of them touch if and only if their two vertices are
connected by an edge.
5.3 Faces of a Convex Polytope 93

On the other hand, Koebe's theorem follows easily from a stronger


version of the Steinitz theorem due to Andreev: Every 3-connected
planar graph has a cage representation, i.e., as the graph of a 3-di-
mensional convex polytope P whose edges are all tangent to the unit
sphere (each vertex of P can see a cap of the unit sphere, and a suitable
stereographic projection of these caps yields the disks as in Koebe's
theorem). These beautiful results, as well as several others along these
lines, would certainly deserve to be included in a book like this, but
here they are not for space and time reasons.
A result of Blind and Mani-Levitska, with a beautiful simple new
proof by Kalai [KaI88], shows that a simple polytope is determined by
its dimension and its graph; that is, if two d-dimensional simple poly-
topes P and Q have isomorphic graphs, then they are combinatorially
equivalent.
One of the most challenging problems about graphs of convex poly-
topes is the Hirsch conjecture. In its basic form, it states that the
graph of any d-dimensional polytope with n facets has diameter at
most n-d; i.e., every two vertices can be connected by a path of at
most n-d edges. This conjecture is implied by its special case with
n = 2d, the so-called d-step conjecture. There are several variants of
the Hirsch conjecture. Some of them are known to be false, such as
the Hirsch conjecture for d-dimensional polyhedra with n-facets; their
graph can have diameter at least n-d+Ld/5j. But even here the con-
jecture fails just by a little, while the crucial and wide open question
is whether the diameter of the graph can be bounded by a fixed poly-
nomial in d and n.
The Hirsch conjecture is motivated by linear programming (and it
was published in Dantzig's book [Dan63]), since the running time of
all variants of the simplex algorithm is bounded from below by the
number of edges that must be traversed in order to get from the start-
ing vertex of the polyhedron of admissible solutions to the optimum
vertex.
The best upper bound is due to Kalai. He published several papers
on this subject, successively improving and simplifying his arguments,
and this sequence is concluded with [Ka192]. He proves the following:
Let P be a convex polyhedron in R d with n facets. Assume that no
edge of P is horizontal and that P has a (unique) topmost vertex w.
Then from every vertex v of P there is a path to w consisting of at most
f(d, n) ::; 2n(d+Li~~tJ-l) ::; 2n iog2 d+l edges and going upward all the
time. The proof is quite short and uses only very simple properties of
polytopes (also see [Zie94] or [KaI97]).
Kalai [KaI92] also discovered a randomized variant of the simplex
algorithm for linear programming for which the expected number of
pivot steps, for every linear program with n constraints in R d , is
94 Chapter 5: Convex Polytopes

bounded by a subexponential function of nand d, namely by nO(Vd).


All the previous worst-case bounds were exponential. Interestingly, es-
sentially the same algorithm (in a dual setting) was found by Sharir
and Welzl and a little later analyzed in [MSW96], independent of
Kalai's work and at almost the same time, but coming from a quite
different direction. The Sharir-Welzl algorithm is formulated in an
abstract framework, and it can be used for many other optimization
problems besides linear programming.
Realizations of polytopes. By a realization of a d-dimensional polytope
P we mean any polytope Q ~ Rd that is combinatorially equivalent
to P. The proof of Steinitz's theorem shows that every 3-dimension-
al polytope has a realization whose vertices have integer coordinates.
For 3-polytopes with n vertices, Richter-Gebert [RG97] proved that
the vertex coordinates can be chosen as positive integers no larger than
218n2 , and if the polytope has at least one triangular facet, the upper
bound becomes 43 n (a previous, slightly worse, estimate was given by
Onn and Sturmfels [OS94]). No nontrivial lower bounds seem to be
known. Let us remark that for straight edge drawings of planar graphs,
the vertices of every n-vertex graph can be placed on a grid with
side O(n). This was first proved by de Fraysseix, Pach, and Pollack
[dFPP90] with the (2n-4) x (n-2) grid, and re-proved by Schnyder
[Sch90] by a different method, with the (n-l) x (n-l) grid; see also
Kant [Kan96] for more recent results in this direction.
For higher-dimensional polytopes, the situation is strikingly differ-
ent. Although all simple polytopes and all simplicial polytopes can be
realized with integer vertex coordinates, there are 4-dimensional poly-
topes for which every realization requires irrational coordinates (we
will see an 8-dimensional example in Section 5.6). There are also 4-di-
mensional n-vertex polytopes for which every realization with integer
coordinates uses doubly exponential coordinates, of order 22n (n) • There
are numerous other results indicating that the polytopes of dimension
4 and higher are complicated. For example, the problem of deciding
whether a given finite lattice is isomorphic to the face lattice of a
4-dimensional polytope is algorithmically difficult; it is polynomially
equivalent to the problem of deciding whether a system of polynomial
inequalities with integer coefficients in n variables has a solution. This
latter problem is known to be NP-hard, but most likely it is even
harder; the best known algorithm needs exponential time and poly-
nomial space. An overview of such results, and references to previous
work on which they are built, can be found in Richter-Gebert [RG99],
and detailed proofs in [RG97]. Section 6.2 contains a few more remarks
on realizability (see, in particular, Exercise 6.2.3).
5.3 Faces of a Convex Polytope 95

Exercises
1. Verify that if V c Rd is affinely independent, then each subset F ~ V
determines a face of the simplex conv(V). I2J
2. Verify the description of the faces of the cube and of the crosspolytope
given in the text. [II
3. Consider the (n-1 )-dimensional permutahedron as defined in the intro-
duction to this chapter.
(a) Verify that it really has n! vertices corresponding to the permutations
of {l, 2, ... , n}.12J
(b) Describe all faces of the permutahedron combinatorially (what sets
of permutations are vertex sets of faces?). [II
(c) Determine the dimensions of the faces found in (b). In particular, show
that the facets correspond to ordered partitions (A, B) of {I, 2, ... ,n},
A, B =I- 0, and count them. [II
4. Let P C R4 = conv{ ±ei ± ej: i,j = 1,2,3,4, i =I- j}, where eI, ... ,e4 is
the standard basis (this P is called the 24-cell). Describe the face lattice
of P and prove that P is combinatorially equivalent to P* (in fact, P can
be obtained from P* by an isometry and scaling). [II
5. Using Proposition 5.3.2, prove the following:
(a) If F is a face of a convex polytope P, then F is the intersection of P
with the affine hull of F. I2J
(b) If F and G are faces of a convex polytope P, then F n G is a face,
too. CD
6. Let P be a convex polytope in R3 containing the origin as an interior
point, and let F be a j-face of P, j = 0,1,2.
(a) Give a precise definition of the face F' of the dual polytope P* cor-
responding to F (i.e., describe F' as a subset of R 3 ). I2J
(b) Verify that F' is indeed a face of P*. I2J
7. Let V C Rd be the vertex set of a convex polytope and let U C V. Prove
that U is the vertex set of a face of conv(V) if and only if the affine hull
of U is disjoint from conv(V \ U). [II
8. Prove that the graph of any 3-dimensional convex polytope is 3-connected;
i.e., removing any 2 vertices leaves the graph connected. [II
9. Let C be a convex set. Call a point x E C exposed if there is a hyperplane
h with Cnh = {x} and all the rest of C on one side. For convex polytopes,
exposed points are exactly the vertices, and we have shown that any
extremal point is also exposed. Find an example of a compact convex set
C C R 2 with an extremal point that is not exposed. [II
10. (On extremal points) For a set X ~ R d , let ex(X) = {x E X: x rt.
conv(X \ {x})} denote the set of extremal points of X.
(a) Find a convex set C ~ Rd with C =I- conv(ex(C)). CD
(b) Find a compact convex C ~ R3 for which ex(C) is not closed. [!]
96 Chapter 5: Convex Polytopes

(c) By modifying the proof of Theorem 5.2.2, prove that C = conv( ex( C))
for every compact convex C C Rd (this is a finite-dimensional version of
the well known Krein-Milman theorem). ~

5.4 Many Faces: The Cyclic Polytopes


A convex polytope P can be given to us by the list of vertices. How difficult
is it to recover the full face lattice, or, more modestly, a representation of P
as an intersection of half-spaces? The first question to ask is how large the
face lattice or the collection of half-spaces can be, compared to the number
of vertices. That is, what is the maximum total number of faces, or the
maximum number of facets, of a convex polytope in Rd with n vertices? The
dual question is, of course, the maximum number of faces or vertices of a
bounded intersection of n half-spaces in Rd.
Let fj = h(P) denote the number of j-faces of a polytope P. The vector
(fa, iI,···, fd) is called the f-vector of P. We thus assume fa = n and we
are interested in estimating the maximum value of fd-l and of '£%=0 fk'
In dimensions 2 and 3, the situation is simple and favorable. For d = 2, our
polytope is a convex polygon with n vertices and n edges, and so fa = iI = n,
h = 1. The f-vector is even determined uniquely.
A 3-dimensional polytope can be regarded as a drawing of a planar graph,
in our case with n vertices. By well-known results for planar graphs, we have
iI :::; 3n-6 and h :::; 2n-4. Equalities hold if and only if the polytope is
simplicial (all facets are triangles).
In both cases the total number of faces is linear in n. But as the dimension
grows, polytopes become much more complicated. First of all, even the total
number of faces of the most innocent convex polytope, the d-dimensional
simplex, is exponential in d. But here we consider d fixed and relatively
small, and we investigate the dependence on the number of vertices n.
Still, as we will see, for every n ?: 5 there is a 4-dimensional convex
polytope with n vertices and with every two vertices connected by an edge,
i.e., with G) edges! This looks counterintuitive, but our intuition is based
on the 3-dimensional case. In any fixed dimension d, the number of facets
can be of order n Ld / 2J , which is rather disappointing for someone wishing to
handle convex polytopes efficiently. On the other hand, complete desperation
is perhaps not appropriate: Certainly not all polytopes exhibit this very bad
behavior. For example, it is known that if we choose n points uniformly at
random in the unit ball Bd, then the expected number of faces of their convex
hull is only o( n), for every fixed d.
It turns out that the number of faces for a given dimension and number of
vertices is the largest possible for so-called cyclic polytopes, to be introduced
next. First we define a very useful curve in Rd.
5.4 Many Faces: The Cyclic Polytopes 97

5.4.1 Definition (Moment curve). The curve


in Rd is called the moment curve.
,= {(t, t 2, ... , t d): t E R}

5.4.2 Lemma. Any hyperplane h intersects the moment curve I in at most


d points. If there are d intersections, then h cannot be tangent to I, and thus
at each intersection, I passes from one side of h to the other.

Proof. A hyperplane h can be expressed by the equation (a, x) = b, or


in coordinates alXI + a2X2 + ... + adXd = b. A point of I has the form
(t, t 2, ... , t d), and if it lies in h, we obtain al t + a2t2 + ... + adtd - b = O. This
means that t is a root of a nonzero polynomial Ph(t) of degree at most d,
and hence the number of intersections of h with I is at most d. If there are d
distinct roots, then they must be all simple. At a simple root, the polynomial
Ph (t) changes sign, and this means that the curve I passes from one side of
h to the other. 0

As a corollary, we see that every d points of the moment curve are affinely
independent, for otherwise, we could pass a hyperplane through them plus
one more point of I' SO the moment curve readily supplies explicit examples
of point sets in general position.

5.4.3 Definition (Cyclic polytope). The convex hull of finitely many


points on the moment curve is called a cyclic polytope.
How many facets does a cyclic polytope have? Each facet is determined
by a d-tuple of vertices, and distinct d-tuples determine distinct facets. Here
is a criterion telling us exactly which d-tuples determine facets.

5.4.4 Proposition (Gale's evenness criterion). Let V be the vertex set


of a cyclic polytope P considered with the linear ordering:::; along the mo-
ment curve (larger vertices have larger values of the parameter t). Let F =
{VI, V2,"" Vd} S;;; V be a d-tuple of vertices of P, where VI < V2 < ... < Vd·
Then F determines a facet of P if and only if for any two vertices u, V E V \ F,
the number of vertices Vi E F with u < Vi < V is even.

Proof. Let hF be the hyperplane affinely spanned by F. Then F determines


a facet if and only if all the points of V \ F lie on the same side of h F .
Since the moment curve I intersects hF in exactly d points, namely at
the points of F, it is partitioned into d+ 1 pieces, say 10, ... "d, each lying
completely in one of the half-spaces, as is indicated in the drawing:

II
1'6

_L-=*---I--~~--t-- h p
98 Chapter 5: Convex Polytopes

Hence, if the vertices of V \ F are all contained in the odd-numbered pieces


/'1, /'3, ... , as in the picture, or if they are all contained in the even-numbered
pieces /'0, /'2, ... , then F determines a facet. This condition is equivalent to
Gale's criterion. D
Now we can count the facets.
5.4.5 Theorem. The number of facets of a d-dimensional cyclic polytope
with n vertices (n?:: d+1) is

(n- ld/2J)
ld/2J +
(n -ld/2J - 1)
ld/2J _ 1 for d even, and

n - ld/2J - 1)
2( ld/2J for dodd.

For fixed d, this has the order of magnitude n ld / 2J .

Proof. The number of facets equals the number of ways of placing d black
circles and n - d white circles in a row in such a way that we have an even
number of black circles between each two white circles.
Let us say that an arrangement of black and white circles is paired if any
contiguous segment of black circles has an even length (the arrangements
permitted by Gale's criterion need not be paired because of the initial and
final segments). The number of paired arrangements of 2k black circles and
n - 2k white circles is (nk"k), since by deleting every second black circle we
get a one-to-one correspondence with selections of the positions of k black
circles among n - k possible positions.
Let us return to the original problem, and first consider an odd d = 2k+ 1.
In a valid arrangement of circles, we must have an odd number of consecutive
black circles at the beginning or at the end (but not both). In the former case,
we delete the initial black circle, and we get a paired arrangement of 2k black
and n-1- 2k white circles. In the latter case, we similarly delete the black
circle at the end and again get a paired arrangement as in the first case. This
establishes the formula in the theorem for odd d.
For even d = 2k, the number of initial consecutive black circles is ei-
ther odd or even. In the even case, we have a paired arrangement, which
contributes (nk"k) possibilities. In the odd case, we also have an odd num-
ber of consecutive black circles at the end, and so by deleting the first and
last black circles we obtain a paired arrangement of 2(k-1) black circles and
n-2k white circles. This contributes (nk"~~2) possibilities. D

Bibliography and remarks. The convex hull of the moment curve


was studied by by Caratheodory [Car07J. In the 1950s, Gale con-
structed neighborly polytopes by induction. Cyclic polytopes and the
evenness criterion appear in Gale [Gal63J. The moment curve is an
important object in many other branches besides the theory of convex
5.4 Many Faces: The Cyclic Polytopes 99

polytopes. For example, in elementary algebraic topology it is used


for proving that every (at most countable) d-dimensional simplicial
complex has a geometric realization in R2d+1.
Convex hulls of random sets. Barany [Bar89] proved that if n points
are chosen uniformly and independently at random from a fixed d-
dimensional convex polytope K (for example, the unit cube), then
the number of k-dimensional faces of their convex hull has the order
(logn)d-l for every fixed d and k, 0:::; k :::; d-l (the constant of pro-
portionality depending on d, k, and K). If K is a smooth convex body
(such as the unit ball), then the order of magnitude is n(d-I)/(d+1),
again with d, k, and K fixed. For more references and wider context
see, e.g., Weil and Wieacker [WW93].

Exercises
1. (a) Show that if V is a finite subset of the moment curve, then all the
points of V are extreme in conv(V); that is, they are vertices of the
corresponding cyclic polytope. ~
(b) Show that any two cyclic polytopes in R d with n vertices are com-
binatorially the same: They have isomorphic face lattices. Thus, we can
speak of the cyclic polytope. 0
2. (Another curve like I) Let !3 c R d be the curve {C~l ' t~2' ... , t~d): t E
R, t > a}. Show that any hyperplane intersects !3 in at most d points
(and if there are d intersections, then there is no tangency), and conclude
that any n distinct points on f3 form the vertex set of a polytope com-
binatorially isomorphic to the cyclic polytope. [IJ (Let us remark that
many other curves have these properties as well; the moment curve is
just the most convenient example.)
3. (Universality of the cyclic polytope)
(a) Let Xl, ... ,xn be points in Rd. Let Yi denote the vector arising by
appending 1 as the (d+l)st component of Xi' Show that if the determi-
nants of all matrices with columns Yi 1 , ••• , Yid+l' for all choices of indices
il < i2 < '" < id+l, have the same nonzero sign, then Xl, ... ,X n form
the vertex set of a convex polytope combinatorially equivalent to the n-
vertex cyclic polytope in Rd. [IJ
(b) Show that for any integers nand d there exists N such that among any
N points in Rd in general position, one can choose n points forming the
vertex set of a convex polytope combinatorially equivalent to the n-vertex
cyclic polytope. 0 (This can be seen as a d-dimensional generalization of
the Erdos-Szekeres theorem.)
4. Prove that if n is sufficiently large in terms of d, then for every set of
n points in R d in general position, one can choose d+ 1 simplices of di-
mension d with vertices at some of these points such that any hyperplane
avoids at least one of these simplices. Use Exercise 3. ~
100 Chapter 5: Convex Polytopes

This exercise is a special case of a problem raised by Lovasz, and it was


communicated to me by Baniny. A detailed solution can be found in
[BVS+99].
5. Show that for cyclic polytopes in dimensions 4 and higher, every pair
of vertices is connected by an edge. For dimension 4 and two arbitrary
vertices, write out explicitly the equation of a hyperplane intersecting the
cyclic polytope exactly in this edge. 0
6. Determine the f-vector of a cyclic polytope with n vertices in dimensions
4,5, and 6. 0

5.5 The Upper Bound Theorem


The upper bound theorem, one of the earlier major achievements of the theory
of convex polytopes, claims that the cyclic polytope has the largest possible
number of faces.
5.5.1 Theorem (Upper bound theorem). Among all d-dimensional con-
vex polytopes with n vertices, the cyclic polytope maximizes the number of
faces of each dimension.
In this section we prove only an approximate result, which gives the cor-
rect order of magnitude for the maximum number of facets.
5.5.2 Proposition (Asymptotic upper bound theorem). A d-dimen-
sional convex polytope with n vertices has at most 2(Ld/2j) facets and no
more than 2d+1 (Ld/2j) faces in total. For d fixed, both quantities thus have
the order of magnitude n Ld / 2j .
First we establish this proposition for simplicial polytopes, in the following
form.
5.5.3 Proposition. Let P be a d-dimensional simplicial polytope. Then
(a) fo(P) + h(P) + ... + fd(P) ::; 2d fd-l(P), and
(b) fd-l(P) ::; 2fLd/2j-l(P),
This implies Proposition 5.5.2 for simplicial polytopes, since the number
of (Ld/2J -I)-faces is certainly no bigger than (Ld/2j)' the number of all Ld/2J-
tuples of vertices.
Proof of Proposition 5.5.3. We pass to the dual polytope P*, which
is simple. Now we need to prove L::=o fk(P*) ::; 2d fo(P*) and fo(P*) ::;
2frd/21(P*),
Each face of P* has at least one vertex, and every vertex of a simple
d-polytope is incident to 2d faces, which gives the first inequality.
5.5 The Upper Bound Theorem 101

We now bound the number of vertices in terms of the number of d/21- r


faces. This is the heart of the proof, and it shows where the mysterious
exponent Ld/2J comes from.
Let us rotate the polytope P* so that no two vertices share the Xd-CO-
ordinate (i.e., no two vertices have the same vertical level).
Consider a vertex v with the d edges emanating from it. By the pigeonhole
r r
principle, there are at least d/21 edges directed downwards or at least d/21
r
edges directed upwards. In the former case, every d/21-tuple of edges going
r
up determines a d/21-face for which v is the lowest vertex. In the latter case,
every rd/21-tuple of edges going down determines a rd/21-face for which v
is the highest vertex. Here is an illustration, unfortunately for the not too
interesting 3-dimensional case, showing a situation with 2 edges going up and
the corresponding 2-dimensional face having v as the lowest vertex:

r
We have exhibited at least one d/21-face for which v is the lowest vertex or
the highest vertex. Since the lowest vertex and the highest vertex are unique
for each face, the number of vertices is no more than twice the number of
rd/21-faces. D

Warning. For simple polytopes, the total combinatorial complexity is pro-


portional to the number of vertices, and for simplicial polytopes it is pro-
portional to the number of facets (considering the dimension fixed, that is).
For polytopes that are neither simple nor simplicial, the number of faces of
intermediate dimensions can have larger order of magnitude than both the
number of facets and the number of vertices; see Exercise 1.
Nonsimplicial polytopes. To prove the asymptotic upper bound theorem,
it remains to deal with nonsimplicial polytopes. This is done by a perturba-
tion argument, similar to numerous other results where general position is
convenient for the proof but where we want to show that the result holds
in degenerate cases as well. In most instances in this book, the details of
perturbation arguments are omitted, but here we make an exception, since
the proof seems somewhat nontrivial.
5.5.4 Lemma. For any d-dimensional convex polytope P there exists a d-
dimensional simplicial polytope Q with 10(P) = 10(Q) and A(Q) :::: Ik(P)
for all k = 1,2, ... ,d.

Proof. The basic idea is very simple: Move (perturb) every vertex of P by a
very small amount, in such a way that the vertices are in general position, and
show that each k-face of P gives rise to at least one k-face of the perturbed
polytope. There are several ways of doing this proof.
102 Chapter 5: Convex Polytopes

We process the vertices one by one. Let V be the vertex set of P and
let v E V. The operation of €-pushing v is as follows: We choose a point v'
lying in the interior of P, at distance at most € from v, and on no hyperplane
determined by the points of V, and we set V' = (V \ {v}) U {v'}. If we
successively €v-push each vertex v of the polytope, the resulting vertex set is
in general position and we have a simple polytope.
It remains to show that for any polytope P with vertex set V and any
v E V, there is an € > 0 such that €-pushing v does not decrease the number
of faces.
Let U c V be the vertex set of a k-face of P, 0 ::; k ::; d-1, and let V'
arise from V by €-pushing v. If v rt u, then no doubt, U determines a face of
conv(V'), and so we assume that v E U. First suppose that v lies in the affine
hull of U \ {v}; we claim that then U \ {v} determines a k-face of conv(V').
This follows easily from the criterion in Exercise 5.3.7: A subset U c V is the
vertex set of a face of conv(V) if and only if the affine hull of U is disjoint
from conv(V \ U). We leave a detailed argument to the reader (one must use
the fact that v is pushed inside).
If v lies outside of the affine hull of U \ {v}, then we want to show that
U ' = (U \ {v}) U {v'} determines a k-face of conv(V'). The affine hull of U
is disjoint from the compact set conv(V \ U). If we move v continuously by
a sufficiently small amount, the affine hull of U moves continuously, and so
there is an € > 0 such that if we move v within € from its original position,
the considered affine hull and conv(V \ U) remain disjoint. 0

The h-vector and such. Here we introduce some notions extremely useful
for deeper study of the f-vectors of convex polytopes. In particular, they are
crucial in proofs of the (exact) upper bound theorem.
Let us go back to the setting of the proof of Proposition 5.5.3. There we
considered a simple polytope that used to be called P* but now, for simplicity,
let us call it P. It is positioned in Rd in such a way that no edge is horizontal,
and so for each vertex v, there are some iv edges going upwards and d - iv
edges going downwards.
The central definition is this: The h-vector of Pis (ho, hI' ... ' hd), where
hi is the number of vertices v with exactly i edges going upwards. So, for
example, we have ho = hd = l.
Next, we relate the h-vector to the f-vector. Each vertex v is the lowest
vertex for exactly (ik') faces of dimension k, and each k-face has exactly one

t, G)
lowest vertex, and so

fk = hi (5.1)

(for i < k we have (!) = 0). So the h-vector determines the f-vector. Less
obviously, the h-vector can be uniquely reconstructed from the f-vector! A
quick way of seeing this is via generating functions: If f(x) is the polynomial
d k d·
I:k=O fk X and h(x) = I:i=O hiXt, then (5.1) translates to f(x) = h(x+1),
5.5 The Upper Bound Theorem 103

and therefore h(x) = f(x-1). Explicitly, we have

(5.2)

We have defined the h-vector using one particular choice of the vertical
direction, but now we know that it is determined by the f-vector and thus
independent of the chosen direction. By turning P upside down, we see that

hi = hd - i for all i = 0,1, ... ,d.


These equalities are known as the Dehn-Sommerville relations. They include
the usual Euler formula fo + h = II + 2 for 3-dimensional polytopes.
Let us stress once again that all we have said about h-vectors concerns
only simple polytopes. For a simplicial polytope P, the h-vector can now be
defined as the h-vector of the dual simple polytope P*. Explicitly,

hj =
~ 1F-
L../- . k(d-k)
d _ . fk-l.
k=O J
The upper bound theorem has the following neat reformulation in terms
of h-vectors: For any d-dimensional simplicial polytope with fo = n vertices,
we have
n - d+i -1)
hi:::; ( . , i=0,1,···,ldj2J. (5.3)
z
Proving the upper bound theorem is not one of our main topics, but an
outline of a proof can be found in this book. It starts in the next section
and finishes in Exercise 11.3.6, and it is not among the most direct possible
proofs. Deriving the upper bound theorem from (5.3) is a pure and direct
calculation, verifying that the h-vector of the cyclic polytope satisfies (5.3)
with equality. We omit this part.

Bibliography and remarks. The upper bound theorem was con-


jectured by Motzkin in 1957 and proved by McMullen [McM70]. Many
partial results have been obtained in the meantime. Perhaps most no-
tably, Klee [Kle64] found a simple proof for polytopes with not too few
vertices (at least about d2 vertices in dimension d). That proof applies
to simplicial complexes much more general than the boundary com-
plexes of simplicial polytopes: It works for Eulerian pseudomanifolds
and, in particular, for all simplicial spheres, i.e., simplicial complexes
homeomorphic to 3 d - l . Presentations of McMullen's proof and Klee's
proof can be found in Ziegler's book [Zie94]. A nice variation was de-
scribed by Alon and Kalai [AK85]. Another proof, based on linear
programming duality and results on hyperplane arrangements, was
given by Clarkson [Cla93]. An elegant presentation of similar ideas,
104 Chapter 5: Convex Polytopes

using the Gale transform discussed below in Section 5.6, can be found
in Welzl [We101] and in Exercises 11.3.5 and 11.3.6. Our exposition of
the asymptotic upper bound theorem is based on Seidel [Sei95].
The ordering of the vertices of a simple polytope P by their height
in the definition of the h-vector corresponds to a linear ordering of the
facets of P*. This ordering of the facets is a shelling. Shelling, even
in the strictly peaceful mathematical sense, is quite important, also
beyond the realm of convex polytopes. Let K be a finite cell complex
whose cells are convex polytopes (such as the boundary complex of a

°
convex polytope), and suppose that all maximal cells have the same
dimension k. Such K is called shellable if k = or k 2: 1 and K has
a shelling. A shelling of K is an enumeration F I , F 2 , ... , Fn of the
facets (maximum-dimension cells) of K such that (i) the boundary
complex of FI is shellable, and (ii) for every i > 1, there is a shelling
of the complex Fi n u~:i Fj that can be extended to a shelling of the
boundary complex of Fi . The boundary complex of a convex polytope
is homeomorphic to a sphere, and a shelling builds the sphere in such
a way that each new cell is glued by contractible part of its boundary
to the previously built part, except for the last cell, which closes the
remaining hole.
McMullen's proof of the upper bound theorem does not generalize
to simplicial spheres (i.e., finite simplicial complexes homeomorphic
to spheres), for example because they need not be shellable, counter-
intuitive as this may look. The upper bound theorem for them was
proved by Stanley [Sta75] using much heavier algebraic and algebraic-
topological tools.
An interesting extension of the upper bound theorem was found
by Kalai [KaI91]. Let P be a simplicial d-dimensional polytope. All
proper faces of P are simplices, and so the boundary is a simplicial
complex. Let K be any sub complex of the boundary (a subset of the
proper faces of P such that if F E K, then all faces of F also lie in
K). The strong upper bound theorem, as Kalai's result is called, asserts
that if K has at least as many (d-1)-faces as the d-dimensional cyclic
polytope on n vertices, then K has at least as many k-faces as that
cyclic polytope, for all k = 0, 1, ... ,d-1. (Note that we do not assume
that P has n vertices!) The proof uses methods developed for the
proof of the g-theorem mentioned below as well as Kalai's technique
of algebraic shifting.
Another major achievement concerning the f-vectors of polytopes
is the so-called g-theorem. The inventive name g-vector of a d-dimen-
sional simple polytope refers to the vector (gO, g1, . .. ,gLd/2J), where
go = ho and gi = hi - hi-I, i = 1,2, ... , Ld/2J. The g-theorem char-
acterizes all possible integer vectors that can appear as the g-vector
of a d-dimensional simple (or simplicial) polytope. Since the g-vector
5.5 The Upper Bound Theorem 105

uniquely determines the f-vector, we have a complete characteriza-


tion of f-vectors of simple polytopes. In particular, the g-theorem
guarantees that all the components of the g-vector are always non-
negative (this fact is known as the generalized lower bound theorem),
and therefore the h-vector is unimodal: We have ho S hI S ... S
h Ld / 2J = hrd/21 :::: ... :::: h d. (On the other hand, the f-vector of a
simple polytope need not be unimodal; more exactly, it is unimodal
in dimensions up to 19, and there are 20-dimensional nonunimodal
examples.) We again refer to [Zie94] for a full statement of the g-
theorem. The proof has two independent parts; one of them, due to
Billera and Lee [BL81]' constructs suitable polytopes, and the other
part, first proved by Stanley [Sta80], shows certain inequalities for all
simple polytopes. For studying the most elementary proof of the sec-
ond part currently available, one can start with McMullen [McM96]
and continue with [McM93].
For nonsimple (and nonsimplicial) polytopes, a characterization
of possible f-vectors remains elusive. It seems, anyway, that the flag
vector might be a more appropriate parameter for nonsimple poly-
topes. The flag vector counts, for every k = 1,2, ... , d and for every
i l < i2 < ... < ik, the number of chains FI C F2 C ... C Fk , where
F I , ... ,Fk are faces with dim(Fj ) = ij (such a chain is called a flag).
No analogue of the upper bound theorem is known for centrally
symmetric polytopes. A few results concerning their face counts, ob-
tained by methods quite different from the ones for arbitrary poly-
topes, will be mentioned in Section 14.5.
The proof of Lemma 5.5.4 by pushing vertices inside is similar to
an argument in Klee [Kle64], but he proves more and presents the
proof in more detail.
Convex hull computation. What does it mean to compute the convex
hull of a given n-point set V C Rd? One possible answer, briefly
touched upon in the notes to Section 5.2, is to express conv(V) as
the intersection of half-spaces and to compute the vertex sets of all
facets. (As we know, the face lattice can be reconstructed from this
information purely combinatorially; see Kaibel and Pfetsch [KP01]
for an efficient algorithm.) Of course, for some applications it may
be sufficient to know much less about the convex hull, say only the
graph of the polytope or only the list of its vertices, but here we will
discuss only algorithms for computing all the vertex-facet incidences
or the whole face lattice. For a more detailed overview of convex hull
algorithms see, e.g., Seidel [Sei97].
For the dimension d considered fixed, there is a quite simple and
practical randomized algorithm. It computes the convex hull of n
points in Rd in expected time O(nLd/2J + nlogn) (Seidel [Sei91],
simplifying Clarkson and Shor [CS89]), and also a very complicated
106 Chapter 5: Convex Polytopes

but deterministic algorithm with the same asymptotic running time


(Chazelle [Cha93bj; somewhat simplified in Br6nnimann, Chazelle,
and Matousek [BCM99]). This is worst-case optimal, since an n-vertex
polytope may have about n Ld / 2J facets. There are also output-sensitive
algorithms, whose running time depends on the total number f of faces
of the resulting polytope. Recent results in this direction, including an
algorithm that computes the convex hull of n points in general posi-
tion in Rd (d fixed) in time O(n log f + (nf)1-1/(Ld/2J+l)(logn)c(d»),
can be found in Chan [ChaOObj.
Still, none of the known algorithms is theoretically fully satisfac-
tory, and practical computation of convex hulls even in moderate di-
mensions, say 10 or 20, can be quite challenging. Some of the algo-
rithms are too complicated and with too large constants hidden in the
asymptotic notation to be of practical value. Algorithms requiring gen-
eral position of the points are problematic for highly degenerate point
configurations (which appear in many applications), since small per-
turbations used to achieve general position often increase the number
of faces tremendously. Some of the randomized algorithms compute
intermediate polytopes that can have many more faces than the fi-
nal result. Often we are interested just in the vertex-facet incidences,
but many algorithms compute all faces, whose number can be much
larger, or even a triangulation of every face, which may again increase
the complexity. Such problems of existing algorithms are discussed in
Avis, Bremner, and Seidel [ABS97j.
For actual computations, simple and theoretically suboptimal al-
gorithms are often preferable. One of them is the double-description
method mentioned earlier, and another algorithm that seems to be-
have well in many difficult instances is the reverse search of Avis and
Fukuda [AF92j. It enumerates the vertices of the intersection of a given
set H of half-spaces one by one, using quite small storage. Conceptu-
n
ally, one thinks of optimizing a generic linear function over H by a
simplex algorithm with Bland's rule. This defines a spanning tree in
the graph of the polytope, and this tree is searched depth-first starting
from the optimum vertex, essentially by running the simplex algorithm
"backwards." The main problem of this algorithm is with degenerate
vertices of high degree, which may correspond to an enormous number
of bases in the simplex algorithm.
Also, it sometimes helps if one knows some special properties of
the convex hull in a particular problem, say many symmetries. For ex-
ample, very extensive computations of convex hulls were performed by
Deza, Fukuda, Pasechnik, and Sato [DFPSOO], who studied the metric
polytope. Before we define this interesting polytope, let us first intro-
duce the metric cone Mn. This is a set in R(~) representing all metrics
on {I, 2, ... ,n}, where the coordinate X{i,j} specifies the distance of
5.6 The Gale Transform 107

i to j, 1 :S i < j :S n. So Mn is defined by the triangle inequalities


X{i,j} + X{j,k} :S
X{i,k}, where i,j, k are three distinct indices. The
metric polytope mn is the subset of Mn defined by the additional
inequalities saying that the perimeter of each triangle is at most 2,
namely X{i,j} +X{j,k} +X{i,k} :S 2. Deza et al. were able to enumerate
all the approximately 1.5· 109 vertices of the 28-dimensional polytope
ms; this may give some idea of the extent of these computational prob-
lems. Without using many symmetries of m n , a polytope of this size
would currently be out of reach. Such computations might provide in-
sight into various conjectures concerning the metric polytope, which
are important for combinatorial optimization problems (see, e.g., Deza
and Laurent [DL97] for background).

Exercises
1. (a) Let P be a k-dimensional convex polytope in Rk, and Q an .e-dimen-
sional convex polytope in R e. Show that the Cartesian product P x Q C
RkH is a convex polytope of dimension k +.e. 12]
(b) If F is an i-face of P, and G is a j-face of Q, i,j 2: 0, then F x G is
an (i + j)-face of P x Q. Moreover, this yields all the nonempty faces of
PxQ.0
(c) Using the product of suitable polytopes, find an example of a "fat-
lattice" polytope, i.e., a polytope for which the total number offaces has
a larger order of magnitude than the number of vertices plus the number
of facets together (the dimension should be a constant). 0
(d) Show that the following yields a 5-dimensional fat-lattice polytope:
The convex hull of two regular n-gons whose affine hulls are skew 2-flats
in R5. 0
For recent results on fat-lattice polytopes see Eppstein, Kuperberg, and
Ziegler [EKZ01].

5.6 The Gale Transform


On a very general level, the Gale transform resembles the duality transform
defined in Section 5.1. Both convert a (finite) geometric configuration into
another geometric configuration, and they may help uncover some properties
of the original configuration by making them more apparent, or easier to
visualize, in the new configuration. The Gale transform is more complicated
to explain and probably more difficult to get used to, but it seems worth the
effort. It was invented for studying high-dimensional convex polytopes, and
recently it has been used for solving problems about point configurations by
relating them to advanced theorems on convex polytopes. It is also closely
related to the duality of linear programming (see Section 10.1), but we will
not elaborate on this connection here.
108 Chapter 5: Convex Polytopes

The Gale transform assigns to a sequence a = (aI, a2, ... , an) of n ;::: d+l
points in R d another sequence 9 = (1h, 92, ... , 9n) of n points. The points
91,92, ... ,9n live in a different dimension, namely in R n - d- l . For example,
n points in the plane are transformed to n points in R n -3 and vice versa.
In the literature one finds many results about k-dimensional polytopes with
k+3 or k+4 vertices; this is because their vertex sets have a low-dimensional
Gale transform.
Let us stress that the Gale transform operates on sequences, not individual
points: We cannot say what 91 is without knowing all of aI, a2, ... , an. We
also require that the affine hull of the ai be the whole Rd; otherwise, the
Gale transform is not defined. (On the other hand, we do not need any sort
of general position, and some of the ai may even coincide.)
The reader might wonder why the points of the Gale transform are written
with bars. This is to indicate that they should be interpreted as vectors
in a vector space, rather than as points in an affine space. As we will see,
"affine" properties of the sequence a, such as affine dependencies, correspond
to "linear" properties of the Gale transform, such as linear dependencies.
In order to obtain the Gale transform of a, we first convert the ai into
(d+l)-dimensional vectors: iii E R d +l is obtained from ai by appending a
(d+l)st coordinate equal to 1. This is the embedding Rd --+ R d + l often used
for relating affine notions in Rd to linear notions in R d +l ; see Section 1.1.
Let A be the d x n matrix with iii as the ith column. Since we assume that
there are d+ 1 affinely independent points in a, the matrix A has rank d+ 1,
and so the vector space V generated by the rows of A is a (d+ 1)-dimensional
subspace ofRn. We let V-L be the orthogonal complement of V in Rn; that is,
V-L = {w ERn: (v,w) = 0 for all v E V}. We have dim(V-L) = n-d-1. Let
us choose some basis (b1, b2 , •. . , bn-d-l) of V -L, and let B be the (n-d-l) x n
matrix with bj as the jth row. Finally, we let 9i E R n - d - l be the ith column
of B. The sequence 9 = (91,92, ... , 9n) is the Gale transform of a. Here is a
pictorial summary:
n

d 1111111111 --+ --+ d+l Gale transform


al an
basis of
point sequence n-d-l orthogonal
complement

5.6.1 Observation.
(i) (The Gale transform is determined up to linear isomorphism) In the
construction of g, we can choose an arbitrary basis of V -L. Choosing a
different basis corresponds to multiplying the matrix B from the left by a
nonsingular (n-d-l) x (n-d-l) matrix T (Exercise 1), and this means
transforming (91, ... ,9n) by a linear isomorphism ofRn-d-l.
5.6 The Gale Transform 109

(ii) A sequence 9 in Rn-d-l is the Gale transform of some a if and only if


it spans Rn-d-l and has 0 as the center of gravity: L~=l iii = O.
(iii) Let us consider a sequence 9 in Rn-d-l satisfying the condition in (ii).
If we interpret it as a point sequence (breaking the convention that the
result of the Gale transform should be thought of as a sequence of vec-
tors), apply the Gale transform to it, again consider the result as a point
sequence, and apply the Gale transform the second time, we recover the
original g, up to linear isomorphism (Exercise 5).

Two ways of probing a configuration. We would like to set up a dictio-


nary for translating between geometric properties of a sequence a and those
of its Gale transform. First we discuss how some familiar geometric proper-
ties of a configuration of points or vectors are reflected in the values of affine
or linear functions on the configuration, and how they manifest themselves
in affine or linear dependencies. For a sequence a = (al,"" an) of vectors in
Rd+l, we define two vector subspaces of Rn:

LinVal(a)= {(f(al), f(a2), ... , f(a n )): f: R d+1 --+ R is a linear function},
LinDep(a) = {a ERn: alaI + a2a2 + ... + ana n = O}.

For a point sequence a = (al,"" an), we then let AfNal(a) = LinVal(a) and
AffDep(a) = LinDep(a), where a is obtained from a as above, by appending
1'so Another description is

AfNal(a) =
{(f(al),J(a2), ... , f(a n )): f: Rd --+ R is an affine function},
AffDep(a) = {a ERn: alaI + ... + ana n = 0, a1 + ... + an = O}.

The knowledge of LinVal(a) tells us a lot about a, and we only have to


learn to decode the information. As usual, we assume that a linearly spans
all of Rd+l.
Each nonzero linear function f: R d + 1 --+ R determines the linear hy-
perplane hf = {x E R d + 1 : f(x) = O} (by a linear hyperplane we mean a
hyperplane passing through 0). This hf is oriented (one of its half-spaces is
positive and the other negative), and the sign of !(ai) determines whether ai
lies on h f' on its positive side, or on its negative side.
f(x) > 0

as

f(x) < 0

We begin our decoding of the properties of a with the property "span-


ning a linear hyperplane." That is, we choose our favorite index set I ~
110 Chapter 5: Convex Polytopes

{I, 2, ... ,n}, and we ask whether the points of the subsequence aI = (iii: i E
1) span a linear hyperplane. First, we observe that they lie in a common linear
hyperplane if and only ifthere is a nonzero <P E LinVal(a) such that <Pi = 0 for
all i E I. It could still happen that all of aI lies in a lower-dimensional linear
subspace. Using the assumption that a spans Rd+l, it is not difficult to see
that aI spans a linear hyperplane if and only if all <P E LinVal(a) that vanish
on aI have identical zero sets; that is, the set {i: <Pi = O} is the same for all
such <po If we know that aI spans a linear hyperplane, we can also see how
the other vectors in a are distributed with respect to this linear hyperplane.
Analogously, knowing AffVal(a), we can determine which subsequences of
a span (affine) hyperplanes and how the other points are partitioned by these
hyperplanes. For example, we can tell whether there are some d+ 1 points on
a common hyperplane, and so we know whether a is in general position. As a
more complicated example, let P = conv(a). We can read off from AffVal(a)
which of the ai are the vertices of P, and also the whole face lattice of P
(Exercise 6).
Similar information can be inferred from AffDep( a) (exactly the same
information, in fact, since AffDep(a) = AffVal(a).L; see Exercise 7). For
an a E AffDep(a) let I+(a) = {i E {l,2, ... ,n}: ai > O} and L(a) =
{i E {l, 2, ... ,n}: ai < O}. As we learned in the proof of Radon's lemma
(Lemma 1.3.1), 1+ = I+(a) and L = L(a) correspond to Radon partitions
of a. Namely, 2:iEI+ aiai = 2:iEL (-ai)ai, and dividing by 2:iEI+ ai =
2:iEL (-ai), we have convex combinations on both sides, and so conv(aI+)n
conv(aL) =f. 0. Conversely, if hand 12 are disjoint index sets with conv(ah)n
conv(aI2) =f. 0, then there is a nonzero a E AffDep(a) with I+(a) s-;;; hand
L(a) s-;;; 12 • For example, ai is a vertex of conv(a) if and only if there is no
a E AffDep(a) with h(a) = {i}.
For a sequence a of vectors, linear dependencies correspond to expressing
o as a convex combination. Namely, for disjoint index sets hand 12 , we
have 0 E conv( {iii: i E h} U {-iii: i E I 2 }) if and only if there is a nonzero
a E LinDep(a) with h(a) s-;;; hand L(a) s-;;; h
Together with these geometric interpretations of LinVal(a), AffVal(a),
LinDep(a), and AffDep(a), the following lemma (whose proof is left to Ex-
ercise 8) allows us to translate properties of point configurations to those of
their Gale transforms.
5.6.2 Lemma. Let a be a sequence ofn points in Rd whose points afHnely
span R d , and let g be its Gale transform. Then LinVal(g) = AffDep(a) and
LinDep(g) = AffVal(a). 0

So a Radon partition of a corresponds to a partition of g by a linear


hyperplane, and a partition of a by a hyperplane translates to a linear de-
pendence (i.e., a "linear Radon partition") of g.
Let us list several interesting connections, again leaving the simple but
instructive proofs to the reader.
5.6 The Gale Transform III

5.6.3 Corollary (Dictionary of the Gale transform).


(i) (Lying in a common hyperplane) For every (d+1)-point index set I ~
{1, 2, ... , n}, the points ai with i E I lie in a common hyperplane if and
only if all the vectors 9j with j (j. I lie in a common linear hyperplane.
(ii) (General position) In particular, the points of a are in general position
(no d+1 on a common hyperplane) if and only if every n-d-1 vectors
among 91, ... ,9n span Rn-d-l (which is a natural condition of general
position for vectors).
(iii) (Faces of the convex hull) The points ai with i E I are contained in a
common facet of P = conv(a) if and only if 0 E conv{9j: j (j. I}. In par-
ticular, if P is a simplicial polytope, then its k-faces exactly correspond
to complements of the (n-k-1 )-element subsets of [j containing 0 in the
convex hull.
(iv) (Convex independence) The ai form a convex independent set if and only
if there is no oriented linear hyperplane with exactly one of the 9j on the
positive side.

Here is, finally, a picture of a 3-dimensional convex polytope with 6 ver-


tices and the (planar) Gale transform of its vertex set:

..' ..... ·····a~····· ........

For example, the facet ala2a5a6 is reflected by the complementary pair 93,94
of parallel oppositely oriented vectors, and so on.
Signs suffice. As was noted above, in order to find out whether some
ai is a vertex of conv(a), we ask whether there is an a E AffDep(a) with
I+(a) = {i}. Only the signs of the vectors in AffDep(a) are important here,
and this is the case with all the combinatorial-geometric information about
point sequences or vector sequences in Corollary 5.6.3. For such purposes,
the knowledge of sgn(AffDep(a)) = ((sgn(ar), ... ,sgn(an )): a E AffDep(a)}
is as good as the knowledge of AffDep(a).
We can thus declare two sequences a and b combinatorially isomorphic if
sgn(AffDep(a)) = sgn(AffDep(b)) and sgn(AfNal(a)) = sgn(AfNal(b)).2 We
will hear a little more about this notion of combinatorial isomorphism in
Section 9.3 when we discuss order types, and also in the notes to Section 6.2
in connection with oriented matroids.

2 It is nontrivial but true that either of these equalities implies the other one.
112 Chapter 5: Convex Polytopes

Here we need only one very special case: If 9 = (fh, ... ,9n) is a sequence
of vectors, tl' ... ' tn > 0 are positive real numbers, and g' = (t191, ... , tn9n),
then clearly,

sgn(LinVal(g)) = sgn(LinVal(g')) and sgn(LinOep(g)) = sgn(LinOep(g')),


and so 9 and g' are combinatorially isomorphic vector configurations.
Affine Gale diagrams. We have seen a certain asymmetry of the Gale
transform: While the sequence a is interpreted affinely, as a point sequence,
its Gale transform needs to be interpreted linearly, as a sequence of vectors
(with 0 playing a special role). Could one reduce the dimension of 9 by 1 and
pass to an "affine version" of the Gale transform? This is indeed possible, but
one has to distinguish "positive" and "negative" points in the affine version.
Let 9 be the Gale transform of some a, 91, ... , 9n E R n-d-1. Let us
assume for simplicity that all the 9i are nonzero. We choose a hyperplane h
not parallel to any of the ?,h and not passing through 0, and we project the
9i centrally from 0 into h, obtaining points g1> ... , gn E h ~ R n-d-2. If gi
lies on the same side of 0 as 9i, i.e., if gi = ti9i with ti > 0, we set ai = +1,
and call gi a positive point. For gi lying on the other side of 0 than 9i we
let ai = -1, and we call gi a negative point. Here is an example with the
2-dimensional Gale transform from the previous drawing:

---?i~- ........ . affine Gale diagram

The positive gi are marked by full circles, the negative ones by empty circles,
and we have borrowed the (incomplete) yin-yang symbol for marking the
positions shared by one positive and one negative point. This sequence 9 of
positive and negative points in Rn-d-2, or more formally the pair (g,a),
is called an affine Gale diagram of a. It conveys the same combinatorial
information as g, although we cannot reconstruct a from it up to linear
isomorphism, as was the case with g. (For this reason, we speak of Gale
diagram rather than Gale transform.) One has to get used to interpreting
the positive and negative points properly. If we put

AfNal(g, a) = {(ar!(gl), ... , anf(gn)): f: R n - d- 2 -+ R affine},


AffOep(g, a) = {Q E R n : ~~=1 aiQigi = 0, ~~=1 aiQi = O},
then, as is easily checked,
5.6 The Gale Transform 113

sgn(AffDep(g, a)) = sgn(LinDep(g)) and sgn(AfNal(g, a)) = sgn(LinVal(g)).


Here is a reinterpretation of Corollary 5.6.3 in terms of the affine Gale dia-
gram.

5.6.4 Proposition (Dictionary of affine Gale diagrams). Let a be a


sequence of n points in R d, let 9 be the Gale transform of a, and assume that
all the !Ji are nonzero. Let (g, a) be an affine Gale diagram of a in Rn-d-2.
(i) A subsequence a[ lies in a common facet of conv(a) if and only if
conv({gj: j tJ. f,aj = I}) nconv({gj: j tJ. f,aj = -I}) =1= 0.
(ii) The points of a are in convex position if and only if for every oriented
hyperplane in R n - d - 2 , the number of positive points of 9 on its positive
side plus the number of negative points of 9 on its negative side is at
least 2. 0

So far we have assumed that !Ji =1= 0 for all i. This need not hold in general,
and points !Ji = 0 need a special treatment in the affine Gale diagram: They
are called the special points, and for a full specification of the affine Gale
diagram, we draw the positive and negative points and give the number
of special points. It is easy to find out how the presence of special points
influences the conditions in the previous proposition.
A nonrational polytope. Configurations of k+4 points in R k have planar
affine Gale diagrams. This leads to many interesting constructions of k-dimen-
sional convex polytopes with k+4 vertices. Here we give just one example: an
8-dimensional polytope with 12 vertices that cannot be realized with rational
coordinates; that is, no polytope with isomorphic face lattice has all vertex
coordinates rational. First one has to become convinced that if 9 distinct
points are placed in R2 so that they are not all collinear and there are collinear
triples and 4-tuples as is marked by segments in the left drawing below,

then not all coordinates of the points can be rational. We omit the proof,
which has little to do with the Gale transform or convex polytopes.
Next, we declare some points negative, some positive, and some both
positive and negative, as in the right drawing, obtaining 12 points. These
points have a chance of being an affine Gale diagram of the vertex set of
an 8-dimensional convex polytope, since condition (ii) in Proposition 5.6.4
114 Chapter 5: Convex Polytopes

is satisfied. How do we construct such a polytope? For gi = (Xi, Yi), we put


9i = (tiXi' tiYi, ti) E R3, choosing ti > 0 for positive gi and ti < 0 for negative
ti, in such a way that 2:;:1 9i = O. Then the Gale transform of 9 isthe vertex
set of the desired convex polytope P (see Observation 5.6.1(ii) and (iii)).
Let P' be some convex polytope with an isomorphic face lattice and let
(g',(T') be an affine Gale diagram of its vertex set a'. We have, for exam-
ple, g~ = g~o because {a~: i =I- 7, 10} form a facet of P', and similarly for
the other point coincidences. The triple g~, g~2' g~ (where g~ is positive) is
collinear, because {a~: i =I- 1,8, 12} is a facet. In this way, we see that the
point coincidences and collinearities are preserved, and so no affine Gale dia-
gram of P' can have all coordinates rational. At the same time, by checking
the definition, we see that a point sequence with rational coordinates has at
least one affine Gale diagram with rational coordinates. Thus, P cannot be
realized with rational coordinates.

Bibliography and remarks. Gale diagrams and the Gale transform


emerged from the work of Gale [GaI56] and were further developed
by Perles, as is documented in [Grii67] (also see, e.g., [MS71]). Our
exposition essentially follows Ziegler's book [Zie94] (his treatment is
combined with an introduction to oriented matroids). We aim at con-
creteness, and so, for example, the Gale transform is defined using the
orthogonal complement, although it might be mathematically more
elegant to work with the annihilator in the dual space (Rn)*, and so
on. The construction of an irrational 8-polytope is due to Perles.
In Section 11.3 (Exercise 6) we mention an interpretation of the
h-vector of a simplicial convex polytope via the Gale transform. Using
this correspondence, Wagner and Welzl [WW01] found an interesting
continuous analogue of the upper bound theorem, which speaks about
probability distributions in Rd. For other recent applications of a sim-
ilar correspondence see the notes to Section 11.3.

Exercises
1. Let E be a k x n matrix of rank k :::; n. Check that for any k x n matrix E'
whose rows generate the same vector space as the rows of E, there exists
a nonsingular k x k matrix T with E' = TE. Infer that if 9 = (91, ... ,9n)
is a Gale transform of a, then any other Gale transform of a has the form
(T91 , T92,.·., T9n) for a nonsingular square matrix T. [I]
2. Let a be a sequence of d+ 1 affinely independent points in Rd. What is
the Gale transform of a, and what are AffVal(a) and AffDep(a)? IT]
3. Let 9 be a Gale transform of the vertex set of a convex polytope PeRd,
and let Ii be obtained from 9 by appending the zero vector. Check that
Ii is again a Gale transform of a convex independent set. What is the
relation of this set to P? [I]
5.7 Voronoi Diagrams 115

4. Using affine Gale diagrams, count the number of classes of combinatorial


equivalence of d-dimensional convex polytopes with d+2 vertices. How
many of them are simple, and how many simplicial? 0
5. Verify the characterization in Observation 5.6.1(ii) of sequences 9 in
Rn-d-l that are Gale transforms of some a, and check that if the Gale
transform is applied twice to such g, we obtain 9 up to linear isomor-
phism.0
6. Let a = (al, ... , an) be a point sequence in Rd whose affine hull is all of
R d, and let P = conv{al, ... , an}.
Given AffVal(a), explain how we can determine which of the ai are the
vertices of P and how we reconstruct the face lattice of P. 0
7. Let a be a sequence of n vectors in Rd+l that spans Rd+l.
(a) Find dim LinVal(a) and dim LinDep(a). 0
(b) Check that LinVal(a) is the orthogonal complement of LinDep(a). 0
8. Prove Lemma 5.6.2. 0
9. Verify Corollary 5.6.3. 0

5.7 Voronoi Diagrams


Consider a finite set P C Rd. For each point pEP, we define a region reg(p),
which is the "sphere of influence" of the point p: It consists of the points
x E Rd for which p is the closest point among the points of P. Formally,
reg(p) = {x E Rd: dist(x,p) s dist(x, q) for all q E P},
where dist(x, y) denotes the Euclidean distance of the points x and y. The
Voronoi diagram of P is the set of all regions reg (p) for pEP. (More precisely,
it is the cell complex induced by these regions; that is, every intersection of
a subset of the regions is a face of the Voronoi diagram.) Here an example of
the Voronoi diagram of a point set in the plane:



(Of course, the Voronoi diagram is clipped by a rectangle so that it fits into a
finite page.) The points of P are traditionally called the sites in the context
of Voronoi diagrams.
116 Chapter 5: Convex Polytopes

5.7.1 Observation. Each region reg(p) is a convex polyhedron with at most


IPI-l facets.

n
Indeed,
reg(p) = {x: dist(x,p) S dist(x,q)}
QEP\{P}

is an intersection of IPI - 1 half-spaces. o


For d = 2, a Voronoi diagram of n points is a subdivision of the plane
into n convex polygons (some of them are unbounded). It can be regarded as
a drawing of a planar graph (with one vertex at the infinity, say), and hence
it has a linear combinatorial complexity: n regions, O(n) vertices, and O(n)
edges.
In the literature the Voronoi diagram also appears under various other
names, such as the Dirichlet tessellation.
Examples of applications. Voronoi diagrams have been reinvented and
used in various branches of science. Sometimes the connections are surprising.
For instance, in archaeology, Voronoi diagrams help study cultural influences.
Here we mention a few applications, mostly algorithmic.
• ("Post office problem" or nearest neighbor searching) Given a point set
P in the plane, we want to construct a data structure that finds the point
of P nearest to a given query point x as quickly as possible. This prob-
lem arises directly in some practical situations or, more significantly, as
a subroutine in more complicated problems. The query can be answered
by determining the region of the Voronoi diagram of P containing x. For
this problem (point location in a subdivision of the plane), efficient data
structures are known; see, e.g., the book [dBvKOS97j or other introduc-
tory texts on computational geometry.
• (Robot motion planning) Consider a disk-shaped robot in the plane. It
should pass among a set P of point obstacles, getting from a given start
position to a given target position and touching none of the obstacles.

If such a passage is possible at all, the robot can always walk along
the edges of the Voronoi diagram of P, except for the initial and final
5.7 Voronoi Diagrams 117

segments of the tour. This allows one to reduce the robot motion problem
to a graph search problem: We define a subgraph of the Voronoi diagram
consisting of the edges that are passable for the robot .
• (A nice triangulation: the Delaunay triangulation) Let P C R2 be a finite
point set. In many applications one needs to construct a triangulation of
P (that is, to subdivide conv(P) into triangles with vertices at the points
of P) in such a way that the triangles are not too skinny. Of course, for
some sets, some skinny triangles are necessary, but we want to avoid
them as much as possible. One particular triangulation that is usually
very good, and provably optimal with respect to several natural criteria,
is obtained as the dual graph to the Voronoi diagram of P. Two points
of P are connected by an edge if and only if their Voronoi regions share
an edge.

If no 4 points of P lie on a common circle then this indeed defines a


triangulation, called the Delaunay triangulation3 of P; see Exercise 5.
The definition extends to points sets in Rd in a straightforward manner .
• (Interpolation) Suppose that f: R2 ~ R is some smooth function whose
values are known to us only at the points of a finite set P C R2. We
would like to interpolate f over the whole polygon conv(P). Of course,
we cannot really tell what f looks like outside P, but still we want a
reasonable interpolation rule that provides a nice smooth function with
the given values at P. Multidimensional interpolation is an extensive
semiempirical discipline, which we do not seriously consider here; we
explain only one elegant method based on Voronoi diagrams. To compute
the interpolated value at a point x E conv(P), we construct the Voronoi
diagram of P, and we overlay it with the Voronoi diagram of P U {x}.

3 Being a transcription from Russian, the spelling of Delaunay's name varies in


the literature. For example, in crystallography literature he is usually spelled
"Delone."
118 Chapter 5: Convex Polytopes

The region of the new point x cuts off portions of the regions of some of
the old points. Let wp be the area of the part of reg(p) in the Voronoi
diagram of P that belongs to reg(x) after inserting x. The interpolated
value f(x) is
f(x) = L L wp w f(p)·
PEP qEP q

An analogous method can be used in higher dimensions, too.

Relation ofVoronoi diagrams to convex polyhedra. We now show that


Voronoi diagrams in Rd correspond to certain convex polyhedra in Rd+!.
First we define the unit paraboloid in R d+ 1 :

For d = 1, U is a parabola in the plane.


In the sequel, let us imagine the space Rd as the hyperplane Xd+l = 0 in
Rd+!. For a point p = (PI, ... ,Pd) E Rd, let e(p) denote the hyperplane in
Rd+! with equation

Xd+l = 2PIXI + 2P2X2 + ... + 2PdXd - p~ - p~ - ... - p~.

Geometrically, e(p) is the hyperplane tangent to the paraboloid U at the point


u(p) = (PI,P2,'" ,Pd,P~ + ... + p~) lying vertically above p. It is perhaps
easier to remember this geometric definition of e(p) and derive its equation
by differentiation when needed. On the other hand, in the forthcoming proof
we start out from the equation of e(p), and as a by-product, we will see that
e(p) is the tangent to U at u(p) as claimed.
5.7.2 Proposition. Let p,x E Rd be points and let u(x) be the point oiU
vertically above x. Then u(x) lies above the hyperplane e(p) or on it, and the
vertical distance oiu(x) to e(p) is 82 , where 8 = dist(x,p).

e(p)

---~-~=--j4---.j---- Xd+l = 0

Proof. We just substitute into the equations of U and of e(p). The Xd+!-
coordinate of u(x) is x~ + ... + x~, while the Xd+l-coordinate of the point
5.7 Voronoi Diagrams 119

of e(p) above x is 2PIXl + ... + 2PdXd - pi - ... - p~. The difference is


(XI-Pl)2+···+(Xd-Pd)2=8 2. 0

Let £(p) denote the half-space lying above the hyperplane e(p). Consider
an n-point set P C Rd. By Proposition 5.7.2, x E reg(p) holds if and only
if e(p) is vertically closest to U at x among all e(q), q E P. Here is what we
have derived:
5.7.3 Corollary. The Voronoi diagram of P is the vertical projection of the
facets of the polyhedron npEP £(p) onto the hyperplane Xd+l = O. 0

Here is an illustration for a planar Voronoi diagram:

5.7.4 Corollary. The maximum total number of faces of all regions of the
Voronoi diagram of an n-point set in Rd is O(n rd / 21 ).

Proof. We know that the combinatorial complexity of the Voronoi diagram


equals the combinatorial complexity of an H-polyhedron with at most n
facets in Rd+l. By intersecting this H-polyhedron with a large simplex we
can obtain a bounded polytope with at most n+d+2 facets, and we have not
decreased the number of faces compared to the original H-polyhedron. Then
the dual version of the asymptotic upper bound theorem (Theorem 5.5.2)
implies that the total number offaces is O( n rd/21), since L(d+l)/2J = fd/2l
o
The convex polyhedra in R d+ 1 obtained from Voronoi diagrams in Rd
by the above construction are rather special, and so a lower bound for the
combinatorial complexity of convex polytopes cannot be automatically trans-
ferred to Voronoi diagrams. But it turns out that the number of vertices of a
Voronoi diagram on n points in Rd can really be of order n rd / 21 (Exercise 2).
Let us remark that the trick used for transforming Voronoi diagrams
to convex polyhedra is an example of a more general technique, called lin-
earization or Veronese mapping, which will be discussed a little more in
120 Chapter 5: Convex Polytopes

Section 10.3. This method sometimes allows us to convert a problem about


algebraic curves or surfaces of bounded degree to a problem about k-flats in
a suitable higher-dimensional space.
The farthest-point Voronoi diagram. The projection of the H-poly-
hedron npEP £ (p )OP, where I'0P denotes the half-space opposite to 1', forms
the farthest-neighbor Voronoi diagram, in which each point pEP is assigned
the regions of points for which it is the farthest point. It can be shown that
all nonempty regions of this diagram are unbounded and they correspond
precisely to the points appearing on the surface of conv(P).

Bibliography and remarks. The concept of Voronoi diagrams in-


dependently emerged in various fields of science, for example as the
medial axis transform in biology and physiology, the Wigner-Seitz
zones in chemistry and physics, the domains of action in crystallo-
graphy, and the Thiessen polygons in meteorology and geography. Ap-
parently, the earliest documented reference to Voronoi diagrams is a
picture in the famous Principia Philosopiae by Descartes from 1644
(that picture actually seems to show a power diagram, a generalization
of the Voronoi diagram to sites with different strengths of influence).
Mathematically, Voronoi diagrams were first introduced by Dirichlet
[Dir50] and by Voronoi [Vor08] for the investigation of quadratic forms.
For more information on the interesting history and a surprising va-
riety of applications we refer to several surveys: Aurenhammer and
Klein [AKOO], Aurenhammer [Aur91]' and the book Okabe, Boots,
and Sugihara [OBS92]. Every computational geometry textbook also
has at least a chapter devoted to Voronoi diagrams, and most papers
on this subject appear in computational geometry.
The Delaunay triangulation (or, more correctly, the Delaunay tes-
sellation, since it need not be a triangulation in general) was first
considered by Voronoi as the dual to the Voronoi diagram, and later
by Delaunay [DeI34] with the definition given in Exercise 5(b) below.
The Delaunay triangulation of a planar point set P optimizes sev-
eral quality measures among all triangulations of P: It maximizes the
minimum angle occurring in any triangle, minimizes the maximum
circumradius of the triangles, maximizes the sum of inradii, and so
on (see [AKOO] for references). Such optimality properties can usually
be proved by local flipping. We consider an arbitrary triangulation 7
of a given finite P C R2 (say with no 4 cocircular points). If there
is a 4-point Q <:;; P such that conv( Q) is a quadrilateral triangulated
by two triangles of 7 but in such a way that these two triangles are
not the Delaunay triangulation of Q, then the diagonal of Q can be
flipped:
5.7 Voronoi Diagrams 121

locally
Delaunay

It can be shown that every sequence of such local flips is finite and
finishes with the Delaunay triangulation of P (Exercise 7). This pro-
cedure has an analogue in higher dimensions, where it gives a simple
and practically successful algorithm for computing Delaunay trian-
gulations (and Voronoi diagrams); see, e.g., Edelsbrunner and Shah
[ES96j.
Generalizations of Voronoi diagrams. The example in the text with
robot motion planning, as well as other applications, motivates var-
ious notions of generalized Voronoi diagrams. First, instead of the
Euclidean distance, one can take various other distance functions, say
the Cp-metrics. Second, instead of the spheres of influence of points,
we can consider the spheres of influence of other sites, such as dis-
joint polygons (this is what we get if we have a circular robot moving
amidst polygonal obstacles). We do not attempt to survey the numer-
ous results concerning such generalizations, again referring to [AKOOj.
Results on the combinatorial complexity of Voronoi diagrams under
non-Euclidean metrics and/or for nonpoint sites will be mentioned in
the notes to Section 7.7.
In another, very general, approach to Voronoi diagrams, one takes
the Voronoi diagram induced by two objects as a primitive notion. So
for every two objects we are given a partition of space into two regions
separated by a bisector, and Voronoi diagrams for more than two ob-
jects are built using the 2-partitions for all pairs. If one postulates a
few geometric properties of the bisectors, one gets a reasonable theory
of Voronoi diagrams (the so-called abstract Voronoi diagrams), includ-
ing efficient algorithms. So, for example, we do not even need a notion
of distance at this level of generality. Abstract Voronoi diagrams (in
the plane) were suggested by Klein [Kle89j.
A geometrically significant generalization of the Euclidean Voronoi
diagram is the power diagram: Each point pEP is assigned a real
weight w(p), and reg(P) = {x E Rd: Ilx - pll2 - w(p) ::; IIx _ qll2 -
w(q) for all q E P}. While Voronoi diagrams in Rd are projections
of certain convex polyhedra in Rd+l, the projection into Rd of every
intersection of finitely many nonvertical upper half-spaces in R d + 1 is
a power diagram. Moreover, a hyperplane section of a power diagram
is again a power diagram. Several other generalized Voronoi diagrams
in Rd (for example, with multiplicative weights of the sites) can be
obtained by intersecting a suitable power diagram in R d + 1 with a
simple surface and projecting into Rd, which yields fast algorithms;
see Aurenhammer and Imai [AI88j.
122 Chapter 5: Convex Polytopes

Another generalization are higher-order Voronoi diagrams. The


kth-order Voronoi diagram of a finite point set P assigns to each k-
point T ~ P the region reg(T) consisting of all x E Rd for which the
points of T are the k nearest neighbors of x in P. The usual Voronoi
diagram arises for k = 1, and the farthest-point Voronoi diagram for
k = IPI- 1. The kth-order Voronoi diagram of Pc Rd is the projec-
tion of the kth level facets in the arrangement of the hyperplanes e(p),
pEP (see Chapter 6 for these notions). Lee [Lee82] proved that the
kth-order Voronoi diagram of n points in the plane has combinato-
rial complexity O(k(n-k)); this is better than the maximum possible
complexity of level k in an arrangement of n arbitrary planes in R 3 .
Applications of Voronoi diagrams are too numerous to be listed here,
and we add only a few remarks to those already mentioned in the
text. Using point location in Voronoi diagrams as in the post office
problem, several basic computational problems in the plane can be
solved efficiently, such as finding the closest pair in a point set or the
largest disk contained in a given polygon and not containing any of
the given points.
Besides providing good triangulations, the Delaunay triangulation
contains several other interesting graphs as subgraphs, such as a min-
imum spanning tree of a given point set (Exercise 6). In the plane,
this leads to an O( n log n) algorithm for the minimum spanning tree.
In R 3 , sub complexes of the Delaunay triangulation, the so-called (¥-
complexes, have been successfully used in molecular modeling (see,
e.g., Edelsbrunner [Ede98]); they allow one to quickly answer ques-
tions such as, "how many tunnels and voids are there in the given
molecule?"
Robot motion planning using Voronoi diagrams (or, more gener-
ally, the retraction approach, where the whole free space for the robot
is replaced by some suitable low-dimensional skeleton) was first con-
sidered by O'Dunlaig and Yap [OY85]. Algorithmic motion planning
is an extensive discipline with innumerable variants of the problem.
For a brief introduction from the computational-geometric point of
view see, e.g., [dBvKOS97]; among several monographs we mention
Laumond and Overmars [L096] and Latombe [Lat91].
The spatial interpolation of functions using Voronoi diagrams was
considered by Sibson [Sib81].

Exercises
1. Prove that the region reg(p) of a point p in the Voronoi diagram of a
finite point set P C Rd is unbounded if and only if p lies on the surface
of conv(P). [!]
5.7 Voronoi Diagrams 123

2. (a) Show that the Voronoi diagram of the 2n-point set {(~,O,O): i =
1,2, ... , n} U {(O, 1, *): j = 1,2, ... , n} in R3 has D(n 2 ) vertices. 0
(b) Let d = 2k+ 1 be odd, let e1,"" ed be vectors of the standard
orthonormal basis in R d, and let eo stand for the zero vector. For
i = 0,1, ... , k and j = 1,2, ... , n, let Pi,j = e2i + *e2i+1' Prove that
for every choice of jo, j1, . .. ,jk E {I, 2, ... ,n}, there is a point in Rd for
which the nearest points among the Pi,j are exactly PO,jo' P1,j" ... ,Pk,jk'
Conclude that the Voronoi diagram of the Pi,j has combinatorial com-
plexity D(n k ) = D(n rd / 21 ). 0
3. (Voronoi diagram of flats) Let 101,"" Cd-1 be small distinct positive
numbers and for i = 1,2, ... , d-1 and j = 1,2, ... , n, let Fi,j be the
(d-2)-flat {x E Rd: Xi = j, Xd = ci}' For every choice of j1, 12, ... ,jd-1 E
{1,2, ... ,n}, find a point in Rd for which the nearest sites (under the
Euclidean distance) among the Fi,j are exactly F 1 ,j" F2,h, ... ,Fd- 1,jd_l'
Conclude that the Voronoi diagram of the Fi,j has combinatorial com-
plexity D(n d - 1 ). 0
This example is from Aronov [AroOO].
4. For a finite point set in the plane, define the farthest-point Voronoi dia-
gram as indicated in the text, verify the claimed correspondence with a
convex polyhedron in R3, and prove that all nonempty regions are un-
bounded. 0
5. (Delaunay triangulation) Let P be a finite point set in the plane with no
3 points collinear and no 4 points cocircular.
(a) Prove that the dual graph of the Voronoi diagram of P, where two
points p, q E P are connected by a straight edge if and only if the bound-
aries of reg(p) and reg(q) share a segment, is a plane graph where the
outer face is the complement of conv(P) and every inner face is a trian-
gle.0
(b) Define a graph on P as follows: Two points P and q are connected
by an edge if and only if there exists a circular disk with both P and q
on the boundary and with no point of P in its interior. Prove that this
graph is the same as in (a), and so we have an alternative definition of
the Delaunay triangulation. 0
6. (Delaunay triangulation and minimum spanning tree) Let Pc R2 be a
finite point set with no 3 points collinear and no 4 co circular. Let T be a
spanning tree of minimum total edge length in the complete graph with
the vertex set P, where the length of an edge is just its Euclidean length.
Prove that all edges of T are also edges of the Delaunay triangulation of
P.0
7. (Delaunay triangulation by local flipping) Let P C R2 be an n-point set
with no 3 points collinear and no 4 cocircular. Let T be an arbitrary
triangulation of conv(P). Suppose that triangulations Ti, 72, ... are ob-
tained from T by successive local flips as described in the notes above (in
each step, we select a convex quadrilateral in the current triangulation
124 Chapter 5: Convex Polytopes

partitioned into two triangles in a way that is not the Delaunay triangu-
lation of the four vertices and we flip the diagonal of the quadrilateral).
(a) Prove that the sequence of triangulations is always finite (and give
as good an estimate for its maximum length as you can). [!]
(b) Show that if no local flipping is possible, then the current triangula-
tion is the Delaunay triangulation of P. 0
8. Consider a finite set of disjoint segments in the plane. What types of
curves may bound the regions in their Voronoi diagram? The region of a
given segment is the set of points for which this segment is a closest one.
o
9. Let A and B be two finite point sets in the plane. Choose ao E A arbi-
trarily. Having defined ao, ... , ai and b1 , •.• , bi - 1 , define bi+1 as a point
of B \ {b 1 , •.. , bd nearest to ai, and ai+l as a point of A \ {ao, ... , ail
nearest to bi+ 1. Continue until one of the sets becomes empty. Prove that
at least one of the pairs (ai, bi+d, (bi+l' ai+d, i = 0,1,2, ... , realizes the
shortest distance between a point of A and a point of B. (This was used

°
by Eppstein [Epp95] in some dynamical geometric algorithms.) [!]
10. (a) Let C be any circle in the plane X3 = (in R3). Show that there exists
a half-space h such that C is the vertical projection of the set h n U onto
X3 = 0, where U = {x E R3: X3 = xI + xn is the unit paraboloid. CD
(b) Consider n arbitrary circular disks K 1 , ... , Kn in the plane. Show that
there exist only O(n) intersections of their boundaries that lie inside no
other Ki (this means that the boundary of the union of the Ki consists
of O(n) circular arcs). [!]
11. Define a "spherical polytope" as an intersection of n balls in R3 (such
an object has facets, edges, and vertices similar to an ordinary convex
polytope).
(a) Show that any such spherical polytope in R3 has O(n 2 ) faces. You
may assume that the spheres are in general position. 0
(b) Find an example of an intersection of n balls having quadratically
many vertices. [!]
(c) Show that the intersection of n unit balls has O(n) complexity only.
o
6

N umber of Faces in
Arrangements

Arrangements of lines in the plane and their higher-dimensional generaliza-


tion, arrangements of hyperplanes in Rd, are a basic geometric structure
whose significance is comparable to that of convex polytopes. In fact, ar-
rangements and convex polytopes are quite closely related: A cell in a hyper-
plane arrangement is a convex polyhedron, and conversely, each hyperplane
arrangement in Rd corresponds canonically to a convex polytope in R d+1
of a special type, the so-called zonotope. But as is often the case with dif-
ferent representations of the same mathematical structure, convex polytopes
and arrangements of hyperplanes emphasize different aspects of the structure
and lead to different questions.
Whenever we have a problem involving a finite point set in R d and parti-
tions of the set by hyperplanes, we can use geometric duality, and we obtain
a problem concerning a hyperplane arrangement. Arrangements appear in
many other contexts as well; for example, some models of molecules give rise
to arrangements of spheres in R 3 , and automatic planning of the motion of
a robot among obstacles involves, implicitly or explicitly, arrangements of
surfaces in higher-dimensional spaces.
Arrangements of hyperplanes have been investigated for a long time from
various points of view. In several classical areas of mathematics one is mainly
interested in topological and algebraic properties of the whole arrangement.
Hyperplane arrangements are related to such marvelous objects as Lie alge-
bras, root systems, and Coxeter groups. In the theory of oriented matroids
one studies the systems of sign vectors associated to hyperplane arrangements
in an abstract axiomatic setting.
We are going to concentrate on estimating the combinatorial complexity
(number of faces) in arrangements and neglect all the other directions.
126 Chapter 6: Number of Faces in Arrangements

General probabilistic techniques for bounding the complexity of geomet-


ric configurations constitute the second main theme of this chapter. These
methods have been successful in attacking many more problems than can
even be mentioned in this book. We begin with a simple but powerful sam-
pling argument in Section 6.3 (somewhat resembling the proof of the crossing
number theorem), add more tricks in Section 6.4, and finish with quite a so-
phisticated method, demonstrated on a construction of optimal ~-cuttings,
in Section 6.5.

6.1 Arrangements of Hyperplanes


We recall from Section 4.1 that for a finite set H of lines in the plane, the
arrangement of H is a partition of the plane into relatively open convex
subsets, the faces of the arrangement. In this particular case, the faces are
the vertices (0- faces), the edges (I-faces), and the cells (2- faces). 1
An arrangement of a finite set H of hyperplanes in Rd is again a partition
of Rd into relatively open convex faces. Their dimensions are 0 through d. As
in the plane, the O-faces are called vertices, the I-faces edges, and the d-faces
cells. Sometimes the (d-l)-faces are referred to as facets.
The cells are the connected components of R d \ U H. To obtain the facets,
we consider the (d-l )-dimensional arrangements induced in the hyperplanes
of H by their intersections with the other hyperplanes. That is, for each
h E H we take the connected components of h \ Uh'EH:h'#h h'. To obtain
k-faces, we consider every possible k-flat L defined as the intersection of some
d-k hyperplanes of H. The k-faces of the arrangement lying within L are
the connected components of L \ U(H \ H L ), where HL = {h E H: L ~ h}.
Remark on sign vectors. A face of the arrangement of H can be described
by its sign vector. First we need to fix the orientation of each hyperplane
h E H. Each h E H partitions Rd into three regions: h itself and the two
open half-spaces determined by it. We choose one of these open half-spaces as
positive and denote it by hlf), and we let the other one be negative, denoted
by he.
Let F be a face of the arrangement of H. We define the sign vector of
F (with respect to the chosen orientations of the hyperplanes) as CJ(F) =
(IJh: hE H), where
+1 if F ~ hlf),
IJh = { 0 if F ~ h,
-1 if F ~ he.
The sign vector determines the face F, since we have F = nhEH h<7 h , where
hO = h, hI = hlf), and h- I = he. The following drawing shows the sign

1 This terminology is not unified in the literature. What we call faces are sometimes
referred to as cells (O-cells, I-cells, and 2-cells).
6.1 Arrangements of Hyperplanes 127

vectors of the marked faces in a line arrangement. Only the signs are shown,
and the positive half-planes lie above their lines.

00+-

Of course, not all possible sign vectors correspond to nonempty faces. For n
lines, there are 3n sign vectors but only O(n 2 ) faces, as we will derive below.
Counting the cells in a hyperplane arrangement. We want to count
the maximum number of faces in an arrangement of n hyperplanes in Rd. As
we will see, this is much simpler than the similar task for convex polytopes!
If a set H of hyperplanes is in general position, which means that the
intersection of every k hyperplanes is (d-k)-dimensional, k = 2,3, ... , d+1,
the arrangement of H is called simple. For IHI ;::: d+ 1 it suffices to require that
every d hyperplanes intersect at a single point and no d+ 1 have a common
point.
Every d-tuple of hyperplanes in a simple arrangement determines exactly
one vertex, and so a simple arrangement of n hyperplanes has exactly G)
vertices. We now calculate the number of cells; it turns out that the order of
magnitude is also n d for d fixed.

6.1.1 Proposition. The number of cells (d-faces) in a simple arrangement


of n hyperplanes in R d equals

(6.1)

First proof. We proceed by induction on the dimension d and the number


of hyperplanes n. For d = 1 we have a line and n points in it. These divide the
line into n+1 one-dimensional pieces, and formula (6.1) holds. (The formula
is also correct for n = 0 and all d ;::: 1, since the whole space, with no
hyperplanes, is a single cell.)
Now suppose that we are in dimension d, we have n-1 hyperplanes, and
we insert another one. Since we assume general position, the n-1 previous
hyperplanes divide the newly inserted hyperplane h into q>d-l(n-1) cells by
the inductive hypothesis. Each such (d-1 )-dimensional cell within h parti-
tions one d-dimensional cell into exactly two new cells. The total increase in
the number of cells caused by inserting h is thus q> d-l (n-1), and so
128 Chapter 6: Number of Faces in Arrangements

Together with the initial conditions (for d = 1 and for n = 0), this recurrence
determines all values of IP, and so it remains to check that formula (6.1)
satisfies the recurrence. We have

IPd(n -1) + q>d-l(n - 1) = (nol) + [(n~l) + (nol)]


+ [(n~l) + (n~l)] + ... + [(ndl) + (~=:)]
= (nol) + (7) + (~) + ... + (~) = IPd(n).
o
Second proof. This proof looks simpler, but a complete rigorous presenta-
tion is perhaps somewhat more demanding.
We proceed by induction on d, the case d = 0 being trivial. Let H be a set
of n hyperplanes in Rd in general position; in particular, we assume that no
hyperplane of H is horizontal and no two vertices of the arrangement have
the same vertical level (xd-coordinate).
Let 9 be an auxiliary horizontal hyperplane lying below all the vertices.
A cell of the arrangement of H either is bounded from below, and in this
case it has a unique lowest vertex, or is not bounded from below, and then it
intersects g. The number of cells of the former type is the same as the number
of vertices, which is (~). The cells of the latter type correspond to the cells
in the (d-l)-dimensional arrangement induced within 9 by the hyperplanes
of H, and their number is thus IPd-l(n). 0

What is the number of faces of the intermediate dimensions 1,2, ... , d-l
in a simple arrangement of n hyperplanes? This is not difficult to calculate
using Proposition 6.1.1 (Exercise 1); the main conclusion is that the total
number of faces is O(nd ) for a fixed d.
What about nonsimple arrangements? It turns out that a simple arrange-
ment of n hyperplanes maximizes the number of faces of each dimension
among arrangements of n hyperplanes. This can be verified by a perturbation
argument, which is considerably simpler than the one for convex polytopes
(Lemma 5.5.4), and which we omit.

Bibliography and remarks. The paper of Steiner [Ste26) from 1826


gives formulas for the number of faces in arrangements of lines, circles,
planes, and spheres. Of course, his results have been extended in many
ways since then (see, e.g., Zaslavsky [Zas75]). An early monograph on
arrangements is Griinbaum [Grii72).
The questions considered in the subsequent sections, such as the
combinatorial complexity of certain parts of arrangements, have been
studied mainly in the last twenty years or so. A recent survey dis-
cussing a large part of the material of this chapter and providing many
more facts and references is Agarwal and Sharir [ASOOa).
6.1 Arrangements of Hyperplanes 129

The algebraic and topological investigation of hyperplane arrange-


ments (both in real and complex spaces) is reflected in the book Orlik
and Terao [OT91j. Let us remark that in these areas, one usually
considers central arrangements of hyperplanes, where all the hyper-
planes pass through the origin (and so they are linear subspaces of
the underlying vector space). If such a central arrangement in Rd is
intersected with a generic hyperplane not passing through the origin,
one obtains a (d-l )-dimensional "affine" arrangement such as those
considered by us. The correspondence is bijective, and so these two
views of arrangements are not very different, but for many results, the
formulation with central arrangements is more elegant.
The correspondence of arrangements to zonotopes is thoroughly
explained in Ziegler [Zie94j.

Exercises
1. (a) Count the number of faces of dimensions 1 and 2 for a simple ar-
rangement of n planes in R 3 . [2]
(b) Express the number of k-faces in a simple arrangement of n hyper-
planes in Rd. [2]
2. Prove that the number of unbounded cells in an arrangement of n hyper-
planes in Rd is O(n d - l ) (for a fixed d). [2]
3. (a) Check that an arrangement of d or fewer hyperplanes in Rd has no
bounded cell. [2]
(b) Prove that an arrangement of d+ 1 hyperplanes in general position in
R d has exactly one bounded cell. [II
4. How many d-dimensional cells are there in the arrangement of the (g)
hyperplanes in Rd with equations {Xi = Xj}, where 1 :::; i < j :::; d? [II
5. How many d-dimensional cells are there in the arrangement of the hy-
perplanes in Rd with the equations {Xi - Xj = O}, {Xi - Xj = I}, and
{Xi - Xj = -I}, where 1 :::; i < j :::; d? IT]
6. (Flags in arrangements)
(a) Let H be a set of n lines in the plane, and let V be the set of vertices
of their arrangement. Prove that the number of pairs (v, h) with v E V,
hE H, and v E h, i.e., the number of incidences J(V, H), is bounded by
O(n 2 ). (Note that this is trivially true for simple arrangements.) [2]
(b) Prove that the maximum number of d-tuples (Fo, F l , ... ,Fd ) in an
arrangement of n hyperplanes in R d , where Fi is an i-dimensional face
and F i - l is contained in the closure of Fi , is O(n d ) (d fixed). Such d-
tuples are sometimes called flags of the arrangement. [II
7. Let P = {Pl, ... ,Pn} be a point set in the plane. Let us say that points
X, y have the same view of P if the points of P are visible in the same
cyclic order from them. If rotating light rays emanate from X and from y,
the points of P are lit in the same order by these rays. We assume that
130 Chapter 6: Number of Faces in Arrangements

neither x nor y is in P and that neither of them can see two points of P
in occlusion.
(a) Show that the maximum possible number of points with mutually
distinct views of Pis O(n 4 ). ~
(b) Show that the bound in (a) cannot be improved in general. [!]

6.2 Arrangements of Other Geometric Objects


Arrangements can be defined not only for hyperplanes but also for other
geometric objects. For example, what is the arrangement of a finite set H of
segments in the plane? As in the case of lines, it is a decomposition of the
plane into faces of dimension 0,1,2: the vertices, the edges, and the cells,
respectively. The vertices are the intersections of the segments, the edges are
the portions of the segments after removing the vertices, and the cells (2-
faces) are the connected components of R2 \ U H. (Note that the endpoints
of the segments are not included among the vertices.) While the cells of line
arrangements are convex polygons, those in arrangements of segments can be
complicated regions, even with holes:

It is almost obvious that the total number of faces of the arrangement of n


segments is at most O(n 2 ). What is the maximum number of edges on the
boundary of a single cell in such an arrangements? This seemingly innocuous
question is surprisingly difficult, and most of Chapter 7 revolves around it.
Let us now present the definition of the arrangement for arbitrary sets
AI, A 2, ... , An ~ Rd. The arrangement is a subdivision of space into con-
nected pieces again called the faces. Each face is an inclusion-maximal con-
nected set that "crosses no boundary." More precisely, first we define an
equivalence relation ~ on R d: We put x ~ y whenever x and y lie in the
same subcollection of the Ai, that is, whenever {i: x E Ad = {i: y E Ai}.
So for each I ~ {I, 2, . .. , n}, we have one possible equivalence class, namely
{x E Rd: x E Ai {o} i E I} (this is like a field in the Venn diagram of the Ai).
But in typical geometric situations, most of the classes are empty. The faces
of the arrangement of the Ai are the connected components of the equivalence
classes. The reader is invited to check that for both hyperplane arrangements
and arrangements of segments this definition coincides with the earlier ones.
Arrangements of algebraic surfaces. Quite often one needs to con-
sider arrangements of the zero sets of polynomials. Let PI (Xl, X2, . .. ,Xd), . . . ,
Pn (Xl, X2, ... ,Xd) be polynomials with real coefficients in d variables, and let
Zi = {x E Rd: Pi(X) = O} be the zero set of Pi. Let D denote the maximum
6.2 Arrangements of Other Geometric Objects 131

of the degrees of the Pi; when speaking of the arrangement of Zl,"" Zn,
one usually assumes that D is bounded by some (small) constant. Without
a bound on D, even a single Zi can have arbitrarily many connected compo-
nents.
In many cases, the Zi are algebraic surfaces, such as ellipsoids, paraboloids,
etc., but since we are in the real domain, sometimes they need not look like
surfaces at all. For example, the zero set of the polynomial p( Xl, X2) = xi + x~
consists of the single point (0,0). Although it is sometimes convenient to think
of the Zi as surfaces, the results stated below apply to zero sets of arbitrary
polynomials of bounded degree.
It is known that if both d and D are considered as constants, the maximum
number of faces in the arrangement of Zl, Z2,"" Zn as above is at most
O(n d ). This is one of the most useful results about arrangements, with many
surprising applications (a few are outlined below and in the exercises). In
the literature one often finds a (formally weaker) version dealing with sign
patterns of the polynomials Pi' A vector 0" E {-I, 0, + l}n is called a sign
pattern of PI, P2, ... ,Pn if there exists an X E R d such that the sign of Pi (x)
is O"i, for all i = 1,2, ... ,n. Trivially, the number of sign patterns for any n
polynomials is at most 3n . For d = 1, it is easy to see that the actual number
of sign patterns is much smaller, namely at most 2nD + 1 (Exercise 1). It is
not so easy to prove, but still true, that there are at most C(d, D) . n d sign
patterns in dimension d. This result is generally called the Milnor- Thom
theorem (and it was apparently first proved by Oleinik and Petrovskii, which
fits the usual pattern in the history of mathematics). Here is a more precise
(and more recent) version of this result, where the dependence on D and d
is specified quite precisely.

6.2.1 Theorem (Number of sign patterns). Let PI,P2, .. ' ,Pn be d-


variate real polynomials of degree at most D. The number of faces in the
arrangement of their zero sets Zl, Z2,"" Zn S;; R d , and consequently the
number of sign patterns of PI , ... ,Pn as well is at most 2(2D)d L~=o 2i (4n:I).
For n 2:: d 2:: 2, this expression is bounded by

Proofs of these results are not included here because they would require
at least one more chapter. They belong to the field of real algebraic geometry.
The classical, deep, and extremely extensive field of algebraic geometry mostly
studies algebraic varieties over algebraically closed fields, such as the complex
numbers (and the questions of combinatorial complexity in our sense are
not among its main interests). Real algebraic geometry investigates algebraic
varieties and related concepts over the real numbers or other real-closed fields;
the presence of ordering and the missing roots of polynomials makes its flavor
distinctly different.
132 Chapter 6: Number of Faces in Arrangements

Arrangements of pseudo lines. An arrangement of pseudo lines is a nat-


ural generalization of an arrangement of lines. Lines are replaced by curves,
but we insist that these curves behave, in a suitable sense, like lines: For ex-
ample, no two of them intersect more than once. This kind of generalization
is quite different from, say, arrangements of planar algebraic curves, and so it
perhaps does not quite belong to the present section. But besides mentioning
pseudoline arrangements as a useful and interesting concept, we also need
them for a (typical) example of application of Theorem 6.2.1, and so we kill
two birds with one stone by discussing them here.
An (affine) arrangement of pseudolines can be defined as the arrangement
of a finite collection of curves in the plane that satisfy the following conditions:
(i) Each curve is x-monotone and unbounded in both directions; in other
words, it intersects each vertical line in exactly one point.
(ii) Every two of the curves intersect in exactly one point and they cross
at the intersection. (We do not permit "parallel" pseudolines, for they
would complicate the definition unnecessarily.) 2
The curves are called pseudolines, but while "being a line" is an absolute no-
tion, "being a pseudoline" makes sense only with respect to a given collection
of curves.
Here is an example of a (simple) arrangement of 5 pseudolines:

Much of what we have proved for arrangements of lines is true for arrange-
ments of pseudolines as well. This holds for the maximum number of vertices,
edges, and cells, but also for more sophisticated results like the Szemeredi-
Trotter theorem on the maximum number of incidences of m points and n
lines; these results have proofs that do not use any properties of straight lines
not shared by pseudolines.
One might be tempted to say that pseudolines are curves that behave
topologically like lines, but as we will see below, in at least one sense this is
2 This "affine" definition is a little artificial, and we use it only because we do
not want to assume the reader's familiarity with the topology of the projective
plane. In the literature one usually considers arrangements of pseudolines in
the projective plane, where the definition is very natural: Each pseudoline is a
closed curve whose removal does not disconnect the projective plane, and every
two pseudolines intersect exactly once (which already implies that they cross at
the intersection point). Moreover, one often adds the condition that the curves
do not form a single pencil; Le., not all of them have a common point, since
otherwise, one would have to exclude the case of a pencil in the formulation of
many theorems. But here we are not going to study pseudoline arrangements in
any depth.
6.2 Arrangements of Other Geometric Objects 133

profoundly wrong. The correct statement is that every two of them behave
topologically like two lines, but arrangements of pseudolines are more general
than arrangements of lines.
We should first point out that there is no problem with the "local" struc-
ture of the pseudoliries, since each pseudoline arrangement can be redrawn
equivalently (in a sense defined precisely below) by polygonal lines, as a wiring
diagram:

The difference between pseudoline arrangements and line arrangements is of


a more global nature.
The arrangement of 5 pseudolines drawn above can be realized by straight
lines:
5
4

3--~~~::-
2
1

What is the meaning of "realization by straight lines"? To this end, we need


a suitable notion of equivalence of two arrangements of pseudolines. There
are several technically different possibilities; we again use an "affine" notion,
one that is very simple to state but not the most common. Let H be a col-
lection of n pseudolines. We number the pseudolines 1,2, ... , n in the order
in which they appear on the left of the arrangement, say from the bottom
to the top. For each i, we write down the numbers of the other pseudolines
in the order they are encountered along the pseudoline i from left to right.
For a simple arrangement we obtain a permutation 7fi of {I, 2, ... , n} \ {i}
for each i. For the arrangement in the pictures, we have 7f1 = (2,3,5,4),
W2 = (1,5,4,3), 7f3 = (1,5,4,2), 7f4 = (5,1,3,2), and 7f5 = (4,1,3,2). For
a nonsimple arrangement, some of the Wi are linear quasiorderings, meaning
that several consecutive numbers can be chunked together. We call two ar-
rangements affinely isomorphic if they yield the same W1, •.. , 7fn , i.e., if each
pseudoline meets the others in the same (quasi)order as the corresponding
pseudoline in the other arrangement. Two affinely isomorphic pseudoline ar-
rangements can be converted one to another by a suitable homeomorphism
of the plane. 3
3 The more usual notion of isomorphism of pseudoline arrangements is defined for
arrangements in the projective plane. The arrangement of H is isomorphic to the
134 Chapter 6: Number of Faces in Arrangements

An arrangement of pseudolines is stretchable if it is affinely isomorphic to


an arrangement of straight lines. 4 It turns out that all arrangements of 8 or
fewer pseudo lines are stretchable, but there exists a nonstretchable arrange-
ment of 9 pseudolines:

The proof of nonstretchability is based on the Pappus theorem in projective


geometry, which states that if 8 straight lines intersect as in the drawing, then
the points p, q, and r are collinear. By modifying this arrangement suitably,
one can obtain a simple nonstretchable arrangement of 9 pseudolines as well.
Next, we show that most of the simple pseudoline arrangements are non-
stretchable. The following construction shows that the number of isomor-
phism classes of simple arrangements of n pseudo lines is at least 2rl(n 2 ):
9m

92
91

We have m :::::; ~, and the lines hI, ... ,hm and 91, ... ,9m form a regular grid.
n
Each of the about ~ pseudolines Pi in the middle passes near (n) vertices of
arrangement of H' if there exists a homeomorphism 'P of the projective plane
onto itself such that each pseudoline h E H is mapped to a pseudo line 'P( h) E
H'. For affinely isomorphic arrangements in the affine plane, the corresponding
arrangements in the projective plane are isomorphic, but the isomorphism in the
projective plane also allows for mirror reflection and for "relocating the infinity."
Combinatorially, the isomorphism in the projective plane can be described using
the (quasi)orderings 7rl, ••• ,7rn as well. Here the 7ri have to agree only up to
a possible reversal and cyclic shift for each i, and also the numbering of the
pseudolines by 1,2, ... ,n is not canonical.
We also remark that two arrangements of lines are isomorphic if and only if
the dual point configurations have the same order type, up to a mirror reflection
of the whole configuration (order types are discussed in Section 9.3).
4 For isomorphism in the projective plane, one gets an equivalent notion of stretch-
ability.
6.2 Arrangements of Other Geometric Objects 135

this grid, and for each such vertex it has a choice of going below it or above.
This gives 2n(n2) possibilities in total.
Now we use Theorem 6.2.1 to estimate the number of nonisomorphic sim-
ple arrangements of n straight lines. Let the lines be Cl , ... ,Cn, where Ci
has the equation y = aiX + bi and al > a2 > ... > an. The x-coordinate
of the intersection Ci n Cj is !;=~ii' To determine the ordering 7Ti of the in-
tersections along Ci, it suffices to know the ordering of the x-coordinates of
these intersections, and this can be inferred from the signs of the polynomials
Pijk(ai, bi , aj, bj , ak, bk ) = (b i - bj)(ak - ai) - (b i - bk)(aj - ai). So the num-
ber of nonisomorphic arrangements of n lines is no larger than the number
of possible sign patterns of the O(n 3 ) polynomials Pijk in the 2n variables
aI, bl , ... ,an, bn , and Theorem 6.2.1 yields the upper bound of 20 (n!ogn). For
large n, this is a negligible fraction of the total number of simple pseudoline
arrangements. (Similar considerations apply to nonsimple arrangements as
well.)
The problem of deciding the stretchability of a given pseudoline arrange-
ment has been shown to be algorithmically difficult (at least NP-hard). One
can easily encounter this problem when thinking about line arrangements and
drawing pictures: What we draw by hand are really pseudo lines, not lines,
and even with the help of a ruler it may be almost impossible to decide ex-
perimentally whether a given arrangement can really be drawn with straight
lines. But there are computational methods that can decide stretchability in
reasonable time at least for moderate numbers of lines.

Bibliography and remarks. A comprehensive account of real al-


gebraic geometry is Bochnak, Coste, and Roy [BCR98]. Among the
many available introductions to the "classical" algebraic geometry we
mention the lively book Cox, Little, and O'Shea [CL092].
The original bounds on the number of sign patterns, less precise
than Theorem 6.2.1 but still implying the O(n d ) bound for fixed d,
were given independently by Oleinik and Petrovskii [OP49], Milnor
[Mil64], and Thom [Th065]. Warren [War68] proved that the number
of d-dimensional cells in the arrangement as in Theorem 6.2.1, and
consequently the number of sign patterns consisting of ±1 's only, is
at most 2(2D)d 2::1=0 2i (7). The extension to faces of all dimensions,
and to sign patterns including O's, was obtained by Pollack and Roy
[PR93].
Sometimes we have polynomials in many variables, but we are in-
terested only in sign patterns attained at points that satisfy some
additional algebraic conditions. Such a situation is covered by a re-
sult of Basu, Pollack, and Roy [BPR96]: The number of sign patterns
attained by n polynomials of degree at most D on a k-dimensional
algebraic variety V ~ R d, where V can be defined by polynomials of
degree at most D, is at most (~)O(D)d.
136 Chapter 6: Number of Faces in Arrangements

While bounding the number of sign patterns of multivariate poly-


nomials appears complicated, there is a beautiful short proof of an
almost tight bound on the number of zero patterns, due to R6nyai,
Babai, and Ganapathy [RBG01], which we now sketch (in the sim-
plest form, giving a slightly suboptimal result). A vector ( E {O, l}n is
a zero pattern of d-variate polynomials PI, ... ,Pn with coefficients in a
field F if there exists an x = x (() E Fd with Pi (x) = 0 exactly for the i
with (i = o. We show that if all the Pi have degree at most D, then the
number of zero patterns cannot exceed (D~+d). For each zero pattern
(, let qc, be the polynomial fl: C,dO Pi· We have deg qc, ~ Dn. Let us
consider the qc, as elements of the vector space L of all d-variate poly-
nomials over F of degree at most Dn. Using the basis of L consisting
of all monomials of degree at most Dn, we obtain dim L ~ (D~+d). It
remains to verify that the qc, are linearly independent (assuming that
no Pi is identically 0). Suppose that Lc, O'.c,qc, = 0 with O'.c, E F not all
o. Choose a zero pattern ~ with O'.~ -=I- 0 and with the largest possible
number of O's, and substitute x(~) into Lc, O'.c,qc,. This yields O'.~ = 0,
a contradiction.
Pseudoline arrangements. The founding paper is Levi [Lev26], where,
among others, the nonstretchable arrangement of 9 lines drawn above
was presented. A concise survey was written by Goodman [Goo97].
Pseudoline arrangements, besides being very natural, have also
turned out to be a fruitful generalization of line arrangements. Some
problems concerning line arrangements or point configurations were
first solved only in the more general setting of pseudoline arrange-
ments, and certain algorithms for line arrangements, the so-called
topological sweep methods, use an auxiliary pseudoline to speed up
the computation; see [Goo97].
Infinite families of pseudolines have been considered as well, and
even topological planes, which are analogues of the projective plane
but made of pseudolines. It is known that every finite configuration
of pseudolines can be extended to a topological plane, and there are
uncountably many distinct topological planes; see Goodman, Pollack,
Wenger, and Zamfirescu [GPWZ94].
Oriented matroids. The possibility of representing each pseudoline
arrangement by a wiring diagram makes it clear that a pseudoline ar-
rangement can also be considered as a purely combinatorial object.
The appropriate combinatorial counterpart of a pseudoline arrange-
ment is called an oriented matroid of rank 3. More generally, similar to
arrangements of pseudolines, one can define arrangements of pseudo-
hyperplanes in Rd, and these are combinatorially captured by oriented
matroids of rank d+ 1. Here the rank is one higher than the space di-
mension, because an oriented matroid of rank d is usually viewed as a
6.2 Arrangements of Other Geometric Objects 137

combinatorial abstraction of a central arrangement of hyperplanes in


Rd (with all hyperplanes passing through 0).
There are several different but equivalent definitions of an oriented
matroid. We present a definition in the so-called covector form. An
oriented matroid is a set V* ~ {-I, 0, I} n that is symmetric (v E V*
implies -v E V*), contains the zero vector, and satisfies the following
two more complicated conditions:
• (Closed under composition) If u, v E V*, then u 0 v E V*, where
(u 0 V)i = Ui if Ui -j. 0 and (u 0 V)i = Vi if Ui = o.
• (Admits elimination) If u, v E V* and j E S( u, v) = {i: Ui = -Vi -j.
W E V* such that Wj = 0 and Wi = (u 0 V)i for
O}, then there exists
all i tJ. S(u, v).
The rank of an oriented matroid V* is the largest r such that there is
an increasing chain VI --< V2 --< ... --< V r , Vi E V*, where U :5 v means
Ui :5 Vi for all i and where 0 --< 1 and 0 --< -1. At first sight, all this
may look quite mysterious, but it becomes much clearer if one thinks
of a basic example, where V* is the set of sign vectors of all faces of a
central arrangement of hyperplanes in Rd.
It turns out that every oriented matroid of rank 3 corresponds to
an arrangement of pseudolines. More generally, Lawrence's represen-
tation theorem asserts that every oriented matroid of rank d comes
from some central arrangement of pseudohyperplanes in R d , and so
the purely combinatorial notion of oriented matroid corresponds, es-
sentially uniquely, to the topological notion of a (central) arrangement
of pseudo hyperplanes. 5
Oriented matroids are also naturally obtained from configurations
of points or vectors. In the notation of Section 5.6 (Gale transform), if
a is a sequence of n vectors in R r , then both the sets sgn(LinVal(a))
and sgn(LinDep(a)) are oriented matroids in the sense of the above
definition. The first one has rank r, and the second, rank n-r.
We are not going to say much more about oriented matroids, re-
ferring to Ziegler [Zie94] for a quick introduction and to Bj6rner, Las
Vergnas, Sturmfels, White, and Ziegler [BVS+99] for a comprehensive
account.
Stretchability. The following results illustrate the surprising difficulty
of the stretchability problem for pseudoline arrangements. They are
analogous to the statements about realizability of 4-dimensional con-
vex polytopes mentioned in Section 5.3, and they were actually found
much earlier.

5 The correspondence need not really be one-to-one. For example, the oriented
matroids of two projectively isomorphic pseudoline arrangements agree only up
to reorientation.
138 Chapter 6: Number of Faces in Arrangements

Certain (simple) stretchable arrangements of n pseudolines require


coefficients with 2!1(n) digits in the equations of the lines, in every
straight-line realization (Goodman, Pollack, and Sturmfels [GPS90]).
Deciding the stretchability of a given pseudo line arrangement is NP-
hard (Shor [Sh091] has a relatively simple proof), and in fact, it is
polynomially equivalent to the problem of solvability of a system of
polynomial inequalities with integer coefficients. This follows from re-
sults of Mnev, published in Russian in 1985 (proofs were only sketched;
see [Mne89] for an English version). This work went unnoticed in the
West for some time, and so some of the results were rediscovered by
other authors.
Although detailed proofs of such theorems are technically demand-
ing, the principle is rather simple. Given two real numbers, suitably
represented by geometric quantities, one can produce their sum and
their product by classical geometric constructions by ruler. (Since ruler
constructions are invariant under projective transformations, the num-
bers are represented as cross-ratios.) By composing such constructions,
one can express the solvability of P(Xl, ... ,xn ) = 0, for a given n-
variate polynomial P with integer coefficients, by the stretchability of a
suitable arrangement in the projective plane. Dealing with inequalities
and passing to simple arrangements is somewhat more complicated,
but the idea is similar.
Practical algorithms for deciding stretchability have been studied
extensively by Bokowski and Sturmfels [BS89] and by Richter-Gebert
(see, e.g., [RG99]).
Mnev [Mne89] was mainly interested in the realization spaces of ar-
rangements. Let H be a fixed stretchable arrangement. Each straight-
line arrangement H' affinely isomorphic to H can be represented by
a point in R 2n, with the 2n coordinates specifying the coefficients in
the equations of the lines of H'. Considering all possible H' for a given
H, we obtain a subset of R2n. For some time it was conjectured that
this set, the realization space of H, has to be path-connected, which
would mean that one straight-line realization could be converted to
any other by a continuous motion while retaining the affine isomor-
phism type. 6 Not only is this false, but the realization space can have
arbitrarily many components. In a suitable sense, it can even have
arbitrary topological type. Whenever A ~ Rn is a set definable by
a formula involving finitely many polynomial inequalities with inte-
ger coefficients, Boolean connectives, and quantifiers, there is a line
arrangement whose realization space S is homotopy equivalent to A
(Mnev's main result actually talks about the stronger notion of sta-
G In fact, these questions have been studied mainly for the isomorphism of arrange-
ments in the projective plane. There one has to be a little careful, since a mirror
reflection can easily make the realization space disconnected, and so the mirror
reflection (or the whole action of the general linear group) is factored out first.
6.2 Arrangements of Other Geometric Objects 139

ble equivalence of S and A; see, e.g., [Goo97] or [BVS+99]). Similar


theorems were proved by Richter-Gebert for the realization spaces of
4-dimensional polytopes [RG99], [RG97].
These results for arrangements and polytopes can be regarded as
instances of a vague but probably quite general principle: "Almost
none of the combinatorially imaginable geometric configurations are
geometrically realizable, and it is difficult to decide which ones are."
Of course, there are exceptions, such as the graphs of 3-dimensional
convex polytopes.
Encoding pseudoline arrangements. The lower bound 2!l(n 2 ) for the
number of isomorphism classes of pseudoline arrangements is asymp-
totically tight. Felsner [FeI97] found a nice encoding of such an arrange-
ment by an n x n matrix of O's and 1's, from which the isomorphism
type can be reconstructed: The entry (i, j) of the matrix is 1 iff the jth
leftmost crossing along the pseudoline number i is with a pseudoline
whose number k is larger than i.

Exercises
1. Let Pl(X), ... ,Pn(X) be univariate real polynomials of degree at most D.
Check that the number of sign patterns of the Pi is at most 2nD+ 1. 0
2. (Intersection graphs) Let S be a set of n line segments in the plane. The
intersection graph of S is the graph on n vertices, which correspond to
the segments of S, with two vertices connected by an edge if and only if
the corresponding two segments intersect.
(a) Prove that the graph obtained from K5 by subdividing each edge
exactly once is not the intersection graph of segments in the plane (and
not even the intersection graph of any arcwise connected sets in the
plane).8J
(b) Use Theorem 6.2.1 to prove that most graphs are not intersection
graphs of segments: While the total number of graphs on n given vertices
is 2(~) = 2n2 /2+0(n) , only 20(n log n) of them are intersection graphs of
segments (be careful about collinear segments!). 0
(c) Show that the number of (isomorphism classes of) intersection graphs
of planar arcwise connected sets, and even of planar convex sets, on n
vertices cannot be bounded by 20( n log n). (The right order of magnitude
does not seem to be known for either of these classes of intersection
graphs.) 8J
3. (Number of combinatorially distinct simplicial convex polytopes) Use
Theorem 6.2.1 to prove that for every dimension d ~ 3 there exists Cd > 0
such that the number of combinatorial types of simplicial polytopes in
R d with n vertices is at most 2Cdn log n. (The combinatorial equivalence
means isomorphic face lattices; see Definition 5.3.4.) 8J
140 Chapter 6: Number of Faces in Arrangements

Such a result was proved by Alon [Alo86b] and by Goodman and Pollack
[GP86].
4. (Sign patterns of matrices and rank) Let A be a real n x n matrix. The
sign matrix a(A) is the n x n matrix with entries in {-1,0,+1} given
by the signs of the corresponding entries in A.
(a) Check that A has rank at most q if and only if there exist n x q
matrices U and V with A = UV T . [II
(b) Estimate the number of distinct sign matrices of rank q using Theo-
rem 6.2.1, and conclude that there exists an n x n matrix S containing
only entries +1 and -1 such that any real matrix A with a(A) = S has
rank at least en, with a suitable constant e > 0. [II
The result in (b) is from Alon, Frankl, and Rodl [AFR85] (for another
application see [Mat96b]).
5. (Extendible pseudosegments) A family of pseudosegments is a finite col-
lection S = {S1' S2, ... ,sn} of curves in the plane such that each Si is
x-monotone and its vertical projection on the x-axis is a closed interval,
every two curves in the family intersect at most once, and whenever they
intersect they cross (tangential contacts are not allowed). Such an S is
called extendible if there is a family L = {e 1, ... ,en} of pseudolines such
that Si ~ ei , i = 1,2, ... ,no
(a) Find an example of a nonextendible family of 3 pseudosegments. IT]
(b) Define an oriented graph G with vertex set S and with an edge from
Si to Sj if Si n 8j t=- 0 and 8i is below Sj on the left of their intersection.
Check that if S is extendible, then G is acyclic. IT]
(c) Prove that, conversely, if G is acyclic, then S is extendible. Extend
the pseudosegments one by one, maintaining the acyclicity of G. [II
(d) Let Ii be the projection of 8i on the x-axis. Show that if for every
i < j, Ii n I j = 0 or Ii ~ I j or I j ~ h then G is acyclic, and hence S is
extendible. 0
(e) Given a family of closed intervals h, ... ,In ~ R, show that each in-
terval in the family can be partitioned into at most o (log n) subintervals
in such a way that the resulting family of subintervals has the property
as in (d). This implies that an arbitrary family of n pseudosegments can
be cut into a family of O( n log n) extendible pseudosegments. [II
These notions and results are from Chan [ChaOOa].

6.3 Number of Vertices of Level at Most k


In this section and the next one we investigate the maximum number of faces
in certain naturally defined portions of hyperplane arrangements. We con-
sider only simple arrangements, and we omit the (usually routine) perturba-
tion arguments showing that simple arrangements maximize the investigated
quantity.
6.3 Number of Vertices of Level at Most k 141

Let H be a finite set of hyperplanes in R d, and assume that none of them


is vertical, i.e., parallel to the xd-axis. The level of a point x E Rd is the
number of hyperplanes of H lying strictly below x (the hyperplanes passing
through x, if any, are not counted). This extends the definition for lines from
Section 4.7.
We are interested in the maximum possible number of vertices of level
at most k in a simple arrangement of n hyperplanes. The following drawing
shows the region of all points of level at most 2 in an arrangement of lines;
we want to count the vertices lying in the region or on its boundary.

The vertices of level 0 are the vertices of the cell lying below all the
hyperplanes, and since this cell is the intersection of at most n half-spaces,
it has at most O(nLd/2J) vertices, by the asymptotic upper bound theorem
(Theorem 5.5.2). From this result we derive a bound on the maximum number
of vertices of level at most k. The elegant probabilistic technique used in the
proof is generally applicable and probably more important than the particular
result itself.
6.3.1 Theorem (Clarkson's theorem on levels). The total number of
vertices of level at most k in an arrangement of n hyperplanes in Rd is at
most
O( n Ld/2J (k+ 1) rd/2 1),
with the constant of proportionality depending on d.
We are going to prove the theorem for simple arrangements only. The
general case can be derived from the result for simple arrangements by a
standard perturbation argument. But let us stress that the simplicity of the
arrangement is essential for the forthcoming proof.
For all k (0 S k S n - d), the bound is tight in the worst case. To see this
for k :::: 1, consider a set of ~ hyperplanes such that the lower unbounded cell
in their arrangement is a convex polyhedron with n«~)Ld/2J) vertices, and
replace each of the hyperplanes by k very close parallel hyperplanes. Then
each vertex of level 0 in the original arrangement gives rise to n(k d ) vertices
of level at most k in the new arrangement.
A much more challenging problem is to estimate the maximum possible
number of vertices of level exactly k. This will be discussed in Chapter II.
One of the main motivations that led to Clarkson's theorem on levels was
an algorithmic problem. Given an n-point set P C R d , we want to construct
142 Chapter 6: Number of Faces in Arrangements

a data structure for fast answering of queries of the following type: For a
query point x E Rd and an integer t, report the t points of P that lie nearest
to x.
Clarkson's theorem on levels is needed for bounding the maximum amount
of memory used by a certain efficient algorithm. The connection is not entirely
simple. It uses the lifting transform described in Section 5.7, relating the
algorithmic problem in Rd to the complexity of levels in Rd+l, and we do
not discuss it here.
Proof of Theorem 6.3.1 for d = 2. First we demonstrate this special
case, for which the calculations are somewhat simpler.
Let H be a set of n lines in general position in the plane. Let p denote a
certain suitable number in the interval (0,1) whose value will be determined
at the end of the proof. Let us imagine the following random experiment. We
choose a subset R ~ H at random, by including each line h E H into R with
probability p, the choices being independent for distinct lines h.
Let us consider the arrangement of R, temporarily discarding all the other
°
lines, and let f (R) denote the number of vertices of level in the arrangement
of R. Since R is random, f is a random variable. We estimate the expectation
of f, denoted by E[f], in two ways.
First, we have f(R) :::; IRI for any specific set R, and hence E[fl :::;
E[lRll =pn.
Now we estimate E[fl differently: We bound it from below using the
number of vertices of the arrangement of H of level at most k. For each
vertex v of the arrangement of H, we define an event Av meaning "v becomes
°
one of the vertices of level in the arrangement of R." That is, Av occurs
if v contributes 1 to the value of f. The event Av occurs if and only if the
following two conditions are satisfied:
• Both lines determining the vertex v lie in R .
• None of the lines of H lying below v falls into R.

>K } these must be in R

~}h
~ t ese must not be 'R ill

We deduce that Prob[Avl = p2(1- p)l(v) , where £(v) denotes the level of the
vertex v.
Let V be the set of all vertices of the arrangement of H, and let V:::;k ~ V
be the set of vertices of level at most k, whose cardinality we want to estimate.
We have

E[Jl = L Prob[Avl 2: L Prob[Avl


vEV
6.3 Number of Vertices of Level at Most k 143

Altogether we have derived np 2: E[J] 2: IVSkl· p2(1- p)k, and so

Let us now choose the number p so as to minimize the right-hand side. A


convenient value is p = k!l; it does not yield the exact minimum, but it

comes close. We have (1 - k!l) k 2: e- 1 > i for all k 2: 1. This leads to


IVSkl s:; 3(k+1)n. 0

Proof for an arbitrary dimension. The idea of the proof is the same
as above. As for the technical realization, there are at least two possible
routes. The first is to retain the same probability distribution for selecting
the sample R (picking each hyperplane of the given set H independently with
probability p); in this case, most of the proof remains as before, but we need
a lemma showing that E[IRI Ld/2J] = O((pn) Ld/2 J ). This is not difficult to
prove, either from a Chernoff-type inequality or by elementary calculations
(see Exercises 6.5.2 and 6.5.3).
The second possibility, which we use here, is to change the probability
distribution. Namely, we define an integer parameter r and choose a random
r-element subset R s:;; H, with all the (~) subsets being equally probable.
With this new way of choosing R, we proceed as in the proof for d = 2.
We define f(R) as the number of vertices of level 0 in the arrangement of R
and estimate E[J] in two ways. On the one hand, we have f(R) = O(r Ld/2J)
for all R, and so
E[J] = O( r Ld/2J).
The notation V for the set of all vertices of the arrangement of H, VSk
for the vertices of level at most k, and Av for the event "v is a vertex of level
o in the arrangement of R," is as in the previous proof. The conditions for
Av are
• All the d hyperplanes defining the vertex v fall into R .
• None of the hyperplanes of H lying below v fall into R.
So if £ = £(v) is the level of v, then

For brevity, we denote this quantity by P(£). We note that it is a de'creasing


function of £. Therefore,

E[J] = L Prob[Avl 2: IVSkl· P(k).


vEV
144 Chapter 6: Number of Faces in Arrangements

Combining with E[f] = O(rld/2J) derived earlier, we obtain


O(rld/2J)
iV::;kl:::; P(k) . (6.2)

An appropriate value for the parameter r is r = l k~l J. (This is not


surprising, since in the previous proof, the size of R was concentrated around
pn = k~l') Then we have the following estimate:

6.3.2 Lemma. Suppose that 1 :::; k :::; 2nd - 1, which implies 2d :::; r :::; ~.
Then
P(k) 2": cd(k+l)-d
for a suitable Cd > 0 depending only on d.
We postpone the proof of the lemma a little and finish the proof of The-
orem 6.3.1. We want to substitute the bound from the lemma into (6.2). In
order to meet the assumptions of the lemma, we must restrict the range of k
somewhat. But if, say, k 2": ~, then the bound claimed by the theorem is of
order n d and thus trivial, and for k = 0 we already know that the theorem
holds. So we may assume 1 :::; k :::; 2nd - 1, and we have

This establishes the theorem. D

Proof of Lemma 6.3.2.

() (n~~dk)
P k = (~)
(n-d-k)(n-d-k-l) ... (n-k-r+l) . r(r-l) ... (r-d+l)
n(n-l)··· (n-r+l)
r(r-l)··· (r-d+l) n-d-k n-d-k-l n-k-r+l
n(n-l)··· (n-d+l) n-d n-d-l n-r+l

> (;n) d (1 - n : d) ( 1 - n _ ~ _ 1) ... (1 - n- ~ + 1)


r d( k)T
> (2n) 1 - n- r + 1

Now, ~ 2": (k~l - 1)/n 2": 2(k~1) (since k < ~, say) and 1 - n-~+l 2": 1 - 2:
(a somewhat finer calculation actually gives 1 - k~l here). Since k :::; ~, we
can use the inequality I-x 2": e- 2x valid for x E [0, ~J, and we arrive at

Lemma 6.3.2 is proved. D


6.3 Number of Vertices of Level at Most k 145

Levels in arrangements. Besides vertices, we can consider all faces of level


at most k, where the level of a face is the (common) level of all of its points.
Using Theorem 6.3.1, it is not hard to prove that the number of all faces of
level at most k in an arrangement of n hyperplanes is O(n Ld / 2J (k+1) [d/21).
In the literature one often speaks about the level k in an arrangement
of hyperplanes, meaning the boundary of the region of all points of level at
most k. This is a polyhedral surface and each vertical line intersects it in
exactly one point. It is a sub complex of the arrangement; note that it may
also contain faces of level different from k. In Section 4.7 we considered such
levels in arrangements of lines.

Bibliography and remarks. Clarkson's theorem on levels was first


proved in Clarkson [Cla88a] (see Clarkson and Shor [CS89] for the
journal version). The elegant proof technique has many other applica-
tions, and we will meet it several more times, combined with additional
tricks into sophisticated arguments. The theorem can be formulated
in an abstract framework outlined in the notes to Section 6.5. New
variations on the basic method were noted by Sharir [ShaOl] (see Ex-
ercises 4 and 5).
In the planar case, the O(nk) bound on the complexity of levels 0
through k was known before Clarkson's paper, apparently first proved
by Goodman and Pollack [GP84]. Alon and Gyori [AG86] determined
the exact constant of proportionality (which Clarkson's proof in the
present form cannot provide). Welzl [WelO1] proved an exact upper
bound in R3; see the notes to Section 11.3 for a little more about his
method. Several other related references can be found, e.g., in Agarwal
and Sharir [ASOOa].

Exercises
1. Show that for n hyperplanes in R d in general position, the total number
of vertices oflevels k, k+l, ... ,n-d is at most O(n Ld / 2J (n-k) [d/21). 0
2. (a) Consider n lines in the plane in general position (their arrangement
is simple). Call a vertex v of their arrangement an extreme if one of its
defining lines has a positive slope and the other one has a negative slope.
Prove that there are at most O((k+l)2) extremes of level at most k.
Imitate the proof of Clarkson's theorem on levels. 0
(b) Show that the bound in (a) cannot be improved in general. IT]
3. Let K 1 , ... , Kn be circular disks in the plane. Show that the number of
intersections of their boundary circles that are contained in at most k
disks is bounded by O(nk). Use the result of Exercise 5.7.10 and assume
general position if convenient. 0
4. Let L be a set of n nonvertical lines in the plane in general position.
(a) Let W be an arbitrary subset of vertices of the arrangement of L,
and let Xw be the number of pairs (v,f), where v E W, f E L, and
146 Chapter 6: Number of Faces in Arrangements

£ goes (strictly) below v. For every real number p E (0,1), prove that
Xw 2': p-1IWI_ p- 2 n. [II
(b) Let W be a set of vertices in the arrangement of L such that no line
of L lies strictly below more than k vertices of W, where k 2': 1. Use (a)
to prove IWI = O(n-/k). 0
(c) Check that the bound in (b) is tight for all k ~ ~. 0
This exercise and the next one are from Sharir [ShaOl].
5. Let P be an n-point set in the plane in general position (no 4 points on
a common circle). Let C be a set of circles such that each circle in C
passes through 3 points of P and contains no more than k points of P
in its interior. Prove that ICI ~ O(nk2/3), by an approach analogous to
that of Exercise 4. [II

6.4 The Zone Theorem


Let H be a set of n hyperplanes in R d, and let 9 be a hyperplane that may
or may not lie in H. The zone of 9 is the set of the faces of the arrangement
of H that can see g. Here we imagine that the hyperplanes of H are opaque,
and so we say that a face F can see the hyperplane 9 if there are points
x E F and y E 9 such that the open segment xy is not intersected by any
hyperplane of H (the face F is considered relatively open). Let us note that
it does not matter which point x E F we choose: Either all of them can see
9 or none can. The picture shows the zone in a line arrangement:

The following result bounds the maximum complexity of the zone. In the
proof we will meet another interesting random sampling technique.

6.4.1 Theorem (Zone theorem). The number of faces in the zone of any
hyperplane in an arrangement ofn hyperplanes in Rd is O(n d - 1 ), with the
constant of proportionality depending on d.

We prove the result only for simple arrangements; the general case follows,
as usual, by a perturbation argument. Let us also assume that 9 tf- H and that
H U {g} is in general position.
6.4 The Zone Theorem 147

It is clear that the zone has O(n d - 1 ) cells, because each (d-l)-dimen-
sional cell of the (d-l )-dimensional arrangement within 9 is intersects only
one d-dimensional cell of the zone. On the other hand, this information is
not sufficient to conclude that the total number of vertices of these cells
is O(n d - 1 ): For example, as we know from Chapter 4, n arbitrarily chosen
cells in an arrangement of n lines in the plane can together have as many as
O(n4/3) vertices.
Proof. We proceed by induction on the dimension d. The base case is d = 2;
it requires a separate treatment and does not follow from the trivial case
d = 1 by the inductive argument shown below.
The case d = 2. (For another proof see Exercise 7.1.5.) Let H be a set of n
lines in the plane in general position. We consider the zone of a line g. Since
a convex polygon has the same number of vertices and edges, it suffices to
bound the total number of I-faces (edges) visible from the line g.
Imagine 9 drawn horizontally. We count the number of visible edges lying
above g. Among those, at most n intersect the line g, since each line of H
gives rise to at most one such edge. The others are disjoint from g.
Consider an edge uv disjoint from 9 and visible from a point of g. Let
h E H be the line containing uv, and let a be the intersection of h with g:

--a~----~b------------g

Let the notation be chosen in such a way that u is closer to a than v, and
let e E H be the second line (besides h) defining the vertex u. Let b denote
the intersection eng. Let us call the edge uv a right edge of the line e if the
point b lies to the right of a, and a left edge of the line e if b lies to the left
of a.
We show that for each line ethere exists at most one right edge. If it were
not the case, there would exist two edges, uv and xy, where u lies lower than
x, which would both be right edges of e, as in the above drawing. The edge
xy should see some point of the line g, but the part of 9 lying to the right of
a is obscured by the line h, and the part left of a is obscured by the line e.
This contradiction shows that the total number of right edges is at most n.
Symmetrically, we see that the number of left edges in the zone is at
most n. The same bounds are obtained for edges of the zone lying below g.
Altogether we have at most O( n) edges in the zone, and the 2-dimensional
case of the zone theorem is proved.
148 Chapter 6: Number of Faces in Arrangements

The case d > 2. Here we make the inductive step from d-1 to d. We assume
that the total number of faces of a zone in R d - 1 is O(n d - 2 ), and we want to
bound the total number of zone faces in Rd.
The first idea is to proceed by induction on n, bounding the maximum
possible number of new faces created by adding a new hyperplane to n-1
given ones. However, it is easy to find examples showing that the number
of faces can increase roughly by n d - 1 , and so this straightforward approach
fails.
In the actual proof, we use a clever averaging argument. First, we demon-
strate the method for the slightly simpler case of counting only the facets
(i.e., (d-1)-faces) of the zone.
Let f(n) denote the maximum possible number of (d-1)-faces in the zone
in an arrangement of n hyperplanes in R d (the dimension d is not shown in
the notation in order to keep it simple). Let H be an arrangement and 9 a
base hyperplane such that f(n) is attained for them.
We consider the following random experiment. Color a randomly chosen
hyperplane h E H red and the other hyperplanes of H blue. We investigate
the expected number of blue facets of the zone, where a facet is blue if it lies
in a blue hyperplane.
On the one hand, any facet has probability n~l of becoming blue, and
hence the expected number of blue facets is n~l f(n).
We bound the expected number of blue facets in a different way. First,
we consider the arrangement of blue hyperplanes only; it has at most f (n-1)
blue facets in the zone by the inductive hypothesis. Next, we add the red
hyperplane, and we look by how much the number of blue facets in the zone
can increase.
A new blue facet can arise by adding the red hyperplane only if the red
hyperplane slices some existing blue facet F into two parts Fl and F 2 , as is
indicated in the picture:

gn h
6.4 The Zone Theorem 149

This increases the number of blue facets in the zone only if both FI and F2 are
visible from g. In such a case we look at the situation within the hyperplane
h; we claim that F n h is visible from 9 n h.
Let C be a cell of the zone in the arrangement of the blue hyperplanes
having F on the boundary. We want to exhibit a segment connecting F n h
to 9 n h within C. If Xl E FI sees a point YI E 9 and X2 E F2 sees Y2 E g,
then the whole interior of the tetrahedron XIX2YIY2 is contained in C. The
intersection of this tetrahedron with the hyperplane h contains a segment
witnessing the visibility of 9 n h from F n h.
If we intersect all the blue hyperplanes and the hyperplane 9 with the
red hyperplane h, we get a (d-1)-dimensional arrangement, in which F n h
is a facet in the zone of the (d-2)-dimensional hyperplane 9 n h. By the
inductive hypothesis, this zone has O(n d - 2 ) facets. Hence, adding h increases
the number of blue facets of the zone by O(n d - 2 ), and so the total number
of blue facets after h has been added is never more than f(n-1) + O(n d- 2 ).
We have derived the following inequality:
n-1
- - f(n) ::; f(n-1) + O(n d- 2 ).
n
It implies f(n) = O(n d- 1 ), as we will demonstrate later for a slightly more
general recurrence.
The previous considerations can be generalized for (d- k)- faces, where
1 ::; k ::; d-2. Let fJ(n) denote the maximum possible number of j-faces
in the zone for n hyperplanes in dimension d. Let H be a collection of n
hyperplanes where fd-k(n) is attained.
As before, we color one randomly chosen hyperplane h E H red and the
others blue. A (d-k )-face is blue if its relative interior is disjoint from the red
hyperplane. Then the probability of a fixed (d-k)-face being blue is n~k, and
the expected number of blue (d-k)-faces in the zone is at most n~k fd-k(n).
On the other hand, we find that by adding the red hyperplane, the num-
ber of blue (d-k)-faces can increase by at most O(n d - 2 ), by the inductive
hypothesis and by an argument similar to the case of facets. This yields the
recurrence
n-k
- - fd-k(n) ::; fd-k(n-1)
n
+ O(nd2
- ).

We use the substitution c.p(n) = n(n-{):-:~(~2k+l)' which transforms our re-


currence to c.p(n) ::; c.p(n-1) + O(n d- k- 2 ). We assume k < d-1 (so the con-
sidered faces must not be edges or vertices). Then the last recurrence yields
c.p(n) = O(n d- k- l ), and hence fd-k(n) = O(n d- l ).
For the case k = d-1 (edges), we would get only the bound II (n) =
O(nd-1logn) by this method. So the number of edges and vertices must be
bounded by a separate argument, and we also have to argue separately for
the planar case.
150 Chapter 6: Number of Faces in Arrangements

We are going to show that the number of vertices of the zone is at most
proportional to the number of the 2-faces of the zone. Every vertex is con-
tained in some 3-face of the zone. Within each such 3-face, the number of
vertices is at most 3 times the number of 2-faces, because the 3-face is a 3-
dimensional convex polyhedron. Since our arrangement is simple, each 2-face
is contained in a bounded number of 3-faces. It follows that the total number
of vertices is at most proportional to J2(n) = O(n d - l ). The analogous bound
for edges follows immediately from the bound for vertices. 0

Zones in other arrangements. The maximum complexity of a zone can be


investigated for objects other than hyperplanes. We can consider two classes
Z and A of geometric objects in Rd and ask for the maximum complexity of
the zone of a ( E Z in the arrangement of n objects aI, a2, ... , an EA. This
leads to a wide variety of problems. For some of them, interesting results have
been obtained by extending the technique shown above.
Most notably, if ( is a k-flat in Rd, 0 :::; k :::; d, or more generally, a k-di-
mensional algebraic variety in R d of degree bounded by a constant, then the
zone of ( in an arrangement of n hyperplanes has complexity at most

o (n L(d+ k )/2J (log n)/3) ,

where (3 = 1 for d + k odd and (3 = 0 for d + k even. (The logarithmic factor


seems likely to be superfluous in this bound; perhaps a more sophisticated
proof could eliminate it.) With ( being a k-flat, this result can be viewed as
an interpolation between the asymptotic upper bound theorem and the zone
theorem: For k = 0, with ( being a single point, we consider the complexity
of a single cell, while for k = d-1, we have the zone of a hyperplane. The key
ideas of the proof are outlined in the notes below; for a full proof we refer to
the literature.
A simple trick relates the zone problem to another question, the maxi-
mum complexity of a single cell in an arrangement. For example, what is the
complexity of the zone of a segment ( in an arrangement of n line segments?
On the one hand, ( can be chosen as a single point, and so the maximum
zone complexity is at least the maximum possible complexity of a cell in an
arrangement of n segments. On the other hand, the complexity of the zone
of ( is no more than the maximum cell complexity in an arrangement of 2n
segments, since we can split each segment by making a tiny hole near the
intersection with (:

(
6.4 The Zone Theorem 151

A similar reduction works for the zone of a triangle in an arrangement of


triangles in R3 and in many other cases. Results presented in Section 7.6 will
show that under quite general assumptions, the zone complexity in dimension
d is no more than O(n d -1+,,), for an arbitrarily small (but fixed) c > o.

Bibliography and remarks. The two-dimensional zone theorem


was established by Chazelle, Guibas, and Lee [CGL85], with the proof
shown above, and independently by Edelsbrunner, O'Rourke, and Sei-
del [EOS86] by a different method. The first correct proof of the gen-
eral d-dimensional case, essentially the one presented here, is due to
Edelsbrunner, Seidel, and Sharir [ESS93]. The main ingredients of the
technique were previously developed by Sharir and his coauthors in
several papers.
Bern, Eppstein, Plassman, and Yao [BEPY91] determined the best
constant in the planar zone theorem: The zone of a line in an arrange-
ment of n lines has at most 5.5n edges. They also showed that the
zone of a convex k-gon has complexity O(n + k 2 ).
The extension of the zone theorem to the zone of a k-dimensional
algebraic variety in a hyperplane arrangement, as mentioned in the
text, was proved by Aronov, Pellegrini, and Sharir [APS93]. They also
obtained the same bound with ( being the relative boundary of a
(k+ 1)-dimensional convex set in Rd.
The problem with the zone of a curved surface that did not exist
for the zone of a hyperplane is that a face F of the zone of ( can be
split by a newly inserted hyperplane h into two subfaces Fl and F2 ,
both of them lying in the zone, without h n F being in the zone of
( n h, as is illustrated below:

It turns out that each face F split by h in this way is adjacent to a


facet in h that can be seen from ( from both sides; such a facet is called
a popular facet of the zone. In order to set up a suitable recurrence
for the number of faces in the zone, one needs to bound the total
complexity of all popular facets. This is again done by a technique
similar to the proof of the zone theorem in the text. The concept of
popular facet needs to be generalized to a popular j -face, which is a
j-dimensional face F that can be seen from ( in all the 2d - j "sectors"
determined by the d - j hyperplanes defining F. The key observation
is that if a blue popular j-face is split into two new popular j-faces
by the new red hyperplane, then this can be charged to a popular
(j-1)-face within h, as the following picture illustrates for j = 1:
152 Chapter 6: Number of Faces in Arrangements

This is used to set up recurrences for the numbers of popular j-faces.

Exercises
1. (Sum of squares of cell complexities)
(a) Let C be the set of all cells of an arrangement of a set H of n hyper-
planes in Rd. For d = 2,3, prove that LCEC 10(C)2 = O(nd), wherelo(C)
is the number of vertices of the cell C. [1J
(b) Use the technique explained in this section to prove LCEC 10(C)2 =
O(nd(logn)Ld/2J-l) for every fixed d 2: 3 (or a similar bound with a
larger constant in the exponent of logn if it helps). I}]
The result in (b) is from Aronov, Matousek, and Sharir [AMS94].
2. Define the (:=;k )-zone of a hyperplane g in an arrangement of hyperplanes
as the collection of all faces for which some point x of their relative interior
can be connected to some point y E g so that the interior of the segment
xy intersects at most k hyperplanes.
(a) By the technique of Section 6.3 (Clarkson's theorem on levels), show
that the number of vertices of the (:=;k)-zone is O(nd-1k). [1J
(b) Show that the bound in (a) cannot be improved in general. 12]
3. In this exercise we aim at bounding K (n, n), the maximum total number
of edges of n distinct cells in an arrangement of n lines in the plane,
using the cutting lemma as in Section 4.5 (this proof is due to Clarkson,
Edelsbrunner, Guibas, Sharir, and Welzl [CEG+90]). Let L be a set of n
lines in general position.
(a) Prove the bound K(n, m) = O(nVm + m). [1J
(b) Prove K(n, n) = O(n 4 / 3 ) using the cutting lemma. [i]
4. Consider a set H of n planes in R3 in general position and a sphere S
(the surface of a ball).
(a) Show that S intersects at most O(n 2 ) cells of the arrangement of H.
12]
(b) Using (a) and Exercise 1, prove that the zone of S in the arrangement
of H has at most O(n 5 / 2 ) vertices. [i] (This is just an upper bound; the
correct order of magnitude is about n 2. )

6.5 The Cutting Lemma Revisited


Here we present the most advanced version of the random sampling tech-
nique. It combines the approach to the weak cutting lemma (Section 4.6)
6.5 The Cutting Lemma Revisited 153

with ingredients from the proof of Clarkson's theorem on levels and addi-
tional ideas.
We are going to re-prove the cutting lemma 4.5.3: For every set H of n
lines in the plane and every r > 1 there exists a ~-cutting for H of size O(r2),
i.e., a subdivision of the plane into O(r2) generalized triangles .6. 1 , ... ,.6.t
such that the interior of each .6. i is intersected by at most ~ lines of H. The
proof uses random sampling, and unlike the elementary proof in Section 4.7,
it can be generalized to higher dimensions without much trouble. We first give
a complete proof for the planar case and then we discuss the generalizations.
Throughout this section we assume that H is in general position. A per-
turbation argument mentioned in Section 4.7 can be used to derive the cutting
lemma for an arbitrary H.
The first idea is as in the proof of a weaker cutting lemma by random
sampling in Section 4.6: We pick a random sample S of a suitable size and
triangulate its arrangement.
The subsequent calculations become simpler and more elegant if we choose
S by independent Bernoulli trials. That is, instead of picking s random lines
with repetitions as in Section 4.6, we fix a probability p = ~ and we include
each line h E H into S with probability p, the decisions being mutually
independent (this is as in the proof of the planar case of Clarkson's theorem
on levels). These two ways of random sampling (by s random draws with
repetitions and by independent trials with success probability ~) can usually
be thought of as nearly the same; although the actual calculations differ
significantly, their results tend to be similar.
Sampling and triangulation alone do not work. Considerations similar
to those in Section 4.6 show that with probability close to 1, none of the
triangles in the triangulation for the random sample S as above is intersected
by more than C~ logn lines of H, for a suitable constant C. Later we will
see that a similar statement is true with C ~ log s instead of C ~ log n. But
it is not generally true with C~, for any C independent of sand n. So the
most direct road to an optimal ~-cutting, namely choosing const . r random
lines and triangulating their arrangement, is impassable.
To see this, consider a I-dimensional situation, where H = {hI, ... , hn }
is a set of n points in R (or if you prefer, look at the part of a 2-dimensional
arrangement along one of the lines). For simplicity, let us set s = ~; then
p = ~, and we can imagine that we toss a fair coin n times and we include hi
into S if the ith toss is heads. The picture illustrates the result of 30 tosses,
with black dots indicating heads:
0.0 • • 00.0.00.00.0 • • • 0 ••• 000 • • 0
We are interested in the length of the longest consecutive run of tails (empty
circles). For k is fixed, it is very likely that k consecutive tails show up in a
sequence of n tosses for n sufficiently large. Indeed, if we divide the tosses
into blocks of length k (suppose for simplicity that n is divisible by k),
154 Chapter 6: Number of Faces in Arrangements

000000000000000000000000000000
I I I I

then in each block, we have probability 2- k of receiving all tails. The blocks
are mutually independent, and so the probability of not obtaining all tails
in any of the ~ blocks is (1 - 2-k)n/k. For k fixed and n -+ 00 this goes
to 0, and a more careful calculation shows that for k = l~ log2 nJ we have
exponentially small probability of not receiving any block of k consecutive
tails (Exercise 1). So a sequence of n tosses is very likely to contain about log n
consecutive tails. (Sequences produced by humans that are intended to look
random usually do not have this property; they tend to be "too uniform.")
Similarly, for a smaller s, if we make a circle black with probability;, then
the longest run typically has about 2} log s consecutive empty circles.
Of course, in the one-dimensional situation one can define much more
uniform samples, say by making every 2}th circle black. But it is not clear
how one could produce such "more uniform" samples for lines in the plane
or for hyperplanes in Rd.
The strategy: a two-level decomposition. Instead of trying to select
better samples we construct a ~-cutting for H in two stages. First we take a
sample S with probability p = ~ and triangulate the arrangement, obtaining
a collection T of triangles. (The expected number of triangles is O(r2), as we
will verify later.) Typically, T is not yet a ~-cutting. Let 1(1::1) denote the set
of lines of H intersecting the interior of a triangle 1::1 E T and let n.6 = 11(1::1)1.
We define the excess of a triangle 1::1 ETas t.6 = n.6 . ~.
If t.6 ::; 1, then n.6 ::; ~ and 1::1 is a good citizen: It can be included into
the final ~-cutting as is. On the other hand, if t.6 > 1, then 1::1 needs further
treatment: We subdivide it into a collection of finer triangles such that each
of them is intersected by at most ~ lines of H. We do it in a seemingly
naive way: We consider the whole arrangement of 1(1::1), temporarily ignoring
1::1, and we construct a t~ -cutting for it. Then we intersect the triangles of
this t~ -cutting with 1::1, which can produce triangles but also quadrilaterals,
pentagons, and hexagons. Each of these convex polygons is further subdivided
into triangles, as is illustrated below:

1(1::1) a t~ -cutting restrict to 1::1 and triangulate

Note that each triangle in the t~ -cutting is intersected by at most ~ = ~


lines of 1(1::1). Therefore, the triangles obtained within 1::1 are valid triangles
6.5 The Cutting Lemma Revisited 155

of a ~-cutting for H. The final ~-cutting for H is constructed by subdividing


each Do E T with excess greater than 1 in the indicated manner and taking
all the resulting triangles together.
How do we make the required t~ -cuttings for the [(Do)? We do not yet
have any suitable way of doing this unless we use the cutting lemma itself,
which we do not want, of course. Fortunately, as a by-product of the subse-
quent considerations, we obtain a method for directly constructing slightly
suboptimal cuttings:

6.5.1 Lemma (A suboptimal cutting lemma). For every finite collec-


tion of lines and any u > 1, there exists a ~-cutting consisting of at most
K (u log( u+ 1))2 triangles, where K is a suitable constant.
If we employ this lemma for producing the t~ -cuttings, we can estimate
the number of triangles in the resulting ~-cutting in terms of the excesses of
the triangles in T: The total number of triangles is bounded by

L max{1,4K(t~log(t~ + 1))2}. (6.3)


~ET

The key insight for the proof of the cutting lemma is that although we
typically do have triangles Do E T with excess as large as about log r, they
are very few. More precisely, we show that under suitable assumptions, the
expected number of triangles in T with excess t or larger decreases exponen-
tiallyas a function of t. This will take care of both estimating (6.3) by O(r2)
and establishing Lemma 6.5.l.
Good and bad triangulations. Our collection T of triangles is obtained
by triangulating the cells in the arrangement of the random sample S. Now
is the time to specify how exactly the cells are triangulated, since not every
triangulation works. To see this, consider a set H of n lines, each of them
touching the unit circle, and let S be a random sample, again for simplicity
with probability p = ~. We have learned that such a sample is very likely to
leave a gap of about log n unselected lines (as we go along the unit circle).
If we maliciously triangulate the central cell in the arrangement of S by the
diagonals from the vertex near such a large gap,
156 Chapter 6: Number of Faces in Arrangements

all these about ~ triangles have excess about log n; this is way too large for
our purposes.
The triangulation thus cannot be quite arbitrary. For the subsequent
proof, it has to satisfy simple axioms. In the planar case, it is actually tech-
nically easier not to triangulate but to construct the vertical decomposition
of the arrangement of S. We erect vertical segments upwards and downwards
from each vertex in the arrangement of S and extend them until they meet
another line (or all the way to infinity):

So far we have been speaking of triangles, and now we have trapezoids, but
the difference is immaterial, since we can always split each trapezoid into
two triangles if we wish. Let T(S) denote the set of (generalized) trapezoids
in the vertical decomposition of S. As before, I(b.) is the set of lines of H
intersecting the interior of a trapezoid b..
6.5.2 Proposition (Trapezoids with large excess are rare). Let H be
a fixed set of n lines in general position, let p = ~, where 1 :::; r :::; ~, let S be
a random sample drawn from H by independent Bernoulli trials with success
probability p, and let t 2:: 0 be a real parameter. Let T(Sht denote the set
of trapezoids in b. E T(S) with excess at least t, i.e., with II(b.)1 2:: t~. Then
the expected number of trapezoids in T(Sht is bounded as follows:

for a suitable absolute constant C.


First let us see how this result can be applied.
Proof of the suboptimal cutting lemma 6.5.1. To obtain a ~-cutting
for H, we set r = Au log( u+ 1) for a sufficiently large constant A and choose
a sample S as in Proposition 6.5.2.
By that proposition with t = 0, we have E[lT(S)ll :::; Br2 for a suit-
able constant B. By the same proposition with t = Alog(u+1), we have
E[lT(Shtll :::; ~ if A is sufficiently large. By linearity of expectation, we ob-
tain
6.5 The Cutting Lemma Revisited 157

E[3~r2IT(8)1 + IT(8htl] ~~.


So there exists a sample 8 with both IT(8)1 ~ 2Br2 and IT(8»tl = O. This
means that we have a ~-cutting into O(r2) = O((ulog(u+1))2) trapezoids.
o
For an alternative proof of Lemma 6.5.1 see Exercise 10.3.4.
Proof of the cutting lemma (Lemma 4.5.3). Most of the proof has
already been described. To produce a ~-cutting, we pick a random sample 8
with probability p = ~, we let T = T(8) be its vertical decomposition, and
we refine each trapezoid ~ E T with excess t~ > 1 using an auxiliary t~­
cutting. The size of the resulting ~-cutting is bounded by (6.3). So it suffices
to estimate the expected value of that expression using Proposition 6.5.2:

E[ L max {I, 4K(t~ IOg(t~+1))2}]


~ET(S)

~ E[ L max {l, 4KtiJ] (as 10g(t~+1) ~ t~)


~ET(S)

~ E[ IT(8)1 + ~ a~s) 4Kt~]


2i stA <2 i +1
00

i=O
00

i=O

The cutting lemma is proved. o


Note that it was not important that the suboptimal cutting lemma is near-
optimal: Any bound subexponential in u for the size of a ~-cutting would do.
In particular, for any fixed c ;::: 1, the expected cth-degree average of the
excess is only a constant.
For the proof of Proposition 6.5.2, we need several definitions and some
simple properties of the vertical decomposition. Let H be a fixed set of lines in
general position, and let Reg = USCH T(8) be the set of all trapezoids that
can ever appear in the vertical decomposition for some 8 ~ H (including
S = 0). For a trapezoid ~ E Reg, let D(~) be the set of the lines of H
incident to at least one vertex of ~. By the general-position assumption, we
have ID(~)I ~ 4 for all ~. The various possible cases, up to symmetry, are
drawn below; the picture shows the lines of D(~) with ~ marked in gray:
158 Chapter 6: Number of Faces in Arrangements

"::----

4
71 ~:
,"
/

The set D(fj.) is called the defining set of fj.. Note that the same defining set
can belong to several trapezoids.
Now we list the properties required for the proof; some of them are obvious
or have already been noted,
(CO) We have ID(fj.) I :::; 4 for all fj. E Reg. Moreover, any set 8 0 ~ H is the
defining set for at most a constant number of fj. E Reg (certainly no more
than the maximum of 1/(80 )1 for 180 1 :::; 4).
(Cl) For any fj. E /(8), we have D(fj.) ~ 8 (the defining set must be present)
and 8 n J (fj.) = 0 (no intersecting line may be present),
(C2) For any fj. E Reg and any 8 ~ H such that D(fj.) ~ 8 and J(fj.) n 8 = 0,
we have fj. E /(8).
(C3) For every 8 ~ H, we have 1/(8)1 = 0(181 2 + 1). To see this, think of
adding the vertical segments to the arrangement of 8 one by one. Each
of them splits an existing region in two,
The most interesting condition is (C2), which says that the vertical de-
composition is defined "locally." It implies, in particular, that fj. is one of the
trapezoids in the vertical decomposition of its defining set. More generally, it
says that fj. E Reg is present in /(8) whenever it is not excluded for simple
local reasons (which can be checked by looking only at fj.). Checking (C2)
in our situation is easy, and we leave it to the reader. Also note that it is
(C2) that is generally violated for the mischievous triangulation considered
earlier,
Proof of Proposition 6.5.2. First we prove that if 8 <;::; H is a random
sample drawn with probability p = .;;, 0 < r < n, then

E[I/(8)ll = 0(r2 + 1). (6.4)

This takes care of the case t :::; 1 in the proposition. By (C3), we have 1/(8)1 =
0(181 2 + 1) for every fixed 8, and so it suffices to show that E[1812] =
0(r2 + 1), Now, 181 is the sum of independent random variables, each of
them attaining value 1 with probability p and value 0 with probability 1 - p,
and it is easy to check that E [1812] :::; r2 + r (Exercise 2(a)).
Next, we assume t ;::: 1. Let 8 <;::; H be a random sample drawn with
probability p, We observe that the conditions (Cl) and (C2) allow us to
6.5 The Cutting Lemma Revisited 159

express the probability p(.6.) that a certain trapezoid .6. E Reg appears in the
vertical decomposition I(S): Since .6. appears if and only if all lines of D(.6.)
are selected into S and none of 1(.6.) is selected, we have
p(.6.) = pID(LlJI(l_ p)IJ(LlJI.
(An analogous formula appeared in the proof of the planar Clarkson's the-
orem on levels, and one can say that the technique of that proof is devel-
oped one step further in the present proof.) If we write Reg>t = {.6. E
Reg: 11 (.6.) 12': t ~} for the set of all potential trapezoids with excess at least
t, the expected number of trapezoids in I(Sht can be written as
E[l/(Shtll = L p(.6.). (6.5)
LlE Reg ?;t
It seems difficult to estimate this sum directly; the trick is to compare it with
a similar sum obtained for the expected number of trapezoids for another
sample.
We define another probability p = If, and we let 8 be a sample drawn
from H by Bernoulli trials with success probability p. On the one hand,
we have E [1/(8)1] = O(r2 jt 2 + 1) by (6.4). On the other hand, setting
p(.6.) = pID(LlJI(l - p)IJ(LlJI we have, in analogy to (6.5),

E [1/(8)1] = L p(.6.) 2': L p(.6.)


LlEReg LlE Reg ?;t

where
R = min {~~~~: .6. E Reg~t}.
Now R can be bounded from below. For every .6. E Reg>t, we have 11(.6.)1 2':
t~ and ID(.6.) 1 :::; 4, and so -

p(.6.) = (r.) ID(LlJI (1 - p) IJ(LlJI 2': r 4 (1 _p) tn/r


p(.6.) p 1- p 1- p
We use 1- p :::; e- P (this holds for all real p) and 1- p 2': e- 2p (this is true for
all p E [0,n and we have p :::; p :::; ~). Therefore R 2': t- 4 et - 2 . Substituting
into (6.6), we finally derive

E[l/(Shtll :::; ~ . E [1/(8)1] :::; t 4 e-(t-2J ·0 (:: + 1) : :; C. 2- t r 2

for a sufficiently large constant C (the proposition assumes r 2': 1). Proposi-
tion 6.5.2 is proved. D

The following can be proved by the same technique:


160 Chapter 6: Number of Faces in Arrangements

6.5.3 Theorem (Cutting lemma for arbitrary dimension). Let d 2: 1


be a fixed integer, let H be a set of n hyperplanes in R d, and let r be a
parameter, 1 < r :::; n. Then there exists a ~-cutting for H of size O(rd); that
is, a subdivision ofRd into O(r d ) generalized simplices such that the interior
of each simplex is intersected by at most ~ hyperplanes of H.

The only new part of the proof is the construction of a suitable trian-
gulation scheme that plays the role of 7(8). A vertical decomposition does
not work. More precisely, it is not known whether the vertical decomposition
of an arrangement of n hyperplanes in Rd always has at most O(n d ) cells
(prisms); this would be needed as the analogue of condition (C3). Instead
one can use the bottom-vertex triangulation, which we define next.
First we specify the bottom-vertex triangulation of a k-dimensional con-
vex polytope P C R d , 1 :::; k :::; d, by induction on k. For k = 1, P is a line
segment, and the triangulation consists of P itself. For k > 1, we let v be the
vertex of P with the smallest last coordinate (the "bottom vertex"); ties can
be broken by lexicographic ordering of the coordinate vectors. We triangu-
late all proper faces of P inductively, and we add the simplices obtained by
erecting the cone with apex v over all simplices in the triangulations of the
faces not containing v.

d=3

It is not difficult to check that this yields a triangulation of P (even a simpli-


cial complex, although this is not needed in the present proof), and that if P
is a simple polytope, then the total number of simplices in this triangulation
is at most proportional to the number of vertices of P (with the constant of
proportionality depending on d); see Exercise 4.
All the bounded cells of the arrangement of 8 are triangulated in this way.
Some care is needed for the unbounded cells, and several ways are available.
One of the simplest is to intersect the arrangement with a sufficiently large
box containing all the vertices and construct the ~-cutting only inside that
box. This is sufficient for most applications of ~-cuttings. Alternatively (and
almost equivalently), we can consider the whole arrangement in the projective
d-space instead of Rd. We omit a detailed discussion of this aspect.
In this way we obtain a triangulation 7(8) for every subset 8 of the given
set of hyperplanes. The analogue of (C3) is 17(8)1 = O(18I d +1), which follows
(assuming H in general position) because the number of simplices in each cell
is proportional to the number of its vertices, and the total number of vertices
is O(18I d ).
6.5 The Cutting Lemma Revisited 161

The set 1(6) are all hyperplanes intersecting the interior of a simplex
6, and D(6) consists of all the hyperplanes incident to at least one vertex
of 6. We again need to assume that our hyperplanes are in general posi-
tion. Then, obviously, ID(6)1 :::; d(d+1), and a more careful argument shows
that ID(6)1 :::; d(di 3 ). The important thing is that an analogue of (CO)
holds, namely, that both ID(6)1 and the number of 6 with a given D(6) are
bounded by constants.
The condition (C1) holds trivially. The "locality" condition (C2) does
need some work, although it is not too difficult, and we refer to Chazelle and
Friedman [CF90j for a detailed argument.
With (CO)-(C3) in place, the whole proof proceeds exactly as in the planar
case. To get the analogue of (6.4), namely E[lT(S)lj = O(rd+1), we need
the fact that E [ISl d ] = O(r d ) (this is what we avoided in the proof of the
higher-dimensional Clarkson's theorem on levels by passing to another way
of sampling); see Exercise 2(b) or 3.
Further generalizations. An analogue of Proposition 6.5.2 can be derived
from conditions (CO)-(C3) in a general abstract framework. It provides op-
timal ~-cuttings not only for arrangements of hyperplanes but also in other
situations, whenever one can define a suitable decomposition scheme satisfy-
ing (CO)-(C3) and bound the maximum number of cells in the decomposition
(the latter is a challenging open problem for arrangements of bounded-degree
algebraic surfaces). The significance of Proposition 6.5.2 reaches beyond the
construction of cuttings; its variations have been used extensively, mainly in
the analysis of geometric algorithms. We are going to encounter a combina-
torial application in Chapter 11.

Bibliography and remarks. The proof of the cutting lemma as in


this section (with a different way of sampling) is due to Chazelle and
Friedman [CF90j. Analogues of Proposition 6.5.2, or more precisely the
consequence stating that the expectation of the cth-degree average of
the excess is bounded by a constant, were first proved and applied
by Clarkson [Cla88aj (see Clarkson and Shor [CS89j for the journal
version). Since then, they became one of the indispensable tools in the
analysis of randomized geometric algorithms, as is illustrated by the
book by Mulmuley [Mu193aj, for example, as well as by many newer
papers.
The bottom-vertex triangulation (also called the canonical trian-
gulation in some papers) was defined in Clarkson [Cla88bj.
Proposition 6.5.2 can be formulated and proved in an abstract
framework, where H and Reg are some finite sets and T: 2H -+ 2 Reg ,
I: Reg -+ 2 H , and D: Reg -+ 2H are mappings satisfying (CO) (with
some constants), (C1), (C2), and an analogue of (C3) that bounds the
expected size of T(S) for a random S ~ H by a suitable function of
r, typically by O(rk) for some real constant k ~ 1. The conclusion
162 Chapter 6: Number of Faces in Arrangements

is E[IT(Shtl] = O(2-trk). Very similar abstract frameworks are dis-


cussed in Mulmuley [Mu193a] and in De Berg, Van Kreveld, Overmars,
and Schwarzkopf [dBvKOS97].
The axiom (C2) can be weakened to the following:
(C2') If ~ E T(S) and S' ~ S satisfies D(~) ~ S', then ~ E T(S').
That is, ~ cannot be destroyed by deleting elements of S unless we
delete an element of D(~).
A typical situation where (C2') holds although (C2) fails is that
in which H is a set of lines in the plane and T(S) are the trapezoids
in the vertical decomposition of the cell in the arrangement of S that
contains some fixed point, say O. Then ~ can be made to disappear
by adding a line to S even if that line does not intersect ~, as is
illustrated below:

This weaker axiom was first used instead of (C2) by Chazelle, Edels-
brunner, Guibas, Sharir, and Snoeyink [CEG+93] . For a proof of a
counterpart of Proposition 6.5.2 under (C2') see Agarwal, Matousek,
and Schwarzkopf [AMS98].
Yet another proof of the cutting lemma in arbitrary dimension was
invented by Chazelle [Cha93a]. An outline of the argument can also
be found in Chazelle's book [ChaOOc] or in the chapter by Matousek
in [SUOO].
Both the proofs of the higher-dimensional cutting lemma depend
crucially on the fact that the arrangement of n hyperplanes in R d, d
fixed, can be triangulated using O(n d ) simplices. As was explained in
Section 6.2, the arrangement of n bounded-degree algebraic surfaces
in Rd has O(n d ) faces in total, but the faces can be arbitrarily compli-
cated. A challenging open problem is whether each face can be further
decomposed into "simple" pieces (each of them defined by a constant-
bounded number of bounded-degree algebraic inequalities) such that
the total number of pieces for the whole arrangement is O(n d ) or not
much larger. This is easy for d = 2 (the vertical decomposition will
do), but dimension 3 is already quite challenging. Chazelle, Edels-
brunner, Guibas, and Sharir [CEGS89] found a general argument that
provides an O(n 2d - 2 ) bound in dimension d using a suitable vertical
decomposition. By proving a near-optimal bound in the 3-dimensional
case and using it as a basis of the induction, they obtained the bound
of O(n 2d - 3 f3(n)), where f3 is a very slowly growing function (much
smaller than log- n). Recently Koltun [KolO1] established a near-tight
6.5 The Cutting Lemma Revisited 163

bound in the 4-dimensional situation, which pushed the general bound


to O(n 2d -4+E) for every fixed d 2: 4.
This decomposition problem is the main obstacle to proving an
optimal or near-optimal cutting lemma for arrangements of algebraic
surfaces. For some special cases, say for an arrangement of spheres in
R d , optimal decompositions are known and an optimal cutting lemma
can be obtained. In general, if one can prove a bound of O(na) for
the number of pieces in the decomposition, then the techniques of
Chapter 10 yield ~-cuttings of size O(ra log a r), and if, moreover, the
locality condition (C2) can be guaranteed, then the method of the
present section leads to ~-cuttings of size O(ra).

Exercises
1. Estimate the largest k = k(n) such that in a row of n tosses of a fair coin
we obtain k consecutive tails with probability at least ~. In particular,
using the trick with blocks in the text, check that for k = l ~ log2 n J, the
probability of not getting all tails in any of the blocks is exponentially
small (as a function of n). IT]
2. Let X = Xl + X 2 + '" + X n , where the Xi are independent random
variables, each attaining the value 1 with probability p and the value 0
with probability 1 - p.
(a) Calculate E [X2]. IT]
(b) Prove that for every integer d 2: 1 there exists Cd such that E [Xd] <
(np+cd)d. (You can use a Chernoff-type inequality, or prove by induction
that E [(X + a)d] :::; (np + d + a)d for all nonnegative integers n, d, and
a.) @J
(c) Use (b) to prove that E[xa] :::; (np + ca)a also holds for nonintegral
a2:1.@J
3. Let X = Xl + X 2 + ... + Xn be as in the previous exercise. Show that
E[ (~)] = pd G) (where d 2: 0 is an integer) and conclude that E [Xd] :::;
cd(np)d for np 2: d and a suitable Cd > O. @J
4. Let P be a d-dimensional simple convex polytope. Prove that the bottom-
vertex triangulation of P has at most Cdfo(P) simplices, where Cd de-
pends only on d and fo(P) denotes the number of vertices of P. ~
7

Lower Envelopes

This is a continuation of the chapter on arrangements. We again study the


number of vertices in a certain part of the arrangement: the lower envelope.
Already for segments in the plane, this problem has an unexpectedly subtle
and difficult answer. The closely related combinatorial notion of Davenport-
Schinzel sequences has proved to be a useful general tool, since the surprising
phenomena encountered in the analysis of the lower envelope of segments are
by no means rare in combinatorics and discrete geometry.
The chapter has two rather independent parts. After a common introduc-
tion in Section 7.1, lower envelopes in the plane are discussed in Sections 7.2
through 7.4 using Davenport-Schinzel sequences. Sections 7.5 and 7.6 gently
introduce the reader to geometric methods for analyzing higher-dimensional
lower envelopes, finishing with a quick overview of known results in Sec-
tion 7.7.

7.1 Segments and Davenport-Schinzel Sequences


The following question is extremely natural: What is the maximum possible
combinatorial complexity of a single cell in an arrangement of n segments?
(The arrangement of segments was defined in Section 6.2.)
The complexity of a cell can be measured as the number of vertices and
edges on its boundary. It is immediate that the number of edges is at most
proportional to the number of vertices plus 2n, the total number of endpoints
of the segments, and so it suffices to count the vertices.
Here we mainly consider a slightly simpler question: the maximum com-
plexity of the lower envelope of n segments. Informally, the lower envelope of
an arrangement is the part that can be seen by an observer sitting at (0, -(0)
and looking upward. In the picture below, the lower envelope of 4 segments
is drawn thick:
166 Chapter 7: Lower Envelopes

If we think of the segments as graphs of (partially defined) functions, the


lower envelope is the graph of the pointwise minimum. It consists of pieces
of the segments, and we are interested in the maximum possible number of
these pieces (in the drawing, we have 7 pieces). Let us denote this maximum
by o-(n).
Davenport-Schinzel sequences. A tight upper bound for a(n) has been
obtained via a combinatorial abstraction of lower envelopes, the so-called
Davenport-Schinzel sequences. These are closely related to segments, but
the most natural way of introducing them is starting from curves. Let us
consider a finite set of curves in the plane, such as in the following picture:

We suppose that each curve is a graph of a continuous function R -+ R; in


other words, each vertical line intersects it exactly once. Most significantly,
we assume that every two of the curves intersect in at most s points for some
constant s. This condition holds, for example, if the curves are the graphs of
polynomials of degree at most s.
Let us number the curves 1 through n, and let us write down the sequence
of the numbers of the curves along the lower envelope from left to right:

1 2 3 1 2

We obtain a sequence al a2a3 ... ae with the following properties:


(i) For all i, ai E {l, 2, ... , n}.
(ii) No two adjacent terms coincide; i.e., ai -j. ai+l.
(iii) There is no (not necessarily contiguous) subsequence of the form
... a ... b ... a ... b ...... a ... b ... ,
'------'-_-'--_-'-1______ ~

s + 2 letters a and b

where a -j. b. In other words, there are no indices i 1 < i2 < i3 < ... < is+2
with ai, -j. ai2' ai, = ai3 = ai 5 = .. " and ai2 = ai 4 = ai6 = .. '.
7.1 Segments and Davenport-Schinzel Sequences 167

Only (iii) needs a little thought: It suffices to note that between an occurrence
of a curve a and an occurrence of a curve b on the lower envelope, a and b
have to intersect.
Any finite sequence satisfying (i)-(iii) is called a Davenport-Schinzel se-
quence of order s over the symbols 1,2, ... , n. It is not important that the
terms of the sequence are the numbers 1,2, ... , n; often it is convenient to
use some other set of n distinct symbols.
Let us remark that every Davenport-Schinzel sequence of order s over n
symbols corresponds to the lower envelope of a suitable set of n curves with at
most s intersections for each pair of curves (Exercise 1). On the other hand,
very little is known about the realizability of Davenport-Schinzel sequences
by graphs of polynomials of degree s, say.
We will mostly consider Davenport-Schinzel sequences of order 3. This is
the simplest nontrivial case and also the one closely related to lower envelopes
of segments. Every two segments intersect at most once, and so it might
seem that their lower envelope gives rise to a Davenport-Schinzel sequence
of order 1, but this is not the case! The segments are graphs of partially defined
functions, while the discussion above concerns graphs of functions defined on
all of R. We can convert each segment into a graph of an everywhere-defined
function by appending very steep rays to both endpoints:

All the left rays are parallel, and all the right ones are parallel. Then every two
of these curves have at most 3 intersections, and so if the considered segments
are numbered 1 through n and we write the sequence of their numbers along
the lower envelope, we get a Davenport-Schinzel sequence of order 3 (no
ababa).
Let >'s (n) denote the maximum possible length of a Davenport-Schinzel
sequence of order s over n symbols. Some work is needed to see that >'s (n) is
finite for all sand n; the reader is invited to try this. The bound >'l{n) = n is
trivial, and >'2(n) = 2n-l is a simple exercise. Determining the asymptotics
of >'3(n) is a hard problem; it was posed in 1965 and solved in the mid-1980s.
We will describe the solution later, but here we start more modestly: with a
reasonable upper bound on >'3 (n).
7.1.1 Proposition. We have a(n) ~ >'3(n) ~ 2nln n + 3n.

Proof. Let w be a Davenport-Schinzel sequence of order 3 over n symbols.


e,
If the length of w is then there is a symbol a occurring at most ~ times
in w. Let us remove all occurrences of such a from w. The resulting sequence
can contain some pairs of adjacent equal symbols. But we claim that there
can be at most 2 such pairs, coming from the first and last occurrences of a.
168 Chapter 7: Lower Envelopes

Indeed, if some a which is neither the first nor the last a in w were surrounded
by some b from both sides, we would have the situation ... a ... bab ... a ...
with the forbidden pattern ababa. So by deleting all the a and at most 2
more symbols, we obtain a Davenport-Schinzel sequence of order 3 over n-1
symbols.
We arrive at the recurrence

which can be rewritten to


A3(n) < A3(n - 1) + _2_
n - n-1 n-1
(we saw such a recurrence in the proof of the zone theorem). Together with
A3(1) = 1 this yields

A3(n) < 1 + 2 (1 + ~ + ~ + ... + _1_)


n - 2 3 n-1'
and so A3(n) ~ 2nlnn + 3n as claimed. o
Bibliography and remarks. A detailed account of the history of
Davenport-Schinzel sequences and of the analysis of lower envelopes,
with references up until 1995, can be found in the book of Sharir
and Agarwal [SA95]. Somewhat more recent results are included in in
their surveys [ASOOb] and [ASOOa]. We sketch this development and
mention some newer results in the notes to Section 7.3.

Exercises
1. Let w be a Davenport-Schinzel sequence of order s over the symbols
1,2, ... , n. Construct a family of planar curves hI, h2, ... ,hn, each of
them intersecting every vertical line exactly once and each two intersect-
ing in at most s points, such that the sequence of the numbers of the
curves along the lower envelope is exactly w. ~
2. Prove that A2(n) = 2n-1 (the forbidden pattern is abab). I2l
3. Prove that for every nand s, As(n) ~ 1 + (s+l)G).12l
4. Show that the lower envelope of n rays in the plane has O(n) complexity.
8]
5. (Planar zone theorem via Davenport-Schinzel sequences) Prove the zone
theorem (Theorem 6.4.1) for d = 2 using the fact that A2(n) = O(n).
Consider only the part above the line g, and assign one symbol to each
side of each line. 8]
6. Let gI, g2, ... , gm C R2 be graphs of piecewise linear functions R -+
R that together consist of n segments and rays. Prove that the lower
envelope of gl,g2,." ,gm has complexity O(f!'t A3(2m)); in particular, if
m = 0(1), then the complexity is linear. 8]
7.2 Segments; Superlinear Complexity of the Lower Envelope 169

7. Let PI, P2 , .•. , Pm be convex polygons (not necessarily disjoint!) in the


plane with n vertices in total such that no vertex is common to two or
more Pi and the vertices form a point set in general position. Prove that
the number of lines that intersect all the Pi and are tangent to at least
two of them is at most O(A3(n)). [II
8. (Dynamic lower envelope of lines) Let f I, f 2 , ... , fn be lines in the plane in
general position (in particular, none of them is vertical). At each moment
t of time, only a certain subset L t of the lines is present: fi is inserted
at time Si and it is removed at time ti > Si. We are interested in the
maximum possible total number f(n) of vertices of the arrangement of
the fi that appear as vertices of the lower envelope of L t for at least one
t E R.
(a) Show that f(n) = S1(a(n)), where a(n) is the maximum complexity
of the lower envelope of n segments. [II
(b) Prove that f(n) = O(nlogn). (Familiarity with data structures like
segment trees or interval trees may be helpful.) 0
These results are from Tamir [Tam88], and improving the lower bound
or the upper bound is a nice open problem.

7.2 Segments: Superlinear Complexity of the Lower


Envelope
In Proposition 7.1.1 we have shown that the lower envelope of n segments
has complexity at most O( n log n), but it turns out that the true complexity
is still lower. With this information, the next reasonable guess would be that
perhaps the complexity is linear in n. The truth is much subtler, though: On
the one hand, the complexity behaves like a linear function for all practical
purposes, but on the other hand, it cannot be bounded by any linear function:
It outgrows the function n H Cn for every fixed C. We present an ingenious
construction witnessing this.

7.2.1 Theorem. The function a(n), the maximum combinatorial complex-


ity of the lower envelope of n segments in the plane, is superlinear. That is,
for every C there exists an na such that a(na) 2: Cna. Consequently, A3(n),
the maximum length of a Davenport-Schinzel sequence of order 3, is super-
linear, too.

Proof. For every integers k, m 2: 1 we construct a set Sk(m) of segments


in the plane. Let nk(m) = ISk(m)1 be the number of segments and let ek(m)
denote the number of arrangement vertices and segment endpoints on the
lower envelope of Sk(m). We prove that ek(m) 2: ~k. nk(m). In particular,
for m = 1 and k --+ 00, this shows that the complexity of the lower envelope
is nonlinear in the number of segments.
170 Chapter 7: Lower Envelopes

If we really need only the case m = 1, then what is the parameter m


good for? The answer is that we proceed by double induction, on both k
and m, and in order to specify Sk(1), for example, we need Sk-l(2). Results
of mathematical logic, which are beyond the scope of this book, show that
double induction is in some sense unavoidable: The "usual" induction on a
single variable is too crude to distinguish 0"( n) from a linear function.
The segments in Sk(m) are usually not in general position, but they are
aggregated in fans by m segments. A fan of m segments is illustrated below
for m = 4:

All the segments of a fan have a common left endpoint and positive slopes, and
the length of the segments increases with the slope. Other than forming the
fans, the segments are in general position in an obvious sense. For example,
no endpoint of a segment lies inside another segment, the endpoints do not
coincide unless the segments are in a common fan, and so on.
Let fk(m) denote the number of fans forming Sk(m); we have nk(m) =
m· Jk(m).
First we describe the construction of Sk(m) roughly, and later we make
precise some finer aspects. As was already mentioned, we proceed by induc-
tion on k and m. One of the invariants of the construction is that the left
endpoints of all the fans of Sk(m) always show up on the lower envelope.
First we specify the boundary cases with k = 1 or m = 1. For k = 1,
Sl(m) is simply a single fan with m segments. For m = 1, Sk(l) is obtained
from Sk-l(2) by the following transformation of each fan (each fan has 2
segments):

The lower segment in each fan is translated by the same tiny amount to the
left.
Now we describe the construction of Sk(m) for general k, m 2:: 2. First we
construct Sk(m-1) inductively. We shrink this Sk(m-1) both vertically and
horizontally by a suitable affine transform; the vertical shrinking is much
more intensive than the horizontal one, so that all segments become very
short and almost horizontal. Let S' be the transformed Sk(m-1). We will use
many translated copies of S' as "microscopic" ingredients in the construction
of Sk(m).
The "master plan" of the construction is obtained from Sk-l(M), where
M = fk(m-1) is the number of fans in S'. Namely, we first shrink Sk-l(M)
7.2 Segments: Superlinear Complexity of the Lower Envelope 171

vertically so that all segments become nearly horizontal, and then we apply
the affine transform (x, y) ~ (x, x + y) so that the slopes of all the segments
are just a little over 1. Let S* denote the resulting set.
For each fan F in the master construction S*, we make a copy S'p of
the microscopic construction Sf and place it so that its leftmost endpoint
coincides with the left endpoint of F. Let the segments of F be SI, •.. ,SM,
numbered by increasing slopes, and let £1, ... , £M be the left endpoints of
the fans in S'p, numbered from left to right. The fan F is gigantic compared
to S'p. Now we take F apart: We translate each Si so that its left endpoint
goes to £i. The following drawing shows this schematically, since we have no
chance to make a realistic drawing of Sk(m-l). Only a very small part of F
near its left endpoint is shown.
S4 S3 S2
:/ ./·...:::···SI

This construction yields Sk(m). It correctly produces fans of size m, byap-


pending one top (and long) segment to each fan in every S'p. If Sf was taken
sufficiently tiny, then all the vertices of the lower envelope of S* are pre-
served, as well as those in each S'p. Crucially, we need to make sure that the
above transformation of each fan F in S* yields M -1 new vertices on the
envelope, as is indicated below:

The new vertices lie on the right of S'p but, in the scale of the master con-
struction S*, very close to the former left endpoint of F, and so they indeed
appear on the lower envelope.
This is where we need to make the whole construction more precise,
namely, to say more about the structure of the fans in Sk(m). Let us call
a fan r-escalating if the ratio of the slopes of every two successive segments
in the fan is at least r. It is not difficult to check that for any given r > 1,
172 Chapter 7: Lower Envelopes

the construction of Sk(m) described above can be arranged so that all fans
in the resulting set are r-escalating.
Then, in order to guarantee that the M -1 new vertices per fan arise in
the general inductive step described above, we make sure that the fans in the
master construction S* are affine transforms of r-escalating fans for a suitable
very large r. More precisely, let Q be a given number and let r = r(k, Q) be
sufficiently large and <5 = <5(k, Q) > 0 sufficiently small. Let F arise from an
r-escalating fan by the affine transformation described above (which makes
all slopes a little bigger than 1), and assume that the shortest segment has
length 1, say. Suppose that we translate the left endpoint of Si, the segment
with the ith smallest slope in F, by <51 + <52 + ... + <5i almost horizontally to
the right, where <5 ::; <5i ::; Q<5. Then it is not difficult to see, or calculate, that
the lower envelope of the translated segments of F looks combinatorially like
that in the last picture and has M -1 new vertices. The reader who is not
satisfied with this informal argument can find real and detailed calculations
in the book [SA95].
We want to prove that the complexity of the lower envelope of Sk(m) is
at least ~ km times the number of fans; in our notation,

This is simple to do by induction, although the numbers involved are fright-


eningly large. For k = 1, we have II (m) = 1 and e1 (m) = m+ 1, so we are fine.
For m = 1, we obtain h(1) = 2fk-1(2) and edl) = ek-1(2) + 2fk-1(2) 2:
~(k-l) ·2· h-1(2) + 2h-1(2) = (k+l)· fk-1(2) > ~k. h(I).
In the construction of Sk(m) for k,m 2: 2, each of the h-l(M) fans of
the master construction S* produces M = h(m-l) fans, and so

For the envelope complexity we get a contribution of ek-l (M) from S*,
ek (m-l) from each copy of Sf, and M -1 new vertices for each copy of Sf.
Putting this together and using the inductive assumption to eliminate the
function e, we have

ek(m) 2: ek-1(M) + fk-l(M) [ek(m - 1) +M - 1]

2: fk-1(M)· [~(k-l)M+~k(m-l)M+M-l]
2: h-l(M) . [~kM + ~k(m - I)M]
= ~km. M· fk-l(M) = ~km· fdm).

Theorem 7.2.1 is proved. D

Note how the properties of the construction Sk(m) contradict the intuition
gained from small pictures: Most of the segments appear many times on
7.3 More on Davenport-Schinzel Sequences 173

the lower envelope, and between two successive segment endpoints on the
envelope there is typically a concave arc with quite a large number of vertices.

Bibliography and remarks. An example of n segments with super-


linear complexity of the lower envelope was first obtained by Wiernik
and Sharir [WS88], based on an abstract combinatorial construction
of Davenport-Schinzel sequences of order 3 due to Hart and Sharir
[HS86]. The simpler construction shown in this section was found by
Shor (in an unpublished manuscript; a detailed presentation is given
in [SA95]).

Exercises
1. Construct Davenport-Schinzel sequences of order 3 of superlinear length
directly. That is, rephrase the construction explained in this section in
terms of Davenport-Schinzel sequences instead of segments. 0

7.3 More on Davenport-Schinzel Sequences


Here we come back to the asymptotics of the Davenport-Schinzel sequences.
We have already proved that A3(n)/n is unbounded. It even turns out that
the construction in the proof of Theorem 7.2.1 yields an asymptotically tight
lower bound for A3(n), which is of order na(n). Of course, we should explain
what a(n) is.
In order to define the extremely slowly growing function a, we first intro-
duce a hierarchy of very fast growing functions AI, A 2 , .•. • We put

Al(n) = 2n,
Ak(n) = A k- 1 0 A k- l 0 · · · 0 A k- l (l) (n-fold composition), k = 2,3, ....
Only the first few of these functions can be described in usual terms: We have
.2

A2(n) = 2n and A3(n) = 22 with n twos in the exponential tower. The


Ackermann function l A( n) is defined by diagonalizing this hierarchy:

A(n) = An(n).
And a is the inverse function to A:

a(n) = min{k 2:: 1: A(k) 2: n}.

Since A( 4) is a tower of 2's of height 216 , encountering a number n with


a(n) > 4 in any physical sense is extremely unlikely.

1 Several versions of the Ackermann function can be found in the literature, dif-
fering in minor details but with similar properties and orders of magnitude.
174 Chapter 7: Lower Envelopes

The Ackermann function was invented as an example of a function grow-


ing faster than any primitive recursive function. For people familiar with
some of the usual programming languages, the following semiformal expla-
nation can be given: No function as large as A(n) can be evaluated by a
program containing only FOR loops, where the number of repetitions of each
loop in the program has been computed before the loop begins. For a long
time, it was thought that A(n) was a curiosity irrelevant to "natural" math-
ematical problems. Then theoretical computer scientists discovered it in the
analysis of an extremely simple algorithm that manipulates rooted trees, and
subsequently it was found in the backyard of elementary geometry, namely
in the asymptotics of the Davenport-Schinzel sequences.
As was already remarked above, a not too difficult analysis of the con-
struction in Theorem 7.2.1 shows that A3(n) = S1(na(n)). This is the correct
order of magnitude, and we will (almost) present the matching upper bound
in the next section. Even the constants in the asymptotics of A3(n) are known
with surprising precision. Namely, we have

~ na(n) - 2n :::; A3(n) :::; 2na(n) + 0 ( Jna(n) ) ,


and so the gap in the main term is only a factor of 4, in spite of the complexity
of the whole problem!
Higher-order Davenport-Schinzel sequences and their generaliza-
tions. The asymptotics of the functions As(n) for fixed 8 > 3, which corre-
spond to forbidden patterns ababa . .. with 8+2 letters, is known quite well,
although not entirely precisely. In particular, A4(n) is of the (strange) order
n· 2a (n), and for a general fixed 8, we have
n· 2ps (a(n)) :::; As(n) :::; n. 2Qs (a(n)),

where Ps(x) is a polynomial of degree l s;2 J (with a positive leading coeffi-


cient) and qs(x) is a polynomial of the same degree, for 8 odd multiplied by
log x. The proofs are similar in spirit to those shown for 8 = 3 but tech-
nically much more complicated. On the other hand, proving something like
As (n) = O( n log* n) for every fixed 8 is not very difficult with the tricks from
the proof of Proposition 7.4.2 below (see Exercise 7.4.1).
The Davenport-Schinzel sequences have the simple alternating forbidden
pattern ababa . ... More generally, one can consider sequences with an arbi-
trary fixed forbidden pattern v, such as abcdabcdabcd, where a, b, c, d must be
distinct symbols. Of course, here it is not sufficient to require that every two
successive symbols in the sequence be distinct, since then the whole sequence
could be 121212 ... of arbitrary length. To get a meaningful problem, one can
assume that if the forbidden pattern v has k distinct letters (k = 4 in our
example), then each k consecutive letters in the considered sequence avoiding
v must be distinct. Let Ex( v, n) denote the maximum possible length of such
a sequence over n symbols. It is known that for every fixed v, we have
7.3 More on Davenport-Schinzel Sequences 175

Ex(v, n) S; 0 (n. 2<>(n)C)

for a suitable exponent c = c(v). In particular, the length of such sequences


is nearly linear in n. Moreover, many classes of patterns v are known with
Ex(v, n) = O(n), although a complete characterization of such patterns is
still elusive. For example, for patterns v consisting only of two letters a and
b, Ex( v, n) is linear in n if and only if v contains no subsequence ababa (not
necessarily contiguous). These results have already found nice applications
in combinatorial geometry and in enumerative combinatorics.

Bibliography and remarks. Davenport and Schinzel [DS65] de-


fined the sequences now associated with their names in 1965, moti-
vated by a geometric problem from control theory leading to lower
envelopes of a collection of planar curves. They established some sim-
ple upper bounds on As(n). The next major progress was made by
Szemeredi [Sze74], who proved that As(n) S; Csn log* n for a suitable
C s , where log* n is the inverse of the tower function A3(n). Over ten
more years passed until the breakthrough of Hart and Sharir [HS86],
who showed that A3(n) is of order na(n). A recollection of Sharir
about their discovery, after several months of trying to prove a lin-
ear upper bound and then learning about Szemeredi's paper, deserves
to be reproduced (probably imprecisely but with Micha Sharir's kind
consent): "We decided that if Szemeredi didn't manage to prove that
A3(n) is linear then it is probably not linear. We were aware of only
one result with a nonlinear lower bound not exceeding O( n log* n), and
this was Tarjan's bound of 8(na(n)) for path compressions. In des-
peration, we tried to relate it to our problem, and a miracle happened:
The construction Tarjan used for his lower bound could be massaged
a little so as to yield a similar lower bound for A3 (n)."
The path compression alluded to is an operation on a rooted tree.
Let T be a tree with root r and let p be a leaf-to-root path of length
at least 2 in T. The compression of p makes all the vertices on p,
except for r, sons of r, while all the other father-to-son relations in T
remain unchanged. Tarjan [Tar75] proved, as a part of an analysis of a
simple algorithm for the so-called UNION-FIND problem, that if Tis
a suitably balanced rooted tree with n nodes, then the total length of
all paths in any sequence of successive path compressions performed
on T is no more than O(na(n)), and this is asymptotically tight in
the worst case. Hart and Sharir put Davenport-Schinzel sequences of
order 3 into correspondence with generalized path compressions (where
only some nodes on the considered path become sons of the root, while
the others retain the same father) and analyzed them in the spirit of
Tarjan's proofs. Later the proofs were simplified and rephrased by
Sharir to work directly with Davenport-Schinzel sequences.
176 Chapter 7: Lower Envelopes

The constant ~ in the lower bound on A3 (n) is by Wiernik and


Sharir [WS88], and the 2 in the upper bound is due to Klazar
[Kla99] (he gives a self-contained proof somewhat different from that
in [SA95]).
The most precise known bounds for As(n) with s :::: 4 were obtained
by Agarwal, Sharir, and Shor [ASS89], as a slight improvement over
earlier results of Sharir.
Davenport-Schinzel sequences are encountered in many geomet-
ric and nongeometric situations. Even the straightforward bound
A2(n) = 2n-1 is often useful for simplifying proofs, and the asymp-
toties of the higher-order sequences allow one to prove bounds involv-
ing the function a(n) without too much work, although such bounds
are difficult to derive from scratch. Numerous applications, mostly ge-
ometric, are listed in [SA95].
Single cell. Pollack, Sharir, and Sifrony [PSS88] proved that the com-
plexity of a single cell in an arrangement of n segments in the plane is
at most O(na(n)), by a reduction to Davenport-Schinzel sequences of
order 3 (see Exercise 1). A similar argument shows that a single cell
in an arrangement of n curves, with every two curves intersecting at
most s times, has complexity O(As+2(n)) (see [SA95]).
Generalized Davenport-Schinzel sequences were first considered by
Adamec, Klazar, and Valtr [AKV92]. The near-linear upper bound
Ex(v, n) = O(n· 2a (n)c) mentioned in the text is from Klazar [Kla92].
The most general results about sequences u with Ex(u, n) = O(n)
were obtained by Klazar and Valtr [KV94]. A recent survey, includ-
ing applications of the generalized Davenport-Schinzel sequences, was
written by Valtr [Val99a].
We mention two applications. The first one concerns Ramsey-type
questions for geometric graphs (already considered in the notes to Sec-
tion 4.3). We consider an n-vertex graph G drawn in the plane whose
edges are straight segments, and we ask, what is the maximum possible
number of edges of G so that the drawing does not contain a certain
geometric configuration? Here we are interested in the following two
types of configurations: k pairwise crossing edges

3 pairwise crossing edges

and k pairwise parallel edges, where two edges are called parallel if
they do not cross and their four vertices are in convex position:

~:::::::,~
e-
..:
7.3 More on Davenport-Schinzel Sequences 177

A graph with no two crossing edges is planar and thus has O(n) ver-
tices. It seems to be generally believed that forbidding k pairwise cross-
ing edges forces O(n) edges for every fixed k. This has been proved
for k = 3 by Agarwal, Aronov, Pach, Pollack, and Sharir [AAP+97],
and for all k 2: 4, the best known bound is O(nlogn) due to Valtr
(see [Va199a]). For k forbidden pairwise parallel edges, he derived an
O( n) bound for every fixed k using generalized Davenport-Schinzel
sequences, and the O( n log n) bound for k pairwise crossing edges fol-
lows by a neat simple reduction. In this connection, let us mention
a nice open question: What is the smallest n = n( k) such that any
straight-edge drawing of the complete graph Kn always contains k
pairwise crossing edges? The best known bound is O(k2) [AEG+94],
but perhaps the truth is O(k) or close to it.
The second application of generalized Davenport-Schinzel sequen-
ces concerns a conjecture of Stanley and Wilf. Let a be a fixed per-
mutation of {I, 2, ... ,k}. We say that a permutation 7r of {I, 2, ... ,n}
contains a if there are indices i l < i2 < ... < ik such that a( u) < a( v)
if and only if 7r(iu) < 7r(iv), 1 :S u < v :S k. Let N(a, n) de-
note the number of permutations of {I, 2, ... ,n} that do not con-
tain a. The Stanley-Wilf conjecture states that for every k and a
there exists e such that N(a, n) :S en for all n. Using generalized
Davenport-Schinzel sequences, Alon and Friedgut [AFOO] proved that
10gN(a,n) :S nf3(n) for every fixed a, where f3(n) denotes a very
slowly growing function, and established the Stanley-Wilf conjecture
for a wide class of a (previously, much fewer cases had been known).
Klazar [KlaOO] observed that the Stanley-Wilf conjecture is implied by
a conjecture of Fiiredi and Hajnal [FH92] about the maximum number
of l's in an nxn matrix of O's and l's that does not contain a kxk
submatrix having 1's in positions specified by a given fixed k x k per-
mutation matrix. Fiiredi and Hajnal conjectured that at most O( n)
1's are possible. The analogous questions for other types of forbidden
patterns of 1 's in 0/1 matrices are also very interesting and very far
from being understood; this is another direction of generalizing the
Davenport-Schinzel sequences.

Exercises
1. Let e be a cell in an arrangement of n segments in the plane (assume
general position if convenient).
(a) Number the segments 1 through n and write down the sequence of
the segment numbers along the boundary of e, starting from an arbi-
trarily chosen vertex of the boundary (decide what to do if the boundary
has several connected components!). Check that there is no ababab sub-
e
sequence, and hence that the combinatorial complexity of is no more
than O(A4(n)). 0
178 Chapter 7: Lower Envelopes

(b) Find an example where an ababa subsequence does appear in the


sequence constructed in (a). [I]
(c) Improve the argument by splitting the segments suitably, and show
that the boundary of C has complexity O(na(n)). 0
2. We say that an nxn matrix A with entries 0 and 1 is good if it contains
no (! ~ ! ~); that is, if there are no indices i1 < i2 and j1 < j2 <
j3 < j4 with aid! = ai212 = aid3 = ai2j4 = 1.
(a) Prove that a good A has at most As(n) + O(n) ones for a suitable
constant s. [I]
(b) Show that one can take s = 3 in (a). 0

7.4 Towards the Tight Upper Bound for Segments


As we saw in Proposition 7.1.1, it is not very difficult to prove that the
maximum length of a Davenport-Schinzel sequence of order 3 over n symbols
satisfies A3(n) = O(nlogn). Getting anywhere significantly below this bound
seems much harder, and the tight bound requires double induction. But there
is only one obvious parameter in the problem, namely the number n, and
introducing the second variable for the induction is one of the keys to the
proof.
Let w = a1 a2 ... ae be a sequence. A nonrepetitive segment in w is a
contiguous subsequence U = aiaH1 ... aHk consisting of k distinct symbols.
A sequence w is m-decomposable if it can be partitioned into at most m
nonrepetitive segments (the partition need not be unique). Here is the main
definition for the inductive proof: Let 'ljJ(m, n) denote the maximum possible
length of an m-decomposable Davenport-Schinzel sequence of order 3 over n
symbols. First we relate 'ljJ(m, n) to A3(n).
7.4.1 Lemma. Every Davenport-Schinzel sequence of order 3 over n sym-
bols is 2n-decomposable, and consequently,

Proof. Let w be the given Davenport-Schinzel sequence. We define a linear


ordering ~ on the symbols occurring in w: We set a ~ b if the first occurrence
of the symbol a in w precedes the first occurrence of the symbol b. We par-
tition w into maximal strictly decreasing segments according to the ordering
~. Here is an example of such a partitioning (the sequence is chosen so that
the usual ordering of the digits coincides with ~): 11213214211516543. Clearly,
each strictly decreasing segment is a nonrepetitive segment as well, and so it
suffices to show that the number of the maximal strictly decreasing segments
is at most 2n (the tight bound is actually 2n-1).
Let Uj and Uj+1 be two consecutive maximal strictly decreasing segments,
let a be the last symbol of Uj, let i be its position in w, and let b be the first
7.4 Towards the Tight Upper Bound for Segments 179

symbol of Uj+l (at the (i+1)st position). We claim that the ith position is
the last occurrence of a or the (i+ 1)st position is the first occurrence of b.
This will imply that we have at most 2n segments Ui, because each of the n
symbols has (at most) one first and one last occurrence.
Supposing that the claim is not valid, we find the forbidden subsequence
ababa. We have a -< b, for otherwise the (i+ 1)st position could be appended
to Uj, contradicting the maximality. The b at position i+ 1 is not the first b,
and so there is some b before the ith position. There must be another a even
before that b, for otherwise we would have b -< a. Finally, there is an a after
the position i+ 1, and altogether we have the desired ababa. 0

Next, we derive a powerful recurrence for 'lj;(m, n). It is perhaps best


to understand the proof first, and the complicated-looking statement then
becomes quite natural.
7.4.2 Proposition. Let m, n :2: 1 and p :s; m be integers, and let m =
ml + m2 + ... + mp be a partition of minto p nonnegative addends. Then
there is a partition n = nl + n2 + ... + np + n* such that
p

'lj;(m, n) :s; 4m + 4n* + 'lj;(p, n*) + L 'lj;(mk' nk).


k=1
Proof. Let W be an m-decomposable Davenport-Schinzel sequence of order 3
over n symbols attaining 'lj;(m, n). Let W = UI U2 ... Um be a partition of W
into nonrepetitive segments. Let WI = UI U2 ... u m , consist of the first ml
nonrepetitive segments, W2 = u m , + I ... u m , +m2 of the next m2 segments,
and so on until wp. We call WI, W2, . .. ,wp the parts of w.
We divide the symbols in W into two classes: A symbol a is local if it
occurs in (at most) one of the parts Wk, and it is nonlocal if it appears in at
least two distinct parts. We let n* be the number of distinct nonlocal symbols
and nk the number of distinct local symbols occurring in Wk.
If we delete all the nonlocal symbols from Wk, we obtain an mk-decompos-
able sequence over nk symbols with no ababa. However, this sequence can
still contain consecutive repetitions of some symbols, which is forbidden for
a Davenport-Schinzel sequence. So we delete all symbols in each repetition
but the first one; for example, 122232244 becomes 12324. We note that con-
secutive repetitions can occur only at the boundaries of the nonrepetitive
segments Uj, and so at most mk-110cal symbols have been deleted from Wk.
The remaining sequence is already a Davenport-Schinzel sequence, and so
the total number of positions of W occupied by the local symbols is at most
p p

L[mk - 1 + 'lj;(mk' nk)] :s; m +L 'lj;(mk' nk).


k=1 k=1
Next, we need to deal with the nonlocal symbols. Let us say that a non-
local symbol a is a middle symbol in a part Wk if it occurs both before Wk
180 Chapter 7: Lower Envelopes

and after Wk; otherwise, it is a nonmiddle symbol in Wk. We estimate the


contributions of middle and nonmiddle symbols separately.
First we consider each part Wk in turn, and we delete all local symbols and
all nonmiddle symbols from it. Then we look at the sequence that remains
from W after these deletions, and we delete all symbols but one from each
contiguous repetition. As in the case of the local symbols, we have deleted
at most m middle symbols. Clearly, the resulting sequence is a Davenport-
Schinzel sequence of order 3 over n* symbols, and we claim that it is p-
decomposable (this is perhaps the most surprising part of the proof). Indeed,
if we consider what remained from some Wk, we see that sequence cannot
contain a subsequence bab, because some a's precede and follow Wk and we
would get the forbidden ababa. Therefore, the surviving symbols of Wk form
a nonrepetitive segment. Hence the total contribution of the middle symbols
to the length of W is at most m + 'lj;(p, n*).
The nonmiddle symbols in a given Wk can conveniently be divided into
starting and ending symbols (with the obvious meaning). We concentrate on
the total contribution of the starting symbols; the case of the ending symbols
is symmetric. Let n'k be the number of distinct starting symbols in Wk; we have
l:~=l n'k ~ n*, since a symbol is starting in at most one part. Let us erase
from Wk all but the starting symbols, and then we also remove all contiguous
repetitions in each Wk, as in the two previous cases. The remaining starting
symbols contain no subsequence abab, since we know that there is some a
following Wk. Thus, what is left of Wk is a Davenport-Schinzel sequence of
order 2 over n'k symbols, and as such it has length at most 2n'k-1. Therefore,
the total number of starting symbols in all of W is no more than
p

~)mk - 1 + 2n'k -1) < m + 2n*.


k=l

Summing up the contributions of local symbols, middle symbols, starting


symbols, and ending symbols, we arrive at the bound claimed in the propo-
sition. Here is a graphic summary of the proof:
m for repetitions
local:
+ l:k 'lj;(mk' nk)
symbols m for repetitions
ofw middle: + 'lj;(p, n*) (no aba in Wk)
nonlocal \ m for repetitions
\ ,Iorting' + l:k>dn'k)
non-middle (no abab in Wk)

ending: same as starting

D
7.4 Towards the Tight Upper Bound for Segments 181

How to prove good bounds from the recurrence. The recurrence just
proved can be used to show that 'ljJ(m, n) = O((m+n)a(m)), and Lemma 7.4.1
then yields the desired conclusion A3(n) = O(na(n)). We do not give the full
calculation; we only indicate how the recurrence can be used to prove better
and better bounds starting from the obvious estimate 'ljJ(m, n) :::; mn.
First we prove that 'ljJ( m, n) :::; 4m log2 m + 6n, for m a power of 2. From
our recurrence with P = 2 and ml = m2 = W-, we obtain

Proceeding by induction on log2 m and using 'ljJ(2, n) = 2n, we estimate the


last expression by 4m+4n* +2n* + 2m (lOg2 m -1) + 6nl + 2m(log2 m -1) +
6n2 = 4m log2 m + 6n as required.
Next, we assume that m = A3(r) (the tower function) for an integer r
and prove 'ljJ(m, n) :::; 8rm + IOn by induction on r. This time we choose
P= 10: m and mk = ~ = log2 m = A3(r-l). For estimating 'ljJ(p, n*) we use
the bound derived earlier. This gives
p

'ljJ(m,n) :::; 4m+4n* +4plog2P+6n* + L'ljJ(mk,nk)


k=l
:::; 4m + 4n* + 4m + 6n* + 8(r - l)m + IO(n - n*) = 8rm + IOn.
So, by now we already know that A3(n) = O(nlog* n), where log* n is the
inverse to the tower function A3 (n). This bound is as good as linear for
practical purposes.
In general, one proves that for m = Ak (r),

'ljJ(m,n) :::; (4k - 4)rm + (4k - 2)n,

by double induction on k and r. The inductive assumption for k-l is always


used to bound the term 'ljJ(p, n*). We omit the rest of the calculation.

Bibliography and remarks. In this section we draw mostly from


[SA95], with some changes in terminology.

Exercises
1. For integers s > t ;::: 1, let 'ljJ;(m, n) denote the maximum length of a
Davenport-Schinzel sequence of order s (no subsequence abab ... with
s+ 2 letters) over n symbols that can be partitioned into m contiguous
segments, each of them a Davenport-Schinzel sequence of order t. In
particular, 'ljJs(m, n) = 'ljJ;(m, n) is the maximum length of a Davenport-
Schinzel sequence of order s over n symbols that consists of m nonrepet-
itive segments.
(a) Prove that As(n):::; 'ljJ~-l(n,n). 0
(b) Prove that
182 Chapter 7: Lower Envelopes

o
(c) Let w be a sequence witnessing 'l/Js(m, n) and let m = ml + m2 +
... + mp be some partition of m. Divide w into p parts as in the proof of
Proposition 7.4.2, the kth part consisting of mk nonrepetitive segments.
With the terminology and notation of that proof, check that the local
symbols contribute at most m+ L~=l 'l/Js(mk,nk) to the length ofw, the
middle symbols at most m + 'l/J;-2(p, n*), and the starting symbols no
more than m + 'l/Js-l(m, n*). 0
(d) Prove by induction that 'l/Js(n, m) ~ C s . (m + n) logS-2(m+1) and
As(n) ~ C~nlogS-2(n+1), for all s ;::: 2 and suitable Cs and C~ depending
only on s (set p = 2 in (c)). 0

7.5 Up to Higher Dimension: Triangles in Space


As we have seen, lower envelopes in the plane can be handled by means of
a simple combinatorial abstraction, the Davenport-Schinzel sequences. Un-
fortunately, so far, no reasonable combinatorial model has been found for
higher-dimensional lower envelopes. The known upper bounds are usually
much cruder than those in the plane, but their proofs are quite complex and
technical. We start with almost the simplest possible case: triangles in R 3 .
Here is an example of the lower envelope of triangles viewed from below:

It is actually the vertical projection of the lower envelope on a horizontal plane


lying below all the triangles. The projection consists of polygons, both convex
and nonconvex, and the combinatorial complexity of the lower envelope is the
total number of these polygons plus the number of their edges and vertices.
Simple arguments, say using the Euler relation for planar graphs, show that
if we do not care about constant factors, it suffices to consider the vertices of
the polygons.
It turns out that the worst-case complexity of the lower envelope is of
order n 2 a(n). Here we prove a simpler, suboptimal bound:
7.5 Up to Higher Dimension: Triangles in Space 183

7.5.1 Proposition. The combinatorial complexity of the lower envelope of


n triangles in R3 is at most O(na(n) logn) = O(n 2 a(n) logn), where a(n)
stands for the maximum complexity of the lower envelope of n segments in
the plane.
It is convenient, although not really essential, to work with triangles in
general position. As usual, a perturbation argument shows that this is where
the maximum complexity of the lower envelope is attained. The precise gen-
eral position requirements can be found by inspecting the forthcoming proof,
and we leave this to the reader.
Walls and boundary vertices. Let H be a set of n triangles in R3 in
general position. We need to bound the total number of vertices in the pro-
jection of the lower envelope. The vertices are of two types: those that lie on
the vertical projection of an edge of some of the triangles (boundary vertices),
and those obtained from intersections of 3 triangles (inner vertices). In the
above picture there are many boundary vertices but only two inner vertices.
Yet the boundary vertices are rather easy to deal with, while the inner ver-
tices present the real challenge.
We claim that the total number of boundary vertices is at most O(na(n)).
To see this, let a be an edge of a triangle h E H and let 7r a be the "vertical
wall" through a, i.e., the union of all vertical lines that intersect a. Each
triangle of H intersects 7ra in a (possibly empty) segment. The following
drawing shows the triangle h, the wall 7ra , and the segments within it:

Essentially, the boundary vertices lying on the vertical projection of a cor-


respond to breakpoints of the lower envelope of these segments within 7r a .
Only the segment a needs special treatment, since on the one hand, its inter-
sections with other segments can give rise to boundary vertices, but on the
other hand, it does not obscure things lying above it. To take care of this,
we can consider two lower envelopes, one for the arrangement including a
and another without a. So each edge a contributes at most 2a( n) boundary
vertices, and the total number of boundary vertices is O( na( n)).
Levels. Each inner vertex of the projected lower envelope corresponds to a
vertex of the arrangement of H lying on the lower envelope, i.e., of level 0
(recall that according to our definition of arrangement, the vertices are inter-
sections of 3 triangles). The level of a vertex v is defined in the usual way: It
is the number of triangles of H that intersect the open ray emanating from
v vertically downwards. Let h(H) denote the number of vertices of level k,
184 Chapter 7: Lower Envelopes

k = 0, 1, .... Further, let fk(n) be the maximum of fk(H) over all sets H of
n triangles (in general position). So our goal is to estimate fo(n).
The first part of the proof of Proposition 7.5.1 employs a probabilistic
argument, very similar to the one in the proof of the zone theorem (Theo-
rem 6.4.1), to relate fo(H) and h(H) to fo(n-1).
7.5.2 Lemma. For every set H oin triangles in general position, we have
n-3 1
- - fo(H) ::; fo(n-1) - - h(H).
n n
Proof. We pick one triangle hE H at random and estimate E[fo(H \ {h})],
the expected number of vertices of the lower envelope after removing h. Every
vertex of the lower envelope of H is determined by 3 triangles, and so its
chances of surviving the removal of hare n;:3. For a vertex v of levell, the
probability of its appearing on the lower envelope is ~, since we must remove
the single triangle lying below v. Therefore,
n- 3 1
E[Jo(H \ {h})] = - fo(H) + - h(H).
n n
The lemma follows by using fo (H \ {h}) ::; fo (n-1). D

Before proceeding, let us inspect the inequality in the lemma just proved.
Let H be a set of n triangles with fo(H) = fo(n). If we ignored the term
~ h(H), we would obtain the recurrence n;:3
fo(n) ::; fo(n-1). This yields
only the trivial estimate fo(n) = O(n 3 ), which is not surprising, since we
have used practically no geometric information about the triangles. In order
to do better, we now want to show that h (H) is almost as big as fo(H),
in which case the term ~ h(H) decreases the right-hand side significantly.
Namely, we prove that
h(H) 2 fo(H) - O(na(n)). (7.1)
Substituting this into the inequality in Lemma 7.5.2, we arrive at
n-2
- - fo(n) ::; fo(n-1) + O(a(n)).
n
We practiced this kind of recurrences in Section 6.4: The substitution cp( n) =
~~<:-L quickly yields fo(n) = O(na(n) logn). So in order to prove Proposi-
tion 7.5.1, it remains to derive (7.1), and this is the geometric heart of the
proof.
Making someone pay for the level-O vertices. We are going to relate
the number of level-O vertices to the number of level-1 vertices by a local
charging scheme: From each vertex v of level 0, we walk around a little and
find suitable vertices of level 1 to pay for v, as follows.
The level-O vertex v is incident to 6 edges, 3 of them having level 0 and 3
level 1:
7.5 Up to Higher Dimension: Triangles in Space 185

low r nvc10p ................... .

The picture shows only a small square piece from each of the triangles incident
to v. The lower envelope is on the bottom, and the edges of level 1 emanating
from v are marked by arrows. Let e be one of the level-l edges going from v
away from the lower envelope. We follow it until one of the following events
occurs:
(i) We reach the intersection v' of e with a vertical wall 7ra erected from an
edge a of some triangle. This v' pays 1 unit to v.
(ii) We reach the intersection v' of e with another triangle; i.e., v'is a vertex
of the arrangement of H. This v' pays ~ of a unit to v.
This is done for all 3 level-1 edges emanating from v and for all vertices v of
level O. Clearly, every v receives at least 1 unit in total. It remains to discuss
what kind of vertices the v' are and to estimate the total charge paid by
them.
Since there is no other vertex on e between v and v', a particular v' can
be reached from at most 2 distinct v in case (i) and from at most 3 distinct
v in case (ii). So a v'is charged at most 2 according to case (i) or at most 1
according to case (ii) (because of the general position of H, these cases are
never combined, since no intersection of 3 triangles lies in any of the vertical
walls 7ra ).
Next, we observe that in case (i), v' has level at most 2, and in case (ii), it
has level exactly 1. This is best seen by considering the situation within the
vertical plane containing the edge e. As we move along e, just after leaving
v we are at levell, with exactly one triangle h below, as is illustrated next:

e~:7ra
v' :
:
. h
v e~
case (i) case (ii)

The level does not change unless we enter a vertical wall 7ra or another triangle
h' E H. If we first enter some 7ra , then case (i) occurs with v' = en 7ra , and
the level cannot change by more than 1 by entering 7ra . If we first reach a
triangle h', we have case (ii) with v' = en h', and v' has level 1.
Each v' reached in case (i) is a vertex in the arrangement of segments
within one of the walls 7ra , and it has level at most 2 there. It is easy to show
186 Chapter 7: Lower Envelopes

by the technique of the proof of Clarkson's theorem on levels (Theorem 6.3.1)


that the number of vertices of level at most 2 in an arrangement of n segments
is O(cr(n)) (Exercise 2). Since we have 3n walls 7ra , the total amount paid
according to case (i) is O(ncr(n)).
As for case (ii), all the v' are at levell, and each pays at most 1, so the
total charge is at most II (H).
Therefore, fo(H) :s:; II (H) + O(ncr(n)), which establishes (7.1) and con-
cludes the proof of Proposition 7.5.1. 0

Bibliography and remarks. The sharp bound of O(n 2 a(n)) for


the lower envelope of n triangles in R 3 was first proved by Pach and
Sharir [PS89] using a divide-and-conquer argument. A tight bound of
O(nd-1a(n)) for (d-1)-dimensional simplices in Rd was established a
little later by Edelsbrunner [Ede89]. Tagansky [Tag96] found a consid-
erably simpler argument and also proved some new results. We used
his method in the proof of Proposition 7.5.1, but since we omitted
a subtler analysis of the charging scheme, we obtained a suboptimal
bound. To improve the bound to O(n 2 a(n)), the charging scheme is
modified a little: The v' reached in case (i) pays ~ instead of 1, and the
t
v' reached in case (ii) pays if it was reached from k :s:; 3 distinct v.
Then it can be shown, with some work, that every vertex of the lower
envelope receives a charge of at least ~ (and not only 1); see [Tag96].
Hence II(H) ;::: ~fo(H) - O(ncr(n)), and the resulting recurrence be-
comes n-~/3 fo(n) :s:; fo(n-1) +O(cr(n)). It implies fo(n) = O(ncr(n));
proving this is somewhat complicated, since the simple substitution
trick does not work here.

Exercises
1. Given a construction of a set of n segments in the plane with lower
envelope of complexity cr(n), show that the lower envelope of n triangles
in R3 can have complexity D(ncr(n)). 0
2. Show that the number of vertices of level at most k in the arrangement of
n segments (in general position) in the plane is at most O(k2cr(Lk~1J)).
The proof of the general case of Clarkson's theorem on levels (Theo-
rem 6.3.1) applies almost verbatim. IT]

7.6 Curves in the Plane


In the proof for triangles shown in the previous section, if we leave a vertex on
the lower envelope along an edge of levell, we cannot come back to the lower
envelope before one of the events (i) or (ii) occurs. Once we start considering
lower envelopes of curved surfaces, such as graphs of polynomials of degree
7.6 Curves in the Plane 187

s for some fixed s, this is no longer true: The edge can immediately go back
to another vertex on the lower envelope. Then we would be trying to charge
one vertex of the lower envelope to another. This can be done, but one must
define an "order" for each vertex, and charge envelope vertices of order i only
to vertices of order smaller than i or to vertices of significantly higher levels.
We show this for the case of curves in the plane. This example is artifi-
cial, since using Davenport-Schinzel sequences leads to much sharper bounds.
But we can thus demonstrate the ideas of the higher-dimensional proof, while
avoiding many technicalities. We remark that this proof is not really an up-
grade of the one for triangles: Here we aim at a much cruder bound, and so
some of the subtleties in the proof for triangles can be neglected.
We consider n planar curves as discussed in Section 7.1: They are graphs
of continuous functions R ~ R, and every two intersect at most s times.
Moreover, we assume for convenience that the curves cross at each intersec-
tion and no 3 curves have a common point.
7.6.1 Proposition. The maximum possible number of vertices on the lower
envelope of a set H of n curves as above is at most O(nHe) for every fixed
E > O. That is, for every s and every E > 0 there exists C such that the bound
is at most Cn He for all n.

Proof. Let v be a vertex of the arrangement of H. We say that v has order i


if it is the ith leftmost intersection of the two curves defining it. So the order
is an integer between 1 and s.
Let I~k (H) denote the number of vertices of order i and level at most k in
the arran~ement of H. Let I~k(n) be the maximum of this quantity over all
n-element sets H of curves as-in the proposition. Further, we write l<k(H) =
2:::=1 l~k(H) for the total number of vertices of level at most k. F~r k = 0
we writ~ just I instead of I Sc.O and similarly for I(i).
Let v be a vertex of order i on the lower envelope. We define a charging
scheme; that is, we describe who is going to pay for v. We start walking from
v to the left along the curve h passing through v and not being on the lower
envelope on the left of v. If k i vertices are encountered, without returning
to the lower envelope or escaping to -00, then we charge each of these ki
vertices t: units. Here k1' k2' ... ,ks are integer parameters whose values will
be fixed later, but one can think of them as very large constants.
If we end up at -00 before encountering k i vertices, we charge 1 to the
curve h itself. Finally, if we are back at the lower envelope without having
passed at least k i vertices, then, crucially, we must have crossed the second
curve h' defining the vertex v again, at a vertex Vi of order i-I, and this v'
pays 1 for v. A picture illustrates these three cases of charging:

-oo~ ~~h'
v h ~h
188 Chapter 7: Lower Envelopes

We see that v can charge a curve or a vertex of a smaller order significantly,


or it can charge many vertices of arbitrary orders, but each of them just a
little.
We do this charging for all vertices v of order i on the lower envelope.
A given vertex Vi of the arrangement can be charged only if it has level
at most ki' and it can be charged at most twice: The vertices of the lower
envelope that might possibly charge Vi can be found by following the two
curves passing through Vi to the right. So if Vi has order different from i-I,
then it pays at most t" and if it has order i-I, then it can be charged 1
extra. Finally, each curve pays at most 1. Since at least 1 unit was paid for
each vertex of order i on the lower envelope, we obtain
C) 2 (i 1)
f' (n)::;n+ k i f9i (n)+fs,;;' (n). (7.2)

Next, we want to convert this into a recurrence involving only f and f(i).
To this end, we estimate f~ik by following the proof of Clarkson's theorem
on levels almost literally (as for the case of segments in Exercise 7.5.2). We
obtain
o(
f~k(n) = k 2 f(i) (l IJ))·
By substituting this bound (and its analogue for fs,k) into the right-hand
side of (7.2), we arrive at the system of inequalities

where C is a suitable constant and where we put f(O) = O. We also have


f ::; f(1) + ... + f(s).
It remains to derive the bound f(n) = O(n1+c) from this recurrence,
which is not really difficult but still somewhat interesting. It is essential that
lit
f ( J) appears only with the coefficient k i on the right-hand side, in contrast
to f(i-1)(litJ), which has coefficient k;'
Let c > 0 be small but fixed. Let us see what happens if we try to prove
the bounds f(i)(n) ::; Ain1+c and f(n) ::; An1+c by induction on n using
(7.3), where the A are suitable (large) constants and A = 2:::=1 Ai' The
term n on the right-hand side of (7.3) is small compared to n1+ c , and so we
ignore it for the moment. We also neglect the floor functions. By substituting
the inductive hypothesis f(i) (l it J) ::; Ai ( it ) 1+c into the right-hand side of
(7.3), we obtain roughly
n1+C(CAk;C + CA i _ 1k;-C) ::; n1+C(CAk;C + CA i - 1ki ).
For the induction to work, Ai must be larger than the expression in paren-
theses. To make Ai bigger than the second term in parentheses, we can set
Ai = 3Cki A i - 1, say (the constant 3 is chosen to leave enough room for the
other terms). Then Ai = A1C~-lk2k3'" ki' with C 1 = 3C. These Ai grow
7. 7 Algebraic Surface Patches 189

fast, and so A ~ As. Then the requirement that Ai be larger than the first
term in parentheses yields, after a little simplification,
e > cs-i+lk
ki k k s'
-I i+l i+2'"

Therefore, the k i should decrease very fast with i. We can set ks = ct/e
and k i = (Cr-i+ l kiH ki+2'" ks)l/e. Now setting AI, which is still a free
parameter, sufficiently (enormously) large, we can make sure that the desired
bounds f(i) (n) :s: Ain He hold at least up to n = kl' so that we can really
use the recurrence (7.3) in the induction with the k i defined above. These
considerations indicate that the induction works; to be completely sure, one
should perform it once more in detail. But we leave this to the reader's
diligence and declare Proposition 7.6.1 proved. 0

Bibliography and remarks. The method shown in this section


first appeared in Halperin and Sharir [HS94], who considered lower
envelopes of curved objects in R3.

7.7 Algebraic Surface Patches


Here we state, without proofs, general bounds on the complexity of higher-
dimensional lower envelopes. We also discuss a far-reaching generalization: an
analogous bound for the complexity of a cell in a d-dimensional arrangement.
Roughly speaking, the lower envelope of any n "well-behaved" pieces of
(d-1)-dimensional surfaces in Rd has complexity close to n d - l . While for
planar curves it is simple to say what "well-behaved" means, the situation
is more problematic in higher dimensions. The known proofs are geometric,
and listing as axioms all the geometric properties of "well-behaved pieces of
surfaces" actually used in them seems too cumbersome to be useful. Thus, the
most general known results, and even conjectures, are formulated for families
of algebraic surface patches, although it is clear that the proofs apply in more
general settings.
First we recall the definition of a semialgebraic set. This is a set in
Rd definable by a Boolean combination of polynomial inequalities. More
formally, a set A ~ Rd is called semialgebraic if there are polynomials
Pl,P2,'" ,Pr E R[Xb'" ,Xd] (i.e., polynomials in d variables with real coef-
ficients) and a Boolean formula If> (Xl , X 2 , .•. , X r ) (such as XI&(X2 V X 3 )),
where X I, ... ,Xr are variables attaining values "true" or "false", such that

A = {X E Rd: If> (PI (X) ;::: O,P2(X) ;::: 0, ... ,Pr(X) ;::: 0) }.

Note that the formula If> may involve negations, and so the sets {X E
Rd: Pl(X) > O} and {x E Rd: Pl(X) = O} are semialgebraic, for example.
190 Chapter 7: Lower Envelopes

One might want to allow for quantifiers, that is, to admit sets like
{(Xl,X2) E R2: 3Yl VY2P(Xl,X2,Yl,Y2) ::::: O} for a 4-variate polynomial p.
As is useful to know, but not very easy to prove (and we do not attempt it
here), each such set is semialgebraic, too: According to a famous theorem of
Tarski, it can be defined by a quantifier-free formula.
Let D be the maximum of the degrees of the polynomials PI, ... , Pr ap-
pearing in the definition of a semi algebraic set A. Let us call the number
max(d, r, D) the description complexity2 of A. The results about lower en-
velopes concern semi algebraic sets whose description complexity is bounded
by a constant.
An algebraic surface patch is a special case of a semialgebraic set: It
can be defined as the intersection of the zero set of some polynomial q E
R[Xl"'" Xd] with a closed semialgebraic set B. Intuitively, q(x) = 0 defines
a "surface" in Rd, and B cuts off a closed patch from that surface. Note
that B can be all of R d, and so the forthcoming results apply, among others,
to graphs of polynomials or, more generally, to surfaces defined by a single
polynomial equation.
Let us remark that in the papers dealing with algebraic surface patches,
the definition is often more restrictive, and certainly the proofs make several
extra assumptions. Most significantly, they usually suppose that the patches
are smooth and they intersect transversally; that is, near each point com-
mon to the relative interior of k patches, these k patches look locally like k
hyperplanes in general position, 1 :::; k :::; d. These conditions follow from a
suitable general position assumption, namely, that the coefficients of all the
polynomials appearing in the descriptions of all the patches are algebraically
independent numbers. 3 This can be achieved by a perturbation, but a rigor-
ous argument, showing that a sufficiently small perturbation cannot decrease
the complexity of the lower envelope too much, is not entirely easy.
The algebraic surface patches are also typically required to be xd-mono-
tone (every vertical line intersects them only once). This can be guaranteed
by partitioning each of the original patches into smaller pieces, slicing them
along the locus of points with vertical tangent hyperplanes (and eliminating
the vertical pieces).
After these preliminaries, we can state the main theorem.

7.7.1 Theorem. For every integers band d ::::: 2 and every c > 0, there
exists C = C (d, b, c) such that the following holds. Whenever 1'1, 1'2, ... , I'n
are algebraic surface patches in R d, each of description complexity at most
b, the lower envelope of the arrangement of 1'1,1'2, ... ,I'n has combinatorial
complexity at most Cnd-He.

2 This terminology is not standard.


3 Real numbers aI, a2, ... ,am are algebraically independent if there is no nonzero
polynomial p with integer coefficients such that p(al, a2, ... ,am) = o.
7.7 Algebraic Surface Patches 191

How is the combinatorial complexity of the lower envelope defined in this


general case, by the way? For each Ii, we define Mi <;;; R d - l as the region
where Ii is on the bottom of the arrangement; formally, Mi consists of all
(Xl, X2, ... , Xd-l) E R d-l such that the lowest intersection of the vertical line
{(XI,X2, ... ,Xd-l, t): t E R} with U7=llj lies in Ii. The arrangement of the
Mi is often called the minimization diagram of the Ii, and the number of its
faces is the complexity of the lower envelope.
The proof of Theorem 7.7.1 is quite similar to the one shown in the pre-
ceding section. Each lower-envelope vertex is charged either to a vertex of
lower order (the intersection of the same d patches but lying more to the
left), or to some k i vertices, or to a vertex within the vertical wall erected
from the boundary of some patch (all the charged vertices lying at level at
most k i ). The number of vertices of the last type is estimated by using the
(d-l)-dimensional case of Theorem 7.7.1 (so the whole proof goes by induc-
tion on the dimension). To this end, one needs to show that the situation
within the (d-l )-dimensional vertical wall, which in general is curved, can
be mapped to a situation with algebraic surface patches in R d - l . Here the
fact that we are dealing with semialgebraic sets is used most heavily.
Theorem 7.7.1 is a powerful result and it provides nontrivial upper bounds
on the complexity of various geometric configurations. Sometimes the bound
can be improved by a problem-specific proof, but the general lower-envelope
result often quickly yields roughly the correct order of magnitude. For exam-
ples see Exercise 1 and [SA95] or [ASOOa].
Single cell. Bounding the maximum complexity of a single cell in an ar-
rangement is usually considerably more demanding than the lower envelope
question, mainly because a cell can have a complicated topology: It can have
holes, tunnels, and so on (cells in hyperplane arrangements, no more com-
plicated than the lower envelope, are an honorable exception). The following
theorem provides a bound analogous to that of Theorem 7.7.1. It was proved
by similar methods but with several new ideas, especially for the topological
complexity of the cell.
7.7.2 Theorem. For every integers band d 2: 2 and every c > 0, there
exist Co = Co (d, b) and C = C (d, b, c) such that the following holds. Let K
be a cell in the arrangement of n algebraic surface patches in R d in general
position, each of description complexity at most b. Then the combinatorial
complexity of K (the number of faces in its closure) is at most Cn d - HE , and
its topological complexity (the sum of the Betti numbers) is no more than
Con d - l •
The general position assumption can probably be removed, but I am aware
of no explicit reference, except for the special case d = 3.

Bibliography and remarks. For a thorough discussion of semialge-


braic sets and quantifier elimination we refer to books on real algebraic
geometry, such as Bochnak, Coste, and Roy [BCR98].
192 Chapter 7: Lower Envelopes

An old conjecture of Sharir asserts that the combinatorial com-


plexity of the lower envelope in the situation of Theorem 7.7.1 is at
most O(n d- 2 >'s(n)) for a suitable s depending on the description com-
plexity of the patches. The best known lower bound is O(nd- 1 o:(n)),
which applies even for simplices.
The decisive advance towards proving Theorem 7.7.1 was made by
Halperin and Sharir [HS94], who established the 3-dimensional case.
The general case was proved, as a culmination of a long development,
by Sharir [Sha94]. A discussion of the general position assumption
and the perturbation argument can also be found there. Interestingly,
it is not proved that the maximum complexity is attained in general
position; rather, it is argued that the expected complexity after an
appropriate random perturbation is always at least a fixed fraction of
the original complexity minus O(n d -1+ c ).
Some applications lead to the following variation of the lower en-
velope problem: We have two collections F and 0 of algebraic surface
patches in R d , we project the lower envelopes of both F and 0 into
R d - 1 , and we are interested in the complexity of the superimposed
projections (where, for d = 3, a vertex of the superimposed projec-
tions can arise, for example, as the intersection of an edge coming
from F with an edge obtained from 0). In R 3 , it is known that this
complexity is O(n2+c), where n = IFI + 101 (Agarwal, Sharir, and
Schwarzkopf [ASS96]); this is similar to the bound for the lower en-
velopes themselves. The problem remains open in dimensions 4 and
higher.
The combinatorial complexity of a Voronoi diagram can also be
viewed as a lower-envelope problem. Namely, let 81,82, ... , sn be ob-
jects in R d (points, lines, segments, polytopes), and let p be a metric
on Rd. Each 8i defines the function Ii: Rd -+ R by li(X) = p(X,8i),
and the Voronoi diagram of the 8i is exactly the minimization diagram
of the graphs of the Ii (i.e., the projection of their lower envelope). If
the Ii are algebraic of bounded degree (or can be converted to such
functions by a monotone transform of the range), the general lower
envelope bound implies that the complexity of the Voronoi diagram
in Rd is no more than O(nd+ c ). This result is nontrivial, but it is
widely believed that it should be possible to improve it by a factor of
n (and even more in some special cases). Several nice partial results
are known, mostly obtained by methods similar to those for lower
envelopes. Most notably, Chew, Kedem, Sharir, Tagansky, and Welzl
[CKS+98] proved that if the 8i are lines in R3 and the metric p is
given by a norm whose unit ball is a convex polytope with a constant-
bounded number of vertices (this includes the £1 and £00 metrics, but
not the Euclidean metric), then the Voronoi diagram has complexity
O(n 2 o:(n) logn). On the other hand, Aronov [AroOO] constructed, for
7.7 Algebraic Surface Patches 193

every p E [1, ooJ, a set of n (d-2)-flats in Rd whose Voronoi diagram


under the fp metric has complexity O(n d - l ) (Exercise 5.7.3).
Single cell. For a single cell in the arrangement of n simplices
in R d , Aronov and Sharir [AS94] obtained the complexity bound
O(nd-llogn). Halperin and Sharir [HS95J managed to prove Theo-
rem 7.7.2 in dimension 3. The eff.ort was crowned by Basu [Bas98],
who showed by an argument inspired by Morse theory that the topo-
logical complexity of a single cell in R d , assuming general position, is
O(n d- l ); the Halperin-Sharir technique then implies the O(n d- He )
bound on the combinatorial complexity.
The research of Sharir and his colleagues in this problem (and
many other problems discussed in this chapter) has been motivated
by questions about automatic motion planning for a robot. For exam-
ple, let us consider a square-shaped robot in the plane moving among
n pairwise disjoint segment obstacles. The placement of the robot can
be specified by three coordinates: the position (x, y) of the center and
the angle a of rotation. Each obstacle excludes some placements of
the robot. With suitable choice of coordinates, say (x, y, tan ~ ), the
region of excluded placements is bounded by a few algebraic surface
patches. Hence all possible placements of the robot reachable from a
given position by a continuous obstacle-avoiding movement correspond
to a single cell in the arrangement of O(n) algebraic surface patches in
R3. Consequently, the set of reachable placements has combinatorial
complexity at most O(n2+ e ). Similar reduction works for more gen-
eral shapes of the robot and of the obstacles (the robot may even have
movable parts), as long as the robot and each of the obstacles can be
described by a bounded number of algebraic surface patches. Unfor-
tunately, even in quite simple settings, the combinatorial complexity
of the reachable region can be very large. For example, a cube robot
in R3 has 6 degrees of freedom, and so its placements correspond to
points in R6. Exact motion planning algorithms thus become rather
impractical, and faster approximate algorithms are typically used.
The complexity of unions. This is another type of problem that often
occurs in the analysis of geometric algorithms. Let Ai, A 2 , ... , An be
sets in the plane, each of them bounded by a closed Jordan curve, and
suppose that the boundaries of every Ai and Aj intersect in at most
s points. For s = 2, the Ai are called pseudodisks, and the primary
example is circular disks.

pseudodisks not pseudo disks


194 Chapter 7: Lower Envelopes

For this case Kedem, Livne, Pach, and Sharir [KLPS86] proved that
the complexity of U~l Ai is O(n), where the complexity is measured
as the sum of the complexities of the "exterior" cells of the arrange-
ment, Le., the cells that are not contained in any of the Ai.
For s ~ 4, long and skinny sets can form a grid pattern and have
union complexity about n 2 , but linear or near-linear bounds were
proved under additional assumptions. One type of such additional
assumption is metric, namely, that the objects are "fat." A rather
complicated proof of Efrat and Sharir [ESOO] shows that if each Ai
is convex, the ratio of the circumradius and inradius is bounded by
some constant K, and every two boundaries intersect at most s times,
then the union complexity is at most O(nHc) for anye > 0, with the
constant of proportionality depending on s, K, e. Earlier, Matousek,
Pach, Sharir, Sifrony, and Welzl [MPS+94] gave a simpler and more
precise bound of O( n log log n) for fat triangles. Pach, Safruti, and
Sharir [PSS01] showed that the union of n fat wedges in R3 (intersec-
tions of two half-spaces with angle at least some ao > 0), as well as the
union ofn cubes in R3, has complexity O(n2+ c ). Various extensions of
these results to nonconvex objects or to higher dimensions seem easy
to conjecture but quite hard to prove.
Several results are known where one assumes that the Ai have
special shapes or bounded complexity. Aronov, Sharir, and Tagansky
[AST97] proved that the complexity of the union of k convex polygons
in the plane with n vertices in total is O( k 2 +na( k)) and that the union
of k convex polytopes in R3 with n vertices in total has complexity
O( k 3 + kn log k). Boissonnat, Sharir, Tagansky, and Yvinec [BSTY98]
showed that the union of n axis-parallel cubes in Rd has O(nrd/21)
complexity, and O(nLd/2J) complexity if the cubes all have the same
size; both these bounds are tight.
Agarwal and Sharir [ASOOc] proved that the union of n infinite
cylinders of equal radius in R3 has complexity O(n2+c) (here n(n 2)
is a lower bound), and more generally, if AI"", An are pairwise dis-
joint triangles in R3 and B is a ball, then Ui(Ai + B) has complexity
O(n2+c), where Ai + B = {a + b: a E Ai, b E B} is the Minkowski
sum. The proof relies on the result mentioned above about two super-
imposed lower envelopes.

Exercises
1. Let PI, ... , Pn be points in the plane. At time t = 0, each Pi starts moving
along a straight line with a fixed velocity Vi. Use Theorem 7.7.1 to prove
that the convex hull of the n moving points changes its combinatorial
structure at most O(n2+c) times during the time interval [0,00).0
The tight bound is O(n 2 ); it was proved, together with many other related
results, by Agarwal, Guibas, Herschberger, and Veach [AGHVOl].
8

Intersection Patterns of
Convex Sets

In Chapter 1 we covered three simple but basic theorems in the theory of


convexity: ReIly's, Radon's, and Caratheodory's. For each of them we present
one closely related but more difficult theorem in the current chapter. These
more advanced relatives are selected, among the vast number of variations
on the Relly-Radon-CaratModory theme, because of their wide applicability
and also because of nice techniques and tricks appearing in their proofs.
The development started in this chapter continues in Chapters 9 and 10.
One of the culminations of this route is the (p, q)-theorem of Alon and Kleit-
man, which we will prove in Section 10.5. The proof ingeniously combines
many of the tools covered in these three chapters and illustrates their power.
Readers who do not like higher dimensions may want to consider dimen-
sions 2 and 3 only. Even with this restriction, the results are still interesting
and nontrivial.

8.1 The Fractional Helly Theorem


ReIly's theorem says that if every at most d+ 1 sets of a finite family of
convex sets in Rd intersect, then all the sets of the family intersect. What
if not necessarily all, but a large fraction of (d+ 1)-tuples of sets, intersect?
The following theorem states that then a large fraction of the sets must have
a point in common.
8.1.1 Theorem (Fractional ReIly theorem). For every dimension d 2: 1
and every a> 0 there exists a j3 = j3(d, a) > 0 with the following property.
Let F l , ... , Fn be convex sets in R d , n 2: d+l, and suppose that for at least
n
a(d~l) of the (d+l)-point index sets I ~ {I, 2, ... , n}, we have iE1 Fi -=I- 0.
Then there exists a point contained in at least j3n sets among the Fi .
196 Chapter 8: Intersection Patterns of Convex Sets

Although simple, this is a key result, and many of the subsequent devel-
opments rely on it.
The best possible value of (3 is (3 = 1- (l_a)l/Cd+l). We prove the weaker
estimate (3 ~ d~l·
Proof. For a subset I ~ {I, 2, ... ,n}, let us write FI for the intersection
niEI F i ·
First we observe that it is enough to prove the theorem for the Fi closed
and bounded (and even convex polytopes). Indeed, given some arbitrary
F l , ... , Fn , we choose a point PI E FI for every (d+1)-tuple I with FI =I- 0
and we define Fl = conv{pI: FI =I- 0, i E I}, which is a polytope contained in
Fi . If the theorem holds for these Fl, then it also holds for the original Fi .
In the rest of the proof we thus assume that the F i , and hence also all the
nonempty F I , are compact.
Let :::;lexdenote the lexicographic ordering of the points of Rd by their
coordinate vectors. It is easy to show that any compact subset of Rd has a
unique lexicographically minimum point (Exercise 1). We need the following
consequence of Helly's theorem.
8.1.2 Lemma. Let I ~ {l, 2, ... ,n} be an index set with FI =I- 0, and let v
be the (unique) lexicographically minimum point of Fl. Then there exists an
at most d-element subset J ~ I such that v is the lexicographically minimum
point of FJ as well.
In other words, the minimum of the intersection Fr is always enforced by
some at most d "constraints" Fi , as is illustrated in the following drawing
(note that the two constraints determining the minimum are not determined
uniquely in the picture):

Proof. Let C = {x E Rd: x <lex V}. It is easy to check that C is


convex. Since v is the lexicographic minimum of FJ, we have CnFI =
0. So the family of convex sets consisting of C plus the sets Fi with
i E I has an empty intersection. By Helly's theorem there are at most
d+ 1 sets in this family whose intersection is empty as well. The set
C must be one of them, since all the others contain v. The remaining
at most d sets yield the desired index set J. 0

Let us remark that instead of taking the lexicographically minimum point,


one can consider a point minimizing a generic linear function. That formula-
tion is perhaps more intuitive, but it appears slightly more complicated for
rigorous presentation.
8.1 The Fractional Helly Theorem 197

We can now finish the proof of the fractional ReIly theorem. For each of
the Q(d~l) index sets 1 of cardinality d+l with FI =I- 0, we fix ad-element
set J = J(I) c 1 such that FJ has the same lexicographic minimum as Fl.
The theorem follows by double counting. Since the number of distinct
d-tuples J is at most C), one of them, call it Jo, appears as J(1) for at least
Q(d~l)/C) = Q~:;:t distinct 1. Each such 1 has the form J o U {i} for some
i E {I, 2, ... ,n}. The lexicographic minimum of FJo is contained in at least
d + Q ~:;:t > Q d~l sets among the F i · Rence we may set /3 = d~l. 0

Bibliography and remarks. The fractional ReIly theorem is due


to Katchalski and Liu [KL79]. The quantitatively sharp version with
/3 = 1- (l_Q)l/(d+1) was proved by Kalai [KaI84] (and the main result
needed for it was proved independently by Eckhoff [Eck85], too). Ac-
tually, there is an exact result: If the maximum size of an intersecting
subfamily in a family of n convex sets in Rd is m, then the smallest
possible number of intersecting (d+1)-tuples is attained for the family
consisting of n - m + d hyperplanes in general position and m - d
copies of Rd. But there are many other essentially different examples
attaining the same bound.
These assertions are consequences of considerably more general re-
sults about the possible intersection patterns of convex sets in Rd.
For explaining some of them it is convenient to use the language of
simplicial complexes. Let F = {Fl' F 2 , ... , Fn} be a family of con-
vex sets in Rd. The nerve N(F) of F is the simplicial complex with
vertex set {I, 2, ... ,n} whose simplices are all 1 ~ {I, 2, ... ,n} such
n
that iEI Fi =I- 0. A simplicial complex obtainable as N(F) for some
family of convex sets in R d is called d-representable. A characteri-
zation of d-representable simplicial complexes for a given d is most
likely out of reach. There are several useful necessary conditions for
d-representability. One certainly worth mentioning is d-collapsibility,
which means that a given simplicial complex K can be reduced to the
void complex by a sequence of elementary d-collapsings, where an ele-
mentary d-collapsing consists in deleting a face S E K of dimension at
most d-1 that lies in a unique maximal face of K and all the faces of K
containing S. The proof of the d-collapsibility of every d-representable
complex (Wegner [Weg75]) uses an idea quite similar to the proof of
the fractional ReIly theorem.
While no characterization of d-representable complexes is known,
the possible I-vectors of such complexes (where Ii is the number of
i-dimensional simplices, which correspond to (i+ 1)-wise intersections
here) are fully characterized by a conjecture of Eckhoff, which was
proved by Kalai [KaI84], [KaI86] by an impressive combination of sev-
eral methods. The same characterization applies to d-collapsible com-
plexes as well (and even to the more general d-Leray complexes; these
198 Chapter 8: Intersection Patterns of Convex Sets

are the complexes where the homology of dimension d and larger van-
ishes for all induced subcomplexes). We do not formulate it but men-
tion one of its consequences, the upper bound theorem for families of
convex sets: If f r (N (F)) = 0 for a family F of n convex sets in R d and
some r, d::;: r ::;: n, then fk(N(F)) ::;: L~=o (k~j!l)(n-;+d); equality
holds, e.g., in the case mentioned above (several copies of Rd and hy-
perplanes in general position).

Exercises
1. Show that any compact set in Rd has a unique point with the lexico-
graphically smallest coordinate vector. 0
2. Prove the following colored Helly theorem: Let Cl , ... ,Cd + l be finite fam-
ilies of convex sets in Rd such that for any choice of sets C l E Cll ... ,
Cd+! E CMl , the intersection C l n ... n Cd+l is nonempty. Then for
some i, all the sets ofCi have a nonempty intersection. Apply a method
similar to the proof of the fractional Helly theorem; i.e., consider the lex-
icographic minima of the intersections of suitable collections of the sets.
[I]
The result is due to Lovasz ([Lov74J; also see [Bar82]).
3. Let F l , F 2 , ... , Fn be convex sets in Rd. Prove that there exist convex
polytopes PI , P2 , ... , Pn such that dim(niEI F i ) = dim(n iE1 Pi) for ev-
ery I S;;; {1,2, ... ,n} (where dim(0) = -1). 0

8.2 The Colorful Caratheodory Theorem


Caratheodory's theorem asserts that if a point x is in the convex hull of a set
X S;;; R d , then it is in the convex hull of some at most d+ 1 points of X. Here
we present a "colored version" of this statement. In the plane, it shows the
following: Given a red triangle, a blue triangle, and a white triangle, each of
them containing the origin, there is a vertex r of the red triangle, a vertex b of
the blue triangle, and a vertex w of the white triangle such that the tricolored
triangle rbw also contains the origin. (In the following pictures, the colors of
points are distinguished by different shapes of the point markers.)

The d-dimensional statement follows.


8.2 The Colorful CaratModory Theorem 199

8.2.1 Theorem (Colorful Caratheodory theorem). Consider d+1 fi-


nite point sets M I , ... , Md+1 in Rd such that the convex hull of each
Mi contains the point 0 (the origin). Then there exists a (d+1)-point set
S ~ MI U··· U Md+l with IMi n SI = 1 for each i and such that 0 E conv(S).
(If we imagine that the points of Mi have "color" i, then we look for a "rain-
bow" (d+1)-point S with 0 E conv(S), where "rainbow" = "containing all
colors.")

Proof. Call the convex hull of a (d+1)-point rainbow set a rainbow simplex.
We proceed by contradiction: We suppose that no rainbow simplex contains 0,
and we choose a (d+1)-point rainbow set S such that the distance of conv(S)
to 0 is the smallest possible. Let x be the point of conv(S) closest to O.
Consider the hyperplane h containing x and perpendicular to the segment
Ox, as in the picture:

Then all of S lies in the closed half-space h - bounded by h and not contain-
ing O. We have conv(S) n h = conv(S n h), and by Caratheodory's theorem,
there exists an at most d-point subset T ~ S n h such that x E conv(T).
Let i be a color not occurring in T (Le., Mi n T = 0). If all the points
of Mi lay in the half-space h-, then 0 would not be in conv(Mi), which we
assume. Thus, there exists a point y E Mi lying in the complement of h-
(strictly, i.e., y fj. h).
Let us form a new rainbow set S' from S by replacing the (unique) point
of Mi n S by y. We have T c S', and so x E conv(S'). Hence the segment
xy is contained in conv(S'), and we see that conv(S') lies closer to 0 than
conv(S), a contradiction. The colorful Caratheodory theorem is proved. 0

This proof suggests an algorithm for finding the rainbow simplex as in


the theorem. Namely, start with an arbitrary rainbow simplex, and if it does
not contain 0, switch one vertex as in the proof. It is not known whether the
number of steps of this algorithm can be bounded by a polynomial function
of the dimension and of the total number of points in the Mi. It would be
very interesting to construct configurations where the number of steps is very
large or to prove that it cannot be too large.

Bibliography and remarks. The colorful Caratheodory theorem


is due to Barany [Bar82]. Its algorithmic aspects were investigated by
Barany and Onn [B097].
200 Chapter 8: Intersection Patterns of Convex Sets

Exercises
1. Let 8 and T be (d+ 1)- point sets in R d, each containing 0 in the convex
hull. Prove that there exists a finite sequence 8 0 = 8,81 ,82 , . " , 8 m = T
of (d+l)-point sets with 8 i ~ 8 U T and 0 E conv(8i ) for all i, such
that 8 Hl is obtained from 8 i by deleting one point and adding another.
Assume general position of 8 U T if convenient. Warning: better do not
try to find a (d+l)-term sequence. 0

8.3 Tverberg's Theorem


Radon's lemma states that any set of d+2 points in Rd has two disjoint
subsets whose convex hulls intersect. Tverberg's theorem is a generalization of
this statement, where we want not only two disjoint subsets with intersecting
convex hulls but r of them.
It is not too difficult to show that if we have very many points, then such r
subsets can be found. For easier formulations, let T(d, r) denote the smallest
integer T such that for any set A of T points in R d there exist pairwise
disjoint subsets AI, A 2, . .. , Ar C A with n~=l conv(Ai) ¥- 0. Radon's lemma
asserts that T(d,2) = d+2.
It is not hard to see that T(d,rlr2) :S T(d,rl)T(d,r2) (Exercise 1). To-
gether with Radon's lemma this observation shows that T(d, r) is finite for
all r, but it does not give a very good bound.
Here is another, more sophisticated, argument, leading to the (still subop-
timal) bound T(d,r) :S n = (r-l)(d+l)2 + 1. Let A be an n-point set in Rd
and let us set s = n - (r-l)(d+l). A simple counting shows that every d+l
subsets of A of size s all have a point of A in common. Therefore, by Helly's
theorem, the convex hulls of all s-tuples have a common point x (typically
not in A anymore). By Carathedory's theorem, x is contained in the convex
hull of some (d+l)-point set Al ~ A. Since A \ Al has at least s points, x
is still contained in conv(A \ Ad, and thus also in the convex hull of some
(d+l)-point A2 ~ A \A l , etc. We can continue in this manner and select the
desired r disjoint sets AI"", Ar, all of them containing x in their convex
hulls.
It is not difficult to see that T(d, r) cannot be smaller than (r-l)(d+l)+ 1
(Exercise 2). Tverberg's theorem asserts that this smallest conceivable value
is always sufficient.
8.3.1 Theorem (Tverberg's theorem). Let d and r be given natural
numbers. For any set A C Rd of at least (d+l)(r-l) + 1 points there exist r
pairwise disjoint subsets AI, A2"'" Ar ~ A such that n~=l conv(Ai) ¥- 0.
The sets Ai> A2"'" Ar as in the theorem are called a Tverberg partition
of A (we may assume that they form a partition of A), and a point in the
intersection of their convex hulls is called a Tverberg point. The following
8.3 Tverberg's Theorem 201

illustration shows what such partitions can look like for d = 2 and r = 3;
both the drawings use the same 7-point set A:

(Are these all Tverberg partitions for this set, or are there more?)
As in the colorful Caratheodory theorem, a very interesting open problem
is the existence of an efficient algorithm for finding a Tverberg partition of
a given set. There is a polynomial-time algorithm if the dimension is fixed,
but some NP-hardness results for closely related problems indicate that if
the dimension is a part of input then the problem might be algorithmically
difficult.
Several proofs of Tverberg's theorem are known. The one demonstrated
below is maybe not the simplest, but it shows an interesting "lifting" tech-
nique. We deduce the theorem by applying the colorful Caratheodory theorem
to a suitable point configuration in a higher-dimensional space.
Proof of Tverberg's theorem. We begin with a reformulation of Tver-
berg's theorem that is technically easier to handle. For a set X ~ R d , the
convex cone generated by X is defined as the set of all linear combinations of
points of X with nonnegative coefficients; that is, we set

cone(X) = {t
>=1
aiXi: Xl,"" Xn E X, al,"" an E R, ai 2: o} .
Geometrically, cone(X) is the union of all rays starting at the origin and
passing through a point of conv(X). The following statement is equivalent to
Tverberg's theorem:
8.3.2 Proposition (Tverberg's theorem: cone version). Let A be a set
of(d+1)(r-1) + 1 points in Rd+l such that 0 (j. conv(A). Then there exist r
pairwise disjoint subsets AI, A 2 , .•. , AT ~ A such that n~=l cone(Ai) =I- {a}.
Let us verify that this proposition implies Tverberg's theorem. Embed
Rd into R d + l as the hyperplane Xd+l = 1 (as in Section 1.1). A set A c
R d thus becomes a subset of R d+ 1 ; moreover, its convex hull lies in the
Xd+l = 1 hyperplane, and thus it does not contain O. By Proposition 8.3.2, the
set A can be partitioned into groups AI"'" AT with n~=l cone(Ai) =I- {O}.
The intersection of these cones thus contains a ray originating at O. It is
easily checked that such a ray intersects the hyperplane Xd+l = 1 and that
the intersection point is a Tverberg point for A. Hence it suffices to prove
Proposition 8.3.2.
202 Chapter 8: Intersection Patterns of Convex Sets

Proof of Proposition 8.3.2. Let us put N = (d+1)(r-1); thus, A has N+1


points. First we define linear maps ({Jj: R d+l -+ R N, j = 1,2, ... , r. We group
the coordinates in the image space RN into r-1 blocks by d+1 coordinates
each. For j = 1,2, . .. ,r-1, ({Jj(x) is the vector having the coordinates of x
in the jth block and zeros in the other blocks; symbolically,

= (OIOI···IOlxIOI···IO).
({Jj(x)
--..--
x(i-l)

The last mapping, ({Jr, has -x in each block: ({Jr(X) = (-x 1- x 1···1 - x).
These maps have the following property: For any r vectors Ul, ... , U r E
R d +l ,
r

L ({Jj(Uj) = 0 holds if and only if Ul = U2 = ... = Ur . (8.1)


j=l

Indeed, this can be easily seen by expressing


r

L ({Jj(Uj) = (Ul - Ur 1U2 - Ur 1···1 Ur-l - u r ).


j=l

Next, let A = {ab'" ,aN+1} C R d +l be a set with 0 fj. conv(A). We con-


sider the set M = ({Jl (A) U ({J2 (A) U ... U ({Jr (A) in R N consisting of r copies of
A. The first r-1 copies are placed into mutually orthogonal coordinate sub-
spaces of R N. The last copy of each ai sums up to 0 with the other r-1 copies
of ai. Then we color the points of M by N+1 colors; all copies of the same
ai get the color i. In other words, we set Mi = {({Jl(ai),({J2(ai), ... ,({Jr(ai)}.
As we have noted, the points in each Mi sum up to 0, which means that
o E conv(Mi), and thus the assumptions of the colorful Caratheodory theo-
rem hold for Mb"" MN+l.
Let S ~ M be a rainbow set (containing one point of each M i ) with
o E conv(S). For each i, let f(i) be the index of the point of Mi contained
in S; that is, we have S = {({Jf(l)(al), ((Jf(2) (a2), ... , ({Jf(N+l) (aN+1)}' Then
o E conv(S) means that
N+l
L Ui({Jf(i) (ai) =0
i=l
for some nonnegative real numbers Ub ... , UN+1 summing to 1. Let I j be the
set of indices i with f(i) = j, and set Aj = {ai: i E I j }. The above sum can
be rearranged:

(the last equality follows from the linearity of each ({Jj). Write Uj = EiElj Uiai'
This is a linear combination of points of Aj with nonnegative coefficients, and
8.3 Tverberg's Theorem 203

hence Uj E cone(A j ). Above we have derived 2::;=1


cPj(Uj) = 0, and so by
(8.1) we get Ul = U2 = .. , = U r . Hence the common value of all the Uj
belongs to n;=1 cone( A j ).
°
°
It remains to check that Uj =1= 0. Since we assume r¢ conv(A), the only
nonnegative linear combination of points of A equal to is the trivial one,
with all coefficients 0. On the other hand, since not all the O!i are 0, at least
one Uj is expressed as a nontrivial linear combination of points of A. This
proves Proposition 8.3.2 and Tverberg's theorem as well. 0

The colored Tverberg theorem. If we have 9 points in the plane, 3 of


them red, 3 blue, and 3 white, it turns out that we can always partition them
into 3 triples in such a way that each triple has one red, one blue, and one
white point, and the 3 triangles determined by the triples have a nonempty
intersection.

• •

• • •

The colored Tverberg theorem is a generalization of this statement for ar-


bitrary d and r. We will need it in Section 9.2, for a result about many
simplices with a common point. In that application, the colored version is
essential (and Tverberg's theorem alone is not sufficient).

8.3.3 Theorem (Colored Tverberg theorem). For any integers r, d 2: 2


there exists an integer t such that given any t(d+l)-point set Y C Rd par-
titioned into d+l color classes Y1 , •.. , Yd+l with t points each, there ex-
ist r pairwise disjoint sets AI," . ,Ar such that each Ai contains exactly
one point of each 1j, j = 1,2, ... ,d+l (that is, the Ai are rainbow), and
n~=1 conv(Ai) =1= 0.
Let Teal (d, r) denote the smallest t for which the conclusion of the theorem
holds. It is known that Teal (2, r) = r for all r. It is possible that Teol(d, r) = r
for all d and r, but only weaker bounds have been proved. The strongest
known result guarantees that Teal (d, r) ::; 2r-l whenever r is a prime power.
Recall that in Tverberg's theorem, if we need only the existence of T( d, r),
rather than the precise value, several simple arguments are available. In con-
trast, for the colored version, even if we want only the existence of Teol(d, r),
there is essentially only one type of proof, which is not easy and which uses
topological methods. Since such methods are not considered in this book, we
have to omit a proof of the colored Tverberg theorem.

Bibliography and remarks. Tverberg's theorem was conjectured


by Birch and proved by Tverberg (really!) [Tve66]. His original proof is
204 Chapter 8: Intersection Patterns of Convex Sets

technically complicated, but the idea is simple: Start with some point
configuration for which the theorem is valid and convert it to a given
configuration by moving one point at a time. During the movement,
the current partition may stop working at some point, and it must be
shown that it can be replaced by another suitable partition by a local
change.
Later on, Tverberg found a simpler proof [Tve81]. For the proof
presented in the text above, the main idea is due to Sarkaria [Sar92],
and our presentation is based on a simplification by Onn (see [B097]).
Another proof, also due to Tverberg and inspired by the proof of the
colorful CaratModory theorem, was published in a paper by Tverberg
and Vrecica [TV93]. Here is an outline.
Let 7f = (AI, A 2, ... , Ar) be a partition of (d+1)(r-1)+1 given
points into r disjoint nonempty subsets. Consider a ball intersect-
ing all the sets conv(Aj ), j = 1,2, ... ,r, whose radius p = p(7f) is
the smallest possible. By a suitable general position assumption, it
can be assured that the smallest ball is always unique for any par-
tition. (Alternatively, among all balls of the smallest possible radius,
one can take the one with the lexicographically smallest center, which
again guarantees uniqueness.) If p( 7f) = 0, then 7f is a Tverberg parti-
tion. Supposing that p(7f) > 0, it can be shown that 7f can be locally
changed (by reassigning one point from one class to another) to an-
other partition 7f' with p( 7f') < p( 7f). Another proof, based on a similar
idea, was found by Roudneff [Rou01a]. Instead of p(7f), he considers
w(7f) = minxERd w(7f, x), where w(7f, x) = l:~=l dist(x, conv(Ai))2.
He actually proves a "cone version" of Tverberg's theorem (but dif-
ferent from our cone version and stronger).
Several extensions of Tverberg's theorem are known or conjectured.
Here we mention only two conjectures related to the dimension of the
set of Tverberg points. For X C R d , let Tr(X) denote the set of all
Tverberg points for r-partitions of A (the points of Tr(X) are usually
called r-divisible). Reay [Rea68] conjectured that if X is in general
position and has k more points than is generally necessary for the
existence of a Tverberg r-partition, i.e., IXI = (d+1)(r-1) + 1 + k,
then dim Tr(X) ;::: k. This holds under various strong general position
assumptions, and special cases for small k have also been established

°
(see Roudneff [Rou01a], [Rou01b]). Kalai asked the following sophis-
ticated question in 1974: Does l:~-:1 dim Tr(X) ;::: hold for every
finite Xc Rd? Here dim0 = -1, and so the nonexistence of Tverberg
r-partitions for large r must be compensated by sufficiently large di-
mensions of Tr(X) for small r. Together with other interesting aspects
of Tverberg's theorem, this is briefly discussed in Kalai's lively sur-
vey [Ka101]. There he also notes that edge 3-colorability of a 3-regular
graph can be reformulated as the existence of a Tverberg 3-partition
8.3 Tverberg's Theorem 205

of a suitable high-dimensional point set. This implies that deciding


whether T3(X) = 0 for a (2d+3)-point X c Rd is NP-complete.
It is interesting to note that Tverberg's theorem implies the center-
point theorem (Theorem 1.4.2). More generally, if x is an r-divisible
point of a finite X c R d, then each closed half-space containing x
contains at least r points of X (at least one from each of the r parts);
r
in particular, if IXI = nand r = d~ll, we get that every r-divisible
point is a centerpoint. On the other hand, as an example of Avis
[Avi93] in R3 shows, a point x such that each closed half-space h con-
taining x satisfies Ih n XI 2: r need not be r-divisible in general; these
two properties are equivalent only in the plane.
A conjecture of Sierksma asserts that the number of Tverberg par-
titions for a set of (r-1)(d+1)+1 points in Rd in general position is at
least ((r-1)!)d. A lower bound of (r!l)! (~)(r-l)(d+l)/2, provided that
r 2: 3 is a prime number, was proved by VuCic and Zivaljevic [VZ93]
by an ingenious topological argument.
The colored Tverberg theorem was conjectured by Barany, Fiiredi,
and Lovasz [BFL90], who also proved the planar case. The general
case was established by Zivaljevic and Vrecica [ZV92J; simplified proofs
were given later by Bj6rner, Lovasz, Zivaljevic, and Vrecica [BLZV94]
and by Matousek [Mat96a] (using a method of Sarkaria). As was men-
tioned in the text, all these proofs are topological. They show that
Teal (d, r) ::; 2r-1 for r a prime. Recently, this was extended to all
prime powers r by Zivaljevic ,[Ziv98] (a similar approach in a different
problem was used earlier by Ozaydin, by Sarkaria, and by Volovikov).
Barany and Larman [BL92] proved that T(2, r) = r for all r.
We outline a beautiful topological proof, due to Lovasz (reproduced
in [BL92]), showing that Teol(d, 2) = 2 for all d. Let X be the surface of
the (d+1)-dimensional crosspolytope. We recall that the cross polytope
is the convex hull of V = {el,-el,e2,-e2, ... ,ed+l,-ed+l}, where
el, e2,"" ed+! is the standard orthonormal basis in Rd+!. Note that
X consists of 2d+! simplices of dimension d, each of them the convex
hull of d+1 points of V. Let Yi = {Ui' vd C R d , i = 1,2, ... , d+1, be
the given two-point color classes. Define the mapping f: V ---+ Rd by
setting f(ei) = Ui, f( -ei) = Vi' This mapping has a unique extension
1: X ---+ Rd such that! is affine on each of the d-dimensional simplices
mentioned above. This ! is a continuous mapping of X ---+ Rd. Since
X is homeomorphic to the d-dimensional sphere Sd, the Borsuk-Ulam
theorem guarantees that there is an x E X such that /(x) = /( -x). If
VI C V is the vertex set of a d-dimensional simplex containing x, then
VI n (-VI) = 0, -x E conv(-Vl ), and as is easy to check, Sl = f(Vl)
and S2 = f (- VI) are vertex sets of intersecting rainbow simplices
(/(x) = /( -x) is a common point).
206 Chapter 8: Intersection Patterns of Convex Sets

Exercises
1. Prove (directly, without using Tverberg's theorem) that for any integers
d,rl,r2 22, we have T(d,rlr2) ::; T(d,rdT(d,r2). IT]
2. For each r 2 2 and d 2 2, find (d+l)(r-l) points in Rd with no Tverberg
r-partition. 0
3. Prove that Tverberg's theorem implies Proposition 8.3.2. Why is the
assumption 0 tJ conv(A) necessary in Proposition 8.3.2? ITl
4. (a) Derive the following Radon-type theorem (use Radon's lemma): For
every d 2 1 there exists £ = £( d) such that every £ points in R d in general
position can be partitioned into two disjoint subsets A, B such that not
only conv(A) n conv(B) -=I- 0, but this property is preserved by deleting
any single point; that is, conv(A \ {a}) n conv(B) -=I- 0 for each a E A and
conv(A) n conv(B \ {b}) -=I- 0 for each bE B. 0
(b) Show that £(2) 2 7. IT]
Remark. The best known value of £( d) is 2d+3; this was established by
Larman [Lar72], and his proof is difficult. The original question is, What
is the largest n = n(k) such that every n points in Rk in general position
can be brought to a convex position by some projective transform? Both
formulations are related via the Gale transform.
5. Show that for any d, r 2 1 there is an (N + 1)-point set in R d in general
position, N = (d+l)(r-l), having no more than ((r-l)!)d Tverberg
partitions. 0
6. Why does Tverberg's theorem imply the centerpoint theorem (Theo-
rem 1.4.2)? ITl
9

Geometric Selection
Theorems

As in Chapter 3, the common theme of this chapter is geometric Ramsey


theory. Given n points, or other geometric objects, where n is large, we want
to select a not too small subset forming a configuration that is "regular" in
some sense.
As was the case for the Erdos-Szekeres theorem, it is not difficult to prove
the existence of a "regular" configuration via Ramsey's theorem in some of
the subsequent results, but the size of that configuration is very small. The
proofs we are going to present give much better bounds. In many cases we
obtain "positive-fraction theorems": The regular configuration has size at
least en, where n is the number of the given objects and e is a positive
constant independent of n.
In the proofs we encounter important purely combinatorial results: a weak
version of the Szemenldi regularity lemma and a theorem of Erdos and Si-
monovits on the number of complete k-partite subhypergraphs in dense k-
uniform hypergraphs. We also apply tools from Chapter 8, such as Tverberg's
theorem.

9.1 A Point in Many Simplices: The First Selection


Lemma
Consider n points in the plane in general position, and draw all the (~)
triangles with vertices at the given points. Then there exists a point of the
plane common to at least ~ G) of these triangles. Here ~ is the optimal
constant; the proof below, which establishes a similar statement in arbitrary
dimension, gives a considerably smaller constant.
208 Chapter 9: Geometric Selection Theorems

For easier formulations we introduce the following terminology: If Xc Rd


is a finite set, an X -simplex is the convex hull of some (d+1)-tuple of points
of X. We make the convention that X-simplices are in bijective correspon-
dence with their vertex sets. This means that two X-simplices determined by
two distinct (d+1)-point subsets of X are considered different even if they
coincide as subsets of Rd. Thus, the X -simplices form a multiset in general.
This concerns only sets X in degenerate positions; if X is in general position,
then distinct (d+1)-point sets have distinct convex hulls.
9.1.1 Theorem (First selection lemma). Let X be an n-point set in Rd.
Then there exists a point a E R d (not necessarily belonging to X) contained
in at least Cd (d~ 1) X -simplices, where Cd > 0 is a constant depending only
on the dimension d.
The best possible value of Cd is not known, except for the planar case. The
first proof below shows that for n very large, we may take Cd ~ (d+ 1) - (d+ 1) .
The first proof: from Tverberg and colorful Caratheodory. We may
suppose that n is sufficiently large (n ~ no for a given constant no), for
otherwise, we can set Cd to be sufficiently small and choose a point contained
in a single X-simplex.
Put r = fnj(d + 1)1- By Tverberg's theorem (Theorem 8.3.1), there exist
r pairwise disjoint sets M 1 , ... , Mr <;;; X whose convex hulls all have a point
in common; call this point a. (A typical Mi has d+ 1 points, but some of them
may be smaller.)

We want show that the point a is contained in many X-simplices (so far we
have const . n and we need const . n d + 1 ).
Let J = {jo, ... ,jd} <;;; {1, 2, . . . , r} be a set of d+1 indices. We apply the
colorful Caratheodory's theorem (Theorem 8.2.1) for the (d+ 1) "color" sets
M jo , ••. , M jd , which all contain a in their convex hull. This yields a rainbow
X -simplex S J containing a and having one vertex from each of the M j ;, as
illustrated below:

If J' =f. J are two (d+1)-tuples of indices, then SJ =f. SJ', Hence the
number of X -simplices containing the point a is at least
9.1 A Point in Many Simplices: The First Selection Lemma 209

( r ) = (fn/(d+l)l) > 1 n(n - (d+l)) .. · (n - d(d+l)).


d+l d+l - (d+l)d+l (d+l)!

For n sufficiently large, say n :2: 2d( d+ 1), this is at least (d+ 1 )-(d+l)2- d (d~l)'
o
The second proof: from fractional Helly. Let F df!note the family of
all X-simplices. Put N = IFI = (d~l)' We want to apply the fractional Helly
theorem (Theorem 8.1.1) to:F. Call a (d+l)-tuple of sets of F good if its
d+ 1 sets have a common point. To prove the first selection lemma, it suffices
to show that there are at least a(d~l) good (d+1)-tuples for some a > 0
independent of n, since then the fractional Helly theorem provides a point
common to at least fiN members of F.
Set t = (d+ 1)2 and consider a t-point set Y eX. Using Tverberg's
theorem, we find that Y can be partitioned into d+ 1 pairwise disjoint
sets, of size d+ 1 each, whose convex hulls have a common point. (Tver-
berg's theorem does not guarantee that the parts have size d+l, but if they
don't, we can move points from the larger parts to the smaller ones, us-
ing Caratheodory's theorem.) Therefore, each t-point Y C X provides at
least one good (d+l)-tuple of members of:F. Moreover, the members of this
good (d+l)-tuple are pairwise vertex-disjoint, and therefore the (d+l)-tuple
uniquely determines Y. It follows that the number of good (d+ 1)-tuples is at
least (~) = O(n(d+l)2) :2: a(!l)' 0

In the first proof we have used Tverberg's theorem for a large point set,
while in the second proof we applied it only to configurations of bounded size.
For the latter application, if we do not care about the constant of propor-
tionality in the first selection lemma, a weaker version of Tverberg's theorem
suffices, namely the finiteness of T( d, d+ 1), which can be proved by quite
simple arguments, as we have seen.
The relation of Tverberg's theorem to the first selection lemma in the
second proof somewhat resembles the derivation of macroscopic properties
in physics (pressure, temperature, etc.) from microscopic properties (laws of
motion of molecules, say). From the information about small (microscopic)
configurations we obtained a global (macroscopic) result, saying that a sig-
nificant portion of the X -simplices have a common point.
A point in the interior of many X -simplices. In applications of the
first selection lemma (or its relatives) we often need to know that there is a
point contained in the interior of many of the X -simplices. To assert anything
like that, we have to assume some kind of nondegenerate position of X. The
following lemma helps in most cases.

9.1.2 Lemma. Let X C R d be a set of n :2: d+ 1 points in general position,


meaning that no d+ 1 points of X lie on a common hyperplane, and let 1£ be
the set of the G) hyperplanes determined by the points of X. Then no point
210 Chapter 9: Geometric Selection Theorems

a E Rd is contained in more than dn d - l hyperplanes of1i. Consequently, at


most O( n d ) X -simplices have a on their boundary.

Proof. For each d-tuple S whose hyperplane contains a, we choose an


inclusion-minimal set K(S) ~ S whose affine hull contains a. We claim that
if IK(Sr)1 = IK(S2)1 = k, then either K(SI) = K(S2) or K(SI) and K(S2)
share at most k-2 points.
Indeed, if K(SI) = {XI, ... ,Xk-l,xd and K(S2) = {XI, ... ,Xk-I,Yk},
Xk =I- Yk, then the affine hulls of K(Sr) and K(S2) are distinct, for otherwise,
we would have k+1 points in a common (k-1)-flat, contradicting the general
position of X. But then the affine hulls intersect in the (k-2)-flat generated
by Xl, ... , Xk-l and containing a, and K(Sr) and K(S2) are not inclusion-
minimal.
Therefore, the first k-1 points of K (S) determine the last one uniquely,
and the number of distinct sets of the form K (S) of cardinality k is at most
nk-l. The number of hyperplanes determined by X and containing a given
k-point set K c X is at most n d - k , and the lemma follows by summing
over k. 0

Bibliography and remarks. The planar version of the first selec-


tion lemma, with the best possible constant ~, was proved by Boros
and Fiiredi [BF84]. A generalization to an arbitrary dimension, with
the first of the two proofs given above, was found by Barany [Bar82].
The idea of the proof of Lemma 9.1.2 was communicated to me by
Janos Pach.
Boros and Fiiredi [BF84] actually showed that any centerpoint of
X works; that is, it is contained in at least H~) X-triangles. Wag-
ner and Welzl (private communication) observed that a centerpoint
works in every fixed dimension, being common to at least Cd(d:l)
X-simplices. This follows from known results on the face numbers of
convex polytopes using the Gale transform, and it provides yet another
proof of the first selection lemma, yielding a slightly better value of
the constant Cd than that provided by Barany's proof. Moreover, for
a centrally symmetric point set X this method implies that the origin
is contained in the largest possible number of X -simplices.
As for lower bounds, it is known that no n-point X C Rd in gen-
eral position has a point common to more than f<r (d:l) X-simplices
[Bar82]. It seems that suitable sets might provide stronger lower
bounds, but no results in this direction are known.

9.2 The Second Selection Lemma


In this section we continue using the term X -simplex in the sense of Sec-
tion 9.1; that is, an X-simplex is the convex hull of a (d+1)-point subset
9.2 The Second Selection Lemma 211

of X. In that section we saw that if X is a set in Rd and we consider all the


X -simplices, then at least a fixed fraction of them have a point in common.
What if we do not have all, but many X-simplices, some a-fraction of all?
It turns out that still many of them must have a point in common, as stated
in the second selection lemma below.

9.2.1 Theorem (Second selection lemma). Let X be an n-point set


in Rd and let F be a family of a(d~I) X-simplices, where a E (O,lJ is a
parameter. Then there exists a point contained in at least

X -simplices of F, where c = c( d) > °and Sd are constants.


This result is already interesting for a fixed. But for the application that
motivated the discovery of the second selection lemma, namely, trying to
bound the number of k-sets (see Chapter 11), the dependence of the bound
on a is important, and it would be nice to determine the best possible values
of the exponent Sd.
For d = 1 it is not too difficult to obtain an asymptotically sharp bound
(see Exercise 1). For d = 2 the best known bound (probably still not
sharp) is as follows: If IFI = n 3 - v , then there is a point contained in at
least D(n 3 - 3v / log5 n) X-triangles of:F. In the parameterization as in The-
orem 9.2.1, this means that S2 can be taken arbitrarily close to 3, provided
that a is sufficiently small, say a ~ n- o for some 15 > 0. For higher dimen-
sions, the best known proof gives Sd ~ (4d+1)d+I.
Hypergraphs. It is convenient to formulate some of the subsequent con-
siderations in the language of hypergraphs. Hypergraphs are a generalization
of graphs where edges can have more than 2 points (from another point of
view, a hypergraph is synonymous with a set system). A hypergraph is a pair
H = (V, E), where V is the vertex set and E ~ 2v is a system of subsets of
V, the edge set. A k-uniform hypergraph has all edges of size k (so a graph is
a 2-uniform hypergraph). A k-partite hypergraph is one where the vertex set
can be partitioned into k subsets VI, V2 , ... , Vk , the classes, so that each edge
contains at most one point from each Vi. The notions of subhypergraph and
isomorphism are defined analogously to these for graphs. A subhypergraph
is obtained by deleting some vertices and some edges (all edges containing
the deleted vertices, but possibly more). An isomorphism is a bijection of the
vertex sets that maps edges to edges in both directions (a renaming of the
vertices).
Proof of the second selection lemma. The proof is somewhat similar
to the second proof of the first selection lemma (Theorem 9.l.1). We again
use the fractional Helly theorem. We need to show that many (d+1)-tuples
of X -simplices of F are good (have nonempty intersections).
212 Chapter 9: Geometric Selection Theorems

We can view F as a (d+1)-uniform hypergraph. That is, we regard X as


the vertex set and each X-simplex corresponds to an edge, i.e., a subset of X
of size d+ 1. This hypergraph captures the "combinatorial type" of the family
F, and a specific placement of the points of X in Rd then gives a concrete
"geometric realization" of F.
First, let us concentrate on the simpler task of exhibiting at least one good
(d+1)-tuple; even this seems quite nontrivial. Why cannot we proceed as in
the second proof of the first selection lemma? Let us give a concrete example
with d = 2. Following that proof, we would consider 9 points in R2, and
Tverberg's theorem would provide a partition into triples with intersecting
convex hulls:

But it can easily happen that one of these triples, say {a, b, c}, is not an edge
of our hypergraph. Tverberg's theorem gives us no additional information on
which triples appear in the partition, and so this argument would guarantee
a good triple only if all the triples on the considered 9 points were contained
in F. Unfortunately, a 3-uniform hypergraph on n vertices can contain more
than half of all possible (;) triples without containing all triples on some 9
points (even on 4 points). This is a "higher-dimensional" version of the fact
i
that the complete bipartite graph on ~ + ~ vertices has about n 2 edges
without containing a triangle.
Hypergraphs with many edges need not contain complete hypergraphs,
but they have to contain complete multipartite hypergraphs. For example, a
graph on n vertices with significantly more than n 3 / 2 edges contains K 2 ,2,
the complete bipartite graph on 2 + 2 vertices (see Section 4.5). Concerning
hypergraphs, let Kd+1(t) denote the complete (d+1)-partite (d+1)-uniform
hypergraph with t vertices in each of its d+ 1 vertex classes. The illustration
shows a K3(4); only three edges are drawn as a sample, although of course,
all triples connecting vertices at different levels are present.

If t is a constant and we have a (d+1)-uniform hypergraph on n vertices


with sufficiently many edges, then it has to contain a copy of Kd+l(t) as a
subhypergraph. We do not formulate this result precisely, since we will need
a stronger one later.
9.2 The Second Selection Lemma 213

In geometric language, given a family F of sufficiently many X -simplices,


we can color some t points of X red, some other t points blue, ... , t points
by color (d+1) in such a way that all the rainbow X-simplices on the (d+1)t
colored points are present in F. And in such a situation, if t is a sufficiently
large constant, the colored Tverberg theorem (Theorem 8.3.3) with r = d+1
claims that we can find a (d+1)-tuple of vertex-disjoint rainbow X-simplices
whose convex hulls intersect, and so there is a good (d+ 1)-tuple! In fact, these
are the considerations that led to the formulation of the colored Tverberg
theorem.
For the fractional Helly theorem, we need not only one but many good
(d+1)-tuples. We use an appropriate stronger hypergraph result, saying that
if a hypergraph has enough edges, then it contains many copies of Kd+1(t):

9.2.2 Theorem (The Erdos-Simonovits theorem). Let d and t be pos-


itive integers. Let 11. be a (d+ 1)-uniform hypergraph on n vertices and with
a(d~l) edges, where a 2: Cn- 1/ td for a certain sufficiently large constant C.
Then 11. contains at least

copies of K d+1 (t), where c = c( d, t) > 0 is a constant.

For completeness, a proof is given at the end of this section.


Note that in particular, the theorem implies that a (d+1)-uniform hy-
pergraph having at least a constant fraction of all possible edges contains at
least a constant fraction of all possible copies of Kd+1(t).
We can now finish the proof of the second selection lemma by double
counting. The given family F, viewed as a (d+1)-uniform hypergraph, has
a(d~l) edges, and thus it contains at least catd+1n(d+l)t copies of Kd+1(t)
by Theorem 9.2.2. As was explained above, each such copy contributes at
least one good (d+1)-tuple of vertex-disjoint X-simplices of F. On the other
hand, d+1 vertex-disjoint X-simplices have together (d+1)2 vertices, and
hence their vertex set can be extended to a vertex set of some K d + 1 (t) (which
has t(d+1) vertices) in at most n t (d+l)-(d+1)2 = n(t-d-l)(d+l) ways. This is
the maximum number of copies of Kd+l(t) that can give rise to the same
good (d+1)-tuple. Hence there are at least ca td + 1n(d+l)2 good (d+1)-tuples
of X -simplices of F. By the fractional Helly theorem, at least c' a td +1nd+l
X -simplices of F share a common point, with c' = c' (d) > o. This proves the
second selection lemma, with the exponent Sd :::; (4d+ 1)d+1. D

Proof of the Erdos-Simonovits theorem (Theorem 9.2.2). By induc-


tion on k, we are going to show that a k-uniform hypergraph on n vertices
and with m edges contains at least A(n, m) copies of Kk(t), where

m tk
f k(n, m) = ck ntk (n-k) - C k nt(k-l) ,
214 Chapter 9: Geometric Selection Theorems

with Ck > 0 and C k suitable constants depending on k and also on t (t is


not shown in the notation, since it remains fixed). This claim with k = d+l
implies the Erdos-Simonovits theorem.
For k = 1, the claim holds.
So let k > 1 and let H be k-uniform with vertex set V, IVI = n, and edge
set E, lEI = m. For a vertex v E V, define a (k-l)-uniform hypergraph Hv
on V, whose edges are all edges of H that contain v, but with v deleted; that
is, Hv = (V, {e \ {v}: e E E, vEe}). Further, let H' be the (k-l)-uniform
hypergraph whose edge set is the union of the edge sets of all.the H v '
Let K denote the set of all copies of the complete (k-l)-partite hyper-
graph Kk-l(t) in H'. The key notion in the proof is that of an extending
vertex for a copy K E K: A vertex v E V is extending for a K E K if K is
contained in H v , or in other words, if for each edge e of K, eU {v} is an edge
in H. The picture below shows a K2(2) and an extending vertex for it (in a
3-regular hypergraph).

I><I
The idea is to count the number of all pairs (K, v), where K E K and v is an
extending vertex of K, in two ways.
On the one hand, if a fixed copy K E K has qK extending vertices, then
it contributes (qf) distinct copies of Kk(t) in H. We note that one copy of
Kk(t) comes from at most 0(1) distinct K E K in this way, and therefore it
suffices to bound LKEJ( (qf) from below.
On the other hand, for a fixed vertex v, the hypergraph Hv contains at
least fk-l (n, mv) copies K E K by the inductive assumption, where mv is
the number of edges of Hv. Hence

L qK 2 L fk-l(n,m v )'
KEJ( vEV

Using LVEV mv = km, the convexity of fk-l in the second variable, and
Jensen's inequality (see page xvi), we obtain

L qK 2 nik-l(n,km/n). (9.1)
KEJ(
To conclude the proof, we define a convex function extending the binomial
coefficient (~) to the domain R:

for x :::; t - 1,
g(x) ={ ~(X-l)"'(X-t+l) for x > t - 1.
t!
9.3 Order Types and the Same- Type Lemma 215

We want to bound "'L-KEK9(QK) from below, and we have the bound (9.1) for
"'L-KEK QK· Using the bound IKI :::; nt(k-l) (clear, since Kk-l(t) has t(k-1)
vertices) and Jensen's inequality, we derive that the number of copies of Kk(t)
in 1£ is at least
t(k-l) (n fk-l (n, km/n))
en 9 nt(k-l) .
A calculation finishes the induction step; we omit the details. o
Bibliography and remarks. The second selection lemma was
conjectured, and proved in the planar case, by Barany, Fiiredi, and
Lovasz [BFL90]. The missing part for higher dimensions was the col-
ored Tverberg theorem (discussed in Section 8.3). A proof for the
planar case by a different technique, with considerably better quanti-
tative bounds than can be obtained by the method shown above, was
given by Aronov, Chazelle, Edelsbrunner, Guibas, Sharir, and Wenger
[ACE+91] (the bounds were mentioned in the text). The full proof of
the second selection lemma for arbitrary dimension appears in Alon,
Barany, Fiiredi, and Kleitman [ABFK92].
Several other "selection lemmas," sometimes involving geometric
objects other than simplices, were proved by Chazelle, Edelsbrunner,
Guibas, Herschberger, Seidel, and Sharir [CEG+94].
Theorem 9.2.2 is from Erdos and Simonovits [ES83].

Exercises
1. (a) Prove a one-dimensional selection lemma: Given an n-point set X c
R and a family F of a(~) X-intervals, there exists a point common
to D( a 2 G)) intervals of F. What is the best value of the constant of
proportionality you can get? IT]
(b) Show that this result is sharp (up to the value of the multiplicative
constant) in the full range of a. III
2. (a) Show that the exponent 82 in the second selection lemma in the plane
cannot be smaller than 2. III
(b) Show that 83 2: 2. 8J Can you also show that Sd 2: 2?
(c) Show that the proof method via the fractional Helly theorem cannot
give a better value of 82 than 3 in Theorem 9.2.1. That is, construct an
n-point set and a(~) triangles on it in such a way that no more than
O(a 5 n 9 ) triples of these triangles have a point in common. III

9.3 Order Types and the Same-Type Lemma


The order type of a set. There are infinitely many 4-point sets in the
plane in general position, but there are only two "combinatorially distinct"
types of such sets:
216 Chapter 9: Geometric Selection Theorems

• • •
and
• • • • •
What is an appropriate equivalence relation that would capture the intuitive
notion of two finite point sets in Rd being "combinatorially the same"? We
have already encountered one suitable notion of combinatorial isomorphism
in Section 5.6. Here we describe an equivalent but perhaps more intuitive
approach based on the order type of a configuration. First we explain this
notion for planar configurations in general position, where it is quite simple.
Let p = (PI,P2,'" ,Pn) and q = (ql, q2, ... , qn) be two sequences of points
in R 2 , both in general position (no 2 points coincide and no 3 are collinear).
Then p and q have the same order type if for any indices i < j < k we turn
in the same direction (right or left) when going from Pi to Pk via Pj and when
going from qi to qk via qj:

or

We say that both the triples (Pi,pj,Pk) and (qi' qj, qk) have the same orien-
tation.
If the point sequences p and q are in R d , we require that every (d+1)-
element subsequence of p have the same orientation as the corresponding
subsequence of q. The notion of orientation is best explained for d-tuples of
vectors in Rd. If VI, ... , Vd are vectors in R d, there is a unique linear mapping
sending the vector ei of the standard basis of Rd to Vi, i = 1,2, ... , d. The
matrix A of this mapping has the vectors VI,"" Vd as the columns. The
orientation of (VI"'" Vd) is defined as the sign of det(A); so it can be +1
(positive orientation), -1 (negative orientation), or 0 (the vectors are linearly
dependent and lie in a (d-1)-dimensionallinear subspace). For a (d+1)-tuple
of points (PI. P2, ... ,Pd+ I), we define the orientation to be the orientation of
the d vectors P2 - PI, P3 - PI, ... ,Pd+1 - Pl. Geometrically, the orientation of
a 4-tuple (PI,P2,P3,P4) tells us on which side of the plane PIP2P3 the point
P4 lies (if PI,P2,P3,P4 are affinely independent).
Returning to the order type, let p = (PI,P2,'" ,Pn) be a point sequence
in Rd. The order type of p (also called the chirotope of p) is defined as the
mapping assigning to each (d+1)-tuple (iI, i2,.'" id+d of indices, 1 ::; i l <
i2 < ... < id+1 ::; n, the orientation of the (d+1)-tuple (PiuPi2"" ,Pid+J.
Thus, the order type of p can be described by a sequence of + 1's, -1 's, and
O's with (d~l) terms.
The order type makes good sense only for point sequences in Rd con-
taining some d+ 1 affinely independent points. Then one can read off various
properties of the sequence from its order type, such as general position, con-
vex position, and so on; see Exercise 1.
9.3 Order Types and the Same-Type Lemma 217

In this section we prove a powerful Ramsey-type result concerning order


types, called the same-type lemma.
Same-type transversals. Let (YI , Y2 , ••. , Ym ) be an m-tuple of finite sets
in Rd. By a transversal of this m-tuple we mean any m-tuple (Yl, Y2,···, Ym)
such that Yi E Yi for all i. We say that (YI , Y 2 , ... , Y m ) has same-type
transversals if all of its transversals have the same order type. Here is an
example of 4 planar sets with same-type transversals:

..........
Y3

:
..... ::::.::::::......
.•.

........ Y1 ·····•...
Y. ~:.:::.::./ ............................ ::~.::.:: . .::.~. Y2

If (X I ,X2 , ••• ,Xm ) are very large finite sets such that XIU···UXm
is in general position, 1 we can find not too small subsets YI ~ X I, ... ,
Ym ~ Xm such that (YI , ... , Ym ) has same-type transversals. To see this,
color each transversal of (Xl, X 2 , ... , Xm) by its order type. Since the num-
ber of possible order types of an m-point set in general position cannot ex-
ceed r = 2U:;:1), we have a coloring of the edges of the complete m-partite
hypergraph on (Xl, ... ,Xm) by r colors. By the Erdos-Simonovits theorem
(Theorem 9.2.2), there are sets Yi ~ Xi, not too small, such that all edges
induced by Yl U·· ·UYm have the same color, i.e., (Yl , ... , Ym ) has same-type
transversals.
As is the case for many other geometric applications of Ramsey-type theo-
rems, this result can be quantitatively improved tremendously by a geometric
argument: For m and d fixed, the size of the sets Yi can be made a constant
fraction of IXil.

9.3.1 Theorem (Same-type lemma). For any integers d, m 2: 1, there


exists c = c(d, m) > 0 such that the following holds. Let Xl, X 2 , ••. , Xm be
finite sets in Rd such that Xl U· .. UXm is in general position. Then there are
YI ~ X!, ... , Ym ~ Xm such that the m-tuple (YI , Y 2 , ..• , Y m ) has same-type
transversals and IYiI 2: clXil for all i = 1,2, ... , m.

Proof. First we observe that it is sufficient to prove the same-type lemma


for m = d+l. For larger m, we begin with (Xl, X 2 , ... , Xm) as the current m-
tuple of sets. Then we go through all (d+ 1)-tuples (il, i 2, ... ,id+l) of indices,
and if (Zl, . .. ,Zm) is the current m-tuple of sets, we apply the same-type
lemma to the (d+1)-tuple (Zip ... , Zid+J. These sets are replaced by smaller

I This is a shorthand for saying that Xi n Xj = 0 for all i i= j and Xl U ... U Xm


is in general position.
218 Chapter 9: Geometric Selection Theorems

sets (Z~l' ... ' ZL+J such that this (d+1)-tuple has same-type transversals.
After this step is executed for all (d+ 1)-tuples of indices, the resulting current
m-tuple of sets has same-type transversals.
This method gives the rather small lower bound

c(d,m) ?:: c(d,d+1)(,ni 1 ).

To handle the crucial case m = d+1, we will use the following criterion
for a (d+ 1)-tuple of sets having same-type transversals.
9.3.2 Lemma. Let C 1 , C 2 , ..• , Cd+! ~ Rd be convex sets. The following two
conditions are equivalent:
(i) There is no hyperplane simultaneously intersecting all ofCI , C 2 , ... , Cd+!.
(ii) For each nonempty index set I c {l, 2, ... ,d+1}, the sets UiEI C i and
UNI Cj can be strictly separated by a hyperplane.

Moreover, if Xl, X 2 , .•. , Xd+! C Rd are finite sets such that the sets C i =
conv(Xi ) have property (i) (and (ii)), then (Xl' ... ' Xd+d has same-type
transversals.
In particular, planar convex sets C l , C 2 , C 3 have no line transversal if and
only if each of them can be separated by a line from the other two. The proof
of this neat result is left to Exercise 3. We will not need the assertion that
(i) implies (ii).
Same-type lemma for d+l sets. To prove the same-type lemma for the
case m = d+1, it now suffices to choose the sets Yi ~ Xi in such a way
that their convex hulls are separated in the sense of (ii) in Lemma 9.3.2.
This can be done by an iterative application of the ham-sandwich theorem
(Theorem 1.4.3).
Suppose that for some nonempty index set I c {l, 2, ... ,d + I}, the sets
conv(U iEI Xd and conv(Uj 9!'I Xj) cannot be separated by a hyperplane. For
notational convenience, we assume that d+1 E I. Let h be a hyperplane
simultaneously bisecting Xl, X 2 , ... , Xd, whose existence is guaranteed by
the ham-sandwich theorem. Let 'Y be a closed half-space bounded by hand
containing at least half of the points of X d +!. For all i E I, including i = d+ 1,
we discard the points of Xi not lying in 'Y, and for j ~ I we throwaway the
points of Xj that lie in the interior of'Y (note that points on h are never
discarded); see Figure 9.1.
We claim that union of the resulting sets with indices in I is now strictly
separated from the union of the remaining sets. If h contains no points of the
sets, then it is a separating hyperplane. Otherwise, let the points contained
in h be all .. . ,at; we have t :::; d by the general position assumption. For
each aj, choose a point aj very near to aj. If aj lies in some Xi with i E I,
then aj is chosen in the complement of 'Y, and otherwise, it is chosen in the
interior of 'Y. We let h' be a hyperplane passing through a~, ... ,a~ and lying
9.3 Order Types and the Same-Type Lemma 219

r .\.X 3
--'

Xl \
(m· · · · · ~· · · ·\·~.\
\ . . . .\
I
~-I-_-\---+~- h
'.................--<.
X2
ini ial ct 1 = {3}

¥ '-
\
\

\
I h
I
r \
--'

\
\--,.£ .......
/
'\ c:::::7.: ..........\
.
\..
... _-.- ........ .. ....... ... (j.<. . . . . :~ h

1 = {2.3} I = {1.3} r suIt

Figure 9.1. Proof of the same-type lemma for d = 2, m = 3.

very close to h. Then h' is the desired separating hyperplane, provided that
the aj are sufficiently close to the corresponding aj, as in the picture below:

......

h
h' ..
Thus, we have "killed" the index set I, at the price of halving the sizes
of the current sets; more precisely, the size of a set Xi is reduced from IXil
to r1Xi l/21 (or larger). We can continue with the other index sets in the
same manner. After no more than 2d - 1 halvings, we obtain sets satisfying
the separation condition and thus having same-type transversals. The same-
type lemma is proved. The lower bound for c( d, d+ 1) is doubly exponential,
roughly 2- 2d • 0

A simple application. We recall that by the Erdos-Szekeres theorem, for


any natural number k there is a natural number n = n(k) such that any
n-point set in the plane in general position contains a subset of k points
in convex position (forming the vertices of a convex k-gon). The same-type
lemma immediately gives the following result:
220 Chapter 9: Geometric Selection Theorems

9.3.3 Theorem (Positive-fraction Erdos-Szekeres theorem). For ev-


ery integer k 2 4 there is a constant Ck > 0 such that every sufHciently large
finite set X c R 2 in general position contains k disjoint subsets YI , ... , Yk ,
of size at least cklXI each, such that each transversal of (YI , ... , Y k ) is in
convex position.

Proof. Let n = n(k) be the number as in the Erdos-Szekeres theorem. We


partition X into n sets Xl"'" Xn of almost equal sizes, and we apply the
same-type lemma to them, obtaining sets YI , ... , Y n , Yi ~ Xi, with same-
type transversals. Let (YI, ... , Yn) be a transversal of (YI , ... , Yn). By the
Erdos-Szekeres theorem, there are i l < i2 < ... < ik such that Yi" ... , Yik
are in convex position. Then Yi" . .. ,Yi k are as required in the theorem. 0

Bibliography and remarks. For more information on order types,


the reader can consult the survey by Goodman and Pollack [GP93].
The same-type lemma is from Barany and Valtr [BV98], and a very
similar idea was used by Pach [Pac98]. Barany and Valtr proved the
positive-fraction Erdos-Szekeres theorem (the case k = 4 was estab-
lished earlier by Nielsen), and they gave several more applications of
the same-type lemma, such as a positive-fraction Radon lemma and a
positive-fraction Tverberg theorem.
Another, simple proof of the positive-fraction Erdos-Szekeres the-
orem was found by Pach and Solymosi [PS98b]; see Exercise 4 for an
outline.
The equivalence of (i) and (ii) in Lemma 9.3.2 is from Goodman,
Pollack, and Wenger [GPW96].
A nice strengthening of the same-type lemma was proved by Par
[Par02]: Instead of just selecting a Yi from each Xi, the Xi can be
completely partitioned into such Yi. That is, for every d and m there
exists n = n(d, m) such that whenever Xl, X 2 , ... , Xm c Rd are finite
sets with IXII = IX2 1 = ... = IXml and with UXi in general position,
there are partitions Xi = Yil UYi2U, .. UYin, i = 1,2, ... ,m, such that
for each j = 1,2, ... , n, the sets Ylj , Y2 ,j,"" Ymj have the same size
and same-type transversals. Schematically:
j = 1 2 3 n
I I I II Xl
I II X 2
I II X3
~-L_ _~-L_ _~wll X4
(the sets in each column have same-type transversals). For the proof,
one first observes that it suffices to prove the existence of n(d, d+1);
the larger m follow as in the proof of the same-type lemma, by re-
fining the partitions for every (d+ 1)-tuple of the indices i. The key
9.3 Order Types and the Same-Type Lemma 221

step is showing n(d, d+1) ::; 2n(d-1, d+1). The Xi are projected on
a generic hyperplane h and the appropriate partitions are found for
the projections by induction. Let XI c h be the projection of Xi, let
Y{, . .. ,y('d+l) be one of the "columns" in the partitions of the XI (we
omit the index j for simpler notation), let k = IJi'I, and let Yi ~ Xi be
the preimage of Ji'. As far as separation by hyperplanes is concerned,
the Ji' behave like d+l points in general position in R d - l , and so there
is only one inseparable (Radon) partition (see Exercise 1.3.9), i.e., an
I C {I, 2, ... , d+l} (unique up to complementation) such that UiE1 Ji'
cannot be separated from UiltI Ji'. By an argument resembling proofs
of the ham-sandwich theorem, it can be shown that there is a half-
space 'Y in Rd and a number kl such that h n Yil = kl for i E I and
l'YnYiI = k-k l for i 1- I. Letting Zi = Yin'Y for i E I and Zi = Yi \'Y
for i rt I and Ti = Yi \ Zi, one obtains that (Zl, ... , Zd+l) satisfy
condition (ii) in Lemma 9.3.2, and so they have same-type transver-
sals, and similarly for the Ti . A 2-dimensional picture illustrates the
construction:

I = {I 3}

y'1 y'2 y'3

The problem of estimating n(d, m) (the proof produces a doubly ex-


ponential bound) is interesting even for d = 1, and there Par showed,
by ingenious arguments, that n(l, m) = 8(m 2 ).

Exercises
1. Let p = (PI,P2, ... ,Pn) be a sequence of points in Rd containing d+1
affinely independent points. Explain how we can decide the following
questions, knowing the order type of p and nothing else about it:
(a) Is it true that for every k points among the Pi, k = 2,3, ... , d+1, the
affine hull has the maximum dimension k-1? 0
(b) Does PM2 lie in conv({Pl, ... ,PMI})? I}]
(c) Are the points PI, ... ,Pn convex independent (Le., is each of them a
vertex of their convex hull)? 0
2. Let p = (PbP2, ... ,Pn) be a sequence of points in Rd whose affine hull
is the whole of Rd. Explain how we can determine the order type of p,
up to a global change of all signs, from the knowledge of sgn(AfNal(p))
(the signs of affine functions on the Pi; see Section 5.6). 0
222 Chapter 9: Geometric Selection Theorems

(Conversely, sgn(AfNal(p)) can be reconstructed from the order type,


but the proof is more complicated; see, e.g., [BVS+99].)
3. (a) Prove that in the setting of Lemma 9.3.2, if the convex hulls of the
Xi have property (i), then (Xl, ... , Xd+d has same-type transversals.
Proceed by contradiction. [I]
(b) Prove that property (ii) (separation) implies property (i) (no hyper-
plane transversal). Proceed by contradiction and use Radon's lemma. [I]
(c) Prove that (i) implies (ii). m
4. Let k ~ 3 be a fixed integer.
(a) Show that for n sufficiently large, any n-point set X in general position
in the plane contains at least cn 2k convex independent subsets of size 2k,
for a suitable c = c(k) > 0. [I]
(b) Let S = {PI,P2,'" ,P2d be a convex independent subset of X,
where the points are numbered along the circumference of the con-
vex hull in a clockwise order, say. The holder of S is the set H (S) =
{PI,P3,'" ,P2k-I}. Show that there is a set H that is the holder of at
least Sl(nk) sets S. [TI
(c) Derive that each of the indicated triangular regions of such an H
contain Sl(n) points of X:

Infer the positive-fraction Erdos-Szekeres theorem in the plane. ~


(d) Show that the positive-fraction Erdos-Szekeres theorem in higher
dimensions is implied by the planar version. [TI
5. (A Ramsey-type theorem for segments)
(a) Let L be a set of n lines and P a set of n points in the plane, both
in general position and with no point of P lying on any line of L. Prove
that we can select subsets L' ~ L, IL'I ~ an, and P' ~ P, IP'I ~ an,
such that P' lies in a single cell of the arrangement of L' (where a >
is a suitable absolute constant). You can use the same-type lemma for
°
m = 3 (or an elementary argument). m
(b) Given a set S of n segments and a set L of n lines in the plane, both
in general position and with no endpoint of a segment lying on any of
the lines, show that there exist S' ~ Sand L' ~ L, IS'I, 1£'1 ~ (3n, with
a suitable constant (3 > 0, such that either each segment of S' intersects
each line of L' or all segments of S' are disjoint from all lines of L'. [I]
(c) Given a set R of n red segments and a set B of n blue segments
in the plane, with RuB in general position, prove that there are subsets
R' ~ R, IR'I ~ In, and B' ~ B, IB'I ~ In, such that either each segment
9.4 A Hypergraph Regularity Lemma 223

of R' intersects each segment of B' or each segment of R' is disjoint from
°
each segment of B' b> is another absolute constant). 12]
The result in (c) is due to Pach and Solymosi [PS01j.

9.4 A Hypergraph Regularity Lemma


Here we consider a fine tool from the theory of hypergraphs, which we will
need for yet another version of the selection lemma in the subsequent section.
It is a result inspired by the famous Szemeredi regularity lemma for graphs.
Very roughly speaking, the Szemen§di regularity lemma says that for given
c > 0, the vertex set of any sufficiently large graph G can be partitioned
into some number, not too small and not too large, of parts in such a way
that the bipartite graphs between "most" pairs of the parts look like random
bipartite graphs, up to an "error" bounded by c. An exact formulation is
rather complicated and is given in the notes below. The result discussed here
is a hypergraph analogue of a weak version of the Szemen§di regularity lemma.
It is easier to prove than the Szemeredi regularity lemma.
Let 1-l = (X, E) be a k-partite hypergraph whose vertex set is the union
of k pairwise disjoint n-element sets Xl, X 2 , • .• , X k , and whose edges are
k-tuples containing precisely one element from each Xi. For subsets Yi ~ Xi,
i = 1,2, ... , k, let e(YI , ••• , Yk ) denote the number of edges of 1-l contained
in YI U ... U Yk. In this notation, the total number of edges of 1-l is equal to
e(XI"'" X k ). Further, let

denote the density of the subhypergraph induced by the Yi.


9.4.1 Theorem (Weak regularity lemma for hypergraphs). Let 1-l be
a k-partite hypergraph as above, and suppose that p(1-l) 2: j3 for some j3 > 0.
Let 0 < c < ~. Suppose that n is sufliciently large in terms of k, j3, and c.
Then there exist subsets Yi ~ Xi of equal size IYiI = s 2: j31/Ek n, i =
1,2, . .. ,k, such that
(i) (High density) p(YI , ••• , Yk ) 2: j3, and
(ii) (Edges on all large subsets) e(ZI, ... , Zk) > 0 for any Zi ~ Yi with
IZil2: cs, i = 1,2, ... ,k.
The following scheme illustrates the situation (but of course, the vertices
of the Yi and Zi need not be contiguous).
Proof. Intuitively, the sets Yi should be selected in such a way that the
subhypergraph induced by them is as dense as possible. We then want to
show that if there were ZI, ... , Zk of size at least €s with no edges on them,
we could replace the Yi by sets with a still larger density. But if we looked at
the usual density p(Y!, . .. , Yk ), we would typically get too small sets Yi. The
trick is to look at a modified density parameter that slightly favors larger
sets. Thus, we define the magical density jt(Y!, . . . , Yk ) by

We choose Y I , ... , Yk, Yi <;;;; Xi, as sets of equal size that have the maximum
possible magical density jt(YI , ... , Yk ). We denote the common size WII =
... = IYkl by s.
First we derive the condition (i) in the theorem for this choice of the Yi.
We have

and so e(YI , ... , Yk ) ~ f3s k , which verifies (i). Since obviously e(YI , ... , Yk) ~
sk, we have jt(YI , ... ,Yk ) ~ se k • Combining with jt(YI , ... , Yk ) ~ f3n ek de-
rived above, we also obtain that s ~ f31/e k n.
It remains to prove (ii). Since €s is a large number by the assumptions,
rounding it up to an integer does not matter in the subsequent calculations
(as can be checked by a simple but somewhat tedious analysis). In order
to simplify matters, we will thus assume that €s is an integer, and we let
ZI <;;;; Y I ,· · ·, Zk <;;;; Yk be €s-element sets. We want to prove e(ZI, ... , Zk) > O.
We have
9.4 A Hypergrapb Regularity Lemma 225

e(Zl, ... , Zk) = e(Y1 , •.. , Yk) (9.2)


- e(Yl \ Zl, Y2, Y3,···, Yk )
- e(Zl' Y2 \ Z2, Y3,···, Yk)
- e(Zl' Z2, Y3 \ Z3,"" Yk)

We want to show that the negative terms are not too large, using the as-
sumption that the magical density of Y1 , ... ,Yk is maximum. The problem
is that Y1 , ... , Yk maximize the magical density only among the sets of equal
size, while we have sets of different sizes in the terms. To get back to sets of
equal size, we use the following observation. If, say, Rl is a randomly chosen
subset of Y1 of some given size r, we have

where E[ .J denotes the expectation with respect to the random choice of an r-


element Rl ~ Y1 . This preservation of density by choosing a random subset is
quite intuitive, and it is not difficult to verify it by counting (Exercise 1). For
estimating the term e(Yl \ Zl, Y 2, ... , Yk), we use random subsets R 2, ... , Rk
of size (l-€)s of Y 2, ... , Yk, respectively. Thus,

Now for any choice of R 2, ... , Rk, we have

p(Y1 \ Zl, R2,.·" Rk) = ((1 - c)s)-e p(Y1 \ Zl, R 2, ... , Rk)
k

:::; ((1 - c)s)-e k p(Y1 , Y 2, ... , Yk)


= (1- c)-c: k p(Y1 , ... , Yk).

Therefore,

To estimate the term e(Zl' Z2,"" Zi-l, li \Zi, li+l,"" Yk ), we use random
subsets Ri C li \ Zi and Ri+l C li+l,"" Rk C Yk, this time all of size lOS.
A similar calculation as before yields

e(Zl' Z2,"" Zi-l, li \ Zi, li+l,"" Y k ) :::; ci-1-c: k (1 - c)e(Y1, ... , Yk).
(This estimate is also valid for i = 1, but it is worse than the one derived
above and it would not suffice in the subsequent calculation.) From (9.2) we
obtain that e(Zl, ... , Zk) is at least e(Y1 , •.• , Y k ) multiplied by the factor
226 Chapter 9: Geometric Selection Theorems

k
1 - (1 - E:) - (1 - E:)E:- ck I:>i-I = E: - E: I - ck (1 _ E: k- I )
i=2

= E: (1 +E:-Ck(c k- I -1))
= E: (1 -1))
+eckln(l/c)(E:k-1

::::: E: (1 + (1 + E: k In ~)(E:k-I -1))


= E: k+I (1 _ In 1 + E: k In 1)
c c c
>
-
E: k + 1 (1c _ In 1)
c
> O.
Theorem 9.4.1 is proved. o
Bibliography and remarks. The Szemen§di regularity lemma is
from [Sze78], and in its full glory it goes as follows: For every E: > 0 and
for every ko, there exist K and no such that every graph G on n ::::: no
vertices has a partition (Vo, VI, ... , Vk) of the vertex set into k+ 1 parts,
ko :::; k :::; K, where lVol :::; en, IVII = 1V21 = ... = IVkl = m, and all but
at most E:k 2 of the (~) pairs {Vi, Vj} are E:-regular, which means that
for every A ~ Vi and B ~ Vj with IAI, IBI ::::: E:m we have Ip(A, B) -
p(Vi, Vj) I :::; E:. Understanding the idea of the proof is easier than
understanding the statement. The regularity lemma is an extremely
powerful tool in modern combinatorics. A survey of applications and
variations can be found in Koml6s and Simonovits [KS96].
Our presentation of Theorem 9.4.1 essentially follows Pach [Pac98],
whose treatment is an adaptation of an approach of Koml6s and S6s.
One can formulate various hypergraph analogues of the Szemen§di
regularity lemma in its full strength. For instance, for a 3-uniform
hypergraph, one can define a triple VI, V2 , V2 of disjoint subsets of
vertices to be E:-regular if Ip(A I ,A 2 ,A3 ) - P(VI' V2 , V3 )1:::; E: for every
Ai ~ Vi with IAil ::::: E:IViI, and formulate a statement about a parti-
tion of the vertex set of every 3-regular hypergraph in which almost
all triples of classes are E:-regular. Such a result indeed holds, but this
formulation has significant shortcomings. For example, the Szemeredi
regularity lemma allows approximate counting of small subgraphs in
the given graph (see Exercise 3 for a simple example), which is the
key to many applications, but the notion of E:-regularity for triple sys-
tems just given does not work in this way (Exercise 4). A technically
quite complicated but powerful regularity lemma for 3-regular hyper-
graphs that does admit counting of small subhypergraphs was proved
by Frankl and Rodl [FR01]. The first insight is that for triple systems,
one should not partition only vertices but also pairs of vertices.
Let us mention a related innocent-looking problem of geometric
flavor. For a point c E S = {1, 2, ... , n} d, we define a jack with center
9.4 A Hypergraph Regularity Lemma 227

e as the set of all points of S that differ from e in at most 1 coordinate.


The problem, formulated by Szekely, asks for the maximum possible
cardinality of a system of jacks in S such that no two jacks share a line
(i.e., every two centers differ in at least 2 coordinates) and no point
is covered by d jacks. It is easily seen that no more than n d - I jacks
can be taken, and the problem is to prove an o(n d - I ) bound for every
fixed d. The results of Frankl and Rodl [FROl] imply this bound for
d = 4, and recently Rodl and Skokan announced a positive solution
for d = 5 as well; these results are based on sophisticated hypergraph
regularity lemmas. A positive answer would imply the famous theorem
of Szemeredi on arithmetic progressions (see, e.g., Gowers [Gow98] for
recent work and references) and would probably provide a "purely
combinatorial" proof.

Exercises
1. Verify the equality E[p(RI, Y2 , ... , Yk )] = p(YI , ... , Yk ), where the ex-
pectation is with respect to a random choice of an r-element RI S;; YI .
Also derive the other similar equalities used in the proof in the text. [2]
2. (Density Ramsey-type result for segments)
(a) Let e > 0 be a given positive constant. Using Exercise 9.3.5(c) and
the weak regularity lemma, prove that there exists (3 = (3(e) > 0 such
that whenever Rand B are sets of segments in the plane with RuB in
general position and such that the number of pairs (r, b) with r E R,
b E B, and r n b ;t 0 is at least en 2 , then there are subsets R' S;; Rand
B' S;; B such that IR'I 2:: (3n, IB'I 2:: (3n, and each r E R' intersects each
bE B'. IT]
(b) Prove the analogue of (a) for noncrossing pairs. Assuming at least en 2
pairs (r, b) with r n b = 0, select R' and B' of size (3n such that r n b = 0
for each r E R' and bE B'. IT!
These results are from Pach and Solymosi [PS01].
3. (a) Let G = (V, E) be a graph, and let V be partitioned into classes
VI, V2 , V3 of size m each. Suppose that there are no edges with both
vertices in the same Vi, that Ip(Vi, Vj) - ~ I :::; e for all i < j, and that
each pair (Vi, Vj) is e-regular (this means that Ip(A,B) - p(Vi, Vj)1 :::; e
for any A S;; Vi and B S;; Vj with IAI, IBI 2:: em). Prove that the number
of triangles in G is (~ + 0(1) )m3 , where the 0(1) notation refers to e --+ 0
(while m is considered arbitrary but sufficiently large in terms of e). IT]
(b) Generalize (a) to counting the number of copies of K 4 , where G has
4 classes VI, ... , V4 of equal size (if all the densities are about ~, then the
number should be (2- 6 + 0(1))m 4 ). IT]
4. For every e > 0 and for arbitrarily large m, construct a 3-uniform 4-
partite hypergraph with vertex classes VI' ... ' V4 , each of size m, that
contains no K~3) (the system of all triples on 4 vertices), but where
Ip(Vi, Vj, Vk ) - ~I :::; e for all i < j < k and each triple (Vi, Vj, Vk ) is
228 Chapter 9: Geometric Selection Theorems

e-regular. The latter condition means Ip(Ai,Aj,A k ) - p(Vi, Vj, Vk)1 ::; 10
for every Ai ~ Vi, Aj ~ Vj, Ak ~ Vk of size at least em. 0

9.5 A Positive-Fraction Selection Lemma


Here we discuss a stronger version of the first selection lemma (Theo-
rem 9.1.1). Recall that for any n-point set X C Rd, the first selection lemma
provides a "heavily covered" point, that is, a point contained in at least a
fixed fraction of the (d~l) simplices with vertices in points of X. The the-
orem below shows that we can even get a large collection of simplices with
a quite special structure. For example, in the plane, given n red points, n
white points, and n blue points, we can select {k red, {k white, and {k blue
points in such a way that all the red-white-blue triangles for the resulting
sets have a point in common. Here is the d-dimensional generalization.
9.5.1 Theorem (Positive-fraction selection lemma). For all natural
numbers d, there exists c = c( d) > 0 with the following property. Let
Xl, X 2 , ••. , X d + 1 C Rd be finite sets of equal size, with X I UX2 U··· UXd+1
in general position. Then there is a point a E Rd and subsets Zl ~ Xl,""
Zd+1 ~ Xd+1, with IZil ~ CIXil, such that the convex hull of every transver-
sal of (ZI, ... , Zd+1) contains a.
As was remarked above, for d = 2, one can take c = /2'There is an
elementary and not too difficult proof (which the reader is invited to discover).
In higher dimensions, the only known proof uses the weak regularity lemma
for hypergraphs.
Proof. Let X = Xl U ... U X d+1' We may suppose that all the Xi are
large (for otherwise, one-point Zi will do). Let Fa be the set of all "rainbow"
X-simplices, Le., of all transversals of (Xl, ... , X d+1), where the transversals
are formally considered as sets for the moment. The size of Fa is, for d fixed,
at least a constant fraction of O~D (here we use the assumptions that the Xi
are of equal size). Therefore, by the second selection lemma (Theorem 9.2.1),
there is a subset FI ~ Fa of at least j3n d + 1 X-simplices containing a common
point a, where j3 = j3( d) > O. (Note that we do not need the full power of the
second selection lemma here, since we deal with the complete (d+1)-partite
hypergraph. )
For the subsequent argument we need the common point a to lie in the
interior of many of the X -simplices. One way of ensuring this would be
to assume a suitable strongly general position of X and use a perturba-
tion argument for arbitrary X. Another, perhaps simpler, way is to apply
Lemma 9.1.2, which guarantees that a lies on the boundary of at most O(nd )
of the X -simplices of Fl. So we let F2 ~ Fl be the X -simplices containing a
in the interior, and for a sufficiently large n we still have IF21 ~ j3'n d +1.
Next, we consider the (d+ 1)-partite hypergraph 1i with vertex set X and
edge set F 2 • We let 10 = c(d, d + 2), where c(d, m) is as in the same-type
9.5 A Positive-Fraction Selection Lemma 229

lemma, and we apply the weak regularity lemma (Theorem 9.4.1) to 1/.. This
yields sets YI <;;:; Xl, ... , Yd+1 <;;:; Xd+1, whose size is at least a fixed fraction of
the size of the Xi, and such that any subsets Zl <;;:; YI , ... , Zd+1 <;;:; Yd+1 of size
at least EIYiI induce an edge; this means that there is a rainbow X-simplex
with vertices in the Zi and containing the point a.
The argument is finished by applying the same-type lemma with the d+2
sets YI , Y2, ... , Yd+l and Yd+ 2 = {a}. We obtain sets Zl <;;:; YI , ... , Zd+l <;;:;
Yd+1 and Zd+2 = {a} with same-type transversals, and with IZil 2: EIYiI
for i = 1,2, ... ,d+ 1. (Indeed, the same-type lemma guarantees that at least
one point is selected even from an I-point set.) Now either all transversals
of (Zl, ... , Zd+d contain the point a in their convex hull or none does (use
Exercise 9.3.I(d)). But the latter possibility is excluded by the choice of the
Yi (by the weak regularity lemma). The positive-fraction selection lemma is
proved. 0
It is amazing how many quite heavy tools are used in this proof. It would
be nice to find a more direct argument.

Bibliography and remarks. The planar case of Theorem 9.5.1 was


proved by Barany, Fiiredi, and Lovasz [BFL90] (with c(2) 2: /2)' and
the result for arbitrary dimension is due to Pach [Pac98].
10

Transversals and Epsilon Nets

Here we are going to consider problems of the following type: We have a


family F of geometric shapes satisfying certain conditions, and we would like
to conclude that F can be "pierced" by not too many points, meaning that
we can choose a bounded number of points such that each set of F contains at
least one of them. Such questions are sometimes called GaZZai-type problems,
because of the following nice problem raised by Gallai: Let F be a finite family
of closed disks in the plane such that every two disks in F intersect. What
is the smallest number of points needed to pierce F? For this problem, the
exact answer is known: 4 points always suffice and are sometimes necessary.
We will not cover this particular (quite difficult) result; rather, we con-
sider general methods for proving that the number of piercing points can be
bounded. These methods yield numerous results where no other proofs are
available. On the other hand, the resulting estimates are usually quite large,
and in some simpler cases (such as Gallai's problem mentioned above), spe-
cialized geometric arguments provide much better bounds.
Some of the tools introduced in this chapter are widely applicable and
sometimes more significant than the particular geometric results. Such im-
portant tools include the transversal and matching numbers of set systems,
their fractional versions (connected via the duality of linear programming),
the Vapnik-Chervonenkis dimension and ways of estimating it, and epsilon
nets.

10.1 General Preliminaries: Transversals and


Matchings
Let F be a system of sets on a ground set X; both F and X may generally
be infinite. A subset T ~ X is called a transversal of F if it intersects all the
sets of F.
232 Chapter 10: Transversals and Epsilon Nets

The transversal number of F, denoted by T(F), is the smallest possible car-


dinality of a transversal of F.
Many combinatorial and geometric problems, some them considered in
this chapter, can be rephrased as questions about the transversal number of
suitable set systems.
Another important parameter of a set system F is the packing number
(or matching number) of F, usually denoted by v(F). This is the maximum
cardinality of a system of pairwise disjoint sets in F:

v(F) = sup{IMI: M ~ F, MI n M2 = 0 for all M I , M2 EM, MI =I- M 2}.

A subsystem M ~ F of pairwise disjoint sets is called a packing (or a match-


ing; this refers to graph-theoretic matching, which is a system of pairwise
disjoint edges).
Any transversal is at least as large as any packing, and so always

v(F) :S T(F).

In the reverse direction, very little can be said in general, since T(F) can be
arbitrarily large even if v(F) = 1. As a simple geometric example, we can
take the plane as the ground set X and let the sets of F be n lines in general
position. Then v(F) = 1, since every two lines intersect, but T(F) 2:: ~ n,
because no point is contained in more than two of the lines.
Fractional packing and transversal numbers. Now we introduce an-
other parameter of a set system, which always lies between v and T and which
has proved extremely useful in arguments estimating T or v. First we restrict
ourselves to set systems on finite ground sets.
Let F be a system of subsets of a finite set X. A fractional transversal for
F is a function <p: X -+ [0,1] such that for each S E F, we have LXES <p(x) 2::
1. The size of a fractional transversal <p is LXEX <p(x), and the fractional
transversal numberT*(F) is the infimum of the sizes offractional transversals.
So in a fractional transversal, we can take one-third of one point, one-fifth
of another, etc., but we must put total weight of at least one full point into
every set.
10.1 General Preliminaries: Transversals and Matchings 233

Similarly, a fractional packing for F is a function 'IjJ: F -+ [0, 1] such that


for each x E X, we have LSEF: xES 'IjJ(S) :s; 1. So sets receive weights and the
total weight of sets containing any given point must not exceed 1. The size
of a fractional packing 'IjJ is LSEF'IjJ(S), and the fractional packing number
v*(F) is the supremum of the sizes of all fractional packings for F-
It is instructive to consider the "triangle" system of 3 sets on 3 points,

and check that v = 1, T = 2, and v* = T* = ~.


Any packing M yields a fractional packing (by assigning weight 1 to the
°
sets in M and to others), and so v :s; v*. Similarly, T* :::: T.
We promised one parameter but introduced two: T* and v*. But they
happen to be the same.
10.1.1 Theorem. For every set system F on a finite ground set, we have
v*(F) = T*(F). Moreover, the common value is a rational number, and
there exist an optimal fractional transversal and an optimal fractional packing
attaining only rational values.
This is not a trivial result; the proof is a nice application of the duality
of linear programming. Here is the version of the linear programming duality
we need.
10.1.2 Proposition (Duality of linear programming). Let A be an
m x n real matrix, bERm a (column) vector, and cERn a (column) vector.
Let
P = {x ERn: x:?: 0, Ax:?: b}
and
D = {y E R m : y :?: 0, yT A :s; cT }
(the inequalities between vectors should hold in every component). If both
P =1= 0 and D =1= 0, then
min {cTx: x E p} = max {yTb: y E D};

in particular, both the minimum and the maximum are well-defined and
attained.
This result can be quickly proved by piecing together a larger matrix from
A, b, and c and applying a suitable version of the Farkas lemma (Lemma 1.2.5)
to it (Exercise 6). It can also be derived directly from the separation theorem.
234 Chapter 10: Transversals and Epsilon Nets

Let us remark that there are several versions of the linear programming
duality (differing, for example, in including or omitting the requirement x ~
0, or replacing Ax ~ b by Ax = b, or exchanging minima and maxima), and
they are easy to mix up.
Proof of Theorem 10.1.1. Set n = IXI and m = IFI, and let A be the
m x n incidence matrix of the set system F: Rows correspond to sets, columns
to points, and the entry corresponding to a point p and a set S is 1 if pES
and 0 if p f/. S. It is easy to check that v*(F) and r*(F) are solutions to the
following optimization problems:

r*(F) = min {l~x: x ~ 0, Ax ~ 1m},


v*(F) = max {yT1m: Y ~ 0, yT A ::; 1~},

where In E Rn denotes the (column) vector of alII's of length n. Indeed, the


vectors x ERn satisfying x ~ 0 and Ax ~ 1m correspond precisely to the
fractional transversals of F, and similarly, the y E R n with y ~ 0 and yT A::;
1~ correspond to the fractional packings. There is at least one fractional
transversal, e.g., x = In, and at least one fractional packing, namely, y = 0,
and so Proposition 10.1.2 applies and shows that v*(F) = r*(F).
At the same time, r*(F) is the minimum of the linear function x H l~x
over a polyhedron, and such a minimum, since it is finite, is attained at a
vertex. The inequalities describing the polyhedron have rational coefficients,
and so all vertices are rational points. 0

Remark about infinite set systems. Set systems encountered in geome-


try are usually infinite. In almost all the considerations concerning transver-
sals, the problem can be reduced to a problem about finite sets, usually by
a simple ad hoc argument. Nevertheless, we include here a few remarks that
can aid a simple consistent treatment of the infinite case. However, they will
not be used in the sequel in any essential way.
There is no problem with the definitions of v and r in the infinite case, but
one has to be a little careful with the definition of v* and r* to preserve the
equality v* = r*. Everything is still fine if we have finitely many sets on an
infinite ground set: The infinite ground set can be factored into finitely many
equivalence classes, where two points are equivalent if they belong to the
same subcollection of the sets. One can choose one point in each equivalence
class and work with a finite system.
For infinitely many sets, some sort of compactness condition is certainly
needed. For example, the system of intervals {[i, 00): i = 1,2, ... } has, ac-
cording to any reasonable definition, v* = 1 but r* = 00.
If we let F be a family of closed sets in a compact metric space X (compact
Hausdorff space actually suffices), we can define v*(F) as sup", L-SEF'IjJ(S),
where the supremum is over all 'IjJ: F -+ [0,1] attaining only finitely many
nonzero values and such that L-SEF: xES 'IjJ(S) ::; 1 for each x EX.
10.1 General Preliminaries: Transversals and Matchings 235

For the definition of T*, the first attempt might be to consider all functions
cp: X --+ [0, 1] attaining only finitely many nonzero values and summing up to
at least lover every set. But this does not work very well: For example, if we
let F be the system of all compact subsets of [0, 1] of Lebesgue measure ~,
say, then //* :::; 2 but T* would be infinite, since any finite subset is avoided
by some member of F. It is better to define a fractional transversal of F as
a Borel measure f-l on X such that f-l(S) 2:: 1 for all S E F, and T*(F) as
the infimum of f-l(X) over all such f-l. With this definition, the validity of the
first part Theorem 10.1.1 is preserved; i.e., //*(F) = T*(F) for all systems F
of closed sets in a compact X. The proof uses a little of functional analysis,
and we omit it; it can be found in [KM97a]. The rationality of //* and T* no
longer holds in the infinite case.

Bibliography and remarks. Gallai's problem about pairwise in-


tersecting disks mentioned at the beginning of this chapter was first
solved by Danzer in 1956, but he hasn't published the solution. For
another solution and a historical account see Danzer [Dan86].
Attempting to summarize the contemporary knowledge about the
transversal number and the packing number in combinatorics would
mean taking a much larger bite than can be swallowed, so we restrict
ourselves to a few sketchy remarks. An excellent source for many com-
binatorial results is Lovasz's problem collection [Lov93].
A quite old result relating // and T is the famous Konig's edge-
covering theorem from 1912, asserting that //(F) = T(F) if F is the
system of edges of a bipartite graph (this is also easily seen to be
equivalent to Hall's marriage theorem, proved by Frobenius in 1917;
see Lovasz and Plummer [LP86] for the history). On the other hand,
an appropriate generalization to systems of triples, namely, T ::; 2//
for any tripartite 3-uniform hypergraph, is a celebrated recent result
of Aharoni [AhaOl] (based on Aharoni and Haxell [AHOOD, while the
generalization T :::; (k-1)// for k-partite k-uniform hypergraphs, known
as Ryser's conjecture, remains unproved for k 2:: 4.
While computing // or T for a given F is well known to be NP-
hard, T* can be computed in time polynomial in IXI + IFI by linear
programming (this is another reason for the usefulness of the frac-
tional parameter). The problem of approximating T is practically very
important and has received considerable attention. More often it is
considered in the dual form, as the set cover problem: Given F with
U F = X, find the smallest sub collection F' ~ F that still covers X.
The size of such F' is the transversal number of the set system dual
to (X, F), where each set S E F is assigned a point Ys and each point
x E X gives rise to the set {Ys: xES}.
For the set cover problem, it was shown by Chvatal and indepen-
dently by Lovasz that the greedy algorithm (always take a set covering
the maximum possible number of yet uncovered points) achieves a so-
236 Chapter 10: Transversals and Epsilon Nets

lution whose size is no more than (1 + In IXI) times larger than the
optimal one. I Lovasz actually observed that the proof implies, for any
finite set system F,

r(F) ::::; r* (F) . (1 + In ~(F)),


where ~(F) is the maximum degree of F, i.e., the maximum number of
sets with a common point (Exercise 4). The weaker bound with ~(F)
replaced by IFI is easy to prove by probabilistic argument (Exercise 3).
It shows that in order to have a large gap between r* and r, the set
system must have very many sets.

Exercises
1. (a) Find examples of set systems with r* bounded by a constant and r
arbitrarily large. IT]
(b) Find examples of set systems with v bounded by a constant and v*
arbitrarily large. IT]
2. Let F be a system of finitely many closed intervals on the real line. Prove
that v(F) = r(F). 0
3. Prove that
r(F) ::::; r*(F) ·In(IFI+1)
for all (finite) set systems F. Choose a transversal as a random sample.
o
4. (Analysis of the greedy algorithm for transversal) Let F be a finite set
system. We choose points Xl, X2, ..• ,Xt of a transversal one by one: Xi is
taken as a point contained in the maximum possible number of uncovered
sets (i.e., sets of F containing none of Xl, ... , Xi- d.
(a) Prove that the size t of the resulting transversal satisfies

where d = ~(F) is the maximum degree of F and vdF) is the maximum


size of a simple k-packing in :F. A subsystem M ~ F is a simple k-packing
if ~(M) ::::; k (so vI(F) = v(F)). 0
(b) Conclude that r(F) ::::; t ::::; r*(F) . L~=l -k. [2]
5. Konig's edge-covering theorem asserts that if E is the set of edges of a
bipartite graph, then v(E) = r(E). Hall's marriage theorem states that if
G is a bipartite graph with color classes A and B such that every subset
S ~ A has at least lSI neighbors in B, then there is a matching in G
containing all vertices of A.
1 As a part of a very exciting development in complexity theory, it was re-
cently proved that no polynomial-time algorithm can do better in general unless
P = NP; see, e.g., [Hoc96] for proofs and references.
10.2 Epsilon Nets and VC-Dimension 237

(a) Derive Konig's edge-covering theorem from Hall's marriage theorem.


~
(b) Derive Hall's marriage theorem from Konig's edge-covering theorem.
~
6. Let A, b, c, P, and D be as in Proposition 10.1.2.
(a) Check that cT x 2: yTb for all x E P and all y E D. CD
(b) Prove that if P -=I- 0 and D -=I- 0, then the system Ax :::; b,
yT A 2: c, cT X 2: yTb has a nonnegative solution x, y (which implies
Proposition 10.1.2). Apply the version of the Farkas lemma as in Exer-
cise 1.2.7(b). [Ij

10.2 Epsilon Nets and VC-Dimension


Large sets should be easier to hit by a transversal than small ones. The notion
of c-net and the related theory elaborate on this intuition. We begin with a
special case, where the ground set is finite and the size of a set is simply
measured as the cardinality.

10.2.1 Definition (Epsilon net, a special case). Let (X,F) be a set


system with X finite and let c E [0,1] be a real number. A set N ~ X (not
necessarily one of the sets of F) is called an c-net for (X, F) if N n 8 -=I- 0 for
all 8 E F with 181 2: clXI.
So an c-net is a transversal for all sets larger than clXI. Sometimes it is
convenient to write ~ instead of c, with r 2: 1 a real parameter. A beautiful
result (Theorem 10.2.4 below) describes a simple combinatorial condition
on the structure of F that guarantees the existence of ~-nets of size only
O(rlogr) for all r 2: 2.
If we want to deal with infinite sets, measuring the size as the number
of points is no longer appropriate. For example, a "large" subset of the unit
square could naturally be defined as one with large Lebesgue measure. So
in general we consider an arbitraryprobabiIity measure J1 on the ground
set. In concrete situations we will most often encounter J1 concentrated on
finitely many points. This means that there is a finite set Y ~ X and a
positive function w: Y -t (0,1] with 2:: yE Y w(y) = 1, and J1 is given by
J1(A) = 2:: yE Any w(y). In particular, if the weights of all points y E Yare
the same, i.e., I~I' we speak of the uniform measure on Y. Another common
example of J1 is a suitable multiple of the Lebesgue measure restricted to
some geometric figure.

10.2.2 Definition (Epsilon net). Let X be a set, let J1 be a probability


measure on X, let F be a system of J1-measurable subsets of X, and let
c E [0,1] be a real number. A subset N ~ X is called an c-net for (X, F)
with respect to J1 if N n 8 -=I- 0 for all 8 E F with J1(8) 2: c.
238 Chapter 10: Transversals and Epsilon Nets

VC-dimension. In order to describe the result promised above, about ex-


istence of small c:-nets, we need to introduce a parameter of a set system
called the Vapnik-Chervonenkis dimension, or VC-dimension for short. Its
applications are much wider than for the existence of c:-nets.
Let F be a set system on X and let Y <;;; X. We define the restriction of
F on Y (also called the trace of F on Y) as

Fly = {S n Y: S E F}.

It may happen that several distinct sets in F have the same intersection with
Y; in such a case, the intersection is still present only once in FI Y .
10.2.3 Definition (VC-dimension). Let F be a set system on a set X.
Let us say that a subset A <;;; X is shattered by F if each of the subsets of A
can be obtained as the intersection of some S E F with A, i.e., if FIA = 2A.
We define the VC-dimension of F, denoted by dim(F), as the supremum of
the sizes of all finite shattered subsets of X. If arbitrarily large subsets can
be shattered, the VC-dimension is 00 .
Let us consider two examples. First, let 1i be the system of all closed
half-planes in the plane. We claim that dim(1i) = 3. If we have 3 points in
general position, each of their subsets can be cut off by a half-plane, and so
such a 3-point set is shattered. Next, let us check that no 4-point set can be
shattered. Up to possible degeneracies, there are only two essentially different
positions of 4 points in the plane:
• o •
o
o
• • •
In both these cases, if the black points are contained in a half-plane, then
a white point also lies in that half-plane, and so the 4 points are not shat-
tered. This is a rather ad hoc argument, and later we will introduce tools
for bounding the VC-dimension in geometric situations. We will see that
bounded VC-dimension is rather common for families of simple geometric
objects in Euclidean spaces.
A rather different example is the system K2 of all convex sets in the plane.
Here the VC-dimension is infinite, since any finite convex independent set A
is shattered: Each B <;;; A can be expressed as the intersection of A with a
convex set, namely, B = An conv(B).
10.2 Epsilon Nets and VO-Dimension 239

We can now formulate the promised result about small E-nets.


10.2.4 Theorem (Epsilon net theorem). If X is a set with a probability
measure p" F is a system of p,-measurable subsets of X with dim (F) ::::: d,
d 2: 2, and r 2: 2 is a parameter, then there exists a ~-net for (X, F) with
respect to p, of size at most Cdr In r, where C is an absolute constant.
The proof below gives the estimate C ::::: 20, but a more accurate calcula-
tion shows that C can be taken arbitrarily close to 1 for sufficiently large r.
More precisely, for any d 2: 2 there exists an ro > 1 such that for all r > ro,
each set system of VC-dimension d admits a ~-net of size at most dr In r.
Moreover, this bound is tight in the worst case up to smaller-order terms.
For the proof (and also later on) we need a fundamental lemma bounding
the number of distinct sets in a system of given VC-dimension. First we define
the shatter function of a set system F by

7f:F(m) = max IFlyl.


y~X, IYI=m

In words, 7f:F(m) is the maximum possible number of distinct intersections of


the sets of F with an m-point subset of X.
10.2.5 Lemma (Shatter function lemma). For any set system F of
VC-dimension at most d, we have 7f:F(m) ::::: if>d(m) for all m, where if>d(m) =
(7:) + (7) + ... + (r;;).
Thus, the shatter function for any set system is either 2m for all m (the
case of infinite VC-dimension) or it is bounded by a fixed polynomial.
For d fixed and m -t 00, if>d(m) can be simply estimated by O(m d). For
more precise calculations, where we are interested in the dependence on d,
we can use the estimate if> d (m) ::::: (e;{') d, where e is the basis of natural
logarithms. This is valid for all m, d 2: l.
Proof of Lemma 10.2.5. Since VC-dimension does not increase by passing
to a subsystem, it suffices to show that any set system of VC-dimension
at most d on an n-point set has no more than if>d(n) sets. We proceed by
induction on d, and for a fixed d we use induction on n.
Consider a set system (X, F) with IXI = nand dim(F) = d, and fix some
x EX. In the induction step, we would like to remove x and pass to the
set system F1 = Fix \ {x} on n-l points. This F1 has VC-dimension at
most d, and hence IF11 ::::: if>d(n-l) by the inductive hypothesis. How many
more sets can F have compared to F1? The only way that the number of sets
decreases by removing x is when two sets S, Sf E F give rise to the same set
in F 1 , which means that Sf = S U {x}, x Ii S, or the other way round. This
suggests that we define an auxiliary set system F2 consisting of all sets in F1
that correspond to such pairs S, Sf E F: F2 = {S E F: x Ii S, S u {x} E F}.
By the above discussion, we have IFI = IF11 + IF21. Crucially, we observe
that dim(F2) ::::: d-l, since if A ~ X \ {x} is shattered by F 2 , then Au {x} is
240 Chapter 10: Transversals and Epsilon Nets

shattered by :F. Therefore, IF21 S; <I> d-l (n-l). The resulting recurrence has
already been solved in the first proof of Proposition 6.1.1. D

The rest of the proof of the epsilon net theorem is a clever probabilistic
argument; one might be tempted to believe that it works by some magic.
First we need a technical lemma concerning the binomial distribution.

10.2.6 Lemma. Let X = Xl + X 2 + ... + X n , where the Xi are independent


random variables, Xi attaining the value 1 with probability p and the value
o with probability I-p. Then Prob [X ~ ~ np] ~ ~, provided that np ~ 8.
Proof. This is a routine consequence of Chernoff-type tail estimates for the
binomial distribution, and in fact, considerably stronger estimates hold. The
simple result we need can be quickly derived from Chebyshev's inequality
for X, stating that Prob[IX - E[X] I ~ t] S; Var[X] /t 2 , t ~ O. Here E[X] =
np and Var[X] = L:::~=l Var[Xi ] S; np. So

Prob [X < ~np] S; Prob [IX - E[X] I ~ ~np] S; n: S; ~.


D

Proof of the epsilon net theorem. Let us put s = Cdr In r (assuming


without harm that it is an integer), and let N be a random sample picked by
s independent random draws, where each element is drawn from X according
to the probability distribution /1. (So the same element can be drawn several
times; this does not really matter much, and this way of random sampling is
chosen to make calculations simpler.) The goal is to show that N is a ~-net
with a positive probability.
To simplify formulations, let us assume that all S E F satisfy /1(S) ~ ~;
this is no loss of generality, since the smaller sets do not play any role. The
probability that the random sample N misses any given set S E F is at most
(1 - ~)S S; e- s / r , and so if s were at least rln(IFI+l), say, the conclusion
would follow immediately. But r is typically much smaller than IFI (it can
be a constant, say), and so we need to do something more sophisticated.
Let Eo be the event that the random sample N fails to be a ~-net, i.e.,
misses some S E F. We bound Prob[Eo] from above using the following
thought experiment.
By s more independent random draws we pick another random sample
M.2 We put k = tr' again assuming that it is an integer, and we let EI be
the following event:
There exists an S E F with N n S = 0 and 1M n SI ~ k.

2 This double sampling resembles the proof of Proposition 6.5.2, and indeed these
proofs have a lot in common, although they work in different settings.
10.2 Epsilon Nets and VC-Dimension 241

Here an explanation concerning repeated elements is needed. Formally, we


regard Nand M as sequences of elements of X, with possible repetitions, so
N = (Xl, X2,"" x s), M = (Yl, Y2,"" Ys). The notation 1M n SI then really
means I{i E 1,2, ... , s: Yi E S}I, and so an element repeated in M and lying
in S is counted the appropriate number of times.
Clearly, Prob [El] :::; Prob [Eo], since El requires Eo plus something more.
We are going to show that Prob[El ] ~ ~ Prob[Eo]. Let us investigate the
conditional probability Prob [El I N], that is, the probability of El when
N is fixed and M is random. If N is a ~-net, then El cannot occur, and
Prob[Eo IN] = Prob[E1 1 N] = O.
So suppose that there exists an S E F with N n S = 0. There may
be many such S, but let us fix one of them and denote it by SN. We have
Prob[Ell N] ~ Prob[lM n SNI ~ k]. The quantity 1M n SNI behaves like
the random variable X in Lemma 10.2.6 with n = sand p = ~, and so
Prob[lM n SNI ~ k] ~ ~. Hence Prob[Eo IN] :::; 2 Prob [El IN] for all N,
and thus Prob[Eo] :::; 2 Prob [Ed.
Next, we are going to bound Prob[Ed differently. Instead of choos-
ing Nand M at random directly as above, we first make a sequence
A = (ZI' Z2,"" Z2s) of 28 independent random draws from X. Then, in the
second stage, we randomly choose 8 positions in A and put the elements at
these positions into N, and the remaining elements into M (so there are e:)
possibilities for A fixed). The resulting distribution of N and M is the same
as above. We now prove that for every fixed A, the conditional probabil-
ity Prob[El I A] is small. This implies that Prob[Ed is small, and therefore
Prob [Eo] is small as well.
So let A be fixed. First let S E F be a fixed set and consider the con-
ditional probability Ps = Prob[N n S = 0, 1M n SI ~ k I A]. If IA n SI < k,
then Ps = O. Otherwise, we bound Ps :::; Prob[N n S = 01 A]. The latter is
the probability that a random sample of 8 positions out of 28 in A avoids the
at least k positions occupied by elements of S. This is at most

es;k)
esS) -
< (1- ~)s <
28-
e-(k/2s)s = e-k/2 = e-(Cdlnr)/4 = r-Cd/4.

This was an estimate of Ps for a fixed S E F. Now, finally, we use the


assumption about the VC-dimension of F, via the shatter function lemma:
The sets of F have at most <Pd(28) distinct intersections with A. Since the
event "N n S = 0 and 1M n SI ~ k" depends only on An S, it suffices to
consider at most <Pd(28) distinct sets S, and so for every fixed A,

Prob[EIIA]:::; <Pd(28).r- Cd / 4 :::; (2~8)d r- Cd / 4 = (2erlnr.r-C/4t < ~


if d, r ~ 2 and C is sufficiently large. So Prob[Eo] :::; 2 Prob[E1 ] < 1, which
proves Theorem 10.2.4. 0
242 Chapter 10: Transversals and Epsilon Nets

The epsilon net theorem implies that for set systems of small VC~dimen­
sion, the gap between the fractional transversal number and the transversal
number cannot be too large.
10.2.7 Corollary. Let F be a finite set system on a ground set X with
dim (F) ::; d. Then we have

r(F) ::; Cdr*(F) In r*(F),

where C is as in the epsilon net theorem.

Proof. Let r = r*(F). Since F is finite, we may assume that an optimal


fractional transversal <p: X --+ [O,IJ is concentrated on a finite set Y. This <p,
after rescaling, defines a probability measure f.J, on X, by letting f.J,( {y}) =
~ <p(y), Y E Y. Each S E F has f.J,(S) 2 ~ by the definition of fractional
transversal, and so a ~-net for (X, F) with respect to f.J, is a transversal. By
the epsilon net theorem, there exists a transversal of size at most Cdr In r.
o

We mention a concrete application of the corollary in the next section,


where we collect examples of set systems of bounded VC-dimension.

Bibliography and remarks. The notion of VC-dimension orig-


inated in statistics. It was introduced by Vapnik and Chervonenkis
[VcnJ. Under different names, it has also appeared in other papers
(Sauer [Sau72J and Shelah [She72j), but the work [VcnJ was probably
the most influential for subsequent developments. The name VC-di-
mension and some other, by now more or less standard, terminology
were introduced by Haussler and Welzl [HW87J. VC-dimension and the
related theory play an important role in several mathematical fields,
such as statistics (the theory of empirical processes), computational
learning theory, computational geometry, discrete geometry, combina-
torics of hypergraphs, and discrepancy theory.
The shatter function lemma was independently discovered in the
three already mentioned papers [VcnJ, [Sau72J, [She72J.
The shatter function, together with the dual shatter function (de-
fined as the shatter function of the dual set system) was introduced
and applied by Welzl [WeI88J. Implicitly, these notions were used much
earlier, and they appear in the literature under various names, such
as growth functions.
The notion of c-net and the epsilon net theorem (with X finite
and f.J, uniform) are due to Haussler and Welzl [HW87J. Their proof
is essentially the one shown in the text, and it closely follows an ear-
lier proof by Vapnik and Chervonenkis [VcnJ concerning the related
notion of c-approximations. In the same setting as in the definition of
10.3 Bounding the VC-Dimension and Applications 243

c:-nets, a set A S;; X is an c:-approximation for (X, F) with respect to


/1 if for all S E F,

I/1(S) - IAnSl1
IAI < c:.
So while an c:-net intersects each large set at least once, an c:-ap-
proximation provides a "proportional representation" up to the er-
ror of c:. Vapnik and Chervonenkis [VC71] proved the existence of
~-approximations of size O( dr 2 log r) for all set system of VC-dimen-
sion d.
Koml6s, Pach, and Woginger [KPW92] improved the dependence
on d in .the Haussler-Welzl bound on the size of c:-nets. The improve-
ment is achieved by choosing the second sample M of size t somewhat
larger than s and doing the calculations more carefully. They also
proved an almost matching lower bound using suitable random set
systems. The proofs can be found in [PA95] as well.
The proof in the Vapnik-Chervonenkis style, while short and
clever, does not seem to convey very well the reasons for the existence
of small c:-nets. Somewhat longer but more intuitive proofs have been
found in the investigation of deterministic algorithms for constructing
c:-approximations and c:-nets; one such proof is given in [Mat99a], for
instance.

Exercises
1. Show that for any integer d there exists a convex set C in the plane such
that the family of all isometric copies of C has VC-dimension at least d.
W
2. Show that the shatter function lemma is tight. That is, for all d and n
construct a system of VC-dimension d on n points with 1>d(n) sets. 0

10.3 Bounding the VC-Dimension and Applications


The VC-dimension can be determined without great difficulty in several sim-
ple cases, such as for half-spaces or balls in R d , but for only slightly more com-
plicated families its computation becomes challenging. On the other hand, a
few simple steps explained below show that the VC-dimension is bounded for
any family whose sets can be defined by a formula consisting of polynomial
equations and inequalities combined by Boolean connectives (conjunctions,
disjunctions, etc.) and involving a bounded number of real parameters. This
includes families like all ellipsoids in R d , all boxes in R d , arbitrary intersec-
tions of pairs of circular disks in the plane, and so on. On the other hand,
arbitrary convex polygons are not covered (since a general convex polygon
cannot be described by a bounded number of real parameters) and indeed,
this family has infinite VC-dimension.
244 Chapter 10: Transversals and Epsilon Nets

We begin by determining the VC-dimension for half-spaces.


10.3.1 Lemma. The VO-dimension of the system of all (closed) half-spaces
in Rd equals d+ 1.

Proof. Obviously, any set of d+ 1 affinely independent points can be shat-


tered. On the other hand, no d+2 points can be shattered by Radon's lemma.
D
Next, we turn to the family 'Pd,D of all sets in Rd definable by a single
polynomial inequality of degree at most D.
10.3.2 Proposition. Let R[XI' X2, ... , Xd]:SD denote the set of all real poly-
nomials in d variables of degree at most D, and let

'Pd,D = {{x E Rd: p(x) ;?: O}: p E R[XI,X2, ... ,xdhD}.

Then dim('Pd,D) ~ (d~D).

Proof. The following simple but powerful trick is known as the Veronese
mapping in algebraic geometry (or as linearization; it is also related to the
reduction of Voronoi diagrams to convex polytopes in Section 5.7). Let M
be the set of all possible nonconstant monomials of degree at most D in
Xl, ... ,Xd· For example, for D = d = 2, we have M = {Xl, X2, XIX2, xi, x~}.
Let m = IMI and let the coordinates in R m be indexed by the monomials
in M. Define the map <p: Rd -+ Rm by <p(x)JL = p,(x), where the monomial p,
serves as a formal symbol (index) on the left-hand side, while on the right-
hand side we have the number obtained by evaluating p, at the point X E Rd.
For example, for d = D = 2, the map is

<P:(XI,X2) E R2 f-t (XI,X2,XIX2,xi,x~) E R5.

We claim that if A c Rd is shattered by 'Pd,D, then <p(A) is shattered by


half-spaces in R m. To see this, let B C;;; A, and let p E 'Pd,D be a polynomial
that is nonnegative at the points of B and negative at A \ B. We let aJL
be the coefficient of p, in p and ao the constant term of p, and we define
the half-space hp E Rm as {y E Rm: ao + 2:JLEM aJLYJL ;?: O}. For example,
if P(XI,X2) = 7 + 3X2 - XIX2 + xI E 'P2,2, the corresponding half-space is
hp = {y E R5: 7 + 3Y2 - Y3 + Y4 ;?: O}. Then we get hp n <p(A) = <p(B).
Since, finally, <p is injective, we obtain a set of size IAI in Rm shattered by
half-spaces. By Lemma 10.3.1, we have dim('Pd,D) ~ IMI+1 = (D;d). D

Geometrically, the Veronese map embeds Rd into Rm as a curved man-


ifold in such a way that any subset of Rd definable by a single polynomial
inequality of degree at most D can be cut off by a half-space in Rm. Except
for few simple cases, this is hard to visualize, but the formulas work in a
really simple way.
10.3 Bounding the VC-Dimension and Applications 245

By Proposition 10.3.2, any subfamily of some Pd,D has bounded VC-di-


mension; this applies, e.g., to balls in Rd (D = 2) and ellipsoids in Rd (D = 2
as well). For concrete families, the bound from Proposition 10.3.2 is often very
weak. First, if we deal only with special polynomials involving fewer than
(D;td) monomials, then we can use an embedding into R m with a smaller m.
We also do not have to use only coordinates corresponding to monomials
in the embedding. For example, for the family of all balls in R d, a suitable
embedding is cp: Rd --t Rd+1 given by (Xl, ... , Xd) M (Xl, X2,.'" xi
+ x~ +
... + x~). It is closely related to the "lifting" transforming Voronoi diagrams
in Rd to convex polytopes in R d +1 discussed in Section 5.7. Estimates for the
VC-dimension can also be obtained from Theorem 6.2.1 about the number
of sign patterns of polynomials or from similar results.
Combinations of polynomial inequalities. Families like all rectangular
boxes in Rd or lunes (differences of two disks in the plane) can be handled
using the following result.
10.3.3 Proposition. Let F(X1' X 2 , .•. , X k ) be a fixed set-theoretic expres-
sion (using the operations of union, intersection, and difference) with variables
Xl, ... X k standing for sets; for instance,

Let S be a set system on a ground set X with dim(S) = d < 00. Let
T = {F(8 1 , ... , 8k): 8 1 , ... , 8k E S}.
Then dim(T) = O(kdlnk).

Proof. The trick is to look at the shatter functions. Let A ~ X be an


m-point set. It is easy to verify by induction on the structure of F that
for any 8 1 ,82 , ... , 8 k, we have F(81, ... , 8k) n A = F(81 n A, ... , 8k n A).
In particular, F(81 , ... , 8 k) n A depends only on the intersections of the 8 i
with A. Therefore, 7r,(m) ~ 7rs(m)k. By the shatter function lemma, we have
7rs(m) ~ cI>d(m). If A is shattered by T, then 7r,(m) = 2m. From this we have
the inequality 2m ~ cI>d(m)k. Calculation using the estimate cI>d(m) ~ (e;;)d
leads to the claimed bound. 0

Propositions 10.3.3 and 10.3.2 together show that families of geometric


shapes definable by formulas of bounded size involving polynomial equations
and inequalities have bounded VC-dimension. (In the terminology introduced
in Section 7.7, families of semialgebraic sets of bounded description complex-
ity have bounded VC-dimension.) In the subsequent example we will en-
counter a family of quite different nature with bounded VC-dimension. First
we present a general observation.
VC-dimension of the dual set system. Let (X, F) be a set system.
The dual set system to (X, F) is defined as follows: The ground set is Y =
246 Chapter 10: Transversals and Epsilon Nets

{Ys: S E .1'} , where the Ys are pairwise distinct points, and for each x E X
we have the set {Ys: S E .1', XES} (the same set may be obtained for several
different x, but this does not matter for the VC-dimension).
10.3.4 Lemma. Let (X, F) be a set system and let (Y,9) be the dual set
system. Then dim(9) < 2dim (.:F)+l.

Proof. We show that if dim(9) ~ 2d , then dim (F) ~ d. Let A be the inci-
dence matrix of (X, F), with columns corresponding to points of X and rows
corresponding to sets of .1'. Then the transposed matrix AT is the incidence
matrix of (Y, 9). If Y contains a shattered set of size 2d , then A has a 2d x 22d
submatrix M with all the possible 0/1 vectors of length 2d as columns. We
claim that M contains as a submatrix the 2d x d matrix Ml with all pos-
sible 0/1 vectors of length d as rows. This is simply because the d columns
of Ml are pairwise distinct and they all occur as columns of M. This Ml
corresponds to a shattered subset of size d in (X, F). Here is an example for
d= 2:

1}
0 0 0 0 0 0 0 1 1 1 1 1 1 1

M-e
- 0
0
0
0
1
0 0
1
0
1
1
1
0 0
0 1
1 1
1
0
1
1
1
0
0
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
the submatrix Ml is marked bold. 0

An art gallery problem. An art gallery, for the purposes of this section, is
a compact set X in the plane, such as the one drawn in the following picture:

The set X is the lightly shaded area, while the black regions are walls that
are not part of X. We want to choose a small set G c X of guards that
10.3 Bounding the VC-Dimension and Applications 247

together can see all points of X, where a point x E X sees a point y E X if


the segment xy is fully contained in X. The visibility region Vex) of a point
x E X is the set of all points y E X seen by x, as is illustrated below:

It is easy to construct galleries that require arbitrarily many guards; it


suffices to include many small niches so that each of them needs an individual
guard. To forbid this cheap way of making a gallery difficult to guard, we
consider only galleries where each point can be seen from a reasonably large
part of the gallery. That is, we suppose that the gallery X has Lebesgue
measure 1 and that J.L(V(x)) :::: E for every x E X, where E > 0 is a parameter
(say l~) and J.L is the Lebesgue measure restricted to X. Can every such
gallery be guarded by a number of guards that depends only on E?
The answer to this question is still no, although an example is not entirely
easy to construct. The problem is with galleries with many "holes," i.e., many
connected components of the complement (corresponding to pillars in a real-
world gallery, say). But if we forbid holes, then the answer becomes yes.
10.3.5 Theorem. Let X be a simply connected art gallery (i.e., with R2 \X
connected) of Lebesgue measure 1, and let r :::: 2 be a real number such that
J.L(V (x)) :::: ~ for all x EX. Then X can be guarded by at most Cr log r
points, where C is a suitable absolute constant.

Proof. The bound O(rlogr) for the number of guards is obtained from the
epsilon net theorem (Theorem 10.2.4). Namely, we introduce the set system
V = {Vex): x E X}, and note that G is a set guarding all of X if and only
if it is a transversal of V. Further, an E-net for (X, V) with respect to J.L is a
transversal of V, since by the assumption, J.L(V) :::: E = ~ for each V E V. So
the theorem will be proved if we can show that dim (V) is bounded by some
constant (independent of X).
Tools like Proposition 10.3.2 and Proposition 10.3.3 seem to be of little
use, since the visibility regions can be arbitrarily complicated. We thus need
a different strategy, one that can make use of the simple connectedness. We
248 Chapter 10: Transversals and Epsilon Nets

proceed by contradiction: Assuming the existence of an extremely large set


A c X shattered by V, we find, by a sequence of Ramsey-type steps, a
configuration forcing a hole in X.
Let d be a sufficiently large number, and suppose that there is ad-point
set A c X shattered by V. This means that for each subset B ~ A there
exists a point (FB E X that can see all points of B but no point of A \ B. We
put E = {(FB: B ~ A}. In such a situation, we say that A is shattered by E.
Starting with A and E, we find a smaller shattered set in a special position.
We draw a line through each pair of points of A. The arrangement of these
at most (g) lines has at most O(d4 ) faces (vertices, edges, and open convex
polygons), so there is one such face Fa containing a subset E' ~ E of at least
2djO(d4 ) points of E.

These points correspond to subsets of A, and so they define a set system


VI on A. If d l = dim(VI ) were bounded by a constant independent of d,
then the number of sets in VI would grow at most polynomially with d (by
Lemma 10.2.5). But we know that it grows exponentially, and so d l --+ 00
as d --+ 00. Thus, we may assume that some subset Al ~ A is shattered by
a subset EI ~ E', with d l = IAlllarge, and the whole of EI lies in a single
face of the arrangement of the lines determined by points of AI.
Next, we would like to ensure a similar condition in the reverse direction,
that is, all the points being shattered lying in a single cell of the arrangement
of the lines determined by the shattering points.
A simple, although wasteful, way is to apply Lemma 10.3.4 about the
dimension of the dual set system. This means that we can select sets A2 ~ EI
and E2 ~ Al such that A2 is shattered by E2 and d 2 = IA21 is still large
(about log2 d l ).
Now we can repeat the procedure from the first step of the proof, this
time selecting a set A3 ~ A2 of size d3 (still sufficiently large) and E3 ~ E2
such that A3 is shattered by E3 and all of E3 lies in a single face of the
arrangement of the lines determined by the pairs of points of A 3 . This face
must be 2-dimensional, since if it were an edge, all the points of A3 and E3
would be collinear, which is impossible.
We thus have all points of A3 within a single 2-face of the arrangement of
the lines determined by E3 and vice versa. In other words, no line determined
by two points of A3 intersects conv(E 3), and no line determined by two points
of E3 intersects conv(A3). In particular, conv(A3) n conv(E3) = 0. It follows
10.3 Bounding the VC-Dimension and Applications 249

that each point of ~3 sees all points of A3 within an angle smaller than 7r
and in the same clockwise angular order; let :::;A be this linear order of the
points of A 3 . Similarly, we have a common counterclockwise angular order
:::;2; of points of ~3 around any point of A 3 •
Suppose that the initial d was so large that d3 = IA31 = 5. For each
a E A 3, we consider the point a(a) E ~3 that sees all points of A3 but a.
Let these 5 points form a set ~4 C ~3. We have a situation indicated below,
where dashed connecting segments correspond to invisibility and they form
a matching between A3 and ~4'
.... ......•

~~~~:. :;.~:~:.:.: ~: .: : <.~.


::.::7"::'<::'::.:•
.. ' ..
Since we have 5 points on each side, we may choose an a E A3 such that
a is neither the first nor the last point of A3 in :::;A, and at the same time
a = a(a) E ~4 is not the first or last point in :::;2;. Then we have the following
situation (full segments indicate visibility, and the dashed segment means
invisibility):

a'

a"

The segments aa' and a' a both lie above the line aa, and they intersect as
indicated (a' cannot line in the triangle aaa', because the line aa' would go
between a and a', and neither can the segment aa' be outside that triangle,
because then the line aa' would separate a from a'). Similarly, the segments
aa" and a" a intersect as shown. The four segments aa', a' a, aa", and a" a are
contained in X, and since X is simply connected, the shaded quadrilateral
bounded by them must be a part of X. Hence a and a can see each other.
This contradiction proves Theorem 10.3.5. 0

The bound on the VC-dimension obtained from this proof is rather large:
about 1012 . By a more careful analysis, avoiding the use of Lemma 10.3.4 on
the dual VC-dimension where one loses the most, the bound has been im-
proved to 23. Determining the exact VC-dimension in the worst case might
be quite challenging. The art gallery drawn in the initial picture is not chosen
only because of the author's liking for several baroque buildings with pentag-
onal symmetry, but also because it is an example where V has VC-dimension
at least 5 (Exercise 2). A more complicated example gives VC-dimension 6,
and this is the current best lower bound.
250 Chapter 10: Transversals and Epsilon Nets

Bibliography and remarks. As was remarked in the text, for


bounding the VC-dimension of set systems defined by polynomial in-
equalities, we can use the linearization method (as in the proof of
Proposition 10.3.2) or results like Theorem 6.2.1 on the number of sign
patterns. The latter can often provide asymptotically sharp bounds on
the shatter functions (which are usually the more important quantita-
tive parameters in applications); for linearizations, this happens only
in quite simple cases.
There are fairly general results bounding the VC-dimension for
families of sets defined by functions more general than polynomials;
see, e.g., Wilkie [Wil99] and Karpinski and Macintyre [KM97b].
Considerations similar to the proof of Proposition 10.3.3 appear in
Dudley [Dud78]. Lemma 10.3.4 about the VC-dimension of the dual
set system was noted by Assouad [Ass83].
The art gallery problem considered in this section was raised by
Kavraki, Latombe, Motwani, and Raghavan [KLMR98] in connection
with automatic motion planning for robots. Theorem 10.3.5, with the
proof shown, is from Kalai and Matousek [KM97a]. That paper also
proves that for galleries with h holes, the number of guards can be
bounded by a function of E and h, and provides an example showing
that one may need at least n(log h) guards in the worst case for a suit-
able fixed E. Valtr [VaI98] greatly improved the quantitative bounds,
obtaining the lower bound of 6 and upper bound of 23 for dim (V) for
simply connected galleries, as well as a bound of O(log2 h) for galleries
with h holes. In another paper [VaI99b], he constructed contractible
3-dimensional galleries where the visibility region of each point occu-
pies almost half of the total volume of the gallery but the number
of guards is unbounded, which shows that Theorem 10.3.5 has no
straightforward analogue in dimension 3 and higher. Here is another
result from [KM97a]: If a planar gallery X is such that among every k
points of X there are 3 that can be guarded by a single guard, then all
of X can be guarded by O( k 3 log k) guards. Let us stress that our ex-
ample was included mainly as an illustration to VC-dimension, rather
than as a typical specimen of the extensive subject of studying guards
in art galleries from the mathematical point of view. This field has a
large number results, some of them very nice; see, e.g., the handbook
chapter [UrrOO] for a survey.

Exercises
1. (a) Determine the VC-dimension of the set system consisting of all tri-
angles in the plane. 0
(b) What is the VC-dimension of the system of all convex k-gons in the
plane, for a given integer k? [2]
10.4 Weak Epsilon Nets for Convex Sets 251

2. Show that dim (V) ;::: 5 for the art gallery shown above Theorem 10.3.5.
~
Can you construct an example with VC-dimension 6, or even higher?
3. Show that the unit square cannot be expressed as {(x, y) E R2: p(x, y) ;:::
O} for any polynomial p(x, y). 0
4. (a) Let H be a finite set of lines in the plane. For a triangle T, let HT be
the set of lines of H intersecting the interior of T, and let T <:;;: 2H be the
system of the sets HT for all triangles T. Show that the VC-dimension
of T is bounded by a constant. ~
(b) Using (a) and the epsilon net theorem, prove the suboptimal cut-
ting lemma (Lemma 6.5.1): For every finite set H of lines in the plane
and for every r, 1 < r < IHI, there exists a ~-cutting for L consisting
of O(r 2 log2 r) generalized triangles. Use the proof in Section 4.6 as an
inspiration. 0
(c) Generalize (a) and (b) to obtain a cutting lemma for circles with the
same bound O(r 2 log2 r) (see Exercise 4.6.3). ~
5. Let d ;::: 1 be an integer, let U = {I, 2, ... ,d} and V = 2u . Let the
shattering graph SGd have vertex set U U V and edge set {{a, A}: a E
U, A E V, a E A}. Prove that if H is a bipartite graph with classes Rand
S, IRI = r and lSI = s, such that r+log2 s :::; d, then there is an r-element
subset R1 <:;;: U and an s-element Sl <:;;: V such that the subgraph induced
in SGd by R1 U Sl is isomorphic to H. Thus, the shattering graph is
"universal": It contains all sufficiently small bipartite subgraphs. 0
6. For a graph G, let N(G) = {Na(v): v E V(G)} be the system of vertex
neighborhoods (where Na(v) = {u E V(G): {u,v} E E(G)}).
(a) Prove that there is a constant do such that dim(N(G)) :::; do for all
planar G. 0
(b) Show that for every C there exists d = d( C) such that if G is a
graph in which every subgraph on n vertices has at most Cn edges, for
all n ;::: 1, then dim(N(G)) :::; d. (This implies (a) and, more generally,
shows that bounded genus of G implies bounded dim(N(G)).) 0
(c) Show that for every k there exists d = d(k) such that if dim(N(G)) ;:::
d, then G contains a subdivision of the complete graph Kk as a subgraph.
(This gives an alternative proof that if dim(N(G)) is large, then the genus
of G is large, too.) 0

10.4 Weak Epsilon Nets for Convex Sets


Weak e-nets. Let 1l be the system of all closed half-planes in the plane,
and let f.1, be the planar Lebesgue measure restricted to a (closed) disk D of
unit area. What should the smallest possible c-net for (R2, 1l) with respect
to f.1, look like? A natural idea would be to place the points of the c-net
equidistantly around the perimeter of the disk:
252 Chapter 10: Transversals and Epsilon Nets

Is this the best way? No; according to Definition 10.2.2, three points placed
as in the picture below form a valid €-net for every € :::: 0, since any half-plane
cutting into D necessarily contains at least one of them!

; ...•••...........•.••..,.•••.........•••.•.•

..................... :-

One may feel that this is a cheating. The problem is that the points of this
€-net are far away from where the measure is concentrated. For some applica-
tions of €-nets this is not permissible, and for this reason, €-nets of this kind
are usually called weak €-nets in the literature, while a "real" €-net in the
above example would be required to have all of its points inside the disk D.
For €-nets obtained using the epsilon net theorem (Theorem 10.2.4), this
presents no real problem, since we can always restrict the considered set
system to the subset where we want our €-net to lie. In the above example
we would simply require an €-net for the set system (D, HID)' The restriction
to a subset does not increase the VC-dimension.
On the other hand, there are set systems of infinite VC-dimension, and
there we cannot require small €-nets to exist for every restriction of the ground
set. Indeed, if (X, F) has infinite VC-dimension, then by definition, there is
an arbitrarily large A <;;; X that is shattered by F, meaning that FIA. = 2A.
And the complete set system (A, 2A) certainly does not admit small €-nets:
Any ~-net, say, for (A,2A) with respect to the uniform measure on A must
have at least ~ IAI elements! In this sense, the epsilon net theorem is an "if
and only if" result: A set system (X, F) and all of its restrictions to smaller
ground sets admit €-nets of size depending only on € if and only if dim(F) is
finite.
As was mentioned after the definition of VC-dimension, the (important)
system K2 of convex sets in the plane has infinite VC-dimension. Therefore,
the epsilon net theorem is not applicable, and we know that restrictions of
K2 to some bad ground sets (convex independent sets, in this case) provide
arbitrarily large complete set systems. But yet it turns out that not too
large (weak) €-nets exist if the ground set is taken to be the whole plane
(or, actually, it can be restricted to any convex set). These are much less
10.4 Weak Epsilon Nets for Convex Sets 253

understood than the e-nets in the case of finite VC-dimensions, and many
interesting questions remain open.
As has been done in the literature, we will restrict ourselves to measures
concentrated on finite point sets, and first we will talk about uniform mea-
sures. To be on the safe side, let us restate the definition for this particular
case, keeping the traditional terminology of "weak e-nets."

°
10.4.1 Definition (Weak epsilon net for convex sets). Let X be a
finite point set in Rd and e > a real number. A set N ~ Rd is called a
weak e-net for convex sets with respect to X if every convex set containing
at least elXI points of X contains a point of N.
In the rest of this section we consider exclusively e-nets with respect to
convex sets, and so instead of "weak e-net for convex sets with respect to X"
we simply say "weak e-net for X."

10.4.2 Theorem (Weak epsilon net theorem). For every d 2: 1, e > 0,


and finite X C R d , there exists a weak e-net for X of size at most I(d, e),
where 1(d, e) depends on d and e but not on X.

The best known bounds are 1(2,~) = O(r2) in the plane and I(d,~) =
O(rd(log r )b(d)) for every fixed d, with a suitable constant b( d) > 0. The proof
shown below gives I(d,~) = O(r dH ). On the other hand, no lower bound
superlinear in r is known (for fixed d).
Proof. The proof is simple once we have the first selection lemma (Theo-
rem 9.1.1) at our disposal.
Let an X C Rd be an n-point set. The required weak e-net N is con-
structed by a greedy algorithm. Set No = 0. If Ni has already been con-
structed, we look whether there is a convex set C containing at least en
points of X and no point of N i . If not, Ni is a weak e-net by definition. If
yes, we set Xi = X n C, and we apply the first selection lemma to Xi. This
gives us a point ai contained in at least Cd(~~iD = f2(e d+1 n d+1 ) Xrsimplices.
We set NiH = Ni U {ad and continue with the next step of the algorithm.
Altogether there are (d~l) X-simplices. In each step of the algorithm, at
least f2(e d + 1 nd+l) of them are "killed," meaning that they were not inter-
sected by Ni but are intersected by NiH. Hence the algorithm takes at most
O(c(d+l)) steps. 0

In a forthcoming application, we also need weak e-nets for convex sets


with respect to a nonuniform measure (but still concentrated on finitely many
points).

10.4.3 Corollary. Let M be a probability measure concentrated on finitely


many points in Rd. Then weak e-nets for convex sets with respect to M exist,
of size bounded by a function of d and e.
254 Chapter 10: Transversals and Epsilon Nets

Sketch of proof. By taking E: a little smaller, we can make the point weights
rational. Then the problem is reduced to the weak epsilon net theorem with
X a multiset. One can check that all ingredients of the proof go through in
this case, too. 0

10.4.4 Corollary. For every finite system F of convex sets in R d , we have


T(F) ::; f(d, l/T*(F)), where f(d,E:) is as in the weak epsilon net theorem.
The proof of the analogous consequence of the epsilon net theorem, Corol-
lary lO.2.7, can be copied almost verbatim.

Bibliography and remarks. Weak E:-nets were introduced by Haus-


sler and Welzl [HW87]. The existence of weak E:-nets for convex sets
was proved by Alon, Barany, Fiiredi, and Kleitman [ABFK92] by the
method shown in the text but with a slight quantitative improvement,
achieved by using the second selection lemma (Theorem 9.2.1) instead
of the first selection lemma.
The estimates for f(d,~) mentioned after Theorem lO.4.2 have the
following sources: The bound O(r2) in the plane is from [ABFK92] (see
Exercise 1), and the best general bound in R d, close to O( rd), is due to
Chazelle, Edelsbrunner, Grini, Guibas, Sharir, and Welzl [CEG+95]. It
seems that these bounds are quite far from the truth. Intuitively, one
of the "worst" cases for constructing a weak E:-net should be a convex
independent set X. For such sets in the plane, though, near-linear
bounds have been obtained by Chazelle et al. [CEG+95]; they are
presented in Exercises 2 and 3 below. The original proof of the result in
Exercise 3 was formulated using hyperbolic geometry. A simple lower
bound for the size of weak E:-nets was noted in [Mat01]; it concerns
the dependence on d for E: fixed and shows that f(d, 510) =n (eJd72)
as d -+ 00.

Exercises
1. Complete the following sketch of an alternative proof of the weak epsilon
net theorem.
(a) Let X be an n-point set in the plane (assume general position if
convenient). Let h be a vertical line with half of the points of X on each
side, and let Xl, X 2 be these halves. Let M be the set of all intersections
of segments of the form XlX2 with h, where Xl E Xl and X2 E X 2 .
Let No be a weak E:'-net for M (this is a one-dimensional situation!).
Recursively construct weak E:"-nets N l , N2 for Xl and X 2, respectively,
and set N = No U Nl U N 2 . Show that with a suitable choice of E:' and
E:", N is a weak E:-net for X of size O(E:- 2 ). 0
(b) Generalize the proof from (a) to Rd (use induction on d). Estimate
the exponent of E: in the resulting bound on the size of the constructed
weak E:-net. 0
10.5 The Hadwiger-Debrunner (p, q)-Problem 255

2. The aim of this exercise is to show that if X is a finite set in the plane
in convex position, then for any 10 > 0 there exists a weak c-net for X of
size nearly linear in ~.
(a) Let an n-point convex independent set X C R2 be given and let
l::; n be a parameter. Choose points PO,Pl, ... ,Pe-l of X, appearing in
this order around the circumference of conv(X), in such a way that the
set Xi of points of X lying (strictly) between Pi-l and Pi has at most n/l
points for each i. Construct a weak c'-net Ni for each Xi (recursively)
with 10' = lc/3, and let M be the set containing the intersection of the
segment POPj-l with PjPi, for all pairs i,j, 1 ::; i < j-1 ::; l-2. Show
that the set N = {Po, ... ,Pe-d U Nl U··· U Nt U M is a weak c-net for
X. ~
(b) If /(10) denotes the minimum necessary size of a weak c-net for a
finite convex independent point set in the plane, derive a recurrence for
/(10) using (a) with a suitably chosen l, and prove the bound for /(10) =
o (~ (log ~) C). What is the smallest c you can get? ~
3. In this exercise we want to show that if X is the vertex set of a regular
convex n-gon in the plane, then there exists a weak c-net for X of size
O(~).
Suppose X lies on the unit circle u centered at O. For an arc length a ::; 7r
radians, let r(a) be the radius of the circle centered at 0 and touching a
chord of u connecting two points on u at arc distance a. For i = 0,1,2, ... ,
let Ni be a set of led~~)i J points placed at regular intervals on the circle
of radius r(c(1.01)i /10) centered at 0 (we take only those i for which
this is well-defined). Show that 0 U UiNi is a weak c-net of size O(~)
for X (the constants 1.01, etc., are rather arbitrary and can be greatly
improved). ~

10.5 The Hadwiger-Debrunner (p, q )-Problem


Let F be a finite family of convex sets in the plane. By Helly's theorem, if
every 3 sets from F intersect, then all sets of F intersect (unless F has 2 sets,
that is). What if we know only that out of every 4 sets of F, there are some 3
that intersect? Let us say that F satisfies the (4, 3)-condition. In such a case,
F may consist, for instance, of n-1 sets sharing a common point and one
extra set lying somewhere far away from the others. So we cannot hope for
a nonempty intersection of all sets. But can all the sets of F be pierced by a
bounded number of points? That is, does there exist a constant C such that
for any family F of convex sets in R2 satisfying the (4, 3)-condition there are
at most C points such that each set of F contains at least one of them?
This is the simplest nontrivial case of the so-called (p, q)-problem raised
by Hadwiger and Debrunner and solved, many years later, by Alon and Kleit-
man.
256 Chapter 10: Transversals and Epsilon Nets

10.5.1 Theorem (The (p, q)-theorem). Let p, q, d be integers with p 2:


q 2: d+ 1. Then there exists a number HD d(p, q) such that the following is true:
Let F be a finite family of convex sets in Rd satisfying the (p, q)-condition;
that is, among any p sets of F there are q sets with a common point. Then
F has a transversal consisting of at most HDd(p, q) points.

Clearly, the condition q 2: d+ 1 is necessary, since n hyperplanes in gen-


eral position in R d satisfy the (d, d)-condition but cannot be pierced by any
bounded number of points independent of n.
It has been known for a long time that ifp(d-1) < (q-1)d, then HDd(p, q)
exists and equals p-q+1 (Exercise 2). This is the only nontrivial case where
exact values, or even good estimates, of HDd(p, q) are known.
The reader might (rightly) wonder how one can get interesting examples of
families satisfying the (4, 3)-condition, say. A large collection of examples can
be obtained as follows: Choose a probability measure /-L in the plane (/-L(R2 ) =
1), and let F consist of all convex sets S with /-L(S) > 0.5. The (4, 3)-condition
holds, because 4 sets together have measure larger than 2, and so some point
has to be covered at least 3 times. The proof below shows that every family
F of planar convex sets fulfilling the (4, 3)-condition somewhat resembles this
example; namely, that there is a probability measure /-L such that /-L(S) > c
for all S E F, with some small positive constant c > 0 (independent of F).
Note that the existence of such /-L implies the (p,3) condition for a sufficiently
large p = p(c).
The Alon-Kleitman proof combines an amazing number of tools. The
whole structure of the proof, starting from basic results like Helly's theorem,
is outlined in Figure 10.1. The emphasis is on simplicity of the derivation
rather than on the best quantitative bounds (so, for example, Tverberg's
theorem is not required in full strength). The most prominent role is played
by the fractional Helly theorem and by weak c-nets for convex sets. An unsat-
isfactory feature of this method is that the resulting estimates for HDd(p, q)
are enormously large, while the truth is probably much smaller.
Since we have prepared all of the tools and notions in advance, the proof
is now short. We do not attempt to optimize the constant resulting from the
proof, and so we may as well assume that q = d+ 1.
By Corollary 10.4.4, we know that T is bounded by a function of T* for
any finite system of convex sets in Rd. So it remains to show that if F satisfies
the (p,d+1)-condition, then T*(F) = v*(F) is bounded.
10.5.2 Lemma (Bounded v*). Let F be a finite family of convex sets in
Rd satisfying the (p,d+1)-condition. Then v*(F) :S C, where C depends on
p and d but not on F.

Proof. The first observation is that if F satisfies the (p, d+ 1)-condition, then
many (d+1)-tuples of sets of F intersect. This can be seen by double counting.
Every p-tuple of sets of F contains (at least) one intersecting (d+1)-tuple,
10.5 The Hadwiger-Debrunner (p, q)-Problem 257

The lexicographic mInI-


mum of the intersection of
d+ 1 convex sets in Rd is
determined by d sets Tverberg's theorem
(finiteness of T(d, r) suffices)
double
counting

fractional Helly theorem alternative direct proof


(Exercise 10.4.2)
double with much worse bound
counting
double
counting
greedy algorithm

weak c-nets for convex sets of


(p, d+ 1)-condition size depending only on d and c
* v* bounded

T bounded by a function
of d and T* for systems of
convex sets

linear programming
duality * v* = T*

(p, q)-theorem:
(p, d+ 1)-condition* T bounded

Figure 10.1. Main steps in the proof of the (p, q)-theorem.

and a single (d+1)-tuple is contained in (;=~~D p-tuples (where n = IF!).


Therefore, there are at least
258 Chapter 10: Transversals and Epsilon Nets

intersecting (d+ 1)-tuples, with a > 0 depending on p, d only. The fractional


Helly theorem (Theorem 8.1.1) implies that at least f3n sets of F have a
common point, with f3 = f3(d, a) > 0 a constant. 3
How is this related to the fractional packing number? It shows that a
fractional packing that has the same value on all the sets of F cannot have
size larger than ~, for otherwise, the point lying in f3n sets would receive
weight greater than 1 in that fractional packing. The trick for handling other
fractional packings is to consider the sets in F with appropriate multiplicities.
Let 'ljJ: F -+ [0,1] be an optimal fractional packing CLSEF: xES 'ljJ(S) :::; 1
for all x). As we have noted in Theorem 10.1.1, we may assume that the
values of 'ljJ are rational numbers. Write 'ljJ(S) = mg),
where D and the m(S)
are integers (D is a common denominator). Let us form a new collection F m
of sets, by putting m(S) copies of each S into Fmi so Fm is a multiset of sets.
Let N = IFml = LSEF m(S) = D·v*(F). Suppose that we could conclude
the existence of a point a lying in at least f3N sets of Fm (counted with
multiplicity). Then

I;::: 2:= 'ljJ(S) = 2:= m~) = ~ . f3N = f3v*(F),


SEF: aES SEF: aES

and so v*(F) :::; ~.


The existence of a point a in at least f3N sets of Fm follows from the
fractional Helly theorem, but we must be careful: The new family Fm does
not have to satisfy the (p, d+ 1)-condition, since the (p, d+ 1)-condition for F
speaks only of p-tuples of distinct sets from F, while a p-tuple of sets from
Fm may contain multiple copies of the same set.
Fortunately, F m does satisfy the (p', d+ 1)-condition with p' = d(p-l) + 1.
Indeed, a p'-tuple of sets of F m contains at least d+ 1 copies of the same set or
it contains p distinct sets, and in the latter case the (p, d+ 1)-condition for F
applies. Using the fractional Helly theorem (which does not require the sets
in the considered family to be distinct) as before, we see that there exists a
point a common to at least f3N sets of Fm for some f3 = f3(p, d). Lemma 10.5.2
is proved, and this also concludes the proof of the (p, q)-theorem. D

Bibliography and remarks. The (p, q)-problem was posed by


Hadwiger and Debrunner in 1957, who also solved the special case in
Exercise 2 below. The solution described in this section follows Alon
and Kleitman [AK92].
Much better quantitative bounds on HDd(P, q) were obtained by
Kleitman, Gyarfas, and T6th [KGTOl] for the smallest nontrivial val-
ues of p, q, d: 3 :::; HD 2 ( 4,3) :::; 13.

3 By removing these (3n sets and iterating, we would get that :F can be pierced by
O(logn) points. The main point of the (p,q)-theorem is to get rid of this logn
factor.
10.6 A (p, q)-Theorem for Hyperplane Transversals 259

Exercises
1. For which values of p and r does the following hold? Let F be a finite
family of convex sets in R d , and suppose that any subfamily consisting
of at most p sets can be pierced by at most r points. Then F can be
pierced by at most C points, for some C = Cd(p, r). ~
2. Let p 2: q 2: d+1 and p(d-1) < (q-1)d. Prove that HDd(p, q) ~ p-q+1.
You may want to start with the case of HD 2 (5, 4). 8J
3. Let X C R2 be a (4k+1)-point set, and let F = {conv(Y): Y C X, WI =
2k+1}.
(a) Verify that F has the (4, 3)-property, and show that if X is in convex
position, then r(F) 2: 3. 0
(b) Show that r(F) ~ 5 (for any X). 0
These results are due to Alon and Rosenfeld (private communication).

10.6 A (p, q)-Theorem for Hyperplane Transversals


The technique of the proof of the (p, q)-theorem is quite general and allows
one to prove (p, q)-theorems for various families. That is, if we have some basic
family B of sets, such as the family K. of all convex sets in Theorem 10.5.1, a
(p, q)-theorem for B means that if F ~ B satisfies the (p, q)-condition, then
r(F) is bounded by a function of p and q (depending on B but not on the
choice of F).
To apply the technique in such a situation, we first need to bound v* (F)
using the (p, q)-condition. To this end, it suffices to derive a fractional Helly-
type theorem for B. Next, we need to bound r(F) as a function of r*(F). If
the VC-dimension of F is bounded, this is just Corollary 10.2.7, and other-
wise, we need to prove a "weak c-net theorem" for F. Here we present one
sophisticated illustration.
10.6.1 Theorem (A (p, q)-theorem for hyperplane transversals).
Let p 2: d+ 1 and let F be a finite family of convex sets in Rd such that
among every p members of F, there exist d+ 1 that have a common hyper-
plane transversal (i.e., there is a hyperplane intersecting all of them). Then
there are at most C = C(p, d) hyperplanes whose union intersects all mem-
bers of F.
Note that here the piercing is not by points but by hyperplanes. Let
1byp(F), rhyp(F), and vhyp(F) be the notions corresponding to the transversal
number, fractional transversal number, and fractional packing number in this
setting. 4 We prove only the planar case, since some of the required auxiliary
results become more complicated in higher dimensions.
4 We could reformulate everything in terms of piercing by points if we wished to
do so, by assigning to every S E F the set Ts of all hyperplanes intersecting S.
Then, e.g., 7byp(F) = r( {Ts: S E F}).
260 Chapter 10: Transversals and Epsilon Nets

To prove Theorem 10.6.1 for d = 2, we first want to derive a fractional


Helly theorem.

10.6.2 Lemma (Fractional Helly for line transversals). IfF is a family


of n convex sets in the plane such that at least a G) triples have line transver-
sals, then at least (3n of the sets have a line transversal, (3 = (3(a) > o.

Proof. Let F be a family as in the lemma. We distinguish two cases de-


pending on the number of pairs of sets in F that intersect.
First, suppose that at least ~ (~) pairs {S, S'} E (~) satisfy S n S' #- 0.
Project all sets of F vertically on the x-axis. The projections form a family of
intervals with at least ~ (~) intersecting pairs, and so by the one-dimensional
fractional Helly theorem, at least (3'n of these have a common point x. The
vertical line through x intersects (3'n sets of F.
Next, it remains to deal with the case of at most ~ G) intersecting pairs
in F. Call a triple {Sl, S2, S3} good if it has a line transversal and its three
members are pairwise disjoint. Since each intersecting pair gives rise to at
most n triples whose members are not pairwise disjoint, there are at most
n . ~ (~) :::; ~ G) nondisjoint triples, and so at least ~ (~) good triples remain.
Let {Sl,S2,S3} be a good triple; we claim that its sets have a line
transversal that is a common tangent to (at least) two of them. To see this,
start with an arbitrary line transversal, translate it until it becomes tangent
to one of the Si, and then rotate it while keeping tangent to Si until it be-
comes tangent to an Sj, i #- j.

Let L denote the set of all lines that are common tangents to at least
two disjoint members of F. Since two disjoint convex sets in the plane have
exactly 4 common tangents, ILl:::; 4G).
First, to see the idea, let us make the simplifying assumption that no 3
sets of F have a common tangent. Then each line £ E L has a unique defining
pair of disjoint sets for which it is a common tangent. As we have seen, for
each good triple {Sl, S2, S3} there is a line £ E L such that two sets of the
triple are the defining pair of £ and the third is intersected by £. Now, since
we have ~ G) good triples and ILl :::; 4(~), there is an £0 E L playing this role
for at least 8n of the good triples, 8 > O. Each of these 8n triples contains
the defining pair of £0 plus some other set, so altogether £0 intersects at least
8n sets. (Note the similarity to the proof of the fractional Helly theorem.)
Now we need to relax the simplifying assumption. Instead of working with
lines, we work with pairs (£, is, S'}), where S, S' E F are disjoint and £ is
one of their common tangents, and we let L be the set of all such pairs. We
still have ILl :::; 4(~), and each good triple {Sl, S2, S3} gives rise to at least
10.6 A (p, q)- Theorem for Hyperplane Transversals 261

one (.e,{S,S'}) E L, where {S,S'} c {Sl, S2, S3}. The rest of the argument
is as before. 0

The interesting feature is that while this fractional Helly theorem is valid,
there is no Helly theorem for line transversals! That is, for all n one can
find families of n disjoint planar convex sets (even segments) such that any
n-1 have a line transversal but there is no line transversal for all of them
(Exercise 5.1.9).
Lemma 10.6.2 implies, exactly as in the proof of Lemma 10.5.2, that vhyp
is bounded for any family satisfying the (p, d+ 1)-condition. It remains to
prove a weak €-net result.

10.6.3 Lemma. Let L be a finite set (or multiset) of lines in the plane and
let r ~ 1 be given. Then there exists a set N of O(r2) lines (a weak €-net)
such that whenever S ~ R 2 is an (arcwise) connected set intersecting more
than I~I lines of L, then it intersects a line of N.

Proof. Recall from Section 4.5 that a ~-cutting for a set L of lines is a
collection {~b ... ,~d of generalized triangles covering the plane such that
the interior of each ~i is intersected by at most I~I lines of L. The cutting
lemma (Lemma 4.5.3) guarantees the existence of a ~-cutting of size O(r 2 ).
The cutting lemma does not directly cover multisets of lines. Nevertheless,
with some care one can check that the perturbation argument works for
multisets of lines as well.
Thus, let {~l' ... ' ~d be a ~-cutting for the considered L, t = O(r 2 ).
The weak €-net N is obtained by extending each side of each ~i into a line.
Indeed, if an arcwise connected set S intersects more than I~I lines of L,
then it cannot be contained in the interior of a single ~i' and consequently,
it intersects a line of N. 0

Conclusion of the proof of Theorem 10.6.1. Lemma 10.6.3 is now


used exactly as the €-nets results were used before, to show that 7hyp(F) =
O(Thyp(F)2) in this case. This proves the planar version of Theorem 10.6.1.
o

Bibliography and remarks. Theorem 10.6.1 was proved by Alon


and Kalai [AK95J, as well as the results indicated in Exercises 3
and 4 below. It is related to the following conjecture of Grunbaum
and Motzkin: Let F be a family of sets in R d such that the intersec-
tion of any at most k sets of F is a disjoint union of at most k closed
convex sets. Then the Helly number of F is at most k( d+ 1). So here,
in contrast to Exercise 4, the Helly number is determined exactly. I
mention this mainly because of a neat proof by Amenta [Ame96] using
a technique originally developed for algorithmic purposes.
262 Chapter 10: Transversals and Epsilon Nets

It is not completely honest to say that there is no Helly theorem for


line (and hyperplane) transversals, since there are very nice theorems
of this sort, but the assumptions must be strengthened. For example,
Hadwiger's transversal theorem asserts that if F is a finite family of
disjoint convex sets in the plane with a linear ordering ::::: such that
every 3 members of F can be intersected by a directed line in the order
given by:::::, then F has a line transversal. This has been generalized
to hyperplane transversals in R d, and many related results are known;
see, e.g., the survey Goodman, Pollack, and Wenger [GPW93].
The application of the Alon-Kleitman technique for transversals
of d-intervals in Exercise 2 below is due to Alon [Al098]. Earlier, a
similar result with the slightly stronger bound T ::::: (d 2 - d)v was
proved by Kaiser [Kai97] by a topological method, following an initial
breakthrough by Tardos [Tar95], who dealt with the case d = 2. By
the Alon-Kleitman method, Alon [Alo] proved analogous bounds for
families whose sets are subgraphs with at most d components of a
given tree, or, more generally, subgraphs with at most d components
of a graph G of bounded tree-width. In a sense, the latter is an "if
and only if" result, since for every k there exists w(k) such that every
graph of tree-width w(k) contains a collection of subtrees with v = 1
and T ?: k.
Alon, Kalai, Matousek, and Meshulam [AKMMOl] investigated
generalizations of the Alon-Kleitman technique in the setting of ab-
stract set systems. They showed that (p, d+1)-theorems for all p fol-
low from a suitable fractional Helly property concerning (d+ 1)-tuples,
and further that a set system whose nerve is d-Leray (see the notes to
Section 8.1) has the appropriate fractional Helly property and conse-
quently satisfies (p, d+ 1)-theorems.

Exercises
1. (a) Prove that if F is a finite family of circular disks in the plane such
that every two members of F intersect, then T(F) is bounded by a con-
stant (this is a very weak version of Gallai's problem mentioned at the
beginning of this chapter). [II
(b) Show that for every p ?: 2 there is an no such that if a family of
no disks in the plane satisfies the (p,2)-condition, then there is a point
common to at least 3 disks of the family. ~
(c) Prove a (p, 2)-theorem for disks in the plane (or for balls in R d ). [II
2. A d-interval is a set J ~ R of the form J = h U 12 U ... U I d , where
the I j C R are closed intervals on the real line. (In the literature this is
customarily called a homogeneous d- interval.)
(a) Let F be a finite family of d-intervals with v(F) = k. The family
may contain multiple copies of the same d-interval. Show that there is a
10.6 A (p, q)- Theorem for Hyperplane Transversals 263

j3 = j3(d, k) > 0 such that for any such F, there is a point contained in
at least j3 . IFI members of F. 0 Can you prove this with j3 = 2~k? 0
(b) Prove that r(F) ::; dr*(F) for any finite family of d-intervals. 0
(c) Show that r(F) ::; 2d 2 v(F) for any finite family of d-intervals, or at
least that r is bounded by a function of d and v. 0
3. Let K~ denote the family of all unions of at most k convex sets in R d
(so the d-intervals from Exercise 2 are in Kf). Prove a (p, d+1)-theorem
for this family by the Alon-Kleitman technique: Whenever a finite fam-
ily F c K~ satisfies the (p, d+1)-condition, r(F) ::; f(p, d, k) for some
function f. III
4. (a) Show that the family K~ as in Exercise 3 has no finite Helly number.
That is, for every h there exists a subfamily F C K~ of h+ 1 sets in which
n
every h members intersect but F = 0. III
(b) Use the result of Exercise 3 to derive that for every k, d 2:: 1, there
exists an h with the following property. Let F C K~ be a finite family
such that the intersection of any subfamily of F lies in K~ (i.e., is a union
of at most k convex sets). Suppose that every at most h members of F
have a common point. Then all the sets of F have a common point. (This
is expressed by saying that the family K~ has Helly order at most h.) 0
11

Attempts to Count k-Sets

Consider an n-point set X C R d , and fix an integer k. Call a k-point subset


S ~ X a k-set of X if there exists an open half-space "( such that S = X n "(;
that is, S can be "cut off" by a hyperplane. In this chapter we want to
estimate the maximum possible number of k-sets of an n-point set in R d , as
a function of nand k.
This question is known as the k-set problem, and it seems to be extremely
challenging. Only partial results have been found so far, and there is a sub-
stantial gap between the upper and lower bounds even for the number of
planar k-sets, in spite of considerable efforts by many researchers. So this
chapter presents work in progress, much more so than the other parts of this
book. I believe that the k-set problem deserves to be such an exception, since
it has stimulated several interesting directions of research, and the partial
results have elegant proofs.

11.1 Definitions and First Estimates


For technical reasons, we are going to investigate a quantity slightly different
from the number of k-sets, which turns out to be asymptotically equivalent,
however.
First we consider a planar set X C R2 in general position. A k-facet of
X is a directed segment xy, x, y E X, such that exactly k points of X lie
(strictly) to the left of the directed line determined by x and y.

a 4-facet

• •
266 Chapter 11: Attempts to Count k-Sets

Similarly, for X C Rd, a k-facet is an oriented (d-l)-dimensional simplex


with vertices Xl, X2, •.. , Xd E X such that the hyperplane h determined by
Xl, X2, •.. , Xd has exactly k points of X (strictly) on its positive side. (The
orientation of the simplex means that one of the half-spaces determined by
h is designated as positive and the other one as negative.)
Let us stress that we consider k-facets only for sets X in general position
(no d+l points on a common hyperplane). In such a case, the O-facets are
precisely the facets of the convex hull of X, and this motivates the name
k-facet (so k-facets are not k-dimensional!).
A special case of k-facets are the halving facets. These exist only if n - d
is even, and they are the n;-d-facets; i.e., they have exactly the same number
of points on both sides of their hyperplane. Each halving facet appears as
an n;-d-facet with both orientations, and so halving facets can be considered
unoriented. In the plane, instead of k-facets and halving facets, one often
speaks of k-edges and halving edges. The drawing shows a planar point set
with the halving edges:

We let KFAC(X, k) denote the number of k-facets of X, and KFACd(n, k)


is the maximum of KFAC(X, k) over all n-point sets X C Rd in general
position.
Levels, k-sets, and k-facets. The maximum possible number of k-sets is
attained for point sets in general position: Each k-set is defined by an open
half-space, and so a sufficiently small perturbation of X loses no k-sets (while
it may create some new ones).
Next, we want to show that for sets in general position, the number of
k-facets and the number of k-sets are closely related (although the exact
relations are not simple). The best way seems to be to view both notions in
the dual setting.
Let Xc Rd be a finite set in general position. Let H = {V(x): X E X} be
the collection of hyperplanes dual to the points of X, where V is the duality
"with the origin at Xd = -00" as defined in Section 5.1.
We may assume that each k-set S of X is cut off by a nonvertical hy-
perplane hs that does not pass through any point of X. If S lies below hs,
then the dual point YS = V(hs) is a point lying on no hyperplane of Hand
having exactly k hyperplanes of H below it. So Ys lies in the interior of a
cell at level k of the arrangement of H. Similarly, if S lies above h s , then Ys
is in a cell at level n-k. Moreover, if YS 1 and YS 2 lie in the same cell, then
Sl = S2, and so k-sets exactly correspond to cells of level k and n-k.
Similarly, we find that the k-facets of X correspond to vertices of the
arrangement of H of levels k or n-k-d (we need to subtract d because of
11.1 Definitions and First Estimates 267

the d hyperplanes passing through the vertex that are not counted in its
level).
The arrangement of H has at most O(n d - 1 ) unbounded cells (Exer-
cise 6.1.2). Therefore, all but at most O(n d - 1 ) cells of level k have a top-
most vertex, and the level of such a vertex is between k-d+1 and k. On
the other hand, every vertex is the topmost vertex of at most one cell
of level k. A similar relation exists between cells of level n-k and ver-
tices of level n-k-d. Therefore, the number of k-sets of X is at most
O(n d- 1 ) + 2:1=6KFAC(X,k-j). Conversely, KFAC(X,k) can be bounded
in terms of the number of k-sets; this we leave to Exercise 2. From now on,
we thus consider only estimating KFACd(n, k).
Viewing KFACd( n, k) in terms of the k-Ievel in a hyperplane arrangement,
we obtain some immediate bounds from the results of Section 6.3. The k-Ievel
has certainly no more vertices than all the levels 0 through k together, and
hence
KFACd(n,k) = 0 ( n Ld/2J(k+1)rd/21)
by Theorem 6.3.1. On the other hand, the arrangements showing that Theo-
rem 6.3.1 is tight (constructed using cyclic polytopes) prove that for k :::; n/2,
we have
KFACd(n, k) = n (nLd / 2J (k+1)r d/ 21- 1 ) ;

this determines KFACd(n, k) up to a factor of k.


The levels 0 through n together have O(n d) vertices, and so for any par-
ticular arrangement of n hyperplanes, if k is chosen at random, the expected
k~level complexity is O(n d - 1 ). This means that a level with a substantially
higher complexity has to be exceptional, much bigger than most other levels.
It seems hard to imagine how this could happen. Indeed, it is widely believed
that KFACd(n, k) is never much larger than n d- 1 • On the other hand, levels
with somewhat larger complexity can appear, as we will see in Section 11.2.
Halving facets versus k-facets. In the rest of this chapter we will mainly
consider bounds on the halving facets; that is, we will prove estimates for the
function
HFACd(n) = ~ KFACd(n, n;-d), n-d even.
It is easy to see that for all k, we have KFACd(n, k) :::; 2 ·HFACd(2n+d) (Ex-
ercise 1). Thus, for proving asymptotic bounds on max09:'On-d KFACd(n, k),
it suffices to estimate the number of halving facets. It turns out that even
a stronger result is true: The following theorem shows that upper bounds
on HFACd(n) automatically provide upper bounds on KFACd(n, k) sensitive
to k.
11.1.1 Theorem. Suppose that for some d and for all n, HFACd(n) can
be bounded by O(n d - Cd ), for some constant Cd > O. Then we have, for all
k < n-d
- 2 '
268 Chapter 11: Attempts to Count k-Sets

Proof. We use the method of the probabilistic proof of the cutting lemma
from Section 6.5 with only small modifications; we assume familiarity with
that proof. We work in the dual setting, and so we need to bound the number
of vertices of level k in the arrangement of a set H of n hyperplanes in general
position. Since for k bounded by a constant, the complexity of the k-Ievel is
asymptotically determined by Clarkson's theorem on levels (Theorem 6.3.1),
we can assume 2 ::::; k ::::; ~.
We set r = ~ and p = ; = fe, and we let S ~ H be a random sample
obtained by independent Bernoulli trials with success probability p. This time
we let T(S) denote the bottom-vertex triangulation of the bottom unbounded
cell of the arrangement of S (actually, in this case it seems simpler to use the
top-vertex triangulation instead of the bottom-vertex one); the rest of the
arrangement is ignored. (For d = 2, we can take the vertical decomposition
instead.) Here is a schematic illustration for the planar case:

lin s of S
T( )

""'- level k of H

The conditions (CO)-(C2) as in Section 6.5 are satisfied for this T(S)
(in (CO) we have constants depending on d, of course), and as for (C3),
we have IT(S)I = O(ISI Ld/2J + 1) for all S ~ H by the asymptotic upper
bound theorem (Theorem 5.5.2) and by the properties of the bottom-vertex
triangulation. Thus, the analogy of Proposition 6.5.2 can be derived: For
every t :2: 0, the expected number of simplices with excess at least t in T(S)
is bounded as follows:

(11.1)

Let Vk denote the set of the vertices of level k in the arrangement of H,


whose size we want to estimate, and let Vk(S) be the vertices in Vk that
have level 0 with respect to the arrangement of S; i.e., they are covered by a
simplex of T(S).
First we claim that, typically, a significant fraction of the vertices of Vk
appears in Vk(S), namely, E[/Vk(S)1l :2: ~/Vkl. For every v E Vk, the proba-
bility that v E Vk (S), i.e., that none of the at most k hyperplanes below v
goes into S, is at least (1 - p)k = (1 - fe)k :2: ~, and the claim follows.
It remains to bound E[/Vk(S)ll from above. Let .6. E T(S) be a simplex
and let Ht:,. be the set of all hyperplanes of H intersecting .6.. Not all of these
hyperplanes have to intersect the interior of .6. (and thus be counted in the
excess of .6.), but since H is in general position, there are at most a constant
number of such exceptional hyperplanes. We note that all the vertices in
11.1 Definitions and First Estimates 269

Vk(S) n ~ have the same level in the arrangement of Ht::, (it is k minus the
number of hyperplanes below ~). By the assumption in the theorem, we thus
have IVk(S) n ~I = O(IHt::,ld-Cd) = o ((tt::, ~)d-Cd) = O((tt::,k)d-Cd), where tt::,
is the excess of ~. Therefore,

E[lVk(S)lj ~ O(k d- Cd ). L ti-Cd.


t::,ET(S)
Using (11.1), the sum is bounded by O(G)Ld/2J); this is as in Section 6.5.
We have shown that
IVkl ~ 4E[lVk (S)1l =0 (n Ld / 2J krd/21-cd) ,
and Theorem 11.1.1 is proved. o
Bibliography and remarks. We summarize the bibliography of k-
sets here, and in the subsequent sections we only mention the origins
of the particular results described there. In the following we always
assume k 2: 1, which allows us to write k instead of k+l in the bounds.
The first paper concerning k-sets is by Lovasz [Lov71j, who proved
an O(n 3 / 2 ) bound for the number of halving edges. Straus (unpub-
lished) showed an O(nlogn) lower bound. This appeared, together
with the bound O(n..Jk) for planar k-sets, in Erdos, Lovasz, Simmons,
and Straus [ELSS73j. The latter bound was independently found by
Edelsbrunner and Welzl [EW85]. It seems to be the natural bound to
come up with if one starts thinking about planar k-sets; there are nu-
merous variations of the proof (see Agarwal, Aronov, Chan, and Sharir
[AACS98]), and breaking this barrier took quite a long time. The first
progress was made by Pach, Steiger, and Szemeredi [PSS92j, who im-
proved the upper bound by the tiny factor of log* k. A significant
breakthrough, and the current best planar upper bound of O(nkl/3),
was achieved by Dey [Dey98]. A simpler version of his proof, involving
new insights, was provided by Andrzejak, Aronov, Har-Peled, Seidel,
and Welzl [AAHP+98j.
An improvement over the 0 (n log k) lower bound [ELSS73] was
obtained by T6th [T6tOlbj, namely, KFAC 2 (n,k) 2: nexp(cJlogk)
for a constant c > 0 (a similar bound was found by Klawe, Paterson,
and Pippenger in the 1980s in an unpublished manuscript, but only
for the number of vertices of level k in an arrangement of n pseudolines
in the plane).
The first nontrivial bound on k-sets in higher dimension was
proved by Barany, Fiiredi, and Lovasz [BFL90]. They showed that
HFAC 3 (n) = O(n 2 .998 ). Their method includes the main ingredients
of most of the subsequent improvements; in particular, they proved a
planar version of the second selection lemma (Theorem 9.2.1) and con-
jectured the colored Tverberg theorem (see the notes to Sections 8.3
270 Chapter 11: Attempts to Count k-Sets

and 9.2). Aronov, Chazelle, Edelsbrunner, Guibas, Sharir, and Wenger


[ACE+91] improved the bound for the planar second selection lemma
(with a new proof) and showed that HFAC 3(n) = O(n8 / 3 Iogs / 3 n).
A nontrivial upper bound for every fixed dimension d, HFACd(n) =
O(n d - Cd ) for a suitable Cd > 0, was obtained by Alon, Baniny,
Fiiredi, and Kleitman [ABFK92]' following the method of [BFL90]
and using the recently established colored Tverberg theorem. Dey and
Edelsbrunller [DE94] proved a slightly better 3-dimensional bound
HFAC 3(n) = O(n 8 / 3) by a direct and simple 3-dimensional argument
avoiding the use of a planar selection lemma (see Exercise 11.3.8). A
new significant improvement to HFAC 3(n) = O(n 2.S ) was achieved by
Sharir, Smorodinsky, and Tardos [SSTOl]; their argument is sketched
in the notes to Section 11.4.
Theorem 11.1.1 is due to Agarwal et al. [AACS98]. Their proof
uses a way of random sampling different from ours, but the idea is the
same.
Another interesting result on planar k-sets, due to Welzl [WeI86], is
2::kEK KFAC(X, k) = 0 (nJ2:: kE K k) for every n-point set X C R2
and every index set K ~ {l, 2, ... , In/2j} (see Exercise 11.3.2). Using
identities derived by Andrzejak et al. [AAHP+98] (based on Dey's
method), the bound can be improved to 0 (n(IKI' 2::kEK k) 1/3); this
was communicated to me by Emo Welzl.
Edelsbrunner, Valtr, and Welzl [EVW97] showed that "dense" sets
X, i.e., n-point X C Rd such that the ratio of the maximum to mini-
mum interpoint distance is O(nl/d), cannot asymptotically maximize
the number of k-sets. For example, in the plane, they proved that a
bound of HFAC 2 (n) = O(nl+Q) for arbitrary sets implies that any
n-point dense set has at most O(nl+Q/2) halving edges. Alt, Felsner,
Hurtado, and Noy [AFH+OO] showed that if Xc R2 is a set contained
in a union of C convex curves, then KFAC(X, k) = O(n) for all k, with
the constant of proportionality depending on C.
Several upper bounds concern the maximum combinatorial com-
plexity of level k for objects other than hyperplanes. For segments in
the plane, the estimate obtained by combining a result of Dey [Dey98]
with the general tools in Agarwal et al. [AACS98] is O(nkl/3a(~)).
Their method yields the same result for the level k in an arrangement
of n extendible pseudosegments (defined in Exercise 6.2.5). For arbi-
trary pseudosegments, the result of Chan mentioned in that exercise
(n pseudosegments can be cut into O(nlogn) extendible pseudoseg-
ments) gives the slightly worse bound O(nkl/3a(~) log2/3(k+l)).
The study of levels in arrangements of curves with more than one
pairwise intersection was initiated by Tamaki and Tokuyama [TT98],
who considered a family of n parabolas in R2 (here is a neat motiva-
tion: Given n points in the plane, each of them moving along a straight
11.1 Definitions and First Estimates 271

line with constant velocity, how many times can the pair of points with
median distance change?). They showed that n parabolas can be cut
into 0(n 5 / 3 ) pieces in total so that the resulting collection of curves
is a family of pseudosegments (see Exercise 6). This idea of cutting
curves into pseudosegments proved to be of great importance for other
problems as well; see the notes to Section 4.5. Tamaki and Tokuyama
obtained the bound of 0(n 2- 1/ 12 ) for the maximum complexity of the
k-level for n parabolas. Using the tools from [AACS98] and a cutting
into extendible pseudosegments, Chan [ChaOOa] improved this bound
to 0(nkl-2/910g2/3(k+1)).
All these results can be transferred without much difficulty from
parabolas to pseudocircles, which are closed planar Jordan curves, ev-
ery two intersecting at most twice. Aronov and Sharir [AS01a] proved
that if the curves are circles, then even cutting into 0(n 3 / 2 +c ) pseu-
dosegments is possible (the best known lower bound is f2(n 4 / 3 ); see
Exercise 5). This upper bound was extended by Nevo, Pach, Pinchasi,
and Sharir [NPPSOl] to certain families of pseudocircles: The pseudo-
circles in the family should be selected from a 3-parametric family of
real algebraic curves and satisfy an additional condition; for example,
it suffices that their interiors can be pierced by 0(1) points (also see
Alon, Last, Pinchasi, and Sharir [ALPS01] for related things).
Tamaki and Tokuyama constructed a family of n curves with at
most 3 pairwise intersections that cannot be cut into fewer than f2(n 2 )
pseudosegments, demonstrating that their approach cannot yield non-
trivial bounds for the complexity of levels for such general curves (Ex-
ercise 5). However, for graphs of polynomials of degree at most s,
Chan [ChaOOa] obtained a cutting into roughly 0(n2-1/3s-1) pseu-
dosegments and consequently a nontrivial upper bound for levels. His
bound was improved by Nevo et al. [NPPSOl].
As for higher-dimensional results, Katoh and Tokuyama [KT99]
proved the bound 0(n 2k 2/ 3) for the complexity of the k-level for n
triangles in R 3.
Bounds on k-sets have surprising applications. For example, Dey's
results for planar k-sets mentioned above imply that if G is a graph
with n vertices and m edges and each edge has weight that is a linear
function of time, then the minimum spanning tree of G changes at
most 0(mnl/3) times; see Eppstein [Epp98]. The number of k-sets
of the infinite set (zt)d (lattice points in the nonnegative orthant)
appears in computational algebra in connection with Grabner bases
of certain ideals. The bounds of O((k log k)d-l) and f2(k d - 1 log k) for
every fixed d, as well as references, can be found in Wagner [WagOl].
272 Chapter 11: Attempts to Count k-Sets

Exercises
1. Verify that for all k and all dimensions d, KFACd(n, k) :::; 2·HFAC d(2n+
d). ~
2. Show that every vertex in an arrangement of hyperplanes in general po-
sition is the topmost vertex of exactly one cell. For X c R d finite and in
general position, bound KFAC(X, k) using the numbers of j-sets of X,
k :::; j :::; k+d-1. 11]
3. Suppose that we have a construction that provides an n-point set in the
plane with at least f(n) halving edges for all even n. Show that this
implies KFAC 2 (n, k) = fl(ln/2kJf(2k)) for all k :::; ~. 11]
4. Suppose that for all even n, we can construct a planar n-point set with at
least f(n) halving edges. Show that one can construct n-point sets with
fl(nf(n)) halving facets in R3 (for infinitely many n, say). [!] Can you
extend the construction to Rd, obtaining fl(n d- 2 f(n)) halving facets?
5. (Lower bounds for cutting curves into pseudosegments) In this exercise, r
is a family of n curves in the plane, such as those considered in connection
with Davenport-Schinzel sequences: Each curve intersects every vertical
line exactly once, every two curves intersect at most s times, and no 3
have a common point.
(a) Construct such a family r with s = 2 (a family of pseudoparabolas)
whose arrangement has fl(n 4 / 3 ) empty lenses, where an empty lens is
a bounded cell of the arrangement of r bounded by two of the curves.
(The number of empty lenses is obviously a lower bound for the number
of cuts required to turn r into a family of pseudosegments.) 11]
(b) Construct a family r with s = 3 and with fl( n 2 ) empty lenses. ~
6. (Cutting pseudoparabolas into pseudosegments) Let r be a family of n
pseudoparabolas in the plane as in Exercise 5(a). For every two curves
I, I' E r with exactly two intersection points, the lens defined by I and
I' consists of the portions of I and I' between their two intersection
points, as indicated in the picture:

(a) Let A be a family of pairwise nonoverlapping lenses in the arrange-


ment of r, where two lenses are nonoverlapping if they do not share any
edge of the arrangement (but they may intersect, or one may be enclosed
in the other). The goal is to bound the maximum size of A. We define a
bipartite graph G with V (G) = r x {O, I} and with E( G) consisting of all
edges {(f,0), (f', I)} such that there is a lens in A whose lower portion
comes from I and upper portion from I'. Prove that G contains no K 3 ,4
and hence IAI = O(n 5 / 3 ). Supposing that K 3 ,4 were present, correspond-
ing to "lower" curves 11,/2,13 and "upper" curves I~" .. , I~' consider
11.2 Sets with Many Halving Edges 273

the upper envelope U of 1'1,1'2, 1'3 and the lower envelope L of I'i , ... , I'~'
(A more careful argument shows that even K 3 ,3 is excluded.) [!]
(b) Show that the graph G in (a) can contain a K 2 ,r for arbitrarily large r.
CD
(c) Given r, define the lens set system (X, £.) with X consisting of all
bounded edges of the arrangement of r and the sets of £. corresponding
to lenses (each lens contributes the set of arrangement edges contained
in its two arcs). Check that T(£.) is the smallest number of cuts needed
to convert r into a collection of pseudosegments, and that the result of
(a) implies v(£.) = O(n 5/ 3). CD
(d) Using the method of the proof of Clarkson's theorem on levels and
the inequality in Exercise 1O.1.4(a), prove that T(£.) = O(n 5/ 3). [II
7. (The k-set polytope) Let X C Rd be an n-point set in general position
and let k E {I, 2, ... , n-l}. The k-set polytope Qk (X) is the convex hull
of the set
{LX: 8 c X, 181 = k}
xES

in Rd. Prove that the vertices of QdX) correspond bijectively to the


k-sets of X. [!]
The k-set polytope was introduced by Edelsbrunner, Valtr, and Welzl
[EVW97]. It can be used for algorithmic enumeration of k-sets, for ex-
ample by the reverse search method mentioned in the notes to Section 5.5.

11.2 Sets with Many Halving Edges


Here we are going to construct n-point planar sets with a superlinear number
of halving edges. It seems more intuitive to present the constructions in the
dual setting, that is, to construct arrangements of n lines with many vertices
of level n;-2.
A simpler construction. We begin with a construction providing O(n log n)
vertices of the middle level.
By induction on m, we construct a set Lm of 2m lines in general posi-
tion with at least 1m = (m+l)2 m - 2 vertices of the middle level (Le., level
2m - I _I). We note that each line of Lm contains at least one of the middle-
level vertices.
For m = 1 we take two nonvertical intersecting lines.
Let m 2 1 and suppose that an Lm satisfying the above conditions has
already been constructed. First, we select a subset M C Lm of 2m - I lines,
and to each line of £ E M we assign a vertex v(£) of the middle level lying
on £, in such a way that v(C) =1= v(£') for C =1= C'. The selection can be done
greedily: We choose a line into M, take a vertex of the middle level on it, and
exclude the other line passing through that vertex from further consideration.
274 Chapter 11: Attempts to Count k-Sets

Next, we replace each line of Lm by a pair of lines, both almost parallel


to the original line. For a line C E M, we let the two lines replacing C intersect
at v(C). Each of the remaining lines is replaced by two almost parallel lines
whose intersection is not near to any vertex of the arrangement of Lm. This
yields the set Lm+l.
As the following picture shows, a middle-level vertex of the form v(C)
yields 3 vertices of the new middle level (level 2m -1 in the arrangement of
Lm+d:

Each of the other middle-level vertices yields 2 vertices of the new middle

><
level:

Hence.the number of middle-level vertices for Lm+l is at least 2fm + 2m - 1 =


2 [(m + 1)2m - 2 ] + 2m - 1 = fm+l. D

A better construction. This construction is more complicated, but it


shows the lower bound
n. efl ( V1ogn )
for the number of vertices of the middle level (and thus for the number of
halving edges). This bound is smaller than nl+O for every 0 > 0 but much
larger than n(log n)C for any constant c.
For simplicity, we will deal only with values of n of a special form, thus
providing a lower bound for infinitely many n. Simple considerations show
that HFAC 2 (n) is nondecreasing, and this gives a bound for all n.
The construction is again inductive. We first explain the idea, and then
we describe it more formally.
In the first step, we let Lo consist of two intersecting nonvertical lines.
Suppose that after m steps, a set of lines Lm in general position has already
been constructed, with many vertices of the middle level. First we replace
every line C E Lm by am parallel lines; let us call these lines the bundle of c.
So if v is a vertex of the middle level of L m , we get am vertices of the middle
level near v after the replacement.

bundle of C

bundle of C'
11.2 Sets with Many Halving Edges 275

Then we add two new lines Av and fLv as indicated in the next picture, and
we obtain 2a m vertices of the middle level:

.................... Av

If nm = ILml and 1m is the number of vertices of the middle level in L m ,


the construction gives roughly nm+l ;::;:; amn m + 21m and Im+l ;::;:; 2a m lm.
This recurrence is good: With a suitable choice of the multiplicities am, it
leads to the claimed bound. But the construction as presented so far is not
at all guaranteed to work, because the new lines Av and fLv might mess up
the levels of the other vertices. We must make some extra provisions to get
this under control.
First of all, we want the auxiliary lines Av and fLv to be nearly parallel to
the old line £' in the picture. This is achieved by letting the vertical spacing
of the am lines in the bundle of £' be much smaller than the spacing in the
bundle of £:

Namely, if the lines of Lm are £1, £2, ... , £n rn , then the vertical spacing in the
bundle of £i is set to c i , where c > 0 is a suitable very small number.
Let £i be a line of L m , and let di denote the number of indices j < i such
that £j intersects £i in a vertex of the middle level. In the new arrangement
of Lm+l we obtain am lines of the bundle of £i and 2di lines of the form Av
and fLv, which are almost parallel to ii, and di of them go above the bundle
and di below. Thus, for points not very close to ii, the effect is as if £i were
replicated (a m +2di ) times. This is still not good; we would need that all lines
have the same multiplicities. So we let D be the maximum of the di , and for
each i, we add D - di more lines parallel to £i below £i and D - di parallel
lines above it.
276 Chapter 11: Attempts to Count k-Sets

How do we control D? We do not know how many middle-level vertices


can appear on the lines of Lm+l; some vertices are necessarily there by the
construction, but some might arise "just by chance," say by the interaction of
the various auxiliary lines Av and /Lv, which we do not really want to analyze.
So we take a conservative attitude and deal only with the middle-level vertices
we know about for sure.
Here is the whole construction, this time how it really goes. Suppose that
we have already constructed a set Lm = {f 1, ... , f nrn } of lines in general po-
sition (which includes being nonvertical) and a set Vm of middle-level vertices
in the arrangement of L m , such that the number of vertices of Vm lying on
fi is no more than D m , for all i = 1,2, ... , n m . We let C = Cm be sufficiently
small, and we replace each fi by am parallel lines with vertical spacing ci .
Then for each v E Vm , we add the two lines Av and /Lv as explained above,
and finally we add, for each i, the 2(Dm - di ) lines parallel to fi, half above
and half below the bundle, where di is the number of vertices of Vm lying
on f i .
Since L m +1 is supposed to be in general position, we should not forget
to apply a very small perturbation to Lm+l after completing the step just
described.
For each old vertex v E Vm , we now really get the 2a m new middle-level
vertices near v as was indicated in the drawing above, and we put these into
Vm+l' So we have

What about D m +1 , the maximum number of points of Vm +1 lying on a single


line? Each line in the bundle of fi has exactly d i vertices of Vm +1 • The lines
Av get 2a m vertices of Vm+l, and the remaining auxiliary lines get none. So

It remains to define the am, which are free parameters of the construction.
A good choice is to let am = 4Dm . Then we have Do = 1, Dm = 8 m , and
am = 4 . 8 m . From the recurrences above, we further calculate

nm = 2· 6m . 81+ 2+ o
+(m-l), 1m = 8m . 81+ 2+ o
+(m-l).

So lognm is O(m 2 ), while log(fm/nm) = log (~(~)m) = !1(m). We indeed


have 1m ;::: nm . en ( Vlognrn) as promised. D

Bibliography and remarks. The first construction is from Erdos


et al. [ELSS73] and the second one from T6th [T6tOlb]. In the original
papers, they are phrased in the primal setting.
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions 277

11.3 The Lovasz Lemma and Upper Bounds in All


Dimensions
In this section we prove a basic property of the halving facets, usually called
the Lovasz lemma. It implies nontrivial upper bounds on the number of
halving facets, by a simple self-contained argument in the planar case and
by the second selection lemma (Theorem 9.2.1) in an arbitrary dimension.
We prove a slightly more precise version of the Lovasz lemma than is needed
here, since we will use it in a subsequent section. On the other hand, we
consider only halving facets, although similar results can be obtained for k-
facets as well. Sticking to halving facets simplifies matters a little, since for
other k-facets one has to be careful about the orientations.
Let X C Rd be an n-point set in general position with n - d even. Let T
be a (d-l)-point subset of X and let

VT = {x E X \ T: T U {x} is a halving facet of X}.

In the plane, T has a single point and VT are the other endpoints of the
halving edges emanating from it. In 3 dimensions, conv(T) is a segment, and
a typical picture might look as follows:

where T = {tl,t2} and the triangles are halving facets.


Let h be a hyperplane containing T and no point of X \ T. Since IX \ TI
is odd, one of the open half-spaces determined by h, the larger half-space,
contains more points of X than the other, the smaller half-space.
11.3.1 Lemma (Halving-facet interleaving lemma). Every hyperplane
h as above "almost halves" the halving facets containing T. More precisely, if
r is the number of points ofVT in the smaller half-space of h, then the larger
half-space contains exactly r+ 1 points of VT .

Proof. To get a better picture, we project T and VT to a 2-dimensional


plane p orthogonal to T. (For dimension 2, no projection is necessary, of
course.) Let the projection of T, which is a single point, be denoted by t and
the projection of VT by V;'. Note that the points of VT project to distinct
points. The halving facets containing T project to segments emanating from t.
The hyperplane h is projected to a line hi, which we draw vertically in the
following indication of the situation in the plane p:
278 Chapter 11: Attempts to Count k-Sets

larg r half-. pac

ro ation (a .b_ _ _ ~~~~~_-y

We claim that for any two angularly consecutive segments, such as at and bt,
the angle opposite the angle atb contains a point of V,f (such as z). Indeed,
the hyperplane passing through t and a has exactly n;-d points of X in
both of its open half-spaces. If we start rotating it around T towards b, the
point a enters one of the open half-spaces (in the picture, the one below the
rotating hyperplane). But just before we reach b, that half-space again has
n;-d points. Hence there was a moment when the number of points in this
half-space went from n;-d +1 to n;-d, and this must have been a moment of
reaching a suitable z.
This means that for every two consecutive points of V,f, there is at least
one point of V,f in the corresponding opposite wedge. There is actually exactly
one, for if there were two, their opposite wedge would have to contain another
point. Therefore, the numbers of points of VT in the half-spaces determined
by h differ exactly by l.
To finish the proof of the lemma, it remains to observe that if we start
rotating the hyperplane h around T in either direction, the first point of VT
encountered must be in the larger half-space. So the larger half-space has
one more point of VT than the smaller half-space. (Recall that the larger
half-space is defined with respect to X, and so we did not just parrot the
definition here.) D

11.3.2 Corollary (Lovasz lemma). Let X C Rd be an n-point set in


general position, and let C be a line that is not parallel to any of the halving
facets of X. Then Cintersects the relative interior of at most O( n d - 1 ) halving
facets of X.

Proof. We can move Ca little so that it intersects the relative interiors of the
same halving facets as before but intersects no boundary of a halving facet.
Next, we start translating C in a suitably chosen direction. (In the plane there
are just two directions, and both of them will do.) The direction is selected
so that we never cross any (d - 3)-dimensional flat determined by the points
of X. To this end, we need to find a two-dimensional plane passing through
C and avoiding finitely many (d - 3)-dimensional flats in R d , none of them
intersecting C; this is always possible.
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions 279

As we translate the line £, the number of halving facets currently inter-


sected by £ may change only as £ crosses the boundary of a halving facet F,
i.e., a (d-2)-dimensional face of F. By the halving-facet interleaving lemma,
by crossing one such face T, the number of intersected halving facets changes
by 1. After moving far enough, the translated line £ intersects no halving
facet at all. On its way, it crossed no more than O( n d - 1 ) boundaries, since
there are only O(n d- 1) simplices of dimension d-2 with vertices at X. This
proves the corollary. 0

11.3.3 Theorem. For each d ;::: 2, the maximum number of halving facets
satisfies
HFACd(n) = O(nd-l/Sd-l),
where Sd-l is an exponent for which the statement of the second selection
lemma (Theorem 9.2.1) holds in dimension d-1. In particular, in the plane
we obtain HFAC 2 (n) = O(n 3 / 2 ).
For higher dimensions, this result shows that HFACd(n) is asymptotically
somewhat smaller than n d , but the proof method is inadequate for proving
bounds close to n d - 1 .
Theorem 11.3.3 is proved from Corollary 11.3.2 using the second selection
lemma. Let us first give a streamlined proof for the planar case, although
later on we will prove a considerably better planar bound.
Proof of Theorem 11.3.3 for d = 2. Let us project the points of X ver-
tically on the x-axis, obtaining a set Y. The projections of the halving edges
of X define a system of intervals with endpoints in Y. By Corollary 11.3.2,
any point is contained in the interior of at most O( n) of these intervals, for
otherwise, a vertical line through that point would intersect too many halving
edges.
Mark every qth point of Y (with q a parameter to be set suitably later).
Divide the intervals into two classes: those containing some marked point
in their interior and those lying in a gap between two marked points. The
number of intervals of the first class is at most O(n) per marked point, i.e.,
at most O(n 2 jq) in total. The number of intervals of the second class is no
more than (q~l) per gap, i.e., at most (~+ l)(q~l) in total. Balancing both
bounds by setting q = rvnl, we get that the total number of halving edges
is O(n 3/ 2) as claimed. 0
Note that we have implicitly applied and proved a one-dimensional second
selection lemma (Exercise 9.2.1).
Proof of Theorem 11.3.3. We consider an n-point X C Rd. We project
X vertically into the coordinate hyperplane Xd = 0, obtaining a point set Y,
which we regard as lying in R d- 1 . If the coordinate system is chosen suitably,
Y is in general position.
Each halving facet of X projects to a (d-1 )-dimensional Y -simplex in
R d- 1 ; let F be the family of these Y-simplices. If we write IFI = a(~), then
280 Chapter 11: Attempts to Count k-Sets

by the second selection lemma, there exists a point a contained in at least


caSd - 1 C) simplices of :F. Only at most O(n d- 1) of these contain a in their
boundary, by Lemma 9.1.2, and the remaining ones have a in the interior.
By the Lovasz lemma (Corollary 11.3.2) applied on the vertical line in Rd
passing through the point a, we thus get ca Sd - 1 ( : ) = O(n d - 1 ). We calculate
that IFI = a(:) = O(nd-l/Sd-l) as claimed. D

Bibliography and remarks. The planar version of the Lovasz


lemma (Corollary 11.3.2) originated in Lovasz [Lov71]; the proof
implicitly contains the halving-facet interleaving lemma. A higher-
dimensional version of the Lovasz lemma appeared in Barany, Fiiredi,
and Lovasz [BFL90].
Welzl [WelD1] proved an exact version of the Lovasz lemma, as is
outlined in Exercises 5 and 6 below. This question is equivalent to
the upper bound theorem for convex polytopes, via the Gale trans-
form. The connection of k-facets and h-vectors of convex polytopes
was noted earlier by several authors (Lee [Lee91], Clarkson [Cla93],
and Mulmuley [Mu193b]), sometimes in a slightly different but essen-
tially equivalent form. Using this correspondence and the generalized
lower bound theorem mentioned in Section 5.5, Welzl also proved that
the maximum total number of j-facets with j ::; k for an n-point set in
R 3 (or, equivalently, the maximum possible number of vertices of level
at most k in an arrangement of n planes in general position in R 3 ) is
attained for a set in convex position, from which the exact maximum
can be calculated. It also implies that in R 3 , a set in convex position
minimizes the number of halving facets (triangles).
An interesting connection of this result to another problem was dis-
covered by Sharir and Welzl [SW01]. They quickly derived the follow-
ing theorem, which was previously established by Pach and Pinchasi
[PP01] by a difficult elementary proof: If R, B C R2 are n-point sets
("red" and "blue") with RuB in general position, then there are at
least n balanced lines, where a line eis balanced if IRnel = IBnel = 1
and on both sides of e the number of red points equals the number of
blue points (for odd n, the existence of at least one balanced line fol-
lows from the ham-sandwich theorem). A proof based on Welzl's result
in R3 mentioned above is outlined in Exercise 4. Let us remark that
conversely, the Pach-Pinchasi theorem implies the generalized lower
bound theorem for (d+4)-vertex polytopes in Rd.

Exercises
1. (a) Prove the following version the Lovasz lemma in the planar case:
For a set X C R2 in general position, every vertical line e intersects the
interiors of at most k+ 1 of the k-edges. [!J
11.3 The Lovasz Lemma and Upper Bounds in All Dimensions 281

(b) Using (a), prove the bound KFAC 2 (n, k) = O(nv'k+l) (without
appealing to Theorem 11.1.1). [1]
2. Let K ~ {l,2, ... , In/2J}. Using Exercise 1, prove that for any n-point
set X C R2 in general position, the total number of k-edges with k E
K (or equivalently, the total number of vertices of levels k E K in an
arrangement of n lines) is at most 0 (nJL:kEK k). (Note that this is
better than applying the bound KFAC 2 (n, k) = O(nv'k) for each k E K
separately.) [1]
3. (Exact planar Lovasz lemma) Let X C R2 be a 2n-point set in general
position, and let £ be a vertical line having k points of X on the left
and 2n-k points on the right. Prove that £ crosses exactly min(k,2n-k)
halving edges of X. 121
4. Let X be a set of 2n+ 1 points in R 3 in general position, and let
PI, P2, ... , P2n+ I be the points of X listed by increasing height (z-co-
ordinate).
(a) Using Exercise 3, check that if Pk+1 is a vertex of conv(X), then there
are exactly min(k,2n-k) halving triangles having Pk+1 as the middle-
height vertex (that is, the triangle is PiPk+IPj with i < k+l < j). [1]
(b) Prove that every (2n+1)-point convex independent set X C R3 in
general position has at least n 2 halving triangles. 121
(c) Assuming that each (2n+1)-point set in R3 in general position
has at least n 2 halving triangles (which follows from (b) and the re-
sult mentioned in the notes above about the number of halving trian-
gles being minimized by a set in convex position), infer that if X =
{PI,'" ,P2n+l} C R3 is in general position, then for every k, there are
always at least minCk, 2n-k) halving triangles havingpk+1 as the middle-
height vertex (even if Pk+1 is not extremal in X). [1]
(d) Derive from (c) the result about balanced lines mentioned in the notes
to this section: If R, Be R2 are n-point sets (red and blue points), with
RuB in general position, then there are at least n balanced lines £ (with
IR n £1 = IB n £1 = 1 and such that on both sides of £ the number of red
points equals the number of blue points). Embed R2 as the z = 1 plane
in R3 and use a central projection on the unit sphere in R3 centered at O.
[1]
See [SWOl] for solutions and related results.
5. (Exact Lovasz lemma) Let Xc Rd be an n-point set in general position
and let £ be a directed line disjoint from the convex hulls of all (d-l)-
point subsets of X. We think of £ as being vertical and directed upwards.
We say that £ enters a j-facet F if it passes through F from the positive
side (the one with j points) to the negative side. Let hj = hj (£, X)
denote the number of j-facets entered by £, j = 0,1, ... , n - d. Further,
let Sk(£, X) be the number of (d + k)-element subsets S ~ X such that
£ n conv(S) =I- 0.
d '-
(a) Prove that for every X and £ as above, Sk = L:;::k (~)hj. 8J
282 Chapter 11: Attempts to Count k-Sets

(b) Use (a) to show that ho, ... , hn - d are uniquely determined by
So, Sl,"" Sn-d. 0
(c) Infer from (b) that if X' is a set in general position obtained from
X by translating each point in a direction parallel to £, then hj(£, X) =
h j (£, X') for all j. Derive h j = hn-d-j' 0
(d) Prove that for every x E X and all j, we have hj (£, X \ {x}) :S
hj(£, X). [2]
(e) Choose x E X uniformly at random. Check that E [h j (£, X \ {x})] =
n-~-j h j + j!l hj+!' 0
m
(f) From (d) and (e), derive h J + l :S hj, and conclude the exact Lovasz

{(j 1) (n - 1) }
lemma:
h mm . +d - j -
J:S d-1' d-1 .
o
6. (The upper bound theorem and k-facets) Let a = (aI, a2,"" an) be a
sequence of n :::: d+ 1 convex independent points in R d in general position,
and let P be the d-dimensional simplicial convex polytope with vertex set
{al, ... , an}. Let g = (Ih, ... , gn) be the Gale transform of a, gl,"" gn E
Rn-d-l, and let bi be a point in R n - d obtained from gi by appending
a number ti as the last coordinate, where the ti are chosen so that X =
{b l , ... , bn } is in general position.
(a) Let £ be the xn-d-axis in Rn-d oriented upwards, and let Sk
Sk(£'X) and h j = hj(£,X) be as in Exercise 5. Show that h(P)
Sd-k-l(£,X), k = O,l, ... ,d -1. [2]
(b) Derive that hj(P) = hj(£,X), j = O,l, ... ,d, where h j is as at the
end of Section 5.5, and thus (f) of the preceding exercise implies the
upper bound theorem in the formulation with the h-vector (5.3).0
If (a) and (b) are applied to the cyclic polytopes, we get equality in
the bound for hj in Exercise 5(f). In fact, the reverse passage (from an
X c Rn-d in general position to a simplicial polytope in R d ) is possible
as well (see [WeIOl]), and so the exact Lovasz lemma can also be derived
from the upper bound theorem.
7. This exercise shows limits for what can be proved about k-sets using
Corollary 11.3.2 alone.
(a) Construct an n-point set X C R2 and a collection of D( n 3 / 2 ) segments
with endpoints in X such that no line intersects more than O( n) of these
segments. 0
(b) Construct an n-point set in R3 and a collection of D(n 5 / 2 ) triangles
with vertices at these points such that no line intersects more than O(n 2 )
triangles. 8J
8. (The Dey-Edelsbrunner proof of HFAC 3 (n) = O(n 8 / 3 )) Let X be an
n-point set in R3 in general position (make a suitable general position
assumption), and let T be a collection of t triangles with vertices at points
of X. By a crossing we mean a pair (T, e), where T E T is a triangle
11.4 A Better Upper Bound in the Plane 283

and e is an edge of another triangle from T, such that e intersects the


interior of T in a single point (in particular, e is vertex-disjoint from T).
(a) Show that if t 2: Cn 2 for a suitable constant C, then two triangles
sharing exactly one vertex intersect in a segment, and conclude that at
least one crossing exists. III
(b) Show that at least t - Cn 2 crossings exist. [2]
(c) Show that for t 2: C'n 2 , with C' > C being a sufficiently large con-
stant, at least !l(t 3 /n 4 ) crossings exist. Infer that there is an edge crossing
r2(t 3 /n 6 ) triangles. (Proceed as in the proof of the crossing number the-
orem.) 0
(d) Use Corollary 11.3.2 to conclude that HFAC 3 (n) = O(n 8 / 3 ). I2l

11.4 A Better Upper Bound in the Plane


Here we prove an improved bound on the number of halving edges in the
plane.

11.4.1 Theorem. The maximum possible number of halving edges of an n-


point set in the plane is at most O(n 4 / 3 ).
Let X be an n-point set in the plane in general position, and let us draw
all the halving edges as segments. In this way we get a drawing of a graph
(the graph of halving edges) in the plane. Let deg( x) denote the degree of x
in this graph, i.e., the number of halving edges incident to x, and let cr(X)
denote the number of pairs of the halving edges that cross. In the following
example we have cr(X) = 2, and the degrees are (1,1,1,1,1,3).

Theorem 11.4.1 follows from the crossing number theorem (Theorem 4.3.1)
and the following remarkable identity.

11.4.2 Theorem. For each n-point set X in the plane in general position,
where n is even, we have

(11.2)

Proof of Theorem 11.4.1. Theorem 11.4.2 implies, in particular, that


cr(X) = O(n 2 ). The crossing number theorem shows that cr(X) = r2(t 3 /n 2 ) -
O(n), where t is the number of halving edges, and this implies t = O(n 4 / 3 ).
o
284 Chapter 11: Attempts to Count k-Sets

Proof of Theorem 11.4.2. First we note that by the halving-facet in-


terleaving lemma, deg( x) is odd for every x EX, and so the expression
~(deg(x)+l) in the identity (11.2) is always an integer.
For the following arguments, we formally regard the set X as a sequence
(Xl, X2, .. . ,xn ). From Section 9.3 we recall the notion of orientation of a
triple (Xi, Xj, Xk): Assuming i < j < k, the orientation is positive if we make
a right turn when going from Xi to Xk via Xj, and it is negative if we make
a left turn. The order type of X describes the orientations of all the triples
(Xi, Xj, Xk), 1 ::; i < j < k ::; n. We observe that the order type uniquely
determines the halving edge graph: Whether {Xi, Xj} is a halving edge or
not can be deduced from the orientations of the triples involving Xi and Xj'
Similarly, the orientations of all triples determine whether two halving edges
cross.
The theorem is proved by a continuous motion argument. We start with
the given point sequence X, and we move its points continuously until we
reach some suitable configuration Xo for which the identity (11.2) holds. For
example, Xo can consist of n points in convex position, where we have ~
halving edges and every two of them cross.
The continuous motion transforming X into Xo is such that the current
sequence remains in general position, except for finitely many moments when
exactly one triple (Xi,Xj,Xk) changes its orientation. The points Xi,Xj,Xk
thus become collinear at such a moment, but we assume that they always
remain distinct, and we also assume that no other collinearities occur at that
moment. Let us call such a moment a mutation at {Xi,Xj,Xk}.
We will investigate the changes of the graph of halving edges during the
motion, and we will show that mutations leave the left-hand side of the
identity (11.2) unchanged.
Both the graph and the crossings of its edges remain combinatorially
unchanged between the mutations. Moreover, some thought reveals that by
a mutation at {x, y, z}, only the halving edges with both endpoints among
x, y, z and their crossings with other edges can be affected; all the other
halving edges and crossings remain unchanged.
Let us first assume that {x, y} is a halving edge before the mutation at
{x, y, z} and that z lies on the segment xy at the moment of collinearity:
y y

X
X
11.4 A Better Upper Bound in the Plane 285

Figure 11.1. Welzl's Little Devils.

We note that {x, z} and {y, z} cannot be halving edges before the mutation.
After the mutation, {x, y} ceases to be halving, while {x, z} and {y, z} become
halving.
Let deg( z) = 2r+ 1 (before the mutation) and let h be the line passing
through z and parallel to xy. The larger side of h, i.e., the one with more
points of X, is the one containing x and y, and by the halving-facet inter-
leaving lemma, r+ 1 of the halving edges emanating from z go into the larger
side of h and thus cross xy. So the following changes in degrees and crossings
are caused by the mutation:
• deg(z), which was 2r+1, increases by 2, and
• cr(X) decreases by r+1.
It is easy to check that the left-hand side of the identity (11.2) remains the
same after this change.
What other mutations are possible? One is the mutation inverse to the
one discussed above, with z moving in the reverse direction. We show that
there are no other types of mutations affecting the graph of halving edges.
Indeed, for any mutation, the notation can be chosen so that z crosses over
the segment xy. Just before the mutation or just after it, it is not possible for
{x, z} to be a halving edge and {y, z} not. The last remaining possibility is a
mutation with no halving edge on {x, y, z}, which leaves the graph unchanged.
Theorem 11.4.2 is proved. 0

Tight bounds for small n. Using the identity (11.2) and the fact that
all vertices of the graph of halving edges must have odd degrees, one can
determine the exact maximum number of halving edges for small point con-
figurations (Exercise 1). Figure 11.1 shows examples of configurations with
the maximum possible number of halving edges for n = 8, 10, and 12. These
small examples seem to be misleading in various respects: For example, we
know that the number maximum of halving edges is superlinear, and so the
graph of halving edges cannot be planar for large n, and yet all the small
pictures are planar.

Bibliography and remarks. Theorem 11.4.1 was first proved by


Dey [Dey98], who discovered the surprising role of the crossings of
the halving edges. His proof works partially in the dual setting, and
286 Chapter 11 : Attempts to Count k-Sets

it relies on a technique of decomposing the k-Ievel in an arrangement


into convex chains discussed in Agarwal et al. [AACS98]. The identity
(11.2) , with the resulting considerable simplification of Dey's proof,
were found by Andrzejak et al. [AAHP+98]. They also computed the
exact maximum number of halving edges up to n = 12 and proved
results about k-facets and k-sets in dimension 3.
Improved upper bound for k-sets in R 3 . We outline the argument of
Sharir et al. [SSTOl] proving that an n-point set X C R3 in general
position has at most O(n 2 .5 ) halving triangles. Let T be the set of
halving triangles and let t = ITI. We will count the number N of
crossing pairs of triangles in T in two ways, where a crossing pair
looks like this:

The triangles share one vertex p, and the edge of Tl opposite to p


intersects the interior of T 2 .
The Lovasz lemma (Corollary 11.3.2: no line intersects more than
O(n 2 ) halving triangles) implies N = O(n 4 ). To see this, we first
consider pairs (C, T), where Cis a line spanned by two points p, q E X,
t E T, and C intersects the interior of T. Each of the (~) lines C
contributes at most O(n 2 ) pairs, and each pair (C, T) yields at most 3
crossing pairs of triangles, one for each vertex of T.
Now we are going to show that N = 0(t 2 In) - O(tn), which to-
gether with N = O(n4) implies t = O(n 2 .5 ). Let p be a horizontal
plane lying below all of X. For a set A c R 3 , let A * denote the cen-
tral projection of A fromp into p. To bound N from below, we consider
each p E X in turn, and we associate to it a graph Gp drawn in p. Let
IP be the open half-space below the horizontal plane through p. The
vertex set of the geometric graph G p is Vp = (X n IP)*. Let 1l p S;; T
be the set of the halving triangles having p as the highest vertex, and
let Mp S;; T be the triangles with p as the middle-height vertex. Each
T E 1l p contributes an edge of G p, namely, the segment T*:

q~_-'-7 ·

Each TEMp gives rise to an unbounded ray in G p, namely, (Tn,P)*:


11.4 A Better Upper Bound in the Plane 287

Formally, we can interpret such a ray as an edge connecting the vertex


q* E Vp to a special vertex at infinity.
Let mp = IHpl+IMpl be the total number of edges of G p , including
the rays, and let rp = IMpl be the number of rays. Write xp for the
number of edge crossings in the drawing of Gp . We have

2:= mp = 2t and 2:= rp = t,


pEX pEX

because each T E T contributes to one Hp and one Mp. We note that


N ~ L:PEx x P ' since an edge crossing in G p corresponds to a crossing
pair of triangles with a common vertex p.
A lower bound for xp is obtained using a decomposition of G p into
convex chains, which is an idea from Agarwal et al. [AACS98] (used in
Dey's original proof of the O(n4/3) bound for planar halving edges).
We fix a vertical direction in p so that no edges of G p are vertical. Each
convex chain is a contiguous sequence of (bounded or unbounded)
edges of G p that together form a graph of a convex function defined
on an interval. Each edge lies in exactly one convex chain. Let e be
an edge of G p whose right end is a (finite) vertex v. We specify how
the convex chain containing e continues to the right of v: It follows an
edge e' going from v to the right and turning upwards with respect
to v but as little as possible.

If there is no e' like this, then the considered chain ends at v:

By the halving-facet interleaving lemma, the fan of edges emanating


from v has an "antipodal" structure: For every two angularly consecu-
tive edges, the opposite wedge contains exactly one edge. This implies
that e is uniquely determined bye', and so we have a well-defined de-
composition of the edges of G p into convex chains. Moreover, exactly
288 Chapter 11: Attempts to Count k-Sets

one convex chain begins or ends at each vertex. Thus, the number cp
of chains equals !(np + rp).
A lower bound for the number of edge crossings xp is the number
of pairs {C1, C 2 } of chains such that an edge of C 1 crosses an edge of
C 2 . The trick is to estimate the number of pairs {C 1 , C 2 } that do not
cross in this way. There are two possibilities for such pairs: C 1 and C 2
can be disjoint or they can cross at a vertex:

The number of pairs {C1, C2 } crossing at a vertex is at most mp n p ,


because the edge e1 of C 1 entering the crossing determines both C 1
and the crossing vertex, and C2 can be specified by choosing one of
the at most np edges incident to that vertex. Finally, suppose that C 1
and C2 are disjoint and C2 is above C 1 • If we fix an edge e1 of C 1 , then
C2 is determined by the vertex where the line parallel to e1 translated
upwards first hits C2 :

We obtain xp 2: (c:f) - 2mpnp, and a calculation leads to N 2: L xp =


0,(t 2 In) - O(nt). This concludes the proof of the O(n 2 . 5 ) bound for
halving facets in R3.
Having already introduced the decomposition of the graph of halv-
ing edges into convex chains as above, one can give an extremely sim-
ple alternative proof of Theorem 11.4.1. Namely, the graph of halving
edges is decomposed into at most n convex chains and, similarly, into
at most n concave chains. Any convex chain intersects any concave
chain in at most 2 points, and it follows that the number of edge
crossings in the graph of halving edges is O(n 2 ). The application of
the crossing number theorem finishes the proof.

Exercises
1. (a) Find the maximum possible number of halving edges for n = 4 and
n = 6, and construct the corresponding configurations. [II
(b) Check that the three graphs in Figure 11.1 are graphs of halving
edges of the depicted point sets. IT]
(c) Show that the configurations in Figure 11.1 maximize the number of
halving edges. [IJ
12

Two Applications of
High-Dimensional Polytopes

From this chapter on, our journey through discrete geometry leads us to the
high-dimensional world. Up until now, although we have often been consid-
ering geometric objects in arbitrary dimension, we could mostly rely on the
intuition from the familiar dimensions 2 and 3. In the present chapter we can
still use dimensions 2 and 3 to picture examples, but these tend to be rather
trivial. For instance, in the first section we are going to prove things about
graphs via convex polytopes, and for an n-vertex graph we need to consider
an n-dimensional polytope. It is clear that graphs with 2 or 3 vertices cannot
serve as very illuminating examples. In order to underline this shift to high
dimensions, from now on we mostly denote the dimension by n instead of d
as before, in agreement with the habits prevailing in the literature on high-
dimensional topics.
In the first and third sections we touch upon polyhedral combinatorics.
Let E be a finite set, for example the edge set of a graph G, and let F be
some interesting system of subsets of E, such as the set of all matchings in
G or the set of all Hamiltonian circuits of G. In polyhedral combinatorics
one usually considers the convex hull of the characteristic vectors of the sets
of F; the characteristic vectors are points of {O,l}E eRE. For the two
examples above, we thus obtain the matching polytope of G and the traveling
salesman polytope of G. The basic problem of polyhedral combinatorics is to
find, for a given F, inequalities describing the facets of the resulting polytope.
Sometimes one succeeds in describing all facets, as is the case for the matching
polytope. This may give insights into the combinatorial structure of F, and
often it has algorithmic consequences. If we know the facets and they have
a sufficiently nice structure, we can optimize any linear function over the
polytope in polynomial time. This means that given some real weights of the
elements of E, we can find in polynomial time a maximum-weight set in F
290 Chapter 12: Two Applications of High-Dimensional Polytopes

(e.g., a maximum-weight matching). In other cases, such as for the traveling


salesman polytope, describing all facets is beyond reach. The knowledge of
some facets may still yield interesting consequences, and on the practical
side, it can provide a good approximation algorithm for the maximum-weight
set. Indeed, the largest traveling salesman problems solved in practice, with
thousands of vertices, have been attacked by these methods.
We do not treat polyhedral combinatorics in any systematic manner;
rather we focus on two gems (partially) belonging to this area. The first one
is the celebrated weak perfect graph conjecture, stating that the complement
of any perfect graph is perfect, which is proved by combining combinatorial
and polyhedral arguments. The second one is an algorithmically motivated
problem of sorting with partial information, discussed in Section 12.3. We
associate a polytope with every finite partially ordered set, and we reduce
the question to slicing the polytope into two parts of roughly equal volume
by a hyperplane. A key role in this proof is played by the Brunn-Minkowski
inequality. This fundamental geometric inequality is explained and proved in
Section 12.2.

12.1 The Weak Perfect Graph Conjecture


First we recall a few notions from graph theory. Let G = (V, E) be a finite
undirected graph on n vertices. By G we denote the complement of G, that
is, the graph (V, (~) \ E). An induced subgraph of G is any graph that can
be obtained from G by deleting some vertices and all edges incident to the
deleted vertices (but an edge must not be deleted if both of its vertices remain
in the graph). Let w( G) denote the clique number of G, which is the maximum
size of a complete subgraph of G, and let a(G) = w(G) be the independence
number of G. Explicitly, a(G) is the maximum size of an independent set
in G, where a set S ~ V(G) is independent if the subgraph induced by S
in G has no edges. The chromatic number of G is the smallest number of
independent sets covering all vertices of G, and it is denoted by x( G).
Both the problems of finding w( G) and finding x( G) are computationally
hard. It is NP-complete to decide whether w( G) 2:: k, where k is a part of
the input, and it is NP-complete to decide whether x( G) = 3. Even approxi-
mating x( G) or w( G) is hard. So classes of graphs where the clique number
and/or the chromatic number are computationally tractable are of great in-
terest.
Perfect graphs are one of the most important such classes, and they in-
clude many other classes found earlier. A graph G = (V, E) is called perfect
if w(G') = X(G') for every induced subgraph G' of G (including G' = G).
For every graph G we have X(G) 2:: w(G), so a high clique number is a
"reason" for a high chromatic number. But in general it is not the only pos-
sible reason, since there are graphs with w( G) = 2 but X( G) arbitrarily large.
12.1 The Weak Perfect Graph Conjecture 291

Perfect graphs are those whose chromatic number is exclusively controlled by


the cliques, and this is true for G and also for all of its subgraphs.
For perfect graphs, the clique number, and hence also the chromatic num-
ber, can be computed in polynomial time by a sophisticated algorithm (re-
lated to semidefinite programming briefly discussed in Section 15.5). It is not
known how hard it is to decide perfectness of a given graph. No polynomial-
time algorithm has been found, but neither has any hardness result (such as
coNP-hardness) been proved. But for graphs arising in many applications we
know in advance that they are perfect.
Typical nonperfect graphs are the odd cycles C 2 k+l oflength 5 and larger,
since w( C2k +d = 2 for k 2: 2, while X( C2 k+d = 3.
The following two conjectures were formulated by Berge at early stages
of research on perfect graphs. Here is the stronger one:
Strong perfect graph conjecture. A graph G is perfect if and
only if neither G nor its complement contain an odd cycle of length
5 or larger as an induced subgrapb.
This is still open, in spite of considerable effort. The second conjecture is this:
Weak perfect graph conjecture. A graph is perfect if and only
if its complement is perfect.
This was proved in 1972. We reproduce a proof using convex polytopes.

12.1.1 Definition. Let G = (V, E) be a graph on n vertices. We assign a


convex polytope P(G) eRn to G. Let the coordinates in Rn be indexed by
the vertices of G; i.e., if V = {VI, ... , V n }, then the points of P(G) are of
the form x = (XV1, ..• ,XVn ). For an x E Rn and a subset U ~ V, we put
x(U) = LvEU xv·
The polytope P( G) is defined by the following inequalities:
(i) Xv 2: 0 for each vertex V E V, and
(ii) x(K):::; 1 for each clique (complete subgraph) K in the graph G.

Observations.
• P( G) ~ [0,1 In. The inequality Xv :::; 1 is obtained from (ii) by choosing
K = {v}.
• The characteristic vector of each independent set lies in P( G).
• If a vector x E P(G) is integral (i.e., it is a 0/1 vector), then it is the
characteristic vector of an independent set.
Before we start proving the weak perfect graph conjecture, let us intro-
duce some more notation. Let w: V -+ {O, 1,2, ... } be a function assigning
nonnegative integer weights to the vertices of G. We define the weighted clique
number w( G, w) as the maximum possible weight of a clique, where the weight
of a clique is the sum of the weights of its vertices. We also define the weighted
292 Chapter 12: Two Applications of High-Dimensional Polytopes

chromatic number x( G, w) as the minimum number of independent sets such


that each vertex v E V is covered by w( v) of them.
Now we can formulate the main theorem.
12.1.2 Theorem. The following conditions are equivalent for a graph G:
(i) G is perfect.
(ii) w(G, w) = x(G, w) for any nonnegative integral weight function w.
(iii) All vertices of the polytope P(G) are integral (and thus correspond to
the independent sets in G).
(iv) The graph G is perfect.

Proof of (i) => (ii). This part is purely graph-theoretic. For every weight
function w: V --+ {O, 1, 2, ... }, we need to exhibit a covering of V by inde-
°
pendent sets witnessing X(G, w) = w(G, w). If w attains only values and 1,
then we can use (i) directly, since selecting an induced subgraph of G is the
same as specifying a 0/1 weight function on the vertices.
For other values of w we proceed by induction on w(V). Let w be given
and let Vo be a vertex with w(vo) > 1. We define a new weight function w':

'( ) _ { w(v) - 1 for v = Vo,


w v - w(v) for v =I- Vo·

Since w'(V) < w(V), by the inductive hypothesis we assume that we have
independent sets h, 12 , •.. , IN covering each v exactly w'(v) times, where
N = w(G, w'). If w(G, w) > N, then we can obtain the appropriate covering
for w by adding the independent set {vo}, so let us suppose w( G, w) = N.
Let the notation be chosen so that Vo E h. We define another weight
function w":
"( ) _ { w(v) - 1 for v E h,
w v - w(v) for v 1- h.
We claim that w(G, w") < N. If not, then there exists a clique K with
w"(K) = N = w(G,w'). By the choice of the h we have N ::::: w'(K) =
2:[:,1 IIi n KI· Since a clique intersects an independent set in at most one
vertex, K has to intersect each h In particular, it intersects h, and so
w(K) > w"(K) = N, contradicting w(G,w) = N.
We thus have w(G, w") < N. By the inductive hypothesis, we can produce
a covering by independent sets showing that X( G, w") < N. By adding h to
it we obtain a covering witnessing X(G,w) = N.
Proof of (ii) => (iii). Let x :::= (XVI' •.. ,xvJ be a vertex of the convex poly-
tope P(G). Since all the inequalities defining P(G) have rational coefficients,
x has rational coordinates, and we can find a natural number q such that
w = qx is an integral vector. We interpret the coordinates of w as weights of
the vertices of G. Let K be a clique with weight N = w(G,w). One of the
inequalities defining P(G) is x(K) ::::: 1, and hence N = w(K) ::::: q.
12.1 The Weak Perfect Graph Conjecture 293

By (ii) we have X(G, w) = w(G, w) ::; q, and so there are independent sets
h, ... , I q (some of them may be empty) covering each vertex v E V precisely
Wv times. Let Ci be the characteristic vector of Ii; then this property of the
sets Ii can be written as x = 2:;=1 iCi. Thus x is a convex combination of
the Ci, and since it is a vertex of P(G), it must be equal to some Ci, which is
a characteristic vector of an independent set in G.
Proof of (iii) =* (iv). It suffices to prove X(G) = w(G) for every G
satisfying (iii), since (iii) is preserved by passing to an induced subgraph
(right?).
We prove that a graph G fulfilling (iii) has a clique K intersecting all
independent sets of the maximum size o:(G). Then the graph G \ K has
independence number o:(G) -1, and by repeating the same procedure we can
cover G by 0:( G) cliques.
To find the required K, let us consider all the independent sets of size
0: = 0:( G) in G and let M ~ P( G) be the convex hull of their characteristic
vectors. We note that M lies in the hyperplane h = {x: x(V) = o:}. This h
defines a (proper) face of P(G), for otherwise, we would have vertices of P(G)
on both sides of h, and in particular, there would be a vertex z with z(V) > 0:.
This is impossible, since by (iii), z would correspond to an independent set
bigger than 0:.

°
Each facet of P( G) corresponds to an equality in some of the inequalities
defining P(G). The equality can be either of the form Xv = or of the form
x(K) = 1. The face F = P(G) n h is the intersection of some of the facets.
Not all of these facets can be of the type Xv = 0, since then their intersection
°
would contain 0, while tJ. h. Hence all x E M satisfy x(K) = 1 for a certain
clique K, and this means that K n I i- 0 for each independent set I of size 0:.
Proof of (iv) =* (i). This is the implication (i) =} (iv) for the graph G. D

Bibliography and remarks. Perfect graphs were introduced by


Berge [Ber61]'[Ber62]' who also formulated the two perfect graph con-
jectures. The weak perfect graph conjecture was first proved (combi-
natorially) by Lovasz [Lov72]. The proof shown in this section follows
Grotschel, Lovasz, and Schrijver [GLS88], whose account is based on
the ideas of [Lov72] and of Fulkerson [Fu170].
Grotschel et al. [GLS88] denote the polytope P( G) by QSTAB( G)
and call it the clique-constrained stable set polytope (another name in
the literature is the fractional stable set polytope). Here stable set is
another common name for an independent set, and the stable set poly-
tope STAB( G) c RIEl is the convex hull of the characteristic vectors of
all independent sets in G. As we have seen, STAB(G) = QSTAB(G)
if and only if G is a perfect graph. Polynomial-time algorithms for
perfect graphs are based on beautiful geometric ideas (related to the
famous Lovasz -a-function), and they are presented in [GLS88] or in
Lovasz [Lov] (as well as in many other sources).
294 Chapter 12: Two Applications of High-Dimensional Polytopes

Polyhedral combinatorics was initiated mainly by the results of


Edmonds [Edm65]. For a graph G = (V, E), let M (G) denote the
matching polytope of G, that is, the convex hull of the characteris-
tic vectors of the matchings in a graph G. According to Edmonds'
matching polytope theorem, M (G) is described by the following in-
equalities: Xe ~ 0 for all e E E, LeEE: vEe Xe ::; 1 for all v E V, and
LeEE:eCsxe::; MISI-l) for all S ~ V of odd cardinality. For bipar-
tite G, the constraints of the last type are not necessary (this is an
older result of Birkhoff).
A modern textbook on combinatorial optimization, with an in-
troduction to polyhedral combinatorics, is Cook, Cunningham, Pul-
leyblank, and Schrijver [CCPS98]. It also contains references to the-
oretical and practical studies of the traveling salesman problem by
polyhedral methods.
A key step in many results of polyhedral combinatorics is prov-
ing that a certain system of inequalities defines an integral polytope,
i.e., one with all vertices inzn. Let us mention just one important
related concept: the total unimodularity. An m x n matrix A is to-
tally unimodular if every square submatrix of A has determinant 0, 1,
or -1. Total unimodularity can be tested in polynomial time (using
a deep characterization theorem of Seymour). All polyhedra defined
by totally unimodular matrices are integral, in the sense formulated
in Exercise 6. For other aspects of integral polytopes (sometimes also
called lattice polytopes) see, e.g., Barvinok [Bar97] (and Section 2.2).

Exercises
1. What are the integral vertices of the polytope P(C5 )? Find some nonin-
tegral vertex (and prove that it is really a vertex!). [!]
2. Prove that for every graph G and every clique K in G, the inequality
x(K) ::; 1 defines a facet of the polytope P(G). In other words, there
is an x E P(G) for which x(K) = 1 is the only inequality among those
defining P(G) that is satisfied with equality. [!]
3. (On Konig's edge-covering theorem) Explain why bipartite graphs are
perfect, and why the perfectness of the complements of bipartite graphs is
equivalent to Konig's edge-covering theorem asserting that the maximum
number of vertex-disjoint edges in a bipartite graph equals the minimum
number of vertices needed to intersect all edges (also see Exercise 10.1.5).
[!]
4. (Comparability graphs and Dilworth's theorem) For a finite partially
ordered set (X,::;) (see Section 12.3 for the definition), let G = (X, E)
be the graph with E = {{u,v} E (~): u < v or v < u}; that is, edges
correspond to pairs of comparable elements. Any graph isomorphic to
such a G is called a comparability graph. We also need the notions of a
12.1 The Weak Perfect Graph Conjecture 295

chain (a subset of X linearly ordered by:::::) and an antichain (a subset


of X with no two elements comparable under :::::).
(a) Prove that any finite (X,:::::) is the union of at most c antichains,
where c is the length of the longest chain, and check that this implies the
perfectness of comparability graphs. [TI
(b) Derive from (a) the Erdos-Szekeres lemma: If aI, a2, ... , an are ar-
bitrary real numbers, then there exist indices i l < i2 < ... < ik with
k 2 2 n and such that the subsequence ai, , ai2 , ... ,aik is monotone (non-
decreasing or decreasing). IT]
(c) Check that the perfectness of the complements of comparability
graphs is equivalent to the following theorem of Dilworth [Dil50]: Any
finite (X,:::::) is the union of at most a chains, where a is the maximum
number of elements of an antichain. [2]
5. (Hoffman's characterization of polytope integrality) Let P be a (bounded)
convex polytope in R n such that for every a E zn, the minimum of the
function x I-t (a, x) over all x E P is an integer. Prove that all vertices
of P are integral (i.e., they belong to zn). 0
6. (Kruskal-Hoffman theorem)
(a) Show that if A is a nonsingular n x n totally unimodular matrix
(all square submatrices have determinant 0 or ±1), then the mapping
x I-t Ax maps zn bijectively onto zn. IT]
(b) Show that if A is an m x n totally unimodular matrix and b is an
m-dimensional integer vector such that the system Ax = b has a real
solution x, then Ax = b has an integral solution as well. IT]
(c) Let A be an m x n totally unimodular matrix and let u, v E zn
and w, z E zm be integer vectors. Show that all vertices of the convex
polyhedron given by the inequalities u ::::: x ::::: v and w ::::: Ax ::::: z are
integral. [2]
7. (Helly-type theorem for lattice points in convex sets)
(a) Let A be a set of 2d + 1 points in Zd. Prove that there are a, bE A
with ~(a + b) E Zd. IT]
(b) Let "11, ... , 'Yn be closed half-spaces in R d, n 2 2d + 1, and suppose
that the intersection of every 2d of them contains a lattice point (a point
of Zd). Prove that there exists a lattice point common to all the 'Yi. [TI
(c) Prove that the number 2d in (b) is the best possible, i.e., there are 2d
half-spaces such that every 2d - 1 of them have a common lattice point
but there is no lattice point common to all of them. 12]
(d) Extend the Helly-type theorem in (b) to arbitrary convex sets instead
of half-spaces. IT]
The result in (d) was proved by Doignon [Doi73]; his proof starts with (a)
and proceeds on the level of abstract convexity (while the proof suggested
in (b) is more geometric).
296 Chapter 12: Two Applications of High-Dimensional Polytopes

12.2 The Brunn-Minkowski Inequality


Let us consider a 3-dimensional convex loaf of bread and slice it by three
parallel planar cuts.

As we will derive below, the middle cut cannot have area smaller than both
of the other two cuts. Let us choose the coordinate system so that the cuts
are perpendicular to the xl-axis and denote by v( t) the area of the cut by the
plane Xl = t. Then the claim can be stated as follows: For any h < t < t2 we
have v(t) 2: min(v(h),v(t2)). Thus, there is some to such that the function
t I-t v(t) is nondecreasing on (-00, to] and non increasing on [to, (0). Such a
function is called unimodal. A similar result is true for any convex body C in
Rn+l if v(t) denotes the n-dimensional volume of the intersection of C with
the hyperplane {x I = t}.
How can one prove such a statement? In the planar case, with n = 1,
it is easy to see that v(t) is a concave function on the interval obtained by
projecting C on the xl-axis.

This might tempt one to think that v(t) is concave on the appropriate interval
in higher dimension, too, but this is false in general! (See Exercise 1.) There
is concavity in the game, but the right function to look at in R n + l is v(t)l/n.
Perhaps a little more intuitively, we can define r(t) as the radius of the n-di-
mensional ball whose volume equals v(t). We have r(t) = Rnv(t)l/n, where
12.2 The Brunn-Minkowski Inequality 297

Rn is the radius of a unit-volume ball in Rn; let us call r(t) the equivalent
radius of C at t.
12.2.1 Theorem (Brunn's inequality for slice volumes). Let C c
Rn+l be a compact convex body and let the interval [tmin, t max ] be the pro-
jection of C on the xl-axis. Then the equivalent radius function r(t) (or,
equivalently, the function v(t)l/n) is concave on [tmin, t max ]. Consequently,
for any tl < t < t2 we have v(t) :2: min(v(td, V(t2»'
Brunn's inequality is a consequence of the following more general and
more widely applicable statement dealing with two arbitrary compact sets.

12.2.2 Theorem (Brunn-Minkowski inequality). Let A and B be non-


empty compact sets in R n. Then

vol(A + B)l/n :2: vol(A)I/n + vol(B)I/n.


Here A + B = {a + b: a E A, b E B} denotes the Minkowski sum of A
and B. If A' is a translated copy of A, and B' a translated copy of B, then
AI + B' is a translated copy of A + B. So the position of A + B with respect
to A and B depends on the choice of coordinate system, but the shape of
A + B does not. One way of interpreting the Minkowski sum is as follows:
Keep A fixed, pick a point bo E B, and translate B into all possible positions
for which bo lies in A. Then A + B is the union of all such translates. Here is
a planar example:

bo

Sometimes it is also useful to express the Minkowski sum A+B as a projection


of the Cartesian product A x B C R 2n by the mapping (x, y) H x+y,
x,yERn .
Proof of Brunn's inequality for slice volumes from the Brunn-
Minkowski inequality. First we consider "convex combinations" of sets
A,B c Rn of the form (l-t)A+tB, where t E [0,1] and where tA stands for
{ta: a E A}. As t goes from 0 to 1, (l-t)A + tB changes shape continuously
from A to B.
Now, if A and B are both convex and we place them into Rn+l so that A
lies in the hyperplane {Xl = O} and B in the hyperplane {Xl = I}, it is not
difficult to check that (l-t)A+tB is the slice of the convex body conv(AUB)
by the hyperplane {Xl = t}; see Exercise 2:
298 Chapter 12: Two Applications of High-Dimensional Polytopes

Let us consider the situation as in Brunn's inequality, where C c Rn+l


is a convex body. Let A and B be the slices of C by the hyperplanes {Xl =
td and {x = t2}, respectively, where h < t2 are such that A, B -I- 0. For
convenient notation, we change the coordinate system so that h = 0 and
t2 = 1. To prove the concavity of the function v(t)l/n in Brunn's inequality,
we need to show that for all t E (0,1),
(l-t)vol(A)l/n +tvol(B)l/n:::; vol(M)l/n, (12.1)
where M is the slice of C by the hyperplane h t = {Xl = t}. Let C' =
conv(A U B) and M' = C' n h t . We have C' <;;; C and M' <;;; M . By the
remark above, M' = (l-t)A + tB, and so the Brunn-Minkowski inequality
applied to the sets (l-t)A and tB yields
vol(M)l/n 2:: vol(M')l/n = vol((I-t)A + tB)l/n
2:: vol((I-t)A)l /n + vol(tB)l /n
= (l-t) vol(A)l/n + t vol(B)l/n.
This verifies (12.1). D

Proof of the Brunn-Minkowski inequality. The idea of this proof is


simple but perhaps surprising in this context. Call a set A <;;; Rd a brick set if
it is a union of finitely many closed axis-parallel boxes with disjoint interiors.
First we show that it suffices to prove the inequality for brick sets (which is
easy but a little technical), and then for brick sets the proof goes by induction
on the number of bricks.
12.2.3 Lemma. If the Brunn-Minkowski inequality holds for all nonempty
brick sets A', B' eRn, then it is valid for all nonempty compact sets A, B c
R n as well.

Proof. We use a basic fact from measure theory, namely, that if Xl ::! X 2 ::!
X3 ::! .. . is a sequence of measurable sets in R n such that X = n:l
Xi,
then the numbers vol(Xd converge to vol(X).
12.2 The Brunn-Minkowski Inequality 299

Let A, BeRn be nonempty and compact. For k = 1,2, ... , consider the
closed axis-parallel cubes with side length 2- k centered at the points of the
scaled grid 2- k Z n (these cubes cover Rn and have disjoint interiors). Let Ak
be the union of all such cubes intersecting the set A, and similarly for Bk .

..,
i"
r. ~
..... r-- 'I
........,
We have Al ;2 A2 ;2 ... and nk Ak = A (since any point not belonging to A
has a positive distance from it, and the distance of any point of Ak from A
is at most 2- k y'n). Therefore, vol(Ak) --+ vol(A) and vol(Bk) --+ vol(B).
We claim that A+B ;2 nk(Ak+Bk). To see this, let x E Ak+Bk for all k.
We pick Yk E Ak and Zk E Bk with x = Yk + Zk, and by passing to convergent
subsequences we may assume that Yk --+ yEA and Zk --+ Z E B. Then we
obtain x = Y + Z E A + B. Thus limk-too vol(Ak + Bk) :::;; vol(A + B). By
the Brunn-Minkowski inequality for the brick sets A k , B k , we have vol(A +
B)l/n 2': limk-too vol(Ak + Bk)l/n 2': limk-too(vol(Ak)l/n + vol(Bk)l/n)
vol(A)I/n + vol(B)I/n. 0

Proof of the Brunn-Minkowski inequality for brick sets. Let A


and B be brick sets consisting of k bricks in total. If k = 2, then both
A and B, and A + B too, are bricks. Then if Xl, ... ,Xn are the sides of
A and YI, .. " Yn are the sides of B, it suffices to establish the inequality
(Il~=l Xi) lin + (Il~=l Yi) lin :::;; (Il~l (Xi +Yi») l/n; we leave this to Exercise 3.
Now let k > 2 and suppose that the Brunn-Minkowski inequality holds
for all pairs A, B of brick sets together consisting of fewer than k bricks. Let
A and B together have k bricks, and let the notation be chosen so that A
has at least two bricks. Then it is easily seen that there exists a hyperplane h
parallel to some of the coordinate hyperplanes and with at least one full brick
of A on one side and at least one full brick of A on the other side (Exercise 4).
By a suitable choice of the coordinate system, we may assume that h is the
hyperplane {Xl = a}.
Let A' be the part of A on one side of h and A" the part on the other side.
More precisely, A' is the closure of A n h$, where h$ is the open half-space
{Xl> O}, and similarly, A" is the closure of A n he. Hence both A' and A"
have at least one brick fewer than A.
Next, we translate the set B in the xl-direction in such a way that the
hyperplane h divides its volume in the same ratio as A is divided (translation
does not influence the validity of the Brunn-Minkowski inequality). Let B'
and B" be the respective parts of B.
300 Chapter 12: Two Applications of High-Dimensional Polytopes

A'
B'
D
;--

D ......
A" B"

Putting p = vol(A')j vol(A), we also have p = vol(B')j vol(B). (If vol(A) = 0


or vol(B) = 0, then the Brunn-Minkowski inequality is obvious.)
The sets A' and B' together have fewer than k bricks, so we can use the
inductive assumption for them, and similarly for A", B".
The set A' + B' is contained in the closed half-space {Xl 2: O}, and
A" + B" lies in the opposite closed half-space {Xl ::::: O}. Therefore, crucially,
vol(A + B) 2: vol(A' + B') + vol(A" + B"). We calculate

r r
vol(A + B) 2: vol(A' + B') + vol(A" + B")
(induction) 2: [vol(A')I/n + vol(B')I/n + [vol(A")I/n + vol(B")I/n
[pl/n vol(A)I/n + pl/n vol(B)I/n] n
+ [(1_p)l/n vol(A)I/n + (l-p)l/n vol(B)I/nr
= [vol(A)I/n + vol(B)I/n r.
This concludes the proof of the Brunn-Minkowski inequality. o

Bibliography and remarks. Brunn's inequality for slice volumes


appears in Brunn's dissertation from 1887 and in his Habilitations-
schrift from 1889. Minkowski's formulation of Theorem 12.2.2 (proved
for convex sets) was published in the 1910 edition of his book [Min96].
A proof for arbitrary compact sets was given by Lusternik in 1935;
see, e.g., the Sangwine-Yager [SY93] for references.
The proof of the Brunn-Minkowski inequality presented here fol-
lows Appendix III in Milman and Schechtman [MS86]. Several other
proofs are known. A modern one, explained in Ball [Ba197], derives a
more general inequality dealing with functions. Namely, if t E (0,1)
and f, g, and h are nonnegative measurable functions Rn -+ R
such that h((l-t)x + ty) 2: f(X)I-tg(y)t for all x,y ERn, then
JRn h 2: URn f)l-t URn g)t (the Prekopa-Leindler inequality). By
letting f, g, and h be the characteristic functions of A, B, and A + B,
respectively, we obtain vol((l-t)A+tB) 2: vol(A)I-tvol(B)t. This is
an alternative form of the Brunn-Minkowski inequality, from which
12.2 The Brunn-Minkowski Inequality 301

the version in Theorem 12.2.2 follows quickly (see Exercise 5). Ad-
vantageously, the dimension does not appear in the Prekopa-Leindler
inequality, and it is simple to derive the general case from the I-dimen-
sional case by induction; see Exercise 7. This passage to a dimension-
free form of the inequality, which can be proved from the I-dimensional
case by a simple product argument, is typical in the modern theory of
geometric inequalities (a similar phenomenon for measure concentra-
tion inequalities is mentioned in the notes to Section 14.2).
The Brunn-Minkowski inequality is just the first step in a so-
phisticated theory; see Schneider [Sch93] or Sangwine-Yager [SY93].
Among the most prominent notions are the mixed volumes. As was
discovered by Minkowski, if K l , ... , Kr C R n are convex bodies and
>\1, A2,"" Ar are nonnegative real parameters, then vol(AlKl +A2K2+
... + ArKr) is a homogeneous symmetric polynomial of degree n.
For 1 ::; i l ::; i2 ::; '" ::; in ::; r, the coefficient of Ail Ai2 ... Ai n
is denoted by V (Kil , K i2 , ... , Kin) and called the mixed volume of
K h , K i2 , ... , Kin' A powerful generalization of the Brunn-Minkowski
inequality, the Alexandrov-Fenchel inequality, states that for any con-
vex A,B,K3 ,K2, ... ,Kn eRn, we have
V(A, B, K 2, . .. ,Kn)2 2: V(A, A, K 3 , . •. ,Kn) . V(B, B, K 3 , • .. ,Kn).

Exercises
1. Let A be a single point and B the n-dimensional unit cube. What is the
function v(t) = vol((I-t)A + tB)? Show that v(t)i3 is not concave on
[0,1] for any (3 > ~. II]
2. Let A,B ~ Rn be convex sets. Show that the sets conv(({O}xA) U
({l}xB)) and UtE[O,l] [{t}x((I-t)A + tB)] (in Rn+l) are equal. 0
3. Prove that
n n
(gXir/n + (gYir/ ::; (g(Xi+Yi)r/

for arbitrary positive reals Xi, Yi. m


4. Show that for any brick set A with at least two bricks, there exists a
hyperplane h parallel to one of the coordinate hyperplanes that has at
least one full brick of A on each side. 0
5. (Dimension-free form of Brunn-Minkowski) Consider the following two
statements:
(i) Theorem 12.2.2, i.e., vol(A + B)l/n 2: vol(A)l/n + vol(B)l/n for every
nonempty compact A, B eRn.
(ii) For all compact C,D C Rn and all t E (0,1), vol((1-t)C+ tD) 2:
vol(C)l-t vol(D)t.
(a) Derive (ii) from (i); prove and use the inequality (l-t)x+ty 2: xl-tyt
(x, Y positive reals, t E (0,1)). m
302 Chapter 12: Two Applications of High-Dimensional Polytopes

(b) Prove (i) from (ii). 0


6. Give a short proof of the I-dimensional Brunn-Minkowski inequality:
vol(A + B) ~ vol(A) + vol(B) for any nonempty measurable A, B c R.
~
7. (Brunn-Minkowski via Pn§kopa-Leindler) The goal is to establish state-
ment (ii) in Exercise 5.
(a) Let f, g, h: R -+ R be bounded nonnegative measurable functions
such that h((I-t)x+ty) ~ f(X)l-tg(y)t for all x, y E R and all t E (0,1).
Use the one-dimensional Brunn-Minkowski inequality (Exercise 6) to
prove J h ~ (I-t) (J f) +t (J g) (all integrals over R); by the inequality
in Exercise 5(a), the latter expression is at least (J f)l-t (J gt First
show that we may assume sup f = sup 9 = 1. ~
(b) Prove statement (ii) in Exercise 5 by induction on the dimension,
using (a) in the induction step. 0

12.3 Sorting Partially Ordered Sets


Here we present an amazing application of polyhedral combinatorics and
of the Brunn-Minkowski inequality in a problem in theoretical computer
science: sorting of partially ordered sets. We recall that a partially ordered set,
or poset for short, is a pair (X, :::S), where X is a set and :::S is a binary relation
on X (called an ordering) satisfying three axioms: reflexivity (x :::S x for all
x), transitivity (x :::S y and y :::S z implies x :::S z), and weak antisymmetry (if
x :::S y and y :::S x, then x = y). The ordering :::S is linear if every two elements
of x, y E X are comparable; that is, x :::S y or y :::S x.
Let X be a given finite set with some linear ordering :S. For example,
the elements of X could be identical-looking golden coins ordered by their
weights (assuming that no two weights exactly coincide). We want to sort
X according to :S; that is, to list the elements of X in increasing order. We
can get information about :S by pairwise comparisons: We can choose two
elements a, b E X and ask an oracle whether a :S b or a ~ b. In our example,
we have precise scales such that only one coin fits on each scale, which allows
us to make pairwise comparisons. Our sorting procedure may be adaptive:
The elements to be compared next may be selected depending on the outcome
of previous comparisons. We want to make as few comparisons as possible.
In the usual sorting problem we begin with no information about the or-
dering :S whatsoever. As is well known, 8( n log n) comparisons are sufficient
and also necessary in the worst case. Here we consider a different setting,
when we start with some information already given. Namely, we obtain (ex-
plicitly) some partial ordering :::S on X, and we are guaranteed that x :::S y
implies x :S y; that is, :S is a linear extension of :::S. In the example with coins,
some weighings have already been made for us before we start. How many
comparisons do we need to sort?
12.3 Sorting Partially Ordered Sets 303

Let E(::;)denote the set of all linear extensions of a partial ordering ::;
and let e(::;) = IE(::;)I be the number of linear extensions. To sort means to
select one among the e(::;) possible linear extensions. Since a comparison of
distinct elements a and b can have two outcomes, we need at least log2 e(::;)
comparisons in the worst case to distinguish the appropriate linear extension.
Is this lower bound always asymptotically tight? Can one always sort using
0(10g2 e(::;)) comparisons, for any::;? An affirmative answer is implied by
the following theorem:

12.3.1 Theorem (Efficient comparison theorem). Let (X,::;) be a


poset, and suppose that ::; is not linear. Then there exist elements a, b E X
such that
0< e(::; + (a, b)) < 1-0
- e(::;) - ,
where 0 > 0 is an absolute constant and::; + (a, b) stands for the transitive
closure of the relation ::; u {(a, b) }, that is, the partial ordering we obtain
from::; if we are told that a precedes b.

How do we use this for sorting ::;? For the first comparison, we choose
the two elements a, b as in the theorem. Depending on the outcome of this
comparison, we pass either to the partial ordering::; +(a, b) or to ::; +(b, a).
In both cases, the number of linear extensions has been reduced by the factor
1-0: For a ~ b this is clear by the theorem, and for a ::::: b this follows
from the equality e(::; + (a, b)) + e(::; + (b,a)) = e(::;). Hence, proceeding by
induction, we can sort any partial ordering::; using at most POgl/(1-8) e(::;)l
comparisons.
The conjectured "right" value of 0 in Theorem 12.3.1 is ~ ~ 0.33; obvi-
ously, one cannot do any better for the poset

(meaning that (a, b) is the only pair of distinct elements in the relation ::;).
The proof below gives 0 = ~ ~ 0.184, and more complicated proofs yield
better values, although ~ seems still elusive.
Order polytopes. We assign certain convex polytopes to partial orderings.

12.3.2 Definition (Order polytope). Let (X,::;) be an n-element poset.


Let the coordinates in R n be indexed by the elements of X. We define a
polytope P(::;), the order polytope of::;, as the set of all x E [0, IJn satisfying
the following inequalities:

Xa ~ Xb for every a, b E X with a ::; b.

Here is an alternative description of the order polytope:


304 Chapter 12: Two Applications of High-Dimensional Polytopes

12.3.3 Observation. The vertices of the order polytope P(~) are precisely
the characteristic vectors of all up-sets in (X, ~), where an up-set is a subset
U ~ X such that if a E U and a ~ b, then b E U.

Proof. It is easy to see that the characteristic vector of an up-set is in


P(~), and that any 0/1 vector in P(~) determines an up-set. It remains to
check that all vertices of P(~) are integral. Any vertex is the intersection of
some n facet hyperplanes. Since all potential facet hyperplanes have the form
Xa = Xb, or Xa = 0, or Xa = 1, the integrality is obvious. 0

12.3.4 Observation. Let X be an n-element set.


(i) If S is a linear ordering on X, then P(s) is a simplex of volume l/n!.
(ii) For any partial ordering ~ on X, the simplices of the form P(5o), where S
is a linear extension of~, cover P(~) and have disjoint interiors. Hence
vol(P(~)) = ;he(~).

Here is the order polytope of a 3-element poset:

:r -c

~~~-------- Xa

It is subdivided into 3 tetrahedra corresponding to linear extensions.


Proof of Observation 12.3.4. In (i), consider the ordering 1 50 2 S ... S n.
The characteristic vectors of up-sets have the form (0,0, ... ,0,1,1, ... ,1).
There are n+1 of them, and they are affinely independent, so P(s) is a
simplex. Other linear orderings differ by a permutation of coordinates, so
we get congruent simplices. The volume could be calculated directly, but it
follows easily from considerations below.
As for (ii), any point (Xl, ... , Xn) E P(~) with distinct coordinates de-
termines a unique linear extension of~, namely the one given by the natural
ordering of its coordinates as real numbers. Conversely, for any linear exten-
sion 50 E E(~), we have P(s) ~ P(~) by definition. Hence the congruent
simplices corresponding to linear extensions subdivide P(~).
To see that the simplices have volume l/n!, take the discrete ordering (no
two distinct elements are comparable) for ~. The order polytope is the unit
cube [0, l]n, and it is subdivided into n! congruent simplices corresponding
to the n! possible linear orderings. 0

Height and center of gravity. Let X be a finite set and S a linear


ordering on it. For a E X, we define the height of a in 50, denoted by h«a),
as I{x E X: X 50 a}l. For a poset (X, ~), the height of an element is defined
as the average height over all linear extensions:
12.3 Sorting Partially Ordered Sets 305

1
h-<..(a)
- = -(-)
e-< "~ h«a).
-
- ::;EE(~)

If::::< is clear from context, we omit it in the subscript and we write just h(a).
The "good" elements a, b in the efficient comparison theorem can be se-
lected using the height. Namely, we show that any two distinct a, b with
Ih(a) - h(b)1 < 1 will do. (It is simple to check that if ::::< is not a linear or-
dering, then such a and b always exist; see Exercise 1.)
We now relate the height to the order polytope.

12.3.5 Lemma. For any n-element poset (X, ::::<), the center of gravity of the
order polytope P(::::<) is C = (c a : a E X), where Ca = n~l h~(a).

Proof. The center of gravity of P(::::<) is the arithmetic average of centers


of gravity of the simplices P(~) with ~E E(::::<). Hence it suffices to prove
the lemma for a linear ordering ~. By permuting coordinates, it suffices to
calculate that for the simplex with vertices of the form (0, ... ,0,1, ... , 1), the
center of gravity is n~l (1, 2, ... , n). This is left as Exercise 2. 0

Proof of the efficient comparison theorem. Given the poset (X, ::::<),
we consider two elements a, bE X with Ih(a) - h(b)1 < 1. We want to show
that the number of linear extensions of both ::::< + (a, b) and ::::< + (b, a) is at
least a constant fraction of e(::::<). Consider the order polytopes P = P(::::<),
P::; = P(::::< + (a,b)), and P?:. = P(::::< + (b,a)). Geometrically, P is sliced into
P::; and P?:. by the hyperplane h = {x ERn: Xa = Xb}.

By Observation 12.3.4(ii), it suffices to show that the volumes of both P::;


and P?:. are at least a constant fraction of vol(P).
For convenience, let us introduce a new coordinate system in Rn, where
the first coordinate Yl is Xb - Xa and the others complete it to an orthonormal
coordinate system (Yl,"" Yn). Hence h is the hyperplane Yl = 0. Let c(P)
denote the center of gravity of P, and let Cl = Cl(P) be its Yl-coordinate.
What geometric information do we have about P? It is a convex body
with the following properties:
• The projection of P onto the Yl-axis is the interval [-1, 1]. This is be-

containing b but not a, and thus P has a vertex with Xa = 1, Xb = and


a vertex with Xa = 0, Xb = 1.
°
cause there is an up-set of ::::< containing a and not b, and also an up-set

• We have - n~l < Cl < n~l' since Cl = n~l (h(a) - h(b)) and Ih(a) -
h(b)1 < 1.
306 Chapter 12: Two Applications of High-Dimensional Polytopes

The proof of Theorem 12.3.1 is finished by showing that any compact


convex body P C Rn with these two properties satisfies

1 1
vol(P:,:;) 2: 2e vol(P) and vol(?:,:) 2: 2e vol(P),

where P Sc is the part of P in the half-space {Yl ::; O} and P"2 is the other
part.
For t E [-1, 1], let Pt be the (n-I )-dimensional slice of P by the hyper-
plane {Yl = t}, and let ret) be the equivalent radius of Ph i.e., the radius of
an (n-I)-dimensional ball of volume VOln-l(Pt ). By Brunn's inequality for
slice volumes (Theorem 12.2.1), ret) is concave on [-1,1].
The Yl-coordinate of the center of gravity of P can be expressed as

(imagine P composed of thin plates perpendicular to the Yl-axis). Hence Cl is


fully determined by the function r( t). In other words, the shapes of the slices
of P do not really matter; only their volumes do, and so we may imagine that
P is a rotational body whose slice Pt is an (n-I)-dimensional ball of radius
ret) centered at (t, 0, ... ,0).
We want to show that if Cl(P) 2: - n~l' then vol(P"2) 2: 21e vol(P). The
inequality for vol(PSc ) follows by symmetry. The key step is to pass to another,
especially simple, rotational convex body K. The slice K t of K has radius
",,(t); the functions ",,(t) and ret) are schematically plotted below:
W

",,(t)
y + u
-1 o 1 u

The graph of the function ",,(t) consists of two linear segments, and so K is
a double cone. First we construct the function ",,(t) for t positive. Here the
graph is a segment starting at the point V = (0, reO)) and ending at the point
U = (u,O). The number u is chosen so that vol(K"2) = vol(P"2). Since ret) is
concave and ",,(t) is linear on [0, u], we have u 2: 1. Moreover, as t grows from
o to I, we first have ret) 2: ",,(t), and then from some point on ret) ::; ",,(t).
This ensures that the center of gravity of K"2 is to the right of the center of
gravity of P>- (we can imagine that P>- is transformed into K>- by peeling
off some mass in the region labeled "-" and moving it right, to the region
labeled "+").
12.3 Sorting Partially Ordered Sets 307

Next, we define r;,(t) for t < o. We extend the segment UV to the left until
the (unique) point W such that when YWV is the graph of r;,(t) for negative
t, we have vol(Ks;} = vol(Ps:)' As t goes from 0 down to -1, r;,(t) is first above
r(t) and then below it. This is because at V, the segment WU decreases more
steeply than the function r(t). Therefore, we also have cdKsJ 2: CI (PsJ, and
hence CI (K) 2: CI (P) 2: - n~l. So, as was noted above, it remains to show
that vol(K2J 2: fevol(K), which is a more or less routine calculation.
We fix the notation as in the following picture:
l\S J{?

We note that cI(K) is a weighted average of cI(Kd and cI(K2); the weights
are the volumes of KI and K2 whose ratio is hI : h2. The center of gravity of
an n-dimensional cone is at n~ I of its height, and hence CI (K I) = - .6. r2h -
and CI (K2 ) = n~1 -.6.. Therefore,

cl(K) =
hI (_-'!:L) + h2 (...!lL)
n+1 n+1 -.6. =
h
2 -
h
1 -.6..
hI + h2 n +1
We have .6. = I-hI, and so from the condition cl(K) 2: - n~l we obtain
h2 + nhl 2: n. We substitute hI = U - h2 + 1 and rearrange, which yields
U 1
- > 1--. (12.2)
h2 - n
We are interested in bounding vol(K::::) from below. The cone K:::: is similar
to K 2 , with ratio u/h 2 . So

vol(K::::) = (:J n vol(K2) = (:J n hI; h2 vol(K)

= _u_ (hU )n-l vol(K).


U +1 2

Now we substitute for u/h 2 from (12.2), obtaining

vol(K» 2: _u_ ( 1 - -
1 )n-I vol(K).
- u+l n
308 Chapter 12: Two Applications of High-Dimensional Polytopes

Finally, u~l 2': ~ (as u 2': 1) and (1 - ~)n-1 > e- 1 for all n, so vol(K;::J 2':
~ vol(K) follows. D

Bibliography and remarks. The statement of the efficient com-


parison theorem with J = ~, known as the "~-~ conjecture," was con-
jectured by Kislitsyn [Kis68] and, later but independently, by Fredman
(unpublished) and by Linial [Lin84]. In this strongest possible form it
remains a challenging open problem in the theory of partially ordered
sets (see Trotter [Tr092], [Tr095] for overviews of this interesting area).
The problem of sorting with partial information was considered
by Fredman [Fre76], who proved that any n-element partially ordered
set (X,~) can be sorted by at most log2(e(~)) + 2n comparisons.
This is optimal unless e(~) is only subexponential in n. The effi-
cient comparison theorem was first proved, with J = 131 :::::; 0.2727,
by Kahn and Saks [KS84]. Their proof is quite complicated, and in-
stead of the Brunn-Minkowski inequality it employs the more powerful
Aleksandrov-Fenchel inequality. The constant 131 is optimal for their
approach, in the sense that if a and b are elements of a poset such that
Ih(a) - h(b)1 < 1, then the comparison of a and b generally need not
reduce the number of linear extensions by any better ratio.
The simpler proof presented in this section is due to Kahn and
Linial [KL91]' and a similar one, with a slightly worse J, was found
by Karzanov and Khachiyan; see [Kha89]. The method is inspired
by proofs of a result about splitting a convex body by a hyperplane
passing exactly through the center of gravity (Exercise 3), proved by
Griinbaum [Grii60] (see [KL91] for more remarks on the history). Ob-
servation 12.3.4, on which all the proofs of Theorem 12.3.1 are based,
is from Linial [Lin84].
The current best value of J = (5-V5)/10 :::::; 0.2764 was achieved by
Brightwell, Felsner, and Trotter [BFT95]. They extend the Kahn-Saks
method, and instead of two elements a and b with Ih(a) - h(b) < 1,
they consider three elements a, b, c with h(a) :s: h(b) :s: h(c) :s: h(a) +
2. Interestingly, they also construct an infinite (countable) poset for
which their value of J is optimal (and so the natural infinite analogue
of the ~-~ conjecture is false). In order to formulate this result, one
needs a probability measure on the set of all linear extensions of the
considered poset. Their poset is thin, meaning that the maximum size
of an antichain is bounded by a constant, and the probability measure
is obtained by taking a limit over a sequence of finite intervals in the
poset.
The proofs of the efficient comparison theorem do not provide
an efficient algorithm for actually computing suitable elements a, b.
General methods for estimating the volume of convex bodies, men-
tioned in Section 13.2, yield a polynomial-time randomized algorithm.
12.3 Sorting Partially Ordered Sets 309

Kahn and Kim [KK95J gave a deterministic polynomial-time adap-


tive sorting procedure that sorts any given n-element poset (X,::5)
by O(1og(e(::5))) comparisons. We at least mention some interesting
concepts in their algorithm. Instead of the order polytope, they con-
sider the chain polytope; this the convex hull of the characteristic
vectors of all antichains in (X, ::5). Equivalently, it is the stable set
polytope STAB( G) (see Section 12.1) of the comparability graph G
of (X,::5), where G = G(::5) = (X,{{x,y}: x -< y or y -< x}). As
was shown by Stanley [Sta86J, the chain polytope has the same vol-
ume as the order polytope. The next key notion is the entropy of a
graph. For a given graph G = (V, E) and a probability distribution
p: V --+ [O,lJ on its vertices, the entropy H(G,p) can be defined as
minxEsTAB(G) (- 'EvEv Pv log2 xv) (there are several equivalent defi-
nitions). Graph entropy was introduced by Korner [Kor73J, and he
and his coworkers achieved remarkable results in extremal set theory
and related fields using this concept (see, e.g., Gargano, Korner, and
Vaccaro [GKV94]). The entropy can be approximated in deterministic
polynomial time, and the adaptive sorting algorithm of Kahn and Kim
chooses the next comparison as one that increases the entropy of the
comparability graph as much as possible (this need not always be an
"efficient comparison" in the sense of Theorem 12.3.1).

Exercises
1. Let (X, ::5) be a finite poset. Prove that if ::5 is not a linear ordering, then
there always exist a,b E X with jh(a) - h(b)j < 1. CD
2. Show that the center of gravity of a simplex with vertices ao, al,"" ad
is the same as the center of gravity of its vertex set. ~
3. Let K be a bounded convex body in R n , h a hyperplane passing through
the center of gravity of K, and KI and K2 the parts into which K is
divided by h.
(a) Prove that vol(Kd,vol(K2) 2: (n~l)nvol(K). [II
(b) Show that the bound in (a) cannot be improved in general. ~
13

Volumes in High Dimension

We begin with comparing the volume of the n-dimensional cube with the
volume of the unit ball inscribed in it, in order to realize that volumes of
"familiar" bodies behave quite differently in high dimensions from what the
3-dimensional intuition suggests. Then we calculate that any convex polytope
in the unit ball B n whose number of vertices is at most polynomial in n
occupies only a tiny fraction of B n in terms of volume. This has interesting
consequences for deterministic algorithms for approximating the volume of
a given convex body: If they look only at polynomially many points of the
considered body, then they are unable to distinguish a gigantic ball from a
tiny polytope. Finally, we prove a classical result, John's lemma, which states
that for every n-dimensional symmetric convex body K there are two similar
ellipsoids with ratio ..;n such that the smaller ellipsoid lies inside K and the
larger one contains K. So, in a very crude scale where the ratio ..;n can be
ignored, each symmetric convex body looks like an ellipsoid.
Besides presenting nice and important results, this chapter could help
the reader in acquiring proficiency and intuition in geometric computations,
which are skills obtainable mainly by practice. Several calculations of non-
trivial length are presented in detail, and while some parts do not require
any great ideas, they still contain useful small tricks.

13.1 Volumes, Paradoxes of High Dimension, and


Nets
In the next section we are going to estimate the volumes of various convex
polytopes. Here we start, more modestly, with the volumes of the simplest
bodies.
The ball in the cube. Let Vn denote the volume of the n-dimensional ball
B n of unit radius. A neat way of calculating Vn is indicated in Exercise 2;
312 Chapter 13: Volumes in High Dimension

the result, which can be verified in various other ways and found in many
books of formulas, is

Io
Here r(x) = oo tx-1e- t dt is the usual gamma function, with r(k+1) = k!
for natural numbers k.
Let us compare the volume of the unit cube [o,l]n with that of the in-
scribed ball (of radius !).

o
(Using Exercise 1, the reader may want to add the crosspolytope inscribed
in both bodies to the comparison.) For dimension n = 3, the volume of
the ball is about 0.52, but for n = 11 it is already less than 10- 3 . Using
Stirling's formula, we find that it behaves roughly like (2~e )n/2. For large n,
the inscribed ball is thus like a negligible dust particle in the cube, as far as
the volume is concerned.
This can be experienced if one tries to generate random points uniformly
distributed in the unit ball Bn. A straightforward method is first to generate
a random point x in the cube [-1, l]n, by producing n independent random
numbers Xl,X2, ... ,Xn E [-1,1]. If Ilxll > 1, then x is discarded and the
experiment is repeated, and if IIxll :::; 1, then x is the desired random point
in the unit ball. This works reasonably in dimensions below 10, say, but in
dimension 20, we expect about 40 million discarded points for each accepted
point, and the method is rather useless.
Another way of comparing the ball and the cube is to picture the sizes of
the n-dimensional ball having the same volume as the unit cube:

n=2
on= 10
o n= 50

For large n, the radius grows approximately like 0.24Jn. This indicates that
the n-dimensional unit cube is actually quite a huge body; for example, its
diameter (the length of the longest diagonal) is In. Here is another example
illustrating the largeness of the unit cube quite vividly.
13.1 Volumes, Paradoxes of High Dimension, and Nets 313

Balls enclosing a ball. Place balls of radius ~ into each of the 2n vertices
of the unit cube [0, l]n so that they touch along the edges of the cube, and
consider the ball concentric with the cube and just touching the other balls:

Obviously, this ball is quite small, and it is fully contained in the cube, right?
No: Already for n = 5 it starts protruding out through the facets.
Proper pictures. If a planar sketch of a high-dimensional convex body
should convey at least a partially correct intuition about the distribution of
the mass, say for the unit cube, it is perhaps best to give up the convexity
in the drawing! According to Milman [Mil98], a "realistic" sketch of a high-
dimensional convex body might look like this:

Strange central sections: the Busemann-Petty problem. Let K and


L be convex bodies in Rn symmetric about 0, and suppose that for every
hyperplane h passing through 0, we have VOln_l(K n h) ~ VOln-l(L n h). It
seems very plausible that this should imply vol ( K) ~ vol( L); this conjecture
of Busemann and Petty used to be widely believed (after all, it was known
that if the volumes of the sections are equal for all h, then K = L). But as it
turned out, it is true only for n ~ 4, while in dimensions n 2: 5 it can fail! In
fact, for large dimensions, one of the counterexamples is the unit cube and
the ball of an appropriate radius: It is known that all sections of the unit
cube have volume at most V2, while in large dimensions, the unit-volume
ball has sections of volume about ye.
Nets in a sphere. We conclude this section by introducing a generally
useful tool. Let sn-l = {x ERn: Ilxll = I} denote the unit sphere in Rn
(note that S2 is the 2-dimensional sphere living in R3). We are given a
number TJ > 0, and we want to place a reasonably small finite set N of points
on sn-l in such a way that each x E sn-l has some point of N at distance
no larger than TJ. Such an N is called TJ-dense in sn-l. For example, the
set N = {el' -el,"., en, -en} of the 2n orthonormal unit vectors of the
standard basis is V2-dense. But it is generally difficult to find good explicit
constructions for arbitrary TJ and n. The following simple but clever existential
314 Chapter 13: Volumes in High Dimension

argument yields an 1]-dense set whose size has essentially the best possible
order of magnitude.
Let us call a subset N ~ sn-l 1]-separated if every two distinct points of
N have (Euclidean) distance greater than 1]. In a sense, this is opposite to
being 1]-dense.
In order to construct a small 1]-dense set, we start with the empty set
and keep adding points one by one. The trick is that we do not worry about
1]-density along the way, but we always keep the current set 1]-separated.
Clearly, if no more points can be added, the current set must be 1]-dense.
The result of this algorithm is called an 1]-net. 1 That is, N ~ sn-l is an
1]-net if it is an inclusion-maximal 1]-separated subset of sn-l; i.e., if N is
1]-separated but N U {x} is not 1]-separated for any x E sn-l \ N. (These
definitions apply to an arbitrary metric space in place of sn-l.) A volume
argument bounds the maximum size of an 1]-net.

13.1.1 Lemma (Size of 17-nets in the sphere). For each 1] E (0,1]' any
1]-net N ~ sn-l satisfies

Later on, we will check that for 1] small, no 1]-dense set can be much
smaller (Exercise 14.1.3).
¥
Proof. For each x E N, consider the ball of radius centered at x. These
balls are all disjoint, and they are contained in the ball B(O, 1+1]) ~ B(0,2).
Therefore, vol(B(0,2)) 2: INlvol(B(O, ¥)), and since vol(B(O,r)) in R n is
proportional to r n , the lemma follows. 0

Bibliography and remarks. Most of the material of this section is


well known and standard. As for the Busemann-Petty problem, which
we are not going to pursue any further in this book, information can be
found, e.g., in Gardner, Koldobski, and Schlumprecht [GKS99] (recent
unified solution for all dimensions), in Ball [Bal], or in the Handbook
of Convex Geometry [GW93].

Exercises
1. Calculate the volume of the n-dimensional crosspolytope, i.e., the convex
hull of {el' -el, ... , en, -en}, where ei is the ith vector in the standard
basis of R n. I2l
2. (Ball volume via the Gaussian distribution)
(a) Let In = fRd e-llxl12 dx, where Ilxll = (xi+-· ·+x;Y/2 is the Euclidean
norm. Express In using h. I2l

1 Not to be confused with the notion of c:-net considered in Chapter 10; unfortu-
nately, the same name is customarily used for two rather unrelated concepts.
13.2 Hardness of Volume Approximation 315

(b) Express In using Vn = vol(Bn) and a suitable one-dimensional in-


tegral, by considering the contribution to In of the spherical shell with
inner radius r and outer radius r + dr. IT]
(c) Calculate In by using (b) for n = 2 and (a). ill
(d) Integrating by parts, set up a recurrence and calculate the integral
appearing in (b). Compute Vn . IT]
This calculation appears in Pisier [Pis89j (also see Ball [Ba197]).
3. Let X C sn-l be such that every two points of X have (Euclidean)
distance at least J2. Prove that IXI :::; 2n. 0

13.2 Hardness of Volume Approximation


The theorem in this section can be regarded as a variation on one of the
"paradoxes of high dimension" mentioned in the previous section, namely,
that the volume of the ball inscribed in the unit cube becomes negligible as
the dimension grows. The theorem addresses a dual situation: the volume of
a convex polytope inscribed in the unit ball.
13.2.1 Theorem. Let B n denote the unit ball in R n , and let P be a convex
polytope contained in Bn and having at most N vertices. Then

vol(P) < (Cln(!+1))n/2


vol(Bn)

with an absolute constant C.


Thus, unless the number of vertices is exponential in n, the polytope is
very tiny compared to the ball.
For N > ne n / C , the bound in the theorem is greater than 1, and so it
makes little sense, since we always have vol(P) :::; vol(Bn). Thus, a reasonable
range of N is n+ 1 :::; N :::; cN for some positive constant c > o. It turns out
that the bound is tight in this range, up to the value of C, as discussed in
the next section. This may be surprising, since the elementary proof below
makes seemingly quite rough estimates.
Let us remark that the weaker bound

vol(P) < (ClnN)n/2


(13.1 )
vol(Bn) - n
is somewhat easier to prove than the one in Theorem 13.2.1. The difference
between these two bounds is immaterial for N > n 2 , say. It becomes signif-
icant, for example, for comparing the largest possible volume of a polytope
in B n with n log n vertices with the volume of the largest simplex in Bn.
Application to hardness of volume approximation. Computing or
estimating the volume of a given convex body in R n , with n large, is a
316 Chapter 13: Volumes in High Dimension

fundamental algorithmic problem. Many combinatorial counting problems


can be reduced to it, such as counting the number of linear extension of a
given poset, as we saw in Section 12.3. Since many of these counting problems
are computationally intractable, one cannot expect to compute the volume
precisely, and so approximation up to some multiplicative factor is sought.
It turns out that no polynomial-time deterministic algorithm can gener-
ally achieve approximation factor better than exponential in the dimension.
A concrete lower bound, derived with help of Theorem 13.2.1, is (cn/logn)n.
This can also be almost achieved: An algorithm is known with factor (c'n)n.
In striking contrast to this, there are randomized polynomial-time algo-

°
rithms that can approximate the volume within a factor of (l+c) for each
fixed c > with high probability. Here "randomized" means that the algo-
rithm makes random decisions (like coin tosses) during its computation; it
does not imply any randomness of the input. These are marvelous develop-
ments, but they are not treated in this book. We only briefly explain the
relation of Theorem 13.2.1 to the deterministic volume approximation.
To understand this connection, one needs to know how the input con-
vex body is presented to an algorithm. A general convex body cannot be
exactly described by finitely many parameters, so caution is certainly neces-
sary. One way of specifying certain convex bodies, namely, convex polytopes,
is to give them as convex hulls of finite point sets (V-presentation) or as in-
tersections of finite sets of half-spaces (H-presentation). But there are many
other computationally important convex bodies that are not polytopes, or
have no polynomial-size V-presentation or H-presentation. We will meet an
example in Section 15.5, where the convex body lives in the space of n x n
real matrices and is the intersection of a polytope with the cone consisting
of all positive semidefinite matrices.
In order to abstract the considerations from the details of the presentation
of the input body, the oracle model was introduced for computation with
convex bodies. If KeRn is a convex body, a membership oracle for K is,
roughly speaking, an algorithm (subroutine, black box) that for any given
input point x E Rn outputs YES if x E K and NO if x tf- K.
This is simplified, because in order to be able to compute with the body,

be contained in a ball B(O, R), where Rand r > °


one needs to assume more. Namely, K should contain a ball B(O, r) and
are written using at
most polynomially many digits. On the other hand, the oracle need not (and
often cannot) be exact, so a wrong answer is allowed for points very close to
the boundary. These are important but rather technical issues, and we will
ignore them. Let us note that a polynomial-time membership oracle can be
constructed for both V-presented and H-presented polytopes, as well as for
many other bodies.
Let us now assume that a deterministic algorithm approximates the vol-
ume of each convex body given by a suitable membership oracle. First we
call the algorithm with K = Bn, the unit ball. The algorithm asks the or-
13.2 Hardness of Volume Approximation 317

acle about some points {Xl, X2, ... , X N }, gets the correct answers, and out-
puts an estimate for vol(Bn). Next, we call the algorithm with the body
K = conv( {Xl, X2, ... , X N } n Bn). The answers of the oracle are exactly the
same, and since the algorithm has no other information about the body K
and it is deterministic, it has to output the same volume estimate as it did
for Bn. But by Theorem 13.2.1, vol(Bn)/vol(K) 2: (cn/ln(N/n+1))n/2, and
so the error of the approximation must be at least this factor. If N, the num-
ber of oracle calls, is polynomial in n, it follows that the error is at least
(c'n/ logn)n/2.
By more refined consideration, one can improve the lower bound to ap-
proximately the square of the quantity just given. The idea is to input the
dual body K* into the algorithm, too, for which it gets the same answers, and
then use a deep result (the inverse Blaschke-Santal6 inequality) stating that
vol(K) vol(K*) 2: cn In! for any centrally symmetric n-dimensional convex
body K, with an absolute constant c > 0 (some technical steps are omit-
ted here). This improvement is interesting because, as was remarked above,
for symmetric convex bodies it almost matches the performance of the best
known algorithm.
Idea of the proof of Theorem 13.2.1. Let V be the set of vertices of the
polytope P c Bn, IVI = N. We choose a suitable parameter k < n and prove
that for every X E P, there is a k-tuple J of points of V such that X is close to
conv(J). Then vol(P) is simply estimated as (~) times the maximum possible
volume of the appropriate neighborhood of the convex hull of k points in Bn.
Here is the first step towards realizing this program.
13.2.2 Lemma. Let S in Rn be an n-dimensional simplex, i.e., the convex
hull of n+ 1 aflinely independent points, and let R = R( S) and p = p( S) be
the circumradius and inradius of S, respectively, that is, the radius of the
smallest enclosing ball and of the largest inscribed ball. Then ~ 2: n.

Proof. We first sketch the proof of an auxiliary claim: Among all simplices
contained in Bn, the regular simplex inscribed in Bn has the largest volume.
The volume of a simplex is proportional to the (n-1 )-dimensional volume of
its base times the corresponding height. It follows that in a maximum-volume
simplex S inscribed in Bn, the hyperplane passing through a vertex v of S
and parallel to the facet of S not containing v is tangent to Bn, for otherwise,
v could be moved to increase the height:

It can be easily shown (Exercise 2) that this property characterizes the regular
simplex (so the regular simplex is even the unique maximum).
318 Chapter 13: Volumes in High Dimension

Another, slightly more difficult, argument shows that if 8 is a simplex


of minimum volume circumscribed about Bn, then each facet of 8 touches
Bn at its center of gravity (Exercise 3), and it follows that the volume is
minimized by the regular simplex circumscribed about Bn.
Let 8 0 c Bn be a simplex. We consider two auxiliary regular simplices
8 1 and 8 2, where 8 1 is inscribed in Bn and 82 satisfies vol(82) = vol(80 ).
Since vol(81 ) :2: vol(80 ) = vol(82), 8 1 is at least as big as 8 2 , and so p(80 ) ::::;
p(82 ) ::::; p(8d. A calculation shows that p(81 ) = ~ (Exercise l(a)). 0

Let F be a j-dimensional simplex in Rn. We define the orthogonal p-


neighborhood Fp of F as the set of all x E Rn for which there is ayE F such
that the segment xy is orthogonal to F and Ilx - yll ::::; p. The next drawing
shows orthogonal neighborhoods in R3 of a I-simplex and of a 2-simplex:

The orthogonal p-neighborhood of F can be expressed as the Cartesian prod-


uct of F with a p-ball of dimension n-j, and so voln(Fp) = volj(F) . pn- j .
voln_j(Bn- j ).
13.2.3 Lemma. Let 8 be an n-dimensional simplex contained in Bn, let
x E 8, and let k be an integer parameter, 1 ::::; k ::::; n. Then there is a k-tuple
J of a1Iinely independent vertices of 8 such that x lies in the orthogonal p-
neighborhood of conv( J), where

P = p( n, k)
n
= ( ~ i2
1) 1/2

Proof. We proceed by induction on n - k. For n = k, this is Lemma 13.2.2:


Consider the largest ball centered at x and contained in 8j it has radius at
most ~, it touches some facet F of 8 at a point y, and the segment xy is
perpendicular to F, witnessing x E F 1 / no
For k < n, using the case k = n, let 8' be a facet of 8 and x' E 8' a point
at distance at most ~ from 8' with xx' ..1 8'. By the inductive assumption,
we find a (k-l)-face F of 8' and a point y E F with Ilx' - yll ::::; p(n-l, k)
and x'y ..1 F. Here is an illustration for n = 3 and k = 2:
13.2 Hardness of Volume Approxima.tion 319

Then xx' ..l x'y (because the whole of 8' is perpendicular to xx'), and so
Ilx - Yl12 = Ilx - x'I12 + Ilx' - Yl12 :::; p(n, k)2. Finally, xy ..l F, since both
the vectors x' - y and x - x' lie in the orthogonal complement of the linear
subspace generated by F - y. 0

Proof of Theorem 13.2.1. By CaratModory's theorem and Lemma 13.2.3,


P = conv(V) is covered by the union of all the orthogonal p-neighborhoods
conv(J)p, J E (~), where p = p(n, k) is as in the lemma. The maximum
(k-1 )-dimensional volume of conv( J) is no larger than the (k-1 )-dimensional
volume of the regular (k-1)-simplex inscribed in B k - 1 , which is

k ) (k-l)/2 v'k
M(k-1) = (k_ 1 (k - I)! ;

see Exercise l(b). (If we only want to prove the weaker estimate (13.1) and do
not care about the value of C, then M(k-1) can also be trivially estimated
by VOlk_l(B k- 1 ) or even by 2k- 1 .)
What remains is calculation. We have

vol(P) <
vol(Bn) -
(N)k .M(k-1). P(n , k)n-k+l . VOln_k+l(Bn-k+
vol(Bn)
1
) •
(13.2)

We first estimate

-l
We now set
k - n
In( ~+1) J
(for obtaining the weaker estimate (13.1), the simpler value k = llnn
N J is
more convenient). We may assume that InN is much smaller than n, for
otherwise, the bound in the theorem is trivially valid, and so k is larger than
any suitable constant. In particular, we can ignore the integer part in the
definition of k.
For estimating the various terms in (13.2), it is convenient to work with
the natural logarithm of the quantities. The logarithm of the bound we are
heading for is ~(1nln(~+l) -Inn + 0(1)), and so terms up to O(n) can
be ignored if we do not care about the value of the constant C. Further, we
find that kInk = klnn - klnln(~+l) = klnn + O(n). This is useful for
estimating In(k!) = kInk - O(k) = klnn - O(n).
Now, we can bound the logarithms of the terms in (13.2) one by one.
We have In (~) :::; klnN -In(k!) = k(1n(~) + Inn) -In(k!) :::; n + klnn-
klnn + O(n) = O(n); this term is negligible. Next, InM(k-1) contributes
about -In(k!) = -k In n+O(n). The main contribution comes from the term
lnp(n, k)n-k+l :::; -(n-k) In v'k + O(n) = ~(-ln n + lnln( ~+1)) + ~ In n +
320 Chapter 13: Volumes in High Dimension

O(n). Finally In(voln_k+l(Bn-k+l)jvol(B n )) = In(r(~+l)jr(n-~+l+l)) +


O(n) :::; In n k/ 2 + O(n) = ~ In n + O(n). The term -k In n originating from
M (k-1) cancels out nicely with the two terms ~ In n, and altogether we
obtain ~(-ln n + lnln( -ff+1) + 0(1)) as claimed in the theorem. 0

Bibliography and remarks. Our presentation of Theorem 13.2.1


mostly follows Barany and Fiiredi [BF87]. They pursued the hardness
of deterministic volume approximation, inspired by an earlier result of
Elekes [Ele86] (see Exercise 5). They proved the weaker bound (13.1);
the stronger bound in Theorem 13.2.1, in a slightly different form, was
obtained in their subsequent paper [BF88].
Theorem 13.2.1 was also derived by Carl and Pajor [CP88] from a
work of Carl [Car85] (they provide similar near-tight bounds for £p-
balls).
A dual version of Theorem 13.2.1 was independently discovered by
Gluskin [Glu89] and by Bourgain, Lindenstrauss, and Milman [BLM89].
The dual setting deals with the minimum volume of the intersection
of N symmetric slabs in Rn. Namely, let Ul, U2,"" UN ERn be given
(nonzero) vectors, and let K = n;:l {x ERn: I(Ui, x)1 :::; 1} (the width
of the ith slab is 11:,11)' The dual analogue of Theorem 13.2.1 is this:
Whenever all Iluill :::; 1, we have vol(Bn)j vol(K) :::; (~ln( -ff+1)r/2.
A short and beautiful proof can be found in Ball's handbook chap-
ter [Bal]. There are also bounds based on the sum of norms of the
Ui' Namely, for all p E [1, (0), we have vol(K)l/n;::: ~ , where
yp/2.R

R = (~L,;:llluiIIP)l/P (Euclidean norms!), as was proved by Ball


and Pajor [BP90]; it also follows from Gluskin's work [Glu89]. For
p = 2, this result was established earlier by Vaaler. It has the follow-
ing nice reformulation: The intersection of the cube [-1, 1]N with any
n-flat through 0 has n-dimensional volume at least 2n (see [Bal] for
more information and related results).
The setting with slabs and that of Theorem 13.2.1 are connected
by the Blaschke-Santal6 inequalit'll and the inverse Blaschke-Santal6
inequality. The former states that vol(K) vol(K*) :::; vol(B n )2 :::; cl jn!
for every centrally symmetric convex body in R n (or, more gener-
ally, for every convex body K having 0 as the center of gravity). It
allows one the passage from the setting with slabs to the setting of
Theorem 13.2.1: If the intersection of the slabs {x: I(Ui,X)1 :::; 1} has
large volume, then conv{ Ul, ... , UN} has small volume. The inverse
Blaschke-Santal6 inequality, as was mentioned in the text, asserts that
vol(K) vol(K*) ;::: cn jn! for a suitable c > 0, and it can thus be used
2 In the literature one often finds it as either Blaschke's inequality or Santal6's
inequality. Blaschke proved it for n ::; 3 and Santal6 for all n; see, e.g., the
chapter by Lutwak in the Handbook of Convex Geometry [GW93].
13.2 Hardness of Volume Approximation 321

for the reverse transition. It is much more difficult than the Blaschke-
Santal6 inequality and it was proved by Bourgain and Milman; see,
e.g., [MiI98] for discussion and references.
Let us remark that the weaker bound (~(ln N)) n/2 is relatively
easy to prove in the dual setting with slabs (Exercise 14.1.4), which
together with the Blaschke-Santal6 inequality gives (13.1).
Theorem 13.2.1 concerns the situation where vol(P) is small com-
pared to vol(Bn). The smallest number of vertices of P such that
vol(P) 2 (I-c) vol(Bn) for a small c > 0 was investigated by Gor-
don,· Reisner, and Schutt [GRS97]. In an earlier work they constructed
polytopes with N vertices giving c = O(nN- 2/(n-I)), and in the pa-
per mentioned they proved that this is asymptotically optimal for
N 2 (Cn)(n-I)/2, with a suitable constant C.
The oracle model for computation with convex bodies was intro-
duced by Grotschel, Lov8.sz, and Schrijver [GLS88]. A determinis-
tic polynomial-time algorithm approximating the volume of a convex
body given by a suitable oracle (weak separation oracle) achieving the
approximation factor n!(I+c), for every c > 0, was given by Betke
and Henk [BH93] (the geometric idea goes back at least to Macbeath
[Mac50]). The algorithm chooses an arbitrary direction VI and finds
the supporting hyperplanes hi and hI of K perpendicular to VI. Let
pi and PI be contact points of hi and hI with K. The next direction
V2 is chosen perpendicular to the affine hull of {pi, PI }, etc.

hI
hf pi

PI 1'1

h+
hi l

After n steps, the n pairs of hyperplanes determine a parallelotope


P"2 K, while Q = conv{pi ,PI' ... ,p;i ,p;:;-} ~ K, and it is not hard to
show that vol(P)/vol(Q) :::; n! (the extra factor (I+c)n arises because
the oracle is not exact).
The first polynomial-time randomized algorithm for approximating
the volume with arbitrary precision was discovered by Dyer, Frieze,
and Kannan [DFK91]. Its parameters have been improved many times
322 Chapter 13: Volumes in High Dimension

since then; see, e.g., Kannan, Lovasz, and Simonovits [KLS97]. A re-
cent success of these methods is a polynomial-time approximation al-
gorithm for the permanent of a nonnegative matrix by Jerrum, Sin-
clair, and Vigoda [JSVOl].
By considerations partially indicated in Exercise 4, Barany and
Fiiredi [BF87] showed that in deterministic polynomial time one can-
not approximate the width of a convex body within a factor better
than n( vn/ log n ). Brieden, Gritzmann, Kannan, Klee, Lovasz, and
Simonovits [BGK+99] provided a matching upper bound (up to a con-
stant), and they showed that in this case even randomized algorithms
are not more powerful. They also considered a variety of other parame-
ters of the convex body, such as diameter, inradius, and circumradius,
attaining similar results and improving many previous bounds from
[GLS88].
Lemma 13.2.2 appears in Fejes T6th [T6t65].

Exercises
1. (a) Calculate the inradius and circumradius of a regular n-dimensional
simplex. 0
(b) Calculate the volume of the regular n-dimensional simplex inscribed
in the unit ball Bn. 121
2. Suppose that the vertices of an n-dimensional simplex S lie on the sphere
sn-l and for each vertex v, the hyperplane tangent to sn-l at v is parallel
to the facet of S opposite to v. Check that S is regular. 121
3. Let S c Rn be a simplex circumscribed about Bn and let F be a facet
of S touching Bn at a point c. Show that if c is not the center of gravity
of F, then there is another simplex S' (arising by slightly moving the
hyperplane that determines the facet F) that contains B n and has volume
smaller than vol(S). [!)
4. The width of a convex body K is the minimum distance of two parallel
hyperplanes such that K lies between them. Prove that the convex hull
V
of N points in Bn has width at most 0 ( (In N) / n ). 0
5. (A weaker but simpler estimate) Let VeRn be a finite set. Prove
that conv(V) t;;; UVEV B(~v, ~llvll), where B(x,r) is the ball ofradius r
centered at x. Deduce that the convex hull of N points contained in B n
has volume at most fn vol(Bn). [!)
This is essentially the argument of Elekes [Ele86].

13.3 Constructing Polytopes of Large Volume


For all N in the range 2n :S N :S 4n , we construct a polytope P c B n with
N vertices containing a ball ofradius r = n(((ln ~)/n)1/2). This shows that
13.3 Constructing Polytopes of Large Volume 323

the bound in Theorem 13.2.1 is tight for N ~ 2n, since vol(P)/vol(B n ) ~ rn.
We begin with two extreme cases.
First we construct a k-dimensional polytope Po c Bk with 4k vertices
containing the ball ~ Bk. There are several possible ways; the simplest is based
on 7J-nets. We choose a I-net V C Sk-l and set Po = conv(V). According to
Lemma 13.1.1, we have N = IVI :::; 4k. If there were an x with IIxll = ~ not
lying in Po,

then the separating hyperplane passing through x and avoiding Po would


define a cap (shaded) whose center y would be at distance at least 1 from V.
Another extreme case is with N = 2q vertices in dimension n = q. Then
we can take the crosspolytope, i.e., the convex hull of the vectors el, -el , ... ,
eq , -e q , where (el, ... , eq ) is the standard orthonormal basis. The radius of
the inscribed ball is r = )q, which matches the asserted formula.
Next, suppose that n = qk for integers q and k and set N = q4k. From
N = q4k = !!k 4k we have!:!..
n = 4k/k >- e and k < - n/ In!:!...
n' and so q >
k
- In!:!.. n
Hence it suffices to construct an N-vertex polytope P c Bn containing the
ball rBn with r = 2~.
The construction of P is a combination of the two constructions above.
We interpret R n as the product Rk x Rk X •.• X Rk (q factors). In each of
the copies of Rk, we choose a polytope Po with 4k vertices as above, and we
let P be the convex hull of their union. More formally,

P = conv{(~,xl'X2' ... 'Xk'O,O, ... ,O): (Xl, ... ,Xk) E V,


Ci-l)kx

i = 1,2, ... ,q},


where V is the vertex set of Po.
We want to show that P contains the ball r B n , r = 2~. Let x be a point
of norm Ilxll :::; r and let xCi) be the vector obtained from x by retaining the
coordinates in the ith block, i.e., in positions (i-l)k+l, ... ,ik, and setting
all the other coordinates to o. These xCi) are pairwise orthogonal, and x lies
in the q-dimensional subspace spanned by them. Let yCi) = 21':;;)11 be the
vector of length ~ in the direction of xCi). Each yCi) is contained in P, since
Po contains the ball ofradius ~. The convex hull of the yCi) is a q-dimensional
324 Chapter 13: Volumes in High Dimension

crosspolytope of circumradius ~, and so it contains all vectors of norm 2~


in the subspace spanned by the x(i), including x.
This construction assumes that nand N are of a special form, but it
is not difficult to extend the bounds to all n ;::: 2 and all N in the range
2n :S N :S 4 n by monotonicity considerations; we omit the details. This
proves that the bound in Theorem 13.2.1 is tight up to the value of the
constant C for 2n :S N :S 4n. 0

Bibliography and remarks. Several proofs are known for the lower
bound almost matching Theorem 13.2.1 (Barany and Fiiredi [BF87],
Carl and Pajor [CP88], Kochol [Koc94]). In Barany and Fiiredi [BF87],
the appropriate polytope is obtained essentially as the convex hull of
N random points on sn-l (for technical reasons, d special vertices are
added), and the volume estimate is derived from an exact formula for
the expected surface measure of the convex hull of N random points
on sn-l due to Buchta, Miiller, and Tichy [BMT95].
The idea of the beautifully simple construction in the text is due
to Kochol [Koc94]. His treatment of the basic case with exponentially
large N is different, though: He takes points of a suitably scaled integer
lattice contained in Bk for V, which yields an efficient construction
(unlike the argument with a I-net used in the text, which is only
existential) .

Exercises
1. (Polytopes in Bn with polynomially many facets)
(a) Show that the cube inscribed in the unit ball Bn, which is a convex
polytope with 2n facets, has volume of a larger order of magnitude than
any convex polytope in Bn with polynomially many vertices (and so,
concerning volume, "facets are better than vertices"). [II
(b) Prove that the inradius of any convex polytope with N facets con-
tained in B n is at most o( J(ln(N/n + 1))/n) (and so, in this respect,
facets are not better than vertices). 0
These observations are from Brieden and Kochol [BKOO].

13.4 Approximating Convex Bodies by Ellipsoids


One of the most important issues in the life of convex bodies is their ap-
proximation by ellipsoids, since ellipsoids are in many respects the simplest
imaginable compact convex bodies. The following result tells us how well
they can generally be approximated (or how badly, depending on the point
of view).
13.4 Approximating Convex Bodies by Ellipsoids 325

13.4.1 Theorem (John's lemma). Let KeRn be a bounded closed


convex body with nonempty interior. Then there exists an ellipsoid E in such
that
E in ~ K ~ Eout,

where E out is E in expanded from its center by the factor n. If K is symmetric


about the origin, then we have the improved approximation

E in ~ K ~ Eout = .Jii. Ein·


Thus, K can be approximated from outside and from inside by similar
ellipsoids with ratio 1 : n, or 1 : .Jii for the centrally symmetric case. Both
these ratios are the best possible in general, as is shown by K being the
regular simplex in the general case and the cube in the centrally symmetric
case.

In order to work with ellipsoids, we need a rigorous definition. A suitable


one is to consider ellipsoids as affine images of the unit ball: If B n denotes
the unit ball in Rn, an ellipsoid E is a set E = f(Bn), where f: R n -+ Rn
is an affine map of the form f: x t-+ Ax + c. Here x is regarded as a column
vector, cERn is a translation vector, and A is a nonsingular n x n matrix.
A very simple case is that of c = 0 and A a diagonal matrix with positive
entries aI, a2, . .. ,an on the diagonal. Then

n x~ x~ x;}
E= { XER:-+-+···+-<l, (13.3)
a~ a~ a; -
as is easy to check; this is an ellipsoid with center at 0 and with semi axes
aI, a2, .. ·, an. In this case we have vol(E) = ala2··· an· vol(Bn). An arbi-
trary ellipsoid E can be brought to this form by a suitable translation and
rotation about the origin. In the language of linear algebra, this corresponds
to diagonalizing a positive definite matrix using an orthonormal basis con-
sisting of its eigenvectors; see Exercise 1.
Proof of Theorem 13.4.1. In both cases in the theorem, Ein is chosen as
an ellipsoid of the largest possible volume contained in K. Easy compactness
considerations show that a maximum-volume ellipsoid exists. In fact, it is
also unique, but we will not prove this. (Alternatively, the proof can be done
starting with the smallest-volume ellipsoid enclosing K, but this has some
technical disadvantages. For example, its existence is not so obvious.)
326 Chapter 13: Volumes in High Dimension

We prove only the centrally symmetric case of John's lemma. The non-
symmetric case follows the same idea, but the calculations are different and
more complicated, and we leave them to Exercise 2.
So we suppose that K is symmetric about 0, and we fix the ellipsoid
E in of maximum volume contained in K. It is easily seen that Ein can be
assumed to be symmetric, too. We make a linear transformation so that Ein
becomes the unit ball Bn. Assuming that the enlarged ball Vn' B n does not
contain K, we derive a contradiction by exhibiting an ellipsoid E' <;;; K with
vol(E') > vol(Bn).
We know that there is a point x E K with /lxll > Vn. For convenience, we
may suppose that x = (8,0,0, ... ,0), 8 > Vn. To finish the proof, we check
that the region R = conv(Bn U {-x, x})

-x

contains an ellipsoid E' of volume larger than vol(Bn).


The calculation is a little unpleasant but not so bad, after all. The region
R is a rotational body; all the sections by hyperplanes perpendicular to the
xl-axis are balls. We naturally also choose E' with this property: The semi axis
in the Xl-direction is some a> 1, while the slice with the hyperplane {Xl = O}
is a ball of a suitable radius b < 1. We have vol(E') = abn - l vol(Bn), and
so we want to choose a and b such that abn - l > 1 and E' <;;; R. By the
rotational symmetry, it suffices to consider the planar situation and make
sure that the ellipsis with semiaxes a and b is contained in the planar region
depicted above.
In order to avoid direct computation of a tangent to the ellipsis, we mul-
tiply the xl-coordinate of all points by the factor ~. This turns our ellipsis
into the dashed ball of radius b:

A bit of trigonometry yields


13.4 Approximating Convex Bodies by Ellipsoids 327

s b= st bs
- ---===
t -
JS2=1' VS +
2 t2

This leads to a 2 = s2(1-b2 ) +b2. We now choose b just a little smaller than 1;
a suitable parameterization is b2 = 1-£ for a small £ > O. We want to show
that abn - 1 > 1, and for convenience, we work with the square. We have

The Maclaurin series of the right-hand side in the variable £ is 1 + (s2 - n)£ +
0(£2). Since 8 2 > n, the expression indeed exceeds 1 for all sufficiently small
£ > o. Theorem 13.4.1 is proved. 0

Bibliography and remarks. Theorem 13.4.1 was obtained by


John [Joh48]. He actually proved a stronger statement, which can
be quite useful in many applications. Roughly speaking, it says that
the maximum-volume inscribed ellipsoid has many points of contact
with K that "fix" it within K. The statement and proof are nicely
explained in Ball [BaI97].
As was remarked in the text, the maximum-volume ellipsoid con-
tained in K is unique. The same is true for the minimum-volume
enclosing ellipsoid of K; a proof of the latter fact is outlined in Exer-
cise 3. The uniqueness was proved independently by several authors,
and the oldest such results seem to be due to Lowner (see Danzer,
Griinbaum, and Klee [DGK63] for references). The minimum-volume
enclosing ellipsoid is sometimes called the L6wner-John ellipsoid, but
in other sources the same name refers to the maximum-volume in-
scribed ellipsoid.
The exact computation of the smallest enclosing ellipsoid for a
given convex body K is generally hard. For example, it is NP-hard
to compute the smallest enclosing ellipsoid of a given finite set if the
dimension is a part of input (there are linear-time algorithms for ev-
ery fixed dimension; see, e.g., Matousek, Sharir, and Welzl [MSW96]).
But under suitable algorithmic assumptions on the way that a convex
body K is given (weak separation oracle), it is possible to compute
in polynomial time an enclosing ellipsoid such that its shrinking by a
factor of roughly n 3 / 2 (roughly n in the centrally symmetric case) is
contained in K (if K is given as an H-polytope, then these factors can
be improved to the nearly worst-case optimal n+1 and vn+1, respec-
tively). Finding such approximating ellipsoids is a basic subroutine
in other important algorithms; see Grotschel, Lova,sz, and Schrijver
[GLS88] for more information.
There are several other significant ellipsoids associated with a given
convex body that approximate it in various ways; see, e.g., Linden-
strauss and Milman [LM93] and Tomczak-Jaegermann [TJ89].
328 Chapter 13: Volumes in High Dimension

Exercises
1. Let E be the ellipsoid J(B n ), where J: x H Ax for an n x n nonsingular
matrix A.
(a) Show that E = {x ERn: x T Bx ::; I}. What is the matrix B? ~
(b) Recall or look up appropriate theorems in linear algebra showing that
there is an orthonormal matrix T such that B' = T BT- 1 is a diagonal
matrix with the eigenvalues of B on the diagonal (check and use the fact
that B is positive definite in our case). ~
(c) What is the geometric meaning of T, and what is the relation of the
entries of T BT- 1 to the semiaxes of the ellipsoid E? ~
2. Prove the part of Theorem 13.4.1 dealing with not necessarily symmetric
convex bodies. 0
3. (Uniqueness of the smallest enclosing ellipsoid) Let Xc R n be a bounded
set that is not contained in a hyperplane (i.e., it contains n+1 affinely
independent points). Let £(X) be the set of all ellipsoids in R n contain-
ing X.
(a) Prove that there exists an Eo E £(X) with vol(Eo) = inf{ vol(E): E E
£(X)}. (Show that the infimum can be taken over a suitable compact
subset of £(X).) IT]
(b) Let E 1 , E2 be ellipsoids in Rn; check that after a suitable affine trans-
2
formation of coordinates, we may assume that El = {x ERn: 2:~= 1~ ::;
Ilx - cll ::; I}. Define E = {x ERn: 2 2: i =l ~ +
2 1 n x2
1} and E2 = { x ERn:
~ 2:~=1 (Xi
- Ci)2 ::; I}. Verify that El n E2 ~ E, that E is an ellips~id,
and that vol(E) 2 min(vol(Ed, vol(E2 )), with equality only if El = E 2.
Conclude that the smallest-volume enclosing ellipsoid of X is unique. IT]
4. (Uniqueness of the smallest enclosing ball)
(a) In analogy with Exercise 3, prove that for every bounded set Xc Rn,
there exists a unique minimum-volume ball containing X. [II
(b) Show that if X c R n is finite then the smallest enclosing ball is
determined by at most n+ 1 points of X; that is, there exists an at most
(n+1)-point subset of X whose smallest enclosing ball is the same as that
of X. [II
5. (a) Let P C R2 be a convex polygon with n vertices. Prove that there
are three consecutive vertices of P such that the area of their convex hull
is at most O(n- 3 ) times the area of P. IT]
(b) Using (a) and the fact that every triangle with vertices at integer
points has area at least ~ (check!), prove that every convex n-gon with
integral vertices has area O(n 3 ). [II
Remark. Renyi and Sulanke [RS64] proved that the worst case in (a) is
the regular convex n-gon.
14

Measure Concentration and


Almost Spherical Sections

In the first two sections we are going to discuss measure concentration on


a high-dimensional unit sphere. Roughly speaking, measure concentration
says that if A ~ sn-l is a set occupying at least half of the sphere, then
almost all points of sn-l are quite close to A, at distance about O(n- 1 / 2 ).
Measure concentration is an extremely useful technical tool in high-dimen-
sional geometry. From the point of view of probability theory, it provides
tail estimates for random variables defined on sn-l, and in this respect it
resembles Chernoff-type tail estimates for the sums of independent random
variables. But it is of a more general nature, more like tail estimates for
Lipschitz functions on discrete spaces obtained using martingales.
The second main theme of this chapter is almost-spherical sections of
convex bodies. Given a convex body KeRn, we want to find a k-dimen-
sional subspace L of R n such that K n L is almost spherical; i.e., it contains a
ball of some radius r and is contained in the concentric ball ofradius (l+e)r.
A remarkable Ramsey-type result, Dvoretzky's theorem, shows that with k
being about e- 2 log n, such a k-dimensional almost-spherical section exists
for every K. We also include an application concerning convex polytopes,
showing that a high-dimensional centrally symmetric convex polytope cannot
have both a small number of vertices and a small number of facets.
Both measure concentration and the existence of almost-spherical sections
are truly high-dimensional phenomena, practically meaningless in the familiar
dimensions 2 and 3. The low-dimensional intuition is of little use here, but
perhaps by studying many results and examples one can develop intuition on
what to expect in high dimensions.
We present only a few selected results from an extensive and well-
developed theory of high-dimensional convexity. Most of it was built in the
so-called local theory of Banach spaces, which deals with the geometry of
330 Chapter 14: Measure Concentration and Almost Spherical Sections

finite-dimensional subspaces of various Banach spaces. In the literature, the


theorems are usually formulated in the language of Banach spaces, so instead
of symmetric convex bodies, one speaks about norms, and so on. Here we
introduce some rudimentary terminology concerning normed spaces, but we
express most of the notions in geometric language, hoping to make it more
accessible to nonspecialists in Banach spaces. So, for example, in the formu-
lation of Dvoretzky's theorem, we do not speak about the Banach-Mazur
distance to an inner product norm but rather about almost spherical convex
bodies. On the other hand, for a more serious study of this theory, the lan-
guage of normed spaces seems necessary.

14.1 Measure Concentration on the Sphere


Let P denote the usual surface measure on the unit Euclidean sphere sn-l,
scaled so that all of sn-l has measure 1 (a rigorous definition will be men-
tioned later). This P is a probability measure, and we often think of sn-l as
a probability space. For a set A <;;; sn-l, P[AJ is the P-measure of A and also
the probability that a random point of sn-l falls into A. The letter P should
suggest "probability of," and the notation P [AJ is analogous to Prob [AJ used
elsewhere in the book.
Measure concentration on the sphere can be approached in two steps. The
first step is the observation, interesting but rather easy to prove, that for large
n, most of sn-l lies quite close to the "equator." For example, the following
diagram shows the width of the band around the equator that contains 90%
of the measure, for various dimensions n:

8@8 n=3 n = 11 n = 101


That is, if the width of the gray stripe is 2w, then

p[{X E sn-l: -w::::: xn::::: w}] = 0.9.

As we will see later, w is of order n- 1 / 2 for large n. (Of course, one might
ask why the measure is concentrated just around the "equator" Xn = O. But
counterintuitive as it may sound, it is concentrated around any equator, i.e.,
near any hyperplane containing the origin.)
The second, considerably deeper, step shows that the measure on sn-l
is concentrated not only around the equator, but near the boundary of any
(measurable) subset A c sn-l covering half of the sphere. Here is a precise
quantitative formulation.
14.1 Measure Concentration on the Sphere 331

14.1.1 Theorem (Measure concentration for the sphere). Let A ~


sn-l be a measurable set with P[A] ~ ~, and let At denote the t-neighbor-
hood of A, that is, the set of all x E sn-l whose Euclidean distance to A is
at most t. Then
1 - PlAt] :S 2e- t2n / 2 .

Thus, if A occupies half of the sphere, almost all points of the sphere
lie at distance at most O(n- 1/ 2 ) from A; only extremely small reserves can
vegetate undisturbed by the nearness of A. (There is nothing very special
about measure ~ here; see Exercise 1 for an analogous result with P [A] =
a E (0, ~).) To recover the concentration around the equator, it suffices to
choose A as the northern hemisphere and then as the southern hemisphere.
We present a simple and direct geometric proof of a slightly weaker version
of Theorem 14.1.1, with -t 2 n/4 in the exponent instead of -t 2 n/2. It deals
with both the steps mentioned above in one stroke.
It is based on the Brunn-Minkowski inequality: vol(A)l/n + vol(B)l/n :S
vol(A + B)l/n for any nonempty compact sets A, BeRn (Theorem 12.2.2).
We actually use a slightly different version of the inequality, which resembles
the well known inequality between the arithmetic and geometric means, at
least optically:
vol(~(A + B» ~ Jvol(A) vol(B). (14.1)
This is easily derived from the usual version: We have vol(~(A + B»l/n ~
vol(~A)l/n + vol(~B)l/n = ~(vol(A)l/n + vol(B)l/n) ~ (vol(A)vol(B»1/2n
by the inequality ~ (a + b) ~ JOlj.
Proof of a weaker version of Theorem 14.1.1. For a set A ~ sn-l,
we define A as the union of all the segments connecting the points of A to 0:
A = {ax: x E A, a E [0, I]} ~ Bn. Then we have
P[A] = JL(A),
where JL(A) = vol(A)/vol(Bn) is the normalized volume of A; in fact, this
can be taken as the definition of P[A].
Let t E [0,1]' let P[A] ~ ~, and let B = sn-l \ At. Then Iia - bll ~ t for
all a E A, bE B.
14.1.2 Lemma. For any x E A and fj E 13, we have II~II :S 1- t 2 /8.
Proof of the lemma. Let x = ax, fj = /3y, x E A, y E B:
332 Chapter 14: Measure Concentration and Almost Spherical Sections

First we calculate, by the Pythagorean theorem and by elementary


calculus,
x + y I :::; V
r:-t2 2
Il -2- 1- 4:::; 1- S·
t

For passing to x and fJ, we may assume that /3 = 1. Then

I x; fJ I I ax 2+ yI : :; a I x; yI + (1 - a) I ~ II
= a(l- ~) + (1 - a)(1 - ~) :::; 1- ~.
The lemma is proved.

By the lemma, the set ~ (A + B) is contained in the ball of radius 1 - t 2 /8


around the origin. Applying Brunn-Minkowski in the form (14.1) to A and
B, we have

So

Bibliography and remarks. The simple proof of the slightly


weaker measure concentration result for the sphere shown in this sec-
tion is due to Arias-de-Reyna, Ball, and Villa [ABV98]. More about
the history of measure concentration and related results will be men-
tioned in the next section.

Exercises
1. Derive the following from Theorem 14.1.1: If A <:;; sn-l satisfies P[A] 2::
a, 0 < a :::; ~, then 1 - P [At] :::; 2e-(t-t o )2 n / 2 , where to is such that
2e-t~n/2 < a. III
2. Let A, Be sn-l be measurable sets with distance at least 2t. Prove that
min(P[A] , P[BD :::; 2e- t2n / 2 .12l
3. Use Theorem 14.1.1 to show that any I-dense set in the unit sphere sn-l
has at least ~en/8 points. I2l
4. Let K = n~l{X ERn: I(Ui,X)1 :::; I} be the intersection of symmetric
slabs determined by unit vectors Ul, ... , UN ERn. Using Theorem 14.1.1,
prove that vol(Bn)/vol(K) :::; (~lnN)n/2 for a suitable constant C.III
The relation to Theorem 13.2.1 is explained in the notes to Section 13.2.
14.2 Isoperimetric Inequalities and More on Concentration 333

14.2 Isoperimetric Inequalities and More on


Concentration
The usual proof of Theorem 14.1.1 (measure concentration) has two steps.
First, P[Atl is bounded for A the hemisphere (which is elementary calculus),
and second, it is shown that among all sets A of measure ~, the hemisphere
has the smallest P[Atl. The latter result is an example of an isoperimetric
inequality.
Before we formulate this inequality, let us begin with the mother of all
isoperimetric inequalities, the one for planar geometric figures. It states that
among all planar geometric figures with a given perimeter, the circular disk
has the largest possible area. (This is well known but not so easy to prove
rigorously.) More general isoperimetric inequalities are usually formulated
using the volume of a neighborhood instead of "perimeter." They claim that
among all sets of a given volume in some metric space under consideration,
a ball of that volume has the smallest volume of the t-neighborhood:

(In the picture, assuming that the dark areas are the same, the light gray area
is the smallest for the disk.) Letting t -+ 0, one can get a statement involving
the perimeter or surface area. But the formulation with t-neighborhood makes
sense even in spaces where "surface area" is not defined; it suffices to have a
metric and a measure on the considered space.
Here is this "neighborhood" form of isoperimetric inequality for the Eu-
clidean space R n with Lebesgue measure.
14.2.1 Proposition. For any compact set A C Rd and any t ;::: 0, we have
vol(At) ;::: vol(Bt ), where B is a ball of the same volume as A.
Although we do not need this particular result in the further development,
let us digress and mention a nice proof using the Brunn-Minkowski inequality
(Theorem 12.2.2).
Proof. By rescaling, we may assume that B is a ball of unit radius. Then
At = A + tB, and so

vol(Ad = vol(A + tB) ;::: (vol(A)l/n + t vol(B)l/n) n


= (1 + tt vol(B) = vol(Bt ).
o
For the sphere sn-l with the usual Euclidean metric inherited from Rn,
an r-ball is a spherical cap, i.e., an intersection of sn-l with a half-space. The
334 Chapter 14: Measure Concentration and Almost Spherical Sections

isoperimetric inequality states that for all measurable sets A ~ sn-l and all
t :2:: 0, we have PlAt] :2:: P[Ct], where C is a spherical cap with P[C] = P[A].
We are not going to prove this; no really simple proof seems to be known.
The measure concentration on the sphere (Theorem 14.1.1) is a rather
direct consequence of this isoperimetric inequality, by the argument already
indicated above. If P[A] = ~, then PlAt] :2:: P[Ct], where C is a cap with
P[C] = ~, i.e., a hemisphere. Thus, it suffices to estimate the measure of the
complementary cap sn-l \ Ct. 1
Gaussian concentration. There are many other metric probability spaces
with measure concentration phenomena analogous to Theorem 14.1.1. Per-
haps the most important one is Rn with the Euclidean metric and with the
n-dimensional Gaussian measure, given by

This is a probability measure on Rn corresponding to the n-dimensional


normal distribution. Let Zll Z2, ... ,Zn be independent real random variables,
each of them with the standard normal distribution N(O, 1), i.e., such that

Prob[Z., <
- z] = _l_jZ
v'27r
2
e- t2 / dt
-00

for all z E R. Then the vector (Zl' Z2, ... , Zn) E R n is distributed accord-
ing to the measure ,. This, is spherically symmetric; the density function
(27r)-n/2cllxI12/2 depends only on the distance of x from the origin. The dis-
tance of a point chosen at random according to this distribution is sharply
concentrated around Vii, and in many respects, choosing a random point
according to , is similar to choosing a random point from the uniform dis-
tribution on the sphere Vii sn-l.
The isoperimetric inequality for the Gaussian measure claims that among
all sets A with given ,(A), a half-space has the smallest possible measure
of the t-neighborhood. By simple calculation, this yields the corresponding
theorem about measure concentration for the Gaussian measure:
14.2.2 Theorem (Gaussian measure concentration). Let a measurable
set A ~ Rn satisfy ,(A) :2::~. Then ,(At ):2:: 1- e- t2 / 2 •
I Theorem 14.1.1 provides a good upper bound for the measure of a spherical cap,
but sometimes a lower bound is useful, too. Here are fairly precise estimates; for
convenience they are expressed with a different parameterization. Let C(T) =
{x E sn-l: Xl 2: T} denote the spherical cap of height 1 - T. Then for 0 :::; T :::;
yfiFt, we have f2 : :;
P[C(T)] :::; ~, and for yfiFt :::; T < 1, we have

_1_ (1- T2 )(n-I)/2 < P[C(T)] < _1_ (1- T2 )(n-I)/2.


6TVn 2TVn

These formulas are taken from Brieden et al. [BGK+99].


14.2 Isoperimetric Inequalities and More on Concentration 335

Note that the dimension does not appear in this inequality, and indeed
the Gaussian concentration has infinite-dimensional versions as well. Measure
concentration on sn-1, with slightly suboptimal constants, can be proved as
an easy consequence of the Gaussian concentration; see, for example, Milman
and Schechtman [MS86] (Appendix V) or Pisier [Pis89].
Most of the results in the sequel obtained using measure concentration on
the sphere can be derived from the Gaussian concentration as well. In more
advanced applications the Gaussian concentration is often technically prefer-
able, but here we stick to the perhaps more intuitive measure concentration
on the sphere.
Other important "continuous" spaces with concentration results similar to
Theorem 14.1.1 include the n-dimensional torus (the n-fold Cartesian product
Sl x ... X Sl C R2n) and the group SO(n) of all rotations around the origin
in Rn (see Section 14.4 for a little more about SO(n)).
Discrete metric spaces. Similar concentration inequalities also hold in
many discrete metric spaces encountered in combinatorics. One of the sim-
plest examples is the n-dimensional Hamming cube en = {O, l}n. The points
are n-component vectors of O's and 1 's, and their Hamming distance is the
number of positions where they differ. The "volume" of a set A <;;; {O,l}n
is defined as P[A] = 2~ IAI. An r-ball B is the set of all 0/1 vectors that
differ from a given vector in at most r coordinates, and so its volume is
P[B] = 2- n (1 + G) + (~) + ... + G)). The isoperimetric inequality for the
Hamming cube, due to Harper, is exactly of the form announced above:
If A <;;; en is any set with P[A] 2: P[B], then PlAt] 2: P[Bt ].
Of course, if A is an r-ball, then At is an (r+t)-ball and we have equality.
Suitable estimates (tail estimates for the binomial distribution in probability
theory) then give an analogue of Theorem 14.1.1:

14.2.3 Theorem (Measure concentration for the cube). Let A <;;; en


satisfy P[A] 2: ~. Then 1 - PlAt] :S e-t2j2n.
This is very similar to the situation for sn-1, only the scaling is different:
While the Hamming cube en has diameter n, and the interesting range of t
is from about ,;n to n, the sphere sn-1 has diameter 2, and the interesting
t are in the range from about In to 2.
Another significant discrete metric space with similar measure concentra-
tion is the space Sn of all permutations of {l, 2, ... ,n} (i.e., bijective map-
pings {I, 2, ... ,n} ~ {I, 2, ... ,n}). The distance of two permutations PI and
P2 is I{i: PI(i) i= P2(i)}l, and the measure is the usual uniform probability
measure on Sn, where every single permutation has measure ~. Here a mea-
sure concentration inequality reads 1 - P [At] :S e-(t-3yn)2 j8n for all A <;;; Sn
with P[A] 2: ~. The expander graphs, to be discussed in Section 15.5, also
offer an example of spaces with measure concentration; see Exercise 15.5.7.
336 Chapter 14: Measure Concentration and Almost Spherical Sections

Bibliography and remarks. A modern treatment of measure con-


centration is the book Ledoux [Led01], to which we refer for more
material and references. A concise introduction to concentration of
Lipschitz functions and discrete isoperimetric inequalities, including
some very recent material and combinatorial applications, is contained
in the second edition of the book by Alon and Spencer [ASOOd]. Older
material on measure concentration in discrete metric spaces, with mar-
tingale proofs and several combinatorial examples, can be found in
Bollobas's survey [Bol87]. For isoperimetric inequalities and measure
concentration on manifolds see also Gromov [Gro98] (or Gromov's ap-
pendix in [MS86]).
The Euclidean isoperimetric inequality (the ball has the smallest
surface for a given volume) has a long and involved history. It has been
"known" since antiquity, but full and rigorous proofs were obtained
only in the nineteenth centuryj see, e.g., Talenti [Tal93] for references.
The quick proof via Brunn-Minkowski is taken from Pisier [Pis89].
The exact isoperimetric inequality for the sphere was first proved
(according to [FLM77]) by Schmidt [Sch48]. Figiel, Lindenstrauss, and
Milman [FLM77] have a 3-page proof based on symmetrization.
Measure concentration on the sphere and on other spaces was first
recognized as an important general tool in the local theory of Banach
spaces, and its use was mainly pioneered by Milman. Several nice
surveys with numerous applications, mainly in Banach spaces but also
elsewhere, are available, such as Lindenstrauss [Lin92], Lindenstrauss
and Milman [LM93], Milman [Mil98], and some chapters of the book
Benyamini and Lindenstrauss [BL99].
The Gaussian isoperimetric inequality was obtained by Borell
[Bor75] and independently by Sudakov and Tsirel'son [ST74]. A proof
can also be found in Pisier [Pis89]. Ball [Bal97] derives a slightly weaker
version of the Gaussian concentration directly using the Prekopa-
Leindler inequality mentioned in the notes to Section 12.2. The ex-
act isoperimetric inequality for the Hamming cube is due to Harper
[Har66]. We will indicate a short proof of measure concentration for
product spaces, including the Hamming cube, in the notes to the next
section.
More recently, very significant progress was made in the area of
measure concentration and similar inequalities, especially on product
spaces, mainly associated with the name of Talagrandj see, for in-
stance, [Tal95] or the already mentioned book [LedOl]. Talagrand's
proof method, which works by establishing suitable one-dimensional
inequalities and extending them to product spaces by a clever induc-
tion, also gives most of the concentration results previously obtained
with the help of martingales.
14.3 Concentration of Lipschitz Functions 337

Many new isoperimetric and concentration inequalities, as well as


new proofs of known results, have been obtained by a function theo-
retic (as opposed to geometric) approach. Here concentration inequal-
ities are usually derived from other types of inequalities, such as loga-
rithmic Sobolev inequalities (estimating the entropy of a random vari-
able). One advantage of this is that while concentration inequalities
usually do not behave well under products, entropy estimates extend
to products automatically, and so it suffices to prove one-dimensional
versions.
Reverse isoperimetric inequality. The smallest possible surface area
of a set with given volume is determined by the isoperimetric inequal-
ity. In the other direction, the surface area can be arbitrarily large for
a given volume, but a meaningful question is obtained if one consid-
ers affine-equivalence classes of convex bodies. The following reverse
isoperimetric inequality was proved by Ball (see [BaI97] or [Bal]): For
every n-dimensional convex body C there exists an affine image {;
of unit volume whose surface area is no larger than the surface area
of the n-dimensional unit-volume regular simplex. Among symmetric
convex bodies, the extremal body is the cube.

14.3 Concentration of Lipschitz Functions


Here we derive a form of the measure concentration that is very suitable for
applications. It says that any Lipschitz function on a high-dimensional sphere
is tightly concentrated around its expectation. (Any measurable real function
J: sn-l ~ R can be regarded as a random variable, and its expectation is
given by E[f] = fsn-l J(x) dP(x).)
We recall that a mapping J between metric spaces is C-Lipschitz, where
C > 0 is a real number, if the distance of J(x) and J(y) is never larger than
C times the distance of x and y. We first show that a I-Lipschitz function
J: sn-l ~ R is concentrated around its median. The median of a real-valued
function J is defined as

med(f) = sup{t E R: P[J :::; t] :::; n.


Here P is the considered probability measure on the domain of J; in our
case, it is the normalized surface measure on sn-l. The notation P [J :::; t]
is the usual probability-theory shorthand for P[{x E sn-l: J(x):::; t}]. The
following lemma looks obvious, but an actual proof is perhaps not completely
obvious:
14.3.1 Lemma. Let f: n ~ R be a measurable function on a space n with
a probability measure P. Then

P[J < med(f)] :::; ~ and P[J > med(f)] :::; ~.


338 Chapter 14: Measure Concentration and Almost Spherical Sections

Proof. The first inequality can be derived from the a-additivity of the
measure P:

i]
00

P[f < med(f)] = LP[med(f) - k~l < f:::; med(f) -


k=l
= supP [f
k21
:::; med(f) - iJ : :; ~.
The second inequality follows similarly. o
We are ready to prove that any I-Lipschitz function sn-l -+ R is con-
centrated around its median:
14.3.2 Theorem (Levy's lemma). Let f: sn-l -+ R be I-Lipschitz. Then
for all t E [0,1]'
P[f > med(f) + t] :::; 2e- t2n / 2 and P[f < med(f) - t] :::; 2e- t2n / 2 •

For example, on 99% of sn-l, the function f attains values deviating


from med(f) by at most 3.5n- 1 / 2 .
Proof. We prove only the first inequality. Let A = {x E sn-l: f(x) :::;
med(f)}. By Lemma 14.3.1, P[A] ::::: ~. Since f is I-Lipschitz, we have
f(x) :::; med(f) + t for all x E At. Therefore, by Theorem 14.1.1, we get
P[f> med(f) + t] :::; 1 - PlAt] :::; 2e- t2n / 2 • 0

The median is generally difficult to compute. But for a I-Lipschitz func-


tion, it cannot be too far from the expectation, which is usually easier to
estimate:
14.3.3 Proposition. Let f: sn-l -+ R be I-Lipschitz. Then

Imed(f) - E[J]I :::; 12n- 1 / 2 .


Proof.
k +1
Imed(f) - I :::; E[lf - med(f) I] :::; L Vn . P [If - med(f) I : : : In]
00
E[f]
k=O
00

:::; n- 1 / 2 L(k+l) . 4e- k2 / 2 :::; 12n- 1 / 2


k=O

(the numerical estimate of the last sum is not important; it is important that
it converges to some constant, which is obvious). 0

We derive a consequence of Levy's lemma on finding k-dimensional sub-


spaces where a given Lipschitz function is almost constant. But first we need
some notions and results.
14.3 Concentration of Lipschitz Functions 339

Random rotations and random subspaces. We want to speak about


a random k-dimensional (linear) subspace of Rn. We thus need to specify a
probability measure on the set of all k-dimensional linear subspaces of Rn
(so-called Grassmann manifold or Grassmannian). An elegant way of doing
this is via random rotations.
A rotation p is an isometry of R n fixing the origin and preserving the
orientation. In algebraic terms, p is a linear mapping x H Ax given by
an orthonormal matrix A with determinant 1. The result of performing the
rotation p on the standard orthonormal basis (el, ... , en) in Rn· is an n-tuple
of orthonormal vectors, and these vectors are the columns of A.
The group of all rotations in R n around the origin with the operation
of composition (corresponding to multiplication of the matrices) is denoted
by SO(n), which stands for the special orthogonal group. With the natu-
ral topology (obtained by regarding the corresponding matrices as points in
Rn\ it is a compact group. By a general theorem in the theory of topologi-
cal groups, there is a unique Borel probability measure on SO(n) (the Haar
measure) that is invariant under the action of the elements of SO (n). Here is
a more concrete description of this probability measure. To obtain a random
rotation p, we first choose a vector al E sn-l uniformly at random. Then
we pick a2 orthogonal to al; this a2 is drawn from the uniform distribution
on the (n-2)-dimensional sphere that is the intersection of sn-l with the
hyperplane perpendicular to al and passing through O. Then a3 is chosen
from the unit sphere within the (n-2)-dimensional subspace perpendicular
to al and a2, and so on.
In the sequel we need only the following intuitively obvious fact about
a random rotation p E SO(n): For every fixed U E sn-l, p(u) is a random
vector of sn-l. Therefore, if u E sn-l is fixed, A <;;;; sn-l is measurable, and
p E SO(n) is random, then the probability of p(u) E A equals P[A].
Let Lo be the k-dimensional subspace spanned by the first k coordinate
vectors el, e2, ... , ek. A random k-dimensionallinear subspace L c Rn can
be defined as p(Lo), where p E SO(n) is a random rotation.
By Levy's lemma, a I-Lipschitz function on sn-l is "almost constant" on
a subset A occupying almost all of sn-l. Generally we do not know anything
about the shape of such an A. But the next proposition shows that the
almost-constant behavior can be guaranteed on the intersection of sn-l with
a linear subspace of Rn of relatively large dimension.

14.3.4 Proposition (Subspace where a Lipschitz function is almost


constant). Let f: sn-l ~ R be a I-Lipschitz function and let 8 E (0,1].
Then there is a linear subspace L <;;;; R n such that all values of f restricted
to sn-l n L are in the interval [med(f) - 8, med(f) + 8] and

82
dim L :::: Slog(Sj8) . n - 1.
340 Chapter 14: Measure Concentration and Almost Spherical Sections

Proof. Let La be the subspace spanned by the first k = n5 2 18 log r 11 i-


coordinate vectors. Fix a ~-net N (as defined above Lemma 13.1.1) in sn-l n
La. Let p E SO(n) be a random rotation. For x E N, p(x) is a random point,
and so by Levy's lemma, the probability that If(p(x)) - med(f) I > ~ for
at least one point x E N is no more than INI . 4e- 02n / 8 . Using the bound
INI :::; (i)k from Lemma 13.1.1, we calculate that with a positive probability,
If(y) - med(f) I :::; ~ for all y E p(N).
We choose a p with this property and let L = p(Lo). For each x E sn-1nL,
there is some y E p(N) with Ilx-yll :::; ~,and since f is I-Lipschitz, we obtain
If(x) - med(f) I :::; If(x) - f(y)1 + If(y) - med(f) I :::; 5. D

Bibliography and remarks. Levy's lemma and a measure concen-


tration result similar to Theorem 14.1.1 were found by Levy [Lev51].
Analogues of Levy's lemma for other spaces with measure concen-
tration follow by the same argument. On the other hand, a measure
concentration inequality for sets follows from concentration of Lips-
chitz functions (a Levy's lemma) on the considered space (Exercise 1).
For some spaces, concentration of Lipschitz functions can be proved
directly. Often this is done using martingales (see [LedOl], [ASOOd],
[MS86], [BoI87]). Here we outline a proof without martingales (follow-
ing [LedOl]) for product spaces.
Let n be a space with a probability measure P and a metric p. The
Laplace functional E = En,p,p is a function (0, <Xl) --+ R defined by

E(>") = sup {E [eAf] : f: n --+ R is I-Lipschitz and E[f] = O} .

°
First we show that a bound on E(>..) implies concentration of Lipschitz
functions. Assume that E(>..) :::; ea>? /2 for some a > and all >.. > 0,
and let f: n --+ R be I-Lipschitz. We may suppose that E[f] = 0.
Using Markov's inequality for the random variable Y = e Af , we have
P[f:::: t] = p[Y:::: eO,] :::; E[Y] let>. :::; E(>..)/e t >. :::; ea>.2/ 2->.t, and
setting>.. = ~ yields P[f:::: t] :::; e- t2 / 2a .
Next, for some spaces, E(>..) can be bounded directly. Here we show
that if (n,p) has diameter at most 1, then E(>..) :::; e->.2/2. This can be
proved by the following elegant trick. First we note that eE[f] :::; E [e f ]
for any f, by Jensen's inequality in integral form, and so if E[f] = 0,
then E [e- f ] :::: 1. Then, for a I-Lipschitz f with E[J] = 0, we calculate

E[e>.f] = in e>'f(x)dP(x)

:::; ( / e-Af(Y) dP(Y)) ( / eAf(x) dP(X))

= / / e>'(f(x)-f(Y)) dP(x) dP(y)


14.4 Almost Spherical Sections: The First Steps 341

= f JJ (A(f(X) ~ J(y)))i dP(x) dP(y).


,=0
For i even, we can bound the integrand by Ai Ii!, since IJ(x)- J(y)1 :::; 1.
For odd i, the integral vanishes by symmetry. The resulting bound
is 2::%"'=0 A2k I (2k)! :::; e A2 /2. (If the diameter is D, then we obtain
E(A) :::; eD2A2/2.)
Finally, we prove that the Laplace functional is submultiplica-
tive. Let (n 1 , PI, PI) and (n 2 , P 2 , P2) be spaces, let n = [21 xn 2 ,
P = PI xP 2 , and P = PI + P2 (that is, p((x, y), (x', y')) = PI (x, x') +
P2(y, y')). We claim that EO,p,p(A) :::; E01,Pl,Pl (A) . E 02 ,P 2,P2(A). To
verify this, let J: n --+ R be I-Lipschitz with E[J] = 0, and set
g(y) = Ex [J(x, y)] = 10 1 J(x, y) dP 1 (x). We observe that g, being
a weighted average of I-Lipschitz functions, is I-Lipschitz. We have

The function x H J(x, y) -g(y) is I-Lipschitz and has zero expectation


for each y, and the inner integral is at most EOl,Pl,Pl (A). Since 9 is
I-Lipschitz and E[g] = 0, we have 10 2 eAg(y) dP 2 (y) :::; E 02 ,P2,P2 (A)
and we are done.
By combining the above, we obtain, among others, that if each
of n spaces (n i , Pi, Pi) has diameter at most 1 and ([2, P, p) is the
product, then P[J 2 E[J] + t] :::; e- t2 / 2n for all I-Lipschitz J: n --+ R.
In particular, this applies to the Hamming cube.
Proposition 14.3.4 is due to Milman [Mil69], [Mil71].

Exercises
1. Derive the measure concentration on the sphere (Theorem 14.1.1) from
Levy's lemma. 0

14.4 Almost Spherical Sections: The First Steps


For a real number t 2 1, we call a convex body K t-almost spherical if it
contains a (Euclidean) ball B of some radius r and it is contained in the
concentric ball of radius tr.
342 Chapter 14: Measure Concentration and Almost Spherical Sections

Given a centrally symmetric convex body KeRn and 10 > 0, we are in-
terested in finding a k-dimensional (linear) subspace L, with k as large as
possible, such that the "section" K n L is (1+10 )-almost spherical.
Ellipsoids. First we deal with ellipsoids, where the existence of large spher-
ical sections is not very surprising. But in the sequel it gives us additional
freedom: Instead of looking for a (l+c)-spherical section of a given convex
body, we can as well look for a (l+c)-ellipsoidal section, while losing only a
factor of at most 2 in the dimension. This means that we are free to trans-
form a given body by any (nonsingular) affine map, which is often convenient.
Let us remark that in the local theory of Banach spaces, almost-ellipsoidal
sections are usually as good as almost-spherical ones, and so the following
lemma is often not even mentioned.

14.4.1 Lemma (Ellipsoids have large spherical sections). For any


(2k-1)-dimensional ellipsoid E, there is a k-flat L passing through the center
of E such that E n L is a Euclidean ball.

Proof. °
Let E = {x E R 2k - 1 : L;~~l ~~ :::; I} with < al :::; a2 :::; ... :::;
a2k-l. We define the k-dimensionallinear subspace L by a system of k - 1
linear equations. The ith equation is

1 1
2" - -2--'
ak a 2k - i

i = 1,2, ... ,k-l. It is chosen so that


221
-xi X 2k - i
+- (2
- = -a~' 2 )
x· + X2k-·
a; a~k-i ,

for x E L. It follows that for x E L, we have x E E if and only if Ilxll :::; ak,
and so En L is a ball of radius ak. The reader is invited to find a geometric
meaning of this proof and/or express it in the language of eigenvalues. D

°
To make formulas simpler, we consider only the case 10 = 1 (2-almost
spherical sections) in the rest of this section. An arbitrary 10 > can always
be handled very similarly.
14.4 Almost Spherical Sections: The First Steps 343

The cube. The cube [-1, IJn is a good test case for finding almost-spherical
sections; it seems hard to imagine how a cube could have very round slices.
In some sense, this intuition is not totally wrong, since the almost-spherical
sections of a cube can have only logarithmic dimension, as we verify next.
(But the n-dimensional crosspolytope has (1 +€ )-spherical sections of dimen-
sion as high as c(€)n, and yet it does not look any rounder than the cube; so
much for the intuition.)
The intersection of the cube with a k-dimensionallinear subspace of R n
is a k-dimensional convex polytope with at most 2k facets.

14.4.2 Lemma. Let P be a k-dimensional 2-almost spherical convex poly-


tope. Then P has at least ~ e k / 8 facets.
Therefore, any 2-almost spherical section of the cube has dimension at
most O(logn).
Proof of Lemma 14.4.2. After a suitable affine transform, we may assume
~ Bk ~ P ~ Bk. Each point x E Sk-l is separated from P by one of the facet
hyperplanes. For each facet F of P, the facet hyperplane hF cuts off a cap C F
of Sk-l, and these caps together cover all of Sk-l. The cap CF is at distance
at least ~ from the hemisphere defined by the hyperplane h'p parallel to hF
and passing through O.

By Theorem 14.1.1 (measure concentration), we have P[CF] :S 2e- k / 8 • 0

Next, we show that the n-dimensional cube actually does have 2-almost
n
spherical sections of dimension (log n). First we need a k-dimensional 2-
almost spherical polytope with 4k facets. We note that if P is a convex
polytope with Bk c P C tBk, then the dual polytope P* satisfies t Bk C
P* C Bk (Exercise 1). So it suffices to construct a k-dimensional 2-almost
spherical polytope with 4k vertices, and this was done in Section 13.3: We can
take any I-net in Sk-l as the vertex set. (Let us remark that an exponential
lower bound for the number of vertices also follows from Theorem 13.2.1.)
By at most doubling the number of facets, we may assume that our k-
dimensional 2-almost spherical polytope is centrally symmetric. It remains
to observe that every k-dimensional centrally symmetric convex polytope P
with 2n facets is an affine image of the section [-1, 1In n L for a suitable k-di-
mensionallinear subspace L ~ Rn. Indeed, such a P can be expressed as the
344 Chapter 14: Measure Concentration and Almost Spherical Sections

intersection n~=l{x E Rk: l(ai,x)1 ::; I}, where ±al, ... ,±an are suitably
normalized normal vectors of the facets of P. Let f: Rk -t R n be the linear
map given by
f(x) = ((al' x), (a2' x), ... , (an, x)).
Since P is bounded, the ai span all of R k, and so f has rank k. Consequently,
its image L = f(R k ) is a k-dimensional subspace of Rn. We have P =
f-l([-I, l]n), and so the intersection [-1, l]n n L is the affine image of P.
We see that the n-dimensional cube has 2-almost ellipsoidal sections of
dimension Q(logn) (as well as 2-almost spherical sections, by Lemma 14.4.1).
Next, we make preparatory steps for finding almost-spherical sections of
arbitrary centrally symmetric convex bodies. These considerations are most
conveniently formulated in the language of norms.
Reminder on norms. We recall that a norm on a real vector space Z is
a mapping that assigns a nonnegative real number Ilxllz to each x E Z such
that Ilxllz = 0 implies x = 0, Ilaxllz = lal . Ilxllz for all a E R, and the
triangle inequality holds: Ilx + Yllz ::; Ilxliz + Ilyllz. (Since we have reserved
II . II for the Euclidean norm, we write other norms with various subscripts,
or occasionally we use the symbol I . I.)
Norms are in one-to-one correspondence with closed bounded convex bod-
ies symmetric about 0 and containing 0 in their interior. Here we need only
one direction of this correspondence: Given a convex body K with the listed
properties, we assign to it the norm II . 11K given by

IlxliK = min {t > 0: ~ E K} (x ::f. 0).


Here is an illustration:
IIYIIK = 3 • Y

IIxliK = 1
It is easy to verify the axioms of the norm (the convexity of K is needed for
the triangle inequality). The body K is the unit ball of the norm II· 11K. The
norm of points decreases by blowing up the body K.
General body: the first attempt. Let KeRn be a convex body defining
a norm (i.e., closed, bounded, symmetric, 0 in the interior). Let us define the
function fK: sn-l -t R as the restriction of the norm II . 11K on sn-\ that
is, fK(X) = IlxiIK. We note that K is t-almost spherical if (and only if) there
is a number a> 0 such that a ::; f(x) ::; ta for all x E sn-l. So for finding
a large almost-spherical section of K, we need a linear subspace L such that
14.4 Almost Spherical Sections: The First Steps 345

f does not vary too much on sn-l n L, and this is where Proposition 14.3.4,
about subspaces where a Lipschitz functi<;>u is almost constant, comes in.
Of course, that proposition has its assumptions, and one of them is that
fK is I-Lipschitz. A sufficient condition for that is that K should contain the
unit ball:
14.4.3 Observation. Suppose that the convex body K contains the R-ba11
B(O, R). Then Ilx/lK :::; fi Ilxll for all x, and the function x H IlxilK is fi-Lip-
schitz with respect to the Euclidean metric. 0

Then we can easily prove the following result.


14.4.4 Proposition. Let KeRn be a convex body defining a norm and
such that B n ~ K, and let m = med(fK), where fK is as above. Then there
exists a 2-almost-spherical section of K of dimension at least

nm2 )
o ( 10g(24/m) .

Proof. By Observation 14.4.3, fK is I-Lipschitz. Let us set 8 = W- (note


that Bn ~ K also implies m :::; 1). Proposition 14.3.4 shows that there is a
subspace L such that fK E [~m, tm] on sn-l n L, where

ilimL=O ~.1'2) =0 ( nm 2)
( log(8/8) (14.2)
log(24/m)'
The section K nL is 2-almost spherical. o
A slight improvement. It turns out that the factor log(24/m) in the result
just proved can be eliminated by a refined argument, which uses the fact that
f K comes from a norm.
14.4.5 Theorem. With the assumptions as in Proposition 14.4.4, a 2-almost
spherical section exists of dimension at least !3nm 2 , where!3 > 0 is an absolute
constant.

Proof. The main new observation is that for our fK' we can afford a much
less dense net N in the proof of Proposition 14.3.4. Namely, it suffices to let
N be a i-net in sk-l, where k = r!3m 2 nl
If !3 > 0 is sufficiently small, Levy's lemma gives the existence of a rotation
p such that ~~m :::; fK(Y) :::; ~~m for all Y E p(N); this is exactly as in the
proof of Proposition 14.3.4. It remains to verify ~m :::; fK(X) :::; tm for all
x E sn-l n L, where L = p(Lo). This is implied by the following claim with
a = ~~m and 1·1 = /I. 11K:
Claim. Let N be a i-net in Sk-l with respect to the Euclidean metric, and
let I . I be a norm on Rk satisfying ia :::; IYI :::; a for all yEN and for some
number a > O. Then ~a :::; Ixl :::; ~a for all x E Sk-l.
346 Chapter 14: Measure Concentration and Almost Spherical Sections

To prove the claim, we begin with the upper bound (this is where the
new trick lies). Let M = max{lxI: x E Sk-l} and let Xo E Sk-l be a point
where M is attained. Choose a Yo EN at distance at most ~ from xo, and let
z = (xo - yo)/llxo - yoll be the unit vector in the direction of Xo - Yo. Then
M = Ixol :::; Iyol + Ixo - yol :::; a + Ilxo - yoll . Izl :::; a + ~M. The resulting
inequality M :::; a + ~M yields M :::; ~a.
The lower bound is now routine: If x E Sk-l and yEN is at distance at
most ~ from it, then Ixl ~ Iyl - Ix - yl ~ ia - . ~ ~a ~ ~a. The claim, as
well as Theorem 14.4.5, is proved. D

Theorem 14.4.5 yields almost-spherical sections of K, provided that we


can estimate med(fK) (after rescaling K so that Bn ~ K). We must warn
that this in itself does not yet give almost spherical sections for every K
(Dvoretzky's theorem), and another twist is needed, shown in Section 14.6.
But in order to reap some benefits from the hard work done up until now,
we first explain an application to convex polytopes.

Bibliography and remarks. As was remarked in the text, almost-


spherical and almost-ellipsoidal sections are seldom distinguished in
the local theory of Banach spaces, where symmetric convex bodies are
considered up to isomorphism, i.e., up to a nonsingular linear trans-
form. If Kl and K2 are symmetric convex bodies in R n , their Banach-
Mazur distance d(Kl' K 2) is defined as the smallest positive t for which
there is a linear transform T such that T(Kd ~ K2 ~ t . T(K2).
So a symmetric convex body K is t-almost ellipsoidal if and only if
d(K,Bn) :::; t. It turns out that every two symmetric compact convex
bodies K 1, K 2 eRn satisfy d( K 1, K 2) :::; ,;n. The logarithm of the
Banach-Mazur distance is a metric on the space of compact symmet-
ric convex bodies in R n .
Lemma 14.4.1 appears in Dvoretzky [Dv061]. Theorem 14.4.5 is
from Figiel, Lindenstrauss, and Milman [FLM77].
There are several ways of proving that the n-dimensional crosspoly-
tope has almost spherical sections of dimension n(n) (but, perhaps
surprisingly, no explicit construction of such a section seems to be
known). A method based on Theorem 14.4.5 is indicated in Exer-
cise 14.6.2. A somewhat more direct way, found by Schechtman, is
to let the section L be the image of the linear map f: Rcn -+ Rn
whose matrix has entries ±1 chosen uniformly and independently at
random (c > 0 is a suitable small constant). The proof uses martin-
gales (Azuma's inequality); see, e.g., Milman and Schechtman [MS86].
The existence of a C-almost spherical section of dimension ~, with a
suitable constant C, is a consequence of a theorem of Kashin: If Bl de-
notes the crosspolytope and p is a random rotation, then Blnp(Bl) is
32-almost spherical with a positive probability; see Ball [BaI97] for an
insightful exposition. The previously mentioned methods do not pro-
14.5 Many Faces of Symmetric Polytopes 347

vide a dimension this large, but Kashin's result does not give (1+10)-
almost spherical sections for small E.

Exercises
1. Let K be a convex body containing 0 in its interior. Check that K ~ Bn
if and only if Bn ~ K* (recall that K* = {x E Rk: (x, y) :::; 1 for all y E
K}). Derive that if Bk eKe tBk, then tBk c K* C Bk. [2]

14.5 Many Faces of Symmetric Polytopes


Can an n-dimensional convex polytope have both few vertices and few facets?
Yes, an n-simplex has n+ 1 vertices and n+ 1 facets. What about a centrally
symmetric polytope? The n-dimensional cube has only 2n facets but 2n ver-
tices. Its dual, the crosspolytope (regular octahedron for n = 3), has few
vertices but many facets. It turns out that every centrally symmetric poly-
tope has many facets or many vertices.

14.5.1 Theorem. There is a constant a > 0 such that for any centrally sym-
metric n-dimensional convex polytope P, we have log fo (P) ·log f n-l (P) 2: an
(recall that fo(P) denotes the number of vertices and fn-l (P) the number
offacets).

For the cube, the expression log fo (P) . log f n-l (P) is about n log n, which
is even slightly larger than the lower bound in the theorem. However, poly-
topes can be constructed with both log fo (P) and log f n-l (P) bounded by
O( y'n) (Exercise 1).
Proof of Theorem 14.5.1. We use the dual polytope P* with fo(P) =
f n-l (P*), and we prove the theorem in the equivalent form log f n-l (P*) .
logfn-l(P) 2: an.
John's lemma (Theorem 13.4.1) claims that for any symmetric convex
body K, there exists a (nonsingular) linear map that transforms K into a
y'n-almost spherical body. We can thus assume that the considered n-dimen-
sional polytope P is y'n-almost spherical (this is crucial for the proof).
After rescaling, we may suppose Bn c P c y'n Bn. Letting m = med(fp ),
where fp is the restriction of /I. lip on sn-l as usual, Theorem 14.4.5 tells us
that there is a linear subspace L of R n with P n L being 2-almost spherical
and with dim(L) = O(nm 2). Thus, since any k-dimensionaI2-almost spherical
polytope has efl(k) facets, we have log fn-l (P) = O(nm 2).
Now, we look at P*. Since Bn c Pc y'n Bn, by Exercise 14.4.1 we have
n- 1j2 Bn C P* c Bn. In order to apply Theorem 14.4.5, we set P = y'n P*,
and obtain a 2-almost spherical section L of P of dimension O(nm2), where
m= med(fp). This implies log fn-l (P*) = O(nm 2).
It remains to observe the following inequality:
348 Chapter 14: Measure Concentration and Almost Spherical Sections

14.5.2 Lemma. Let P be a polytope in Rn defining a norm and let P* be


the dual polytope. Then we have med(fp) med(fp*) ::::: 1.
We leave the easy proof as Exercise 2. Since m = med(fp* ) / fo, we finally
obtain

logfn-l(P) ·logfn-l(P*) = !1(n 2m 2m2)


= !1(nmed(fp)2med(fp*)2) = !1(n).
This concludes the proof of Theorem 14.5.1. o

Bibliography and remarks. Theorem 14.5.1, as well as the exam-


ple in Exercise 1, is due to Figiel, Lindenstrauss, and Milman [FLM77].
Most of the tools in the proof come from earlier papers of Milman
[Mi169], [Mil71].

Exercises
1. Construct an n-dimensional convex polytope P with log fo(P) = !1( fo)
and logfn-l(P) = !1(fo) , thereby demonstrating that Theorem 14.5.1
is asymptotically optimal. Start with the interval [0,1] C Rl, and alter-
nate the operations (.)* (passing to the dual polytope) and x (Cartesian
product) suitably; see Exercise 5.5.1 for some properties of the Cartesian
product of polytopes. [!]
The polytopes obtained from [0, 1] by a sequence of these operations are
called Hammer polytopes, and they form an important class of examples.

°
2. Let K be a bounded centrally symmetric convex body in Rn containing
in its interior, and let K* be the dual body.
(a) Show that IlxilK . IlxlIK* : : : 1 for all x E sn-l. IT]
(b) Let f,g:sn-l -+ R be (measurable) functions with f(x)g(x)::::: 1 for
all x E sn-l. Show that med(f) med(g) ::::: 1. 121

14.6 Dvoretzky's Theorem


Here is the remarkable Ramsey-type result in high-dimensional convexity
promised at the beginning of this chapter.

14.6.1 Theorem (Dvoretzky's theorem). For any natural number k


and any real c > 0, there exists an integer n = n(k, c) with the following
property. For any n-dimensional centrally symmetric convex body K ~ R n ,
there exists a k-dimensional linear subspace L ~ Rn such that the section
K n Lis (l+c)-almost spherical.
The best known estimates give n(k,c) = eO(k/c 2 ).
14.6 Dvoretzky's Theorem 349

Thus, no matter how "edgy" a high-dimensional K may be, there is always


a slice of not too small dimension that is almost a Euclidean ball. Another
way of expressing the statement is that any normed space of a sufficiently
large dimension contains a large subspace on which the norm is very close
to the Euclidean norm (with a suitable choice of a coordinate system in the
subspace). Note that the Euclidean norm is the only norm with this universal
property, since all sections of the Euclidean ball are again Euclidean balls.
As we saw in Section 14.4, the n-dimensional cube shows that the largest
dimension of a 2-almost spherical section is only 0 (log n) in the worst case.

°
The assumption that K is symmetric can in fact be omitted; it suffices
to require that be an interior point of K. The proof of this more general
version is not much more difficult than the one shown below.
We prove Dvoretzky's theorem only for c = 1, since in Section 14.4 we
prepared the tools for this particular setting. But the general case is not very
different.
Preliminary considerations. Since affine transforms of K are practically
for free in view of Lemma 14.4.1, we may assume that Bn ~ K ~ Vn Bn
by John's lemma (Theorem 13.4.1). So the norm induced by K satisfies
n- 1 / 2 /1xll S; IlxilK S; Ilxll for all x. If JK is the restriction of II . 11K to
sn-l, we have the obvious bound med(fK) 2: n- 1 / 2 . Immediate applica-
tion of Theorem 14.4.5 shows the existence of a 2-almost spherical section
of K of dimension O(nmed(fK)2) = 0(1), so this approach gives nothing at
all! On the other hand, it just fails, and a small improvement in the order of
magnitude of the lower bound for med(fK) already yields Dvoretzky's theo-
rem.
We will not try to improve the estimate for med(fK) directly. Instead,
we find a relatively large subspace Z c R n such that the section K n Z can
be enclosed in a not too large parallelotope P. Then we estimate, by direct
computation, med(fp) (over the unit sphere in Z).
The selection of the subspace Z is known as the Dvoretzky-Rogers lemma.
We present a version with a particularly simple proof, where dim Z ~ nj log n.
(For our purposes, we would be satisfied with even much weaker estimates,
say dim Z 2: nO for some fixed 8 > 0, but on the other hand, another proof
gives even dim Z = ~.)
14.6.2 Lemma (A version of the Dvoretzky-Rogers lemma). Let
KeRn be a centrally symmetric convex body. Then there exist a lin-
ear subspace Z c Rn of dimension k = log2 -nIn
J, an orthonormal basis
Ul, U2, ... ,Uk of Z, and a nonsingular linear transform T of R n such that
if we let k = T(K) n Z, then Ilxllk S; /lxll for all x E Z and Ilui/lk 2: ~ for
all i = 1,2, ... ,k.
Geometrically, the lemma asserts that k is sandwiched between the unit ball
Bk and a parallelotope P as in the picture:
350 Chapter 14: Measure Concentration and Almost Spherical Sections

(The lemma claims that the points 2Ui are outside of K or on its boundary,
and P is obtained by separating these points from K by hyperplanes.)
Proof. By John's lemma, we may assume B n ~ K ~ tBn, where t = fo,.
Interestingly, the full power of John's lemma is not needed here; the same
proof works with, say, t = n or t = n 10, only the bound for k would become
worse by a constant factor.
Let Xo = Rn and Ko = K. Here is the main idea of the proof. The
current body Ki is enclosed between an inner ball and an outer ball. Either
Ki approaches the inner ball sufficiently closely at "many" places, and in
this case we can construct the desired UI, ... , Uk, or it stays away from the
inner ball on a "large" subspace. In the latter case, we can restrict to that
subspace and inflate the inner ball. But since the outer ball remains the
same, the inflation of the inner ball cannot continue indefinitely. A precise
argument follows; for notational reasons, instead of inflating the inner ball,
we will shrink the body and the outer ball.
We consider the following condition:
(*) Each linear subspace Y ~ Xo with dim(Xo) - dim(Y) < k con-
tains a vector U with Ilull = 1 and IlullKo : : : ~.
This condition mayor may not be satisfied. If it holds, we construct the
orthonormal basis UI, U2, ..• ,Uk by an obvious induction. If it is not satisfied,
we obtain a subspace Xl of dimension greater than n - k such that IlxllKo :::;
~llxll for all x E Xl. Thus, KOnXI is twice "more spherical" thanKo. Setting
KI = ~(Ko n Xl), we have

f I ·11 :::; II· 11K! :::; 11·11·


We again check the condition (*) with Xl and KI instead of Xo and Ko. If
it holds, we find the Ui within X I, and if it does not, we obtain a subspace
X 2 of dimension greater than n - 2k, etc. After the ith step, we have

fll·ll:::; 11·IIKi:::; 11·11·


This construction cannot proceed all the way to step i = io = llog2 n J, since
2i o > t = fo,. Thus, the condition (*) must hold for X io - l at the latest. We
have dimXio - 1 > n - (i o -l)k ::::: k, and so the required basis UI, ... , Uk can
be constructed. 0
14.6 Dvoretzky's Theorem 351

The parallelotope is no worse than the cube. From now on, we work
within the subspace Z as in Lemma 14.6.2. For convenient notation, we as-
sume that Z is all of R n and K is as k in the above lemma, i.e., B n ~ K
and IluillK :::: ~, i = 1,2, ... , n, where UI,"" Un is an orthonormal basis of
Rn. (Note that the reduction of the dimension from n to n/logn is nearly
insignificant for the estimate of n(k, c) in Dvoretzky's theorem.)
The goal is to show that med(fK) = n(J(logn)/n), where fK is II· 11K
restricted to sn-I. Instead of estimating med(fK), we bound the expectation
E[fK]' Since fK is I-Lipschitz (we have Bn ~ K), the difference Imed(fK)-
E[fK] I is O(n-I/2) by Proposition 14.3.3, which is negligible compared to
the lower bound we are heading for.
We have II· 11K :::: II· lip, where P is the parallelotope as in the illustration
to Lemma 14.6.2. So we actually bound E[fp] from below.
First we show, by an averaging trick, that E[Jp] :::: E[Jc], where fdx) =
~ Ilxli oo = ~ maxi IXil is the norm induced by the cube C of side 4. The idea
of the averaging is to consider, together with a point x = L~=I O:iUi E sn-I,
the 2n points of the form L~I O'iO:iUi, where 0' E {-I, l}n is a vector of
signs. For any measurable function f p: sn-I -+ R, we have

In-l L fp(tO'iO:iUi) dP(o:) = L In-l fp(tO'iO:iUi) dP(o:)


S O'E{ -I,l}n i=1 0' S i=1

= 2n hn-l fp(x) dP(x) = 2n E[Jp].

The following lemma with Vi = O:iUi and 1·1 = 11·llp implies that the integrand
on the left-hand side is always at least 2n maxi IIO:iuillp :::: 2n. ~ maxi IO:il,
and so indeed E [fp] :::: E [fcl.
14.6.3 Lemma. Let VI, V2, ... , Vn be arbitrary vectors in a normed space
with norm I . I. Then

L i t O'iVil :::: 2 n mrx IVil·


O'E{ -I,l}n i=1

The proof is left as Exercise 1. It remains to estimate E[fcl from below.

14.6.4 Lemma. For a suitable positive constant c and for all n we have

E[Jcl = ~ 1Sn-l
Ilxll oo dP(x) :::: c ff
ogn
-,
n

where Ilxll oo = maxi IXil is the Coo (or maximum) norm.


Note that once this lemma is proved, Dvoretzky's theorem (with c = 1)
follows from what we have already done and from Theorem 14.4.5.
352 Chapter 14: Measure Concentration and Almost Spherical Sections

Proof of Lemma 14.6.4. There are various proofs; a neat way is based
on the generally useful fact that the n-dimensional normal distribution is
spherically symmetric around the origin. We use probabilistic terminology.
Let ZI, Z2, ... , Zn be independent random variables, each of them with the
standard normal distribution N(O, 1). As was mentioned in Section 14.1, the
random vector Z = (Zl' Z2, ... , Zn) has a spherically symmetric (Gaussian)
distribution, and consequently, the random variable II~II is uniformly dis-
tributed in sn-l. Thus

E[f ] = lE[IIZlloo]
JC 2 IIZII·

We show first, that we have IIZII ::; ffn with probability at least ~, and
second, that for a suitable constant Cl > 1, IIZlloo ;::: Cl Jlogn holds with
probability at least ~. It follows that both these events occur simultaneously
with probability at least ~, and so E[IC] ;::: cJlogn/n as claimed.
As for the Euclidean norm IIZII, we obtain E[IIZI12] = nE[Zr] = n,
since an N(O, 1) random variable has variance 1. By Markov's inequality,
Prob [IIZII ;::: ffn] = Prob [IIZI1 2 ;::: 3E [IIZI1 2]] ::; ~.
Further, by the independence of the Zi we have

Prob[IIZlloo::; z] = Prob[lZil::; z for all i = 1,2, ... ,n]


= Prob[lZll::; zt = (1- k 1 e-
00
t2 2
/ dt) n

We can estimate fzoo e- t2 / 2 dt ;::: f:+l e- t2 / 2 dt ;::: e-(z+1)2/2. Thus, setting


z = Jlnn-1, ~e have Prob[IIZlloo ::; z] ::; (1- k n-l/2)n, which is below
~ for sufficiently large n. Lemma 14.6.4 is proved. 0

Bibliography and remarks. Dvoretzky and Rogers [DR50] in-


vestigated so-called unconditional convergence in infinite-dimensional
Banach spaces, and as an auxiliary result, they proved a statement
similar to Lemma 14.6.2, with the dimension of the subspace about
...;n. They used the largest inscribed ellipsoid and a variational argu-
ment (somewhat similar to the proof of John's lemma). The lemma
actually holds with an ~-dimensional subspace; for a proof due to
Johnson, again using the largest inscribed ellipsoid, see Benyamini
and Lindenstrauss [BL99]. The proof of Lemma 14.6.2 presented in
this section is from Figiel, Lindenstrauss, and Milman [FLM77].
Dvoretzky's theorem was conjectured by Grothendieck [Gr056] and
first proved by Dvoretzky [Dv059], [Dv061]. His proof was quite com-
plicated, and the estimate for the dimension of the almost spheri-
cal section was somewhat worse than that in Theorem 14.6.1. Since
then, several other proofs have appeared; see Lindenstrauss [Lin92]
14.6 Dvoretzky's Theorem 353

for an insightful summary_ The proof shown above essentially follows


Figiel et al. [FLM77], who improved and streamlined Milman's proof
[Mil71] based on measure concentration. A modern proof using mea-
sure concentration for the Gaussian measure instead of that for the
sphere can be found in Pisier [Pis89]. Gordon [Gor88] has a proof with
more probability-theoretic flavor, using certain inequalities for Gaus-
sian random variables (an extension ofthe so-called Slepian's lemma).
The dependence of the dimension of the almost spherical section
on n is of order log n, which is tight, as we have seen. In terms of c,
the proof presented gives a bound proportional to c 2 / log ~, and the
best known general bound is proportional to c 2 (Gordon [Gor88]).
A version of Dvoretzky's theorem for not necessarily symmetric
convex bodies was established by Larman and Mani [LM75], and Gor-
don's proof [Gor88] is also formulated in this setting.
For x ERn, let Ilxllp = (IXlIP + ... + IxnIP)l/P denote the ip-
norm of x. Here p E [1, (0), and for the limit case p = 00 we have
Ilxll oo = maxi IXil. For not too large p, the unit balls of ip-norms have
much larger almost spherical sections than is guaranteed by Dvoret-
zky's theorem. For p E [1,2], the dimension of a (l+c)-almost spherical
section is cgn, and for p ~ 2, it is cE;n 2 / p . These results are obtained by
the probabilistic method, and no explicitly given sections with compa-
rable dimensions seem to be known; see, e.g., [MS86]. There are many
other estimates on the dimension of almost spherical sections, for ex-
ample in terms of the so-called type and cotype of a Banach space, as
well as bounds for the dimension of almost spherical projections. For
example, by a result of Milman, for any centrally symmetric n-dimen-
sional convex body K there is a section of an orthogonal projection
of K that is (l+c)-almost spherical and has dimension at least c(c)n
(which is surprising, since both for sections alone and for projections
alone the dimension of an almost spherical section can be only loga-
rithmic). Such things and much more information can be found in the
books Milman and Schechtman [MS86], Pisier [Pis89], and Tomczak-
Jaegermann [TJ89].

Exercises
1. Prove Lemma 14.6.3. [I]
2. (Large almost spherical sections of the crosspolytope) Use Theorem 14.4.5
and the method of the proof of Lemma 14.6.4 for proving that the n-di-
mensional unit ball of the iI-norm has a 2-almost spherical section of
dimension at least cn, for a suitable constant C > O. [I]
15
Embedding Finite Metric
Spaces into N ormed Spaces

15.1 Introduction: Approximate Embeddings


We recall that a metric space is a pair (X, p), where X is a set and p: X x X -+
[0,(0) is a metric, satisfying the following axioms: p( x, y) = 0 if and only if
x = y, p(x, y) = p(y, x), and p(x, y) + p(y, z) ~ p(x, z).
A metric p on an n-point set X can be specified by an nxn matrix of
real numbers (actually (~) numbers suffice because of the symmetry). Such
tables really arise, for example, in microbiology: X is a collection of bacterial
strains, and for every two strains, one can obtain their dissimilarity, which
is some measure of how much they differ. Dissimilarity can be computed
by assessing the reaction of the considered strains to various tests, or by
comparing their DNA, and so on. 1 It is difficult to see any structure in a
large table of numbers, and so we would like to represent a given metric
space in a more comprehensible way.
For example, it would be very nice if we could assign to each x E X a point
f(x) in the plane in such a way that p(x, y) equals the Euclidean distance of
f (x) and f (y). Such representation would allow us to see the structure of the
metric space: tight clusters, isolated points, and so on. Another advantage
would be that the metric would now be represented by only 2n real numbers,
the coordinates of the n points in the plane, instead of G) numbers as be-
fore. Moreover, many quantities concerning a point set in the plane can be
computed by efficient geometric algorithms, which are not available for an
arbitrary metric space.

1 There are various measures of dissimilarity, and not all of them yield a metric,
but many do.
356 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

This sounds very good, and indeed it is too good to be generally true: It
is easy to find examples of small metric spaces that cannot be represented in
this way by a planar point set. One example is 4 points, each two of them
at distance 1; such points cannot be found in the plane. On the other hand,
they exist in 3-dimensional Euclidean space.
Perhaps less obviously, there are 4-point metric spaces that cannot be

o
represented (exactly) in any Euclidean space. Here are two examples:

y
The metrics on these 4-point sets are given by the indicated graphs; that is,
the distance of two points is the number of edges of a shortest path connecting
them in the graph. For example, in the second picture, the center has distance
1 from the leaves, and the mutual distances of the leaves are 2.
So far we have considered isometric embeddings. A mapping f: X ---t Y,
where X is a metric space with a metric p and Y is a metric space with
a metric a, is called an isometric embedding if it preserves distances, Le.,
if a(f(x),j(y)) = p(x, y) for all x, y E X. But in many applications we
need not insist on preserving the distance exactly; rather, we can allow some
distortion, say by 10%. A notion of an approximate embedding is captured
by the following definition.
15.1.1 Definition (D-embedding of metric spaces). A mapping f: X ---t
Y, where X is a metric space with a metric p and Y is a metric space with
a metric a, is called a D-embedding, where D 2: 1 is a real number, if there
exists a number r > 0 such that for all x, y EX,

r· p(x, y) :::; a(f(x), fey)) :::; D· r· p(x, y).

The infimum of the numbers D such that f is a D-embedding is called the


distortion of f.
Note that this definition permits scaling of all distances in the same ratio
r, in addition to the distortion ofthe individual distances by factors between
1 and D. If Y is a Euclidean space (or a normed space), we can rescale the
image at will, and so we can choose the scaling factor r at our convenience.
Mappings with a bounded distortion are sometimes called bi-Lipschitz
mappings. This is because the distortion of f can be equivalently defined using
the Lipschitz constants of f and of the inverse mapping f-l. Namely, if we
define the Lipschitz norm of f by IIfllLip = sup{a(f(x), f(y))/ p(x, y): x, y E
X, x i- y}, then the distortion of f equals IIfllLip ·llf- 1 1ILip.
We are going to study the possibility of D-embedding of n-point metric
spaces into Euclidean spaces and into various normed spaces. As usual, we
cover only a small sample of results. Many of them are negative, showing
that certain metric spaces cannot be embedded too well. But in Section 15.2
15.1 Introduction: Approximate Embeddings 357

we start on an optimistic note: We present a surprising positive result of


considerable theoretical and practical importance. Before that, we review a
few definitions concerning £p-spaces.
The spaces tp and t~. For a point x E Rd and p E [1, (0), let

denote the £p-norm of x. Most of the time, we will consider the case p = 2,
i.e., the usual Euclidean norm IIxll2 = IIxli. Another particularly important
case is p = 1, the £l-norm (sometimes called the Manhattan distance). The
Roo-norm, or maximum norm, is given by IIxli oo = maxi IXil. It is the limit of
the £p-norms as p -+ 00.
Let £~ denote the space Rd equipped with the £p-norm. In particular, we
write £~ in order to stress that we mean R d with the usual Euclidean norm.
Sometimes we are interested in embeddings into some space £~, with p
given but without restrictions on the dimension d; for example, we can ask
whether there exists some Euclidean space into which a given metric space
embeds isometrically. Then it is convenient to speak about £p, which is the
space of all infinite sequences x = (Xl, X2, ... ) of real numbers with IIxlip < 00,
where IIxlip = (I::llxiIP riP. In particular, £2 is the (separable) Hilbert
space. The space £p contains each £~ isometrically, and it can be shown that
any finite metric space isometrically embeddable into £p can be isometrically
embedded into £~ for some d. (In fact, every n-point subspace of £p can be
isometrically embedded into £~ with d ::; G); see Exercise 15.5.2.)
Although the spaces £p are interesting mathematical objects, we will not
really study them; we only use embeddability into £p as a convenient short-
hand for embeddability into £~ for some d.

Bibliography and remarks. This chapter aims at providing an


overview of important results concerning low-distortion embeddings
of finite metric spaces. The scope is relatively narrow, and we almost
do not discuss even closely related areas, such as isometric embed-
dings. Another recent survey, with fewer proofs and mainly focused
on algorithmic aspects, is Indyk [Ind01].
For studying approximate embeddings, it may certainly be help-
ful to understand isometric embeddings, and here extensive theory is
available. For example, several ingenious characterizations of isometric
embeddability into £2 can be found in old papers of Schoenberg (e.g.,
[Sch38], building on the work of mathematicians like Menger and von
Neumann). A recent book concerning isometric embeddings, and em-
beddings into £1 in particular, is Deza and Laurent [DL97].
Another closely related area is the investigation of bi-Lipschitz
maps, usually (l+£)-embeddings with £ > 0 small, defined on an open
358 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

subset of a Euclidean space (or a Banach space) and being local home-
omorphisms. These mappings are called quasi-isometries (the defini-
tion of a quasi-isometry is slightly more general, though), and the
main question is how close to an isometry such a mapping has to be,
in terms of the dimension and c:; see Benyamini and Lindenstrauss
[BL99], Chapters 14 and 15, for an introduction.

Exercises
1. Consider the two 4-point examples presented above (the square and the
star); prove that they cannot be isometrically embedded into £~. ~ Can
you determine the minimum necessary distortion for embedding into £~?
2. (a) Prove that a bijective mapping f between metric spaces is a D-
embedding if and only if IIfllLip . 11f-111Lip ::; D. IT]
(b) Let (X, p) be a metric space, IXI 2: 3. Prove that the distortion
of an embedding f: X ---+ Y, where (Y,O") is a metric space, equals the
supremum of the factors by which f "spoils" the ratios of distances; that
is,

sup {O"(f(X),J(y))/O"(f(z), f(t)): x, y, z, t E X, x =1= y, z =1= t}.


p(x,y)/p(z,t)

15.2 The Johnson-Lindenstrauss Flattening Lemma


It is easy to show that there is no isometric embedding of the vertex set
V of an n-dimensional regular simplex into a Euclidean space of dimension
k < n. In this sense, the (n+l)-point set V c £2 is truly n-dimensional.
The situation changes drastically if we do not insist on exact isometry: As
we will see, the set V, and any other (n+l)-point set in £2' can be almost
isometrically embedded into £~ with k = O(1ogn) only!

15.2.1 Theorem (Johnson-Lindenstrauss flattening lemma). Let X


be an n-point set in a Euclidean space (i.e., X c £2), and let c: E (0,1]
be given. Then there exists a (1+c:)-embedding of X into £~, where k
O(C 2 Iogn).

This result shows that any metric question about n points in £2 can
be considered for points in £~(1ogn), if we do not mind a distortion of the
distances by at most 10%, say. For example, to represent n points of £2 in
a computer, we need to store n 2 numbers. To store all of their distances, we
need about n 2 numbers as well. But by the flattening lemma, we can store
only O(nlogn) numbers and still reconstruct any of the n 2 distances with
error at most 10%.
15.2 The Johnson-Lindenstrauss Flattening Lemma 359

Various proofs of the flattening lemma, including the one below, provide
efficient randomized algorithms that find the almost isometric embedding
into £~ quickly. Numerous algorithmic applications have recently been found:
in fast clustering of high-dimensional point sets, in approximate searching
for nearest neighbors, in approximate multiplication of matrices, and also in
purely graph-theoretic problems, such as approximating the bandwidth of a
graph or multicommodity flows.

The proof of Theorem 15.2.1 is based on the following lemma, of inde-


pendent interest.
15.2.2 Lemma (Concentration of the length of the projection). For
a unit vector x E sn-l, let

f (x) = JxI + x~ + ... + x%


be the length of the projection of x on the subspace Lo spanned by the first
k coordinates. Consider x E sn-l chosen at random. Then f(x) is sharply
concentrated around a suitable number m = m(n, k):

where P is the uniform probability measure on sn-l. For n larger than a


suitable constant and k ;:::: 10 In n, we have m ;:::: ~ If
In the lemma, the k-dimensional subspace is fixed and x is random. Equiv-
alently, if x is a fixed unit vector and L is a random k-dimensional subspace
of £~ (as introduced in Section 14.3), the length of the projection of x on L
obeys the bounds in the lemma.
Proof of Lemma 15.2.2. The orthogonal projection p: £'2 -7 .e~ given by
(Xl, ... , Xn ) M (Xl, ... , Xk) is I-Lipschitz, and so f is I-Lipschitz as well.
Levy's lemma (Theorem 14.3.2) gives the tail estimates as in the lemma
with m = med(f). It remains to establish the lower bound for m. It is not
impossibly difficult to do it by elementary calculation (we need to find the
measure of a simple region on sn-l). But we can also avoid the calculation
by a trick combined with a general measure concentration result.
For random x E sn-l, we have 1 = E [llxI12] = L~=l E [x;]. By symme-
try, E [x;] = ~, and so E [j2] = ~. We now show that, since f is tightly
concentrated, E [f2] cannot be much larger than m 2 , and so m is not too
small.
For any t ;:::: 0, we can estimate
k
- = E[f2] :::=;P[f:::=;m+t]·(m+t)2+P[f>m+t]·max(f(x)2)
n x
:::=; (m + t)2 + 2e- t2n / 2.
360 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

Let us set t = Jk/5n. Since k 2: lOlnn, we have 2e- t2n / 2 ::; ~, and from
the above inequality we calculate m 2: J(k-2)/n - t 2: ~Jk/n.
Let us remark that a more careful calculation shows that m = Jk/n +
O(Jn) for all k. 0

Proof of the flattening lemma (Theorem 15.2.1). We may assume


that n is sufficiently large. Let X c e~ be a given n-point set. We set k =
200C 2 ln n (the constant can be improved). If k 2: n, there is nothing to
prove, so we assume k < n. Let L be a random k-dimensionallinear subspace
of e~ (obtained by a random rotation of Lo).
The chosen L is a copy of e~. We let p: e~ -t L be the orthogonal projection
onto L. Let m be the number around which IIp(x)11 is concentrated, as in
Lemma 15.2.2. We prove that for any two distinct points x, y E e~, the
condition

(1 - ~)m Ilx - yll ::; IIp(x) - p(y)11 ::; (1 + ~)m Ilx - yll (15.1)

is violated with probability at most n -2. Since there are fewer than n 2 pairs of
distinct x, y E X, there exists some L such that (15.1) holds for all x, y E X.
In such a case, the mapping p is a D-embedding of X into e~ with D ::;
1+c/3
l-c/3 < 1+€ (£or € ::; 1).
Let x and y be fixed. First we reformulate the condition (15.1). Let u =
X-yj since p is a linear mapping, we have p(x) -p(y) = p(u), and (15.1) can
be rewritten as (1- ~)m Ilull ::; IIp(u)11 ::; (l+~)m Iluli. This is invariant under
scaling, and so we may suppose that lIull = 1. The condition thus becomes

(15.2)

By Lemma 15.2.2 and the remark following it, the probability of violating
(15.2), for u fixed and L random, is at most

This proves the Johnson-Lindenstrauss flattening lemma. o


Alternative proofs. There are several variations of the proof, which are
more suitable from the computational point of view (if we really want to
produce the embedding into e~(logn)).
In the above proof we project the set X on a random k-dimension-
al subspace L. Such an L can be chosen by selecting an orthonormal ba-
sis (b 1 , b2 , ... , bk ), where b1 , ... , bk is a random k-tuple of unit orthogo-
nal vectors. The coordinates of the projection of x to L are the scalar
products (b 1 , x), ... , (b k , x). It turns out that the condition of orthogonal-
ity of the bi can be dropped. That is, we can pick unit vectors b1 , .•. , bk E
sn-l independently at random and define a mapping p: X -t e~ by x I-t
15.2 The Johnson-Lindenstrauss Flattening Lemma 361

((b 1 , x), ... , (b k , x)). Using suitable concentration results, one can verify that
P is a (1 +€ )-embedding with probability close to 1. The procedure of picking
the bi is computationally much simpler.
Another way is to choose each component of each bi from the normal
distribution N(O, 1), all the nk choices of the components being independent.
The distribution of each bi in R n is rotationally symmetric (as was mentioned
in Section 14.1). Therefore, for every fixed U E sn-l, the scalar product (b i , u)
also has the normal distribution N(O, 1) and IIp(u)112, the squared length of
the image, has the distribution of I:7=1 Zl, where the Zi are independent
N(O, 1). This is the well known Chi-Square distribution with k degrees of
freedom, and a strong concentration result analogous to Lemma 15.2.2 can
be found in books on probability theory (or derived from general measure-
concentration results for the Gaussian measure or from Chernoff-type tail
estimates). A still different method, particularly easy to implement but with
a more difficult proof, uses independent random vectors bi E {-I, I} n.

Bibliography and remarks. The flattening lemma is from John-


son and Lindenstrauss [JL84]. They were interested in the following
question: Given a metric space Y, an n-point subspace Xc Y, and a
I-Lipschitz mapping f: X -+ £2, what is the smallest C = C(n) such
that there is always a C-Lipschitz mapping 1: Y -+ £2 extending f?
They obtained the upper bound C = O(yflogn), together with an
almost matching lower bound.
The alternative proof of the flattening lemma using independent
normal random variables was given by Indyk and Motwani [IM98]. A
streamlined exposition of a similar proof can be found in Dasgupta and
Gupta [DG99]. For more general concentration results and techniques
using the Gaussian distribution see, e.g., [Pis89], [MS86].
Achlioptas [AchOl] proved that the components of the bi can also
be chosen as independent uniform ±1 random variables. Here the dis-
tribution of (b i , u) does depend on u but the proof shows that for every
u E sn-l, the concentration of IIp(u)112 is at least as strong as in the
case of the normally distributed bi . This is established by analyzing
higher moments of the distribution.
The sharpest known upper bound on the dimension needed for a
(1 +€ )-embedding of an n-point Euclidean metric is !~ (1 + 0(1)) In n,
where 0(1) is with respect to € -+ 0 [IM98], [DG99], [AchOl]. The
main term is optimal for the current proof method; see Exercises 3
and 15.3.4.
The Johnson-Lindenstrauss flattening lemma has been applied
in many algorithms, both in theory and practice; see the survey
[IndOl] or, for example, Kleinberg [Kle97], Indyk and Motwani [IM98],
Borodin, Ostrovsky, and Rabani [BOR99].
362 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

Exercises
1. Let x, y E sn-l be two points chosen independently and uniformly at
random. Estimate their expected (Euclidean) distance, assuming that n
is large. 0
2. Let L ~ R n be a fixed k-dimensional linear subspace and let x be a
random point of sn-l. Estimate the expected distance of x from L, as-
suming that n is large. 0
3. (Lower bound for the flattening lemma)
(a) Consider the n+ 1 points 0, el, e2, ... ,en ERn (where the ei are the
vectors of the standard orthonormal basis). Check that if these points
with their Euclidean distances are (1 +E: )-embedded into £~, then there
exist unit vectors VI, V2, ... ,Vn E R k with I(Vi, Vj) I :::; lODE: for all i #- j
(the constant can be improved). ~
(b) Let A be an n x n symmetric real matrix with aii = 1 for all i and
laij I :::; n -1/2 for all j, j, i #- j. Prove that A has rank at least ~. [:±J
(c) Let A be an nxn real matrix of rank d, let k be a positive integer,
and let B be the nxn matrix with bij = a~j. Prove that the rank of B is
at most (ktd). [:±J
(d) Using (a)-(c), prove that if the set as in (a) is (l+E:)-embedded into
£~, where 100n- 1 / 2 :::; E: :::; ~, then

k =n( 1
E: 2 log ~
log n) .
o
This proof is due to Alon (unpublished manuscript, Tel Aviv University).

15.3 Lower Bounds By Counting


In this section we explain a construction providing many "essentially dif-
ferent" n-point metric spaces, and we derive a general lower bound on the
minimum distortion required to embed all these spaces into ad-dimensional
normed space. The key ingredient is a construction of graphs without short
cycles.
Graphs without short cycles. The girth of a graph G is the length of
the shortest cycle in G. Let m(£, n) denote the maximum possible number
of edges of a simple graph on n vertices containing no cycle of length £ or
shorter, i.e., with girth at least £+1.
We have m(2, n) = G), since the complete graph Kn has girth 3. Next,
m(3, n) is the maximum number of edges of a triangle-free graph on n vertices,
and it equals l~J . r~l by Thran's theorem; the extremal example is the
complete bipartite graph K Ln / 2 j,fn/21. Another simple observation is that for
all k, m(2k+1,n)::::: ~m(2k,n). This is because any graph G has a bipartite
15.3 Lower Bounds By Counting 363

sub graph H that contains at least half of the edges of C. 2 So it suffices to


care about even cycles and to consider £ even, remembering that the bounds
for £ = 2k and £ = 2k+ 1 are almost the same up to a factor of 2.
Here is a simple general upper bound on m(£, n).

15.3.1 Lemma. For all nand £,

m(£, n) ::::: n1+1/Lt'/2J + n.

Proof. It suffices to consider even £ = 2k. Let C be a graph with n vertices


and m = m(2k, n) edges. The average degree is_ J = 2m.
n There is a subgraph
H ~ C with minimum degree at least 5 = ~d. Indeed, by deleting a vertex
of degree smaller than 5 the average degree does not decrease, and so H can
be obtained by a repeated deletion of such vertices.
Let Vo be a vertex of H. The crucial observation is that, since H has no
cycle of length 2k or shorter, the subgraph of H induced by all vertices at
distance at most k from Vo is a tree:

The number of vertices in this tree is at least 1+5+5(5-1)+·· ·+5(5-1)k-l 2


(5-1)k, and this is no more than n. So 5 ::::: n 1/ k +1 and m = ~Jn ::::: 5n :::::
n1+1/k + n. 0
This simple argument yields essentially the best known upper bound.
But it was asymptotically matched only for a few small values of £, namely,
for £ E {4,5,6, 7,10, ll}. For m(4,n) and m(5,n), we need bipartite graphs
without K 2 ,2; these were briefly discussed in Section 4.5, and we recall that
they can have up to n 3 / 2 edges, as is witnessed by the finite projective plane.
The remaining listed cases use clever algebraic constructions.
For the other £, the record is also held by algebraic constructions; they
are not difficult to describe, but proving that they work needs quite deep
mathematics. For all £ == 1 (mod 4) (and not on the list above), they yield
m(£,n) = f2(n1+ 4/(3R-7)), while for £ == 3(mod4), they lead to m(£,n) =
f2(n1+4/(3R-9)).
Here we prove a weaker but simple lower bound by the probabilistic
method.

2 To see this, divide the vertices of G into two classes A and B arbitrarily, and
while there is a vertex in one of the classes having more neighbors in its class
than in the other class, move such a vertex to the other class; the number of
edges between A and B increases in each step. For another proof, assign each
vertex randomly to A or B and check that the expected number of edges between
A and B is ~ IE(G)I.
364 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

15.3.2 Lemma. For all £ ~ 3 and n ~ 2, we have


m(£, n) ~ ~ n1+1/CR-l).

Of course, for odd £ we obtain an !l(n1+ 1/CR-2») bound by using the lemma
for £-I.
Proof. First we note that we may assume n ~ 4R- 1 ~ 16, for otherwise, the
bound in the lemma is verified by a path, say.
We consider the random graph G(n,p) with n vertices, where each of the
G) possible edges is present with probability p, 0 < p < 1, and these choices
are mutually independent. The value of p is going to be chosen later.
Let E be the set of edges of G (n, p) and let F ~ E be the edges contained
in cycles of length £ or shorter. By deleting all edges of F from G(n,p), we
obtain a graph with no cycles of length £ or shorter. If we manage to show,
for some m, that the expectation E[lE \ Fil is at least m, then there is an
instance of G(n,p) with IE \ FI ~ m, and so there exists a graph with n
vertices, m edges, and of girth greater than £.
We have E[lEll = G)p. What is the probability that a fixed pair e =
{u, v} of vertices is an edge of F? First, e must be an edge of G (n, p), which
has probability p, and second, there must be path of length between 2 and
£-1 connecting u and v. The probability that all the edges of a given potential
path of length k are present is pk, and there are fewer than n k- 1 possible
paths from u to v of length k. Therefore, the probability of e E F is at most
~~:,~ pk+1 n k-l, which can be bounded by 2pRn R-2, provided that np ~ 2.
Then E[1F1l ::; (~) . 2pRn R- 2 , and by the linearity of expectation, we have

Now, we maximize this expression as a function of p; a somewhat rough but


/CR)
simple choice is p = ~, which leads to E[lE \ Fil ~ gn1+1 -1 (the
l/(e-l) 1

constant can be improved somewhat). The assumption np ~ 2 follows from


n ~ 4R- 1 • Lemma 15.3.2 is proved. D

There are several ways of proving a lower bound for m( £, n) similar to that
in Lemma 15.3.2, i.e., roughly n1+1/R; one of the alternatives is indicated in
Exercise 1 below. But obtaining a significantly better bound in an elementary
way and improving on the best known bounds (of roughly n1+4/3R) remain
challenging open problems.
We now use the knowledge about graphs without short cycles in lower
bounds for distortion.
15.3.3 Proposition (Distortion versus dimension). Let Z be a d-di-
mensional normed space, such as some £~, and suppose that all n-point metric
spaces can be D-embedded into Z. Let £ be an integer with D < £ ::; 5D (it
is essential that £ be strictly larger than D, while the upper bound is only
for technical convenience). Then
15.3 Lower Bounds By Counting 365

d> 1 ~(£,n)
- log2 ~~~ n

Proof. Let G be a graph with vertex set V = {VI, V2, ... , v n } and with
~ = ~(£,n) edges. Let 9 denote the set of all subgraphs H ~ G obtained
from G by deleting some edges (but retaining all vertices). For each H E 9,
we define a metric PH on the set V by PH(U,V) = min(£,dH(u,v)), where
dH(U, v) is the length of a shortest path connecting U and v in H.
The idea of the proof is that 9 contains many essentially different metric
spaces, and if the dimension of Z were small, then there would not be suffi-
ciently many essentially different placements of n points in Z.
Suppose that for every H E 9 there exists aD-embedding fH: (V, PH) -+
Z. By rescaling, we make sure that 15 PH(U,V) ::; IIfH(U) - fH(V)lIz ::;
PH(U, v) for all u, v E V. We may also assume that the images of all points
are contained in the £-ball Bz(O, £) = {x E Z: Ilxllz ::; £}.
Set /3 = i(iJ-1). We have 0 < /3::; 1. Let N be a /3-net in Bz(O,£). The
notion of /3-net was defined above Lemma 13.1.1, and that lemma showed that
a /3-net in the (d-1 )-dimensional Euclidean sphere has cardinality at most
(~)d. Exactly the same volume argument proves that in our case INI ::; (~)d.
For every H E g, we define a new mapping gH: V -+ N by letting gH(V)
be the nearest point to fH(V) in N (ties resolved arbitrarily). We prove that
for distinct HI, H2 E 9, the mappings gH 1 and gH2 are distinct.
The edge sets of HI and H2 differ, so we can choose a pair u, v of vertices
that form an edge in one of them, say in HI, and not in the other one (H2).
We have PH1(U,V) = 1, while PH2(U,V) = £, for otherwise, a u-v path in H2
of length smaller than £ and the edge {u, v} would induce a cycle of length
at most £ in G. Thus

IlgHl (u) - gHl (v)llz < IlfHl (u) - fHl (v)llz + 2/3 ::; 1 + 2/3
and
£
IlgH2(U) - gH2(V)llz > IlfH2(U) - fH2(V)llz - 2/3 2: D - 2/3 = 1 + 2/3.

Therefore, gH 1(u) i- gH2(U) or gH 1(v) i- gH2(V).


We have shown that there are at least 191 distinct mappings V -+ N. The
number of all mappings V -+ N is IN In , and so

The bound in the proposition follows by calculation. o

15.3.4 Corollary ("Incompressibility" of general metric spaces). If


Z is a normed space such that all n-point metric spaces can be D-embedded
into Z, where D > 1 is considered fixed and n -+ 00, then we have
366 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

• dimZ = !l(n) for D < 3,


• dimZ = !l(fo) for D < 5,
• dim Z = !l(n l / 3 ) for D < 7.
This follows from Proposition 15.3.3 by substituting the asymptotically
optimal bounds for m(3, n), m(5, n), and m(7, n). The constant of propor-
tionality in the first bound goes to 0 as D -+ 3, and similarly for the other
bounds.
The corollary shows that there is no normed space of dimension signifi-
cantly smaller than n in which one could represent all n-point metric spaces
with distortion smaller than 3. So, for example, one cannot save much space
by representing a general n-point metric space by the coordinates of points
in some suitable normed space.
It is very surprising that, as we will see later, it is possible to 3-embed all
n-point metric spaces into a particular normed space of dimension close to
fo. So the value 3 for the distortion is a real threshold! Similar thresholds
occur at the values 5 and 7. Most likely this continues for all odd integers D,
but we cannot prove this because of the lack of tight bounds for the number
of edges in graphs without short cycles.
Another consequence of Proposition 15.3.3 concerns embedding into Eu-
clidean spaces, without any restriction on dimension.

15.3.5 Proposition (Lower bound on embedding into Euclidean


spaces). For all n, there exist n-point metric spaces that cannot be em-
bedded into £2 (i.e., into any Euclidean space) with distortion smaller than
clog n/ log log n, where c > 0 is a suitable positive constant.

Proof. If an n-point metric space is D-embedded into £2' then by the


Johnson-Lindenstrauss flattening lemma, it can be (2D)-embedded into £~
with d ::; Clog n for some specific constant C.
For contradiction, suppose that D ::; cllog n/ log log n with a sufficiently
small Cl > O. Set £ = 4D and assume that £ is an integer. By Lemma 15.3.2,
we have m(£, n) 2: ~nl+l/(e-l) 2: Cln log n, where C l can be made as large as
we wish by adjusting Cl. So Proposition 15.3.3 gives d 2: ~1 logn. If C l > 5C,
we have a contradiction. D
In the subsequent sections the lower bound in Proposition 15.3.5 will be
improved to !l(log n) by a completely different method, and then we will see
that this latter bound is tight.

Bibliography and remarks. The problem of constructing small


graphs with given girth and minimum degree has a rich history; see,
e.g., Bollobas [BoI85] for most of the earlier results.
In the proof of Lemma 15.3.1 we have derived that any graph of
minimum degree 8 and girth 2k+ 1 has at least 1 + 8 L~~~ (8 -1) i ver-
tices, and a similar lower bound for girth 2k is 2 L~~~ (8-1)i. Graphs
15.3 Lower Bounds By Counting 367

attaining these bounds (they are called Moore graphs for odd girth
and generalized polygon graphs for even girth) are known to exist only
in very few cases (see, e.g., Biggs [Big93] for a nice exposition). Alon,
Hoory, and Linial [AHLOl] proved by a neat argument using random
walks that the same formulas still bound the number of vertices from
below if 15 is the average degree (rather than minimum degree) of the
graph. But none of this helps improve the bound on m(£, n) by any
substantial amount.
The proof of Lemma 15.3.2 is a variation on well known proofs by
Erdos.
The constructions mentioned in the text attaining the asymptot-
ically optimal value of m(£, n) for several small £ are due to Benson
[Ben66] (constructions with similar properties appeared earlier in Tits
[Tit59], where they were investigated for different reasons). As for the
other £, graphs with the parameters given in the text were constructed
by Lazebnik, Ustimenko, and Woldar [LUW95], [LUW96] by algebraic
methods, improving on earlier bounds (such as those in Lubotzky,
Phillips, Sarnak [LPS88]; also see the notes to Section 15.5).
Proposition 15.3.5 and the basic idea of Proposition 15.3.3 were
invented by Bourgain [Bou85]. The explicit use of graphs without
short cycles and the detection of the "thresholds" in the behavior
of the dimension as a function of the distortion appeared in Matousek
[Mat96b].
Proposition 15.3.3 implies that a normed space that should accom-
modate all n-point metric spaces with a given small distortion must
have large dimension. But what if we consider just one n-point metric
space M, and we ask for the minimum dimension of a normed space Z
such that M can be D-embedded into Z? Here Z can be "customized"
to M, and the counting argument as in the proof of Proposition 15.3.3
cannot work. By a nice different method, using the rank of certain
matrices, Arias-de-Reyna and Rodriguez-Piazza [AR92] proved that
for each D < 2, there are n-point metric spaces that do not D-embed
into any normed space of dimension below c(D)n, for some c(D) > O.
In [Mat96b] their technique was extended, and it was shown that for
any D > 1, the required dimension is at least c( lD j )n 1/ 2LD J, so for a
fixed D it is at least a fixed power of n. The proof again uses graphs
without short cycles. An interesting open problem is whether the pos-
sibility of selecting the norm in dependence on the metric can ever
help substantially. For example, we know that if we want one normed
space for all n-point metric spaces, then a linear dimension is needed
for all distortions below 3. But the lower bounds in [AR92]' [Mat96b]
for a customized normed space force linear dimension only for distor-
tion D < 2. Can every n-point metric space M be 2.99-embedded, say,
into some normed space Z = Z(M) of dimension o(n)?
368 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

We have examined the tradeoff between dimension and distortion


when the distortion is a fixed number. One may also ask for the min-
imum distortion if the dimension d is fixed; this was considered in
Matousek [Mat90bj. For fixed d, all £p-norms on Rd are equivalent
up to a constant, and so it suffices to consider embeddings into £~.
Considering the n-point metric space with all distances equal to 1,
a simple volume argument shows that an embedding into £~ has dis-
tortion at least D(n 1 / d). The exponent can be improved by a factor
of roughly 2; more precisely, for any d 2: 1, there exist n-point met-
ric spaces requiring distortion D (n 1 / L(d+l)/2 J ) for embedding into £~
(these spaces are even isometrically embeddable into £~+1). They are
obtained by taking a q-dimensional simplicial complex that cannot
be embedded into R 2 q (a Van Kampen-Flores complex; for modern
treatment see, e.g., [Sar91j or [Ziv97]), considering a geometric real-
ization of such a complex in R 2 Q+1, and filling it with points uniformly
(taking an '1]-net within it for a suitable '1], in the metric sense); see
Exercise 3 below for the case q = 1. For d = 1 and d = 2, this bound
is asymptotically tight, as can be shown by an inductive argument
[Mat90bj. It is also almost tight for all even d. An upper bound of
O(n 2 / d log3/2 n) for the distortion is obtained by first embedding the
considered metric space into £'2 (Theorem 15.7.1), and then project-
ing on a random d-dimensional subspace; the analysis is similar to
the proof of the Johnson-Lindenstrauss flattening lemma. It would
be interesting to close the gap for odd d 2: 3; the case d = 1 suggests
that perhaps the lower bound might be the truth. It is also rather puz-
zling that the (suspected) bound for the distortion for fixed dimension,
D :;:::j n 1 / L(d+l)/2J, looks optically similar to the (suspected) bound for
dimension given the distortion (Corollary 15.3.4), d:;:::j nl/L(D+l)/2J. Is
this a pure coincidence, or is it trying to tell us something?

Exercises
1. (Erdos-Sachs construction) This exercise indicates an elegant proof, by
Erdos and Sachs [ES63j, of the existence of graphs without short cycles
whose number of edges is not much smaller than in Lemma 15.3.2 and
that are regular. Let £ 2: 3 and 2: 3. a
a
(a) (Starting graph) For all and £, construct a finite a-regular graph
G( a, £) with no cycles of length £ or shorter; the number of vertices does
not matter. One possibility is by double induction: Construct G(a+1,£)
using G(a,£) and G(a',£-l) with a suitable a'. 8J
(b) Let G be a a-regular graph of girth at least £+1 and let u and v be
two vertices of G at distance at least £+2. Delete them together with
their incident edges, and connect their neighbors by a matching:
15.4 A Lower Bound for the Hamming Cube 369

v
Ill!
Check that the resulting graph still does not contain any cycle of length
at most C. 0
(c) Show that starting with a graph as in (a) and reducing it by the
operations as in (b), we arrive at a 8-regular graph of girth C+1 and with
at most 1 + 8 + 8(8-1) + '" + 8(8-1)€ vertices. What is the resulting
asymptotic lower bound for m(n,C), with C fixed and n ~ oo? GJ
2. (Sparse spanners) Let a be a graph with n vertices and with positive real
weights on edges, which represent the edge lengths. A subgraph H of a is
called a t-spanner of a if the distance of any two vertices u, v in H is no
more than t times their distance in a (both the distances are measured
in the shortest-path metric). Using Lemma 15.3.1, prove that for every
a and every integer t ~ 2, there exists a t-spanner with 0 (nl+l/Lt/2J)
edges. 0
3. Let an denote the graph arising from K 5 , the complete graph on 5 ver-
tices, by subdividing each edge n-1 times; that is, every two of the orig-
inal vertices of K5 are connected by a path of length n. Prove that the
vertex set of an, considered as a metric space with the graph-theoretic
distance, cannot be embedded into the plane with distortion smaller than
const· n. 0
4. (Another lower bound for the flattening lemma)
(a) Given c E (O,~) and n sufficiently large in terms of c, construct a
collection V of ordered n-tuples of points of C~ such that the distance of
every two points in each V E V is between two suitable constants, no two
V =I- V' E V can have the same (l+c)-embedding (that is, there are i,j
such that the distances between the ith point and the jth point in V and
in V' differ by a factor of at least l+c), and log IVI = f2(c 2 nlogn). 0
(b) Use (a) and the method of this section to prove a lower bound of
f2(~11 logn) for the dimension in the Johnson-Lindenstrauss flatten-
c: og.-
ing lemma. 0

15.4 A Lower Bound for the Hamming Cube


We have established the existence of n-point metric spaces requiring the
distortion close to log n for embedding into C2 (Proposition 15.3.5), but we
have not constructed any specific metric space with this property. In this
section we prove a weaker lower bound, only f2( Vlog n ), but for a specific
and very simple space: the Hamming cube. Later on, we extend the proof
370 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

method and exhibit metric spaces with n(log n) lower bound, which turns
out to be optimal. We recall that Cm denotes the space {O,l}m with the
Hamming (or £1) metric, where the distance of two 0/1 sequences is the
number of places where they differ.

15.4.1 Theorem. Let m :::: 2 and n = 2m. Then there is no D-embedding


of the Hamming cube Cm into £2 with D < Vm = Jlog2 n. That is, the
natural embedding, where we regard {O, l}m as a subspace of £T', is optimal.
The reader may remember, perhaps with some dissatisfaction, that at the
beginning of this chapter we mentioned the 4-cycle as an example of a metric
space that cannot be isometrically embedded into any Euclidean space, but
we gave no reason. Now, we are obliged to rectify this, because the 4-cycle is
just the 2-dimensional Hamming cube.
The intuitive reason why the 4-cycle cannot be embedded isometrically
is that if we embed the vertices so that the edges have the right length,
then at least one of the diagonals is too short. We make this precise using
a slightly more complicated notation than necessary, in anticipation of later
developments.
Let V be a finite set, let p be a metric on V, and let E, F ~ (~)
be nonempty sets of pairs of points of V. As our running example, V =
{VI, ... , V4} is the set of vertices of the 4-cycle, p is the graph metric on
it, E = {{Vl,V2}, {V2,V3}, {V3,V4}, {v4,vd} are the edges, and F =
{{ Vb V3}, {V2' V4}} are the diagonals.

E
F

Let us introduce the abbreviated notation

p2(E) = L p(u, v)2.


{u,v}EE

We consider the ratio

the subscripts E, F will be omitted unless there is danger of confusion. For


our 4-cycle, R(p) is a kind of ratio of "diagonals to edges" but with quadratic
averages of distances, and it equals v'2 (right?).
Next, let f: V -+ £~ be a D-embedding of the considered metric space into
a Euclidean space. This defines another metric CY on V: cy(u,v) = Ilf(u)-
f(v)ll. With the same E and F, let us now look at the ratio R(cy).
If f is a D-embedding, then R( CY) :::: R(p) / D. But according to the idea
mentioned above, in any embedding of the 4-cycle into a Euclidean space, the
15.4 A Lower Bound for the Hamming Cube 371

diagonals are always too short, and so R( u) can be expected to be smaller


than J2 in this case. This is confirmed by the following lemma, which (with
Xi = !(Vi» shows that R(u) s:; 1 and therefore D ~ J2.

15.4.2 Lemma (Short diagonals lemma). Let Xl, X2, X3, X4 be arbitrary
points in a Euclidean space. Then

Proof. Four points can be assumed to lie in R 3 , so one could start some
stereometric calculations. But a better way is to observe that it suffices to
prove the lemma for points on the real line! Indeed, for the Xi in some Rd we
can write the I-dimensional inequality for each coordinate and then add these
inequalities together. (This is the reason for using squares in the definition
of the ratio R(u): Squares of Euclidean distances split into the contributions
of individual coordinates, and so they are easier to handle than the distances
themselves. )
If the Xi are real numbers, we calculate

(Xl - X2)2 + (X2 - X3)2+ (X3 - X4)2 + (X4 - xd 2 - (Xl - X3)2 - (X2 - X4)2

= (Xl - X2 + X3 - X4)2 ~ 0,

and this is the desired inequality. o


Proof of Theorem 15.4.1. We proceed as in the 2-dimensional case. Let
V = {O, I}m be the vertex set of em, let p be the Hamming metric, let E be
the set of edges of the cube (pairs of points at distance 1), and let F be the
set of the long diagonals. The long diagonals are pairs of points at distance
m, or in other words, pairs {u, u}, U E V, where u is the vector arising from
u by changing O's to 1's and 1's to O's.
We have lEI = m2 m- l and IFI = 2m- I , and we calculate RE,F(p) = rm·
If u is a metric on V induced by some embedding !: V --+ .e~, we want
to show that RE,F(U) s:; 1; this will give the theorem. So we need to
prove that u 2 (F) s:; u 2 (E). This follows from the inequality for the 4-cycle
(Lemma 15.4.2) by a convenient induction.
The basis for m = 2 is directly Lemma 15.4.2. For larger m, we divide
the vertex set V into two parts Vo and VI, where Vo are the vectors with the
last component 0, i.e., of the form uO, U E {O, 1}m-l. The set Vo induces an
(m-1 )-dimensional subcube. Let Eo be its edge set and Fo the set of its long
diagonals; that is, Fo = {{uO,uO}: U E {O, l}m-l}, and similarly for El and
Fl. Let Em = E \ (Eo U E l ) be the edges of the m-dimensional cube going
between the two subcubes. By induction, we have

For U E {O, 1}m-l, we consider the quadrilateral with vertices uO, uO, u1, u1;
for U = 00, it is indicated in the picture:
372 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

.!r------,-::;;;;tf 111

001~--+--..

~:-::o--I--~ 110

000.-=-----

Its sides are two edges of E OI , one diagonal from Fo and one from FI , and
its diagonals are from F. If we write the inequality of Lemma 15.4.2 for this
quadrilateral and sum up over all such quadrilaterals (they are 2m - 2 , since
u and u yield the same quadrilaterals), we get

By the inductive assumption for the two subcubes, the right-hand side is at
most a 2 (Eod + a 2 (Eo) + a 2 (Ed = a 2 (E). 0

Bibliography and remarks. Theorem 15.4.1, found by Entio


[Enf69], is probably the first result showing an unbounded distortion
for embeddings into Euclidean spaces. Entio considered the problem of
uniform embeddability among Banach spaces, and the distortion was
an auxiliary device in his proof.

Exercises
1. Consider the second graph in the introductory section, the star with 3
leaves, and prove a lower bound of ~ for the distortion required to
embed into a Euclidean space. Follow the method used for the 4-cycle. [1]
2. (Planar graphs badly embeddable into £2) Let Go, G I , ... be the following
graphs:

Go

Gi+l is obtained from G i by replacing each edge by a square with two


new vertices. Using the short diagonals lemma and the method of this
section, prove that any Euclidean embedding of G m (with the graph
metric) requires distortion at least Vm+1. [i]
This result is due to Newman and Rabinovich [NROl].
15.5 A Tight Lower Bound via Expanders 373

3. (Almost Euclidean subspaces) Prove that for every k and c > 0 there
exists n = n(k,c) such that every n-point metric space (X,p) contains a
k-point subspace that is (l+c)-embeddable into £2. Use Ramsey's theo-
rem. 0
This result is due to Bourgain, Figiel, and Milman [BFM86]; it is a kind
of analogue of Dvoretzky's theorem for metric spaces.

15.5 A Tight Lower Bound via Expanders


Here we provide an explicit example of an n-point metric space that requires
distortion n(log n) for embedding into any Euclidean space. It is the vertex
set of a constant-degree expander G with the graph metric. In the proof we
are going to use bounds on the second eigenvalue of G, but for readers not
familiar with the important notion of expander graphs, we first include a
little wider background.
Roughly speaking, expanders are graphs that are sparse but well con-
nected. If a model of an expander is made with vertices being little balls and
edges being thin strings, it is difficult to tear off any subset of vertices, and
the more vertices we want to tear off, the larger effort that is needed.
More formally, we define the edge expansion (also called the conductance)
<1>( G) of a graph G = (V, E) as

mm IAI : A c V, 1::; IAI::; I


. {e(A,V\A) "21V1} ,
where e(A, B) is the number of edges of G going between A and B. One can
say, still somewhat imprecisely, that a graph G is a good expander if <1>( G) is
not very small compared to the average degree of G.
In this section, we consider r-regular graphs for a suitable constant r ::::
3, say r = 3. We need r-regular graphs with an arbitrary large number n
of vertices and with edge expansion bounded below by a positive constant
independent of n. Such graphs are usually called constant-degree expanders. 3
It is useful to note that, for example, the edge expansion of the nxn planar
square grid tends to 0 as n ---t 00. More generally, it is known that constant-
degree expanders cannot be planar; they must be much more tangled than
planar graphs.
The existence of constant-degree expanders is not difficult to prove by the
probabilistic method; for every fixed r :::: 3, random r-regular graphs provide
very good expanders. With considerable effort, explicit constructions have
been found as well; see the notes to this section.

3 A rigorous definition should be formulated for an infinite family of graphs. A


family {G I , G2, ... } of r-regular graphs with IV(Gi)l-+ 00 as i -+ 00 is a family
of constant-degree expanders if the edge expansion of all Gi is bounded below
by a positive constant independent of i.
374 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

Let us remark that several notions similar to edge expansion appear in


the literature, and each of them can be used for quantifying how good an
expander a given graph is (but they usually lead to an equivalent notion of
a family of constant-degree expanders). Often it is also useful to consider
nonregular expanders or expanders with larger than constant degree, but
regular constant-degree expanders are probably used most frequently.
Now, we pass to the second eigenvalue. For our purposes it is most con-
venient to talk about eigenvalues of the Laplacian of the considered graph.
Let G = (V, E) be an r-regular graph. The Laplacian matrix Le of G is an
nxn matrix, n = lVI, with both rows and columns indexed by the vertices
of G, defined by

for u = v,
ifu -=I- v and {u,v} E E(G),
otherwise.

It is a symmetric positive semidefinite real matrix, and it has n real eigen-


values /-ll = 0 :::: /-l2 :::: ... :::: /-In· The second eigenvalue /-l2 = /-l2(G) is a
fundamental parameter of the graph G. 4
Somewhat similar to edge expansion, /-l2(G) describes how much G "holds
together ," but in a different way. The edge expansion and /-l2 (G) are related
but they do not determine each other. For every r-regular graph G, we have
/-l2(G) 2: 1'~~)2 (see, e.g., Lovasz [Lov93], Exercise 11.31 for a proof) and
/-l2(G) :::: 2<p(G) (Exercise 6). Both the lower and the upper bound can almost
be attained for some graphs.
For our application below, we need the following fact: There are constants
rand 13 > 0 such that for sufficiently many values of n (say for at least
one n between 10 k and lOk+l), there exists an n-vertex r-regular graph G
with /-l2(G) 2: 13. This follows from the existence results for constant-degree
expanders mentioned above (random 3-regular graphs will do, for example),
and actually most of the known explicit constructions of expanders bound
the second eigenvalue directly.
We are going to use the lower bound on /-l2 (G) via the following fact:

For all real vectors (XV)VEV with '2:::vEV Xv = 0, we have


(15.3)
x T Lex 2: /-l21Ixll·
To understand what is going on here, we recall that every symmetric real n x n
matrix has n real eigenvalues (not necessarily distinct), and the corresponding
n unit eigenvectors b1 , b2 , ... , bn form an orthonormal basis of Rn. For the
4 The notation J-li for the eigenvalues of Le is not standard. We use it in order
to distinguish these eigenvalues from the eigenvalues Al 2': A2 2': ... 2': An of the
adjacency matrix Ae usually considered in the literature, where (Ae )'Uv = 1 if
{u, v} E E( G) and (Ae)uv = 0 otherwise. Here we deal exclusively with regular
graphs, for which the eigenvalues of Ae are related to those of Le in a very
simple way: Ai = r-J-li, i = 1,2 ... ,n, for any r-regular graph.
15.5 A Tight Lower Bound via Expanders 375

matrix La, the unit eigenvector bl belonging to the eigenvalue fLl 0 is


n- lj2 (1, 1, ... ,1). So the condition LvEv Xv = 0 means the orthogonality of
x to bl , and we have x = L~l aibi for suitable real ai with al = O. We
calculate, using xTbi = ai,
n n n n
xTLa x = I>T(aiLabi) = LaifLixTbi = La;fLi 2 fL2 La; = fL211x112.
i=2 i=2 i=2 i=2

This proves (15.3), and we can also see that x = b2 yields equality in (15.3).
So we can write fL2 = min{xT Lax: Ilxll = 1, LvEv Xv = O} (this is a special
case of the variational definition of eigenvalues discussed in many textbooks
of linear algebra).
Now, we are ready to prove the main result of this section.

15.5.1 Theorem (Expanders are badly embeddable into £2). Let G


be an r-regular graph on an n-element vertex set V with fL2(G) 2 (3, where
r 2 3 and (3 > 0 are constants, and let p be the shortest-path metric on V.
Then the metric space (V, p) cannot be D-embedded into a Euclidean space
for D ::; clog n, where c = c(r, (3) > 0 is independent of n.

Proof. We again consider the ratios RE,F(p) and RE,F(a) as in the proof
for the cube (Theorem 15.4.1). This time we let E be the edge set of G, and
F = (~) are all pairs of distinct vertices. In the graph metric all pairs in E
have distance 1, while most pairs in F have distance about log n, as we will
check below. On the other hand, it turns out that in any embedding into £2
such that all the distances in E are at most 1, a typical distance in F is only
0(1). The calculations follow.
We have p2(E) = lEI = n;. To bound p2(F) from below, we observe that
for each vertex vo, there are at most 1 +r+r(r-l) + ... +r(r-l)k-l ::; rk+l
vertices at distance at most k from Vo. So for k = logr n;-l , at least half of
the pairs in F have distance more than k, and we obtain p2(F) = fl(n 2k 2) =
fl(n 2 10g2 n). Thus
RE,F(p) = fl (Vn .logn).
Let f: V --+ £~ be an embedding into a Euclidean space, and let a be the
metric induced by it on V. To prove the theorem, it suffices to show that
RE,F(a) = O(vIn); that is,

By the observation in the proof of Lemma 15.4.2 about splitting into coordi-
nates, it is enough to prove this inequality for a one-dimensional embedding.
So for every choice of real numbers (xv )VEV, we want to show that

(15.4)
{u,V}EF {u,v}EE
376 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

By adding a suitable number to all the Xv, we may assume that L:vEV Xv = 0.
This does not change anything in (15.4), but it allows us to relate both sides
to the Euclidean norm of the vector x.
We calculate, using L:vEV Xv = 0,

L (xu- x v)2 = (n-1) L x;- L XuXv =n L x;- (L xv) 2= nllxl12.


{u,V}EF vEV ui-v vEV vEV

For the right-hand side of (15.4), the Laplace matrix enters:

L (xu - xv)2 =r L x; - 2 L XuXv = xTLex::::: P,21IxI1 2,


{u,v}EE vEV {u,v}EE

the last inequality being (15.3). This establishes (15.4) and concludes the
proof of Theorem 15.5.1. 0

The proof actually shows that the maximum of RE,F(a) over all Euclidean
J
metrics a equals P,2/n (which is an interesting geometric interpretation of
P,2). The maximum is attained for the a induced by the mapping V -+ R
specified by b2 , the eigenvector belonging to P,2.
The cone of squared .e 2 -metrics and universality of the lower-bound
method. For the Hamming cubes, we obtained the exact minimum distor-
tion required for a Euclidean embedding. This was due to the lucky choice of
the sets E and F of point pairs. As we will see below, a "lucky" choice, leading
to an exact bound, exists for every finite metric space if we allow for sets of
weighted pairs. Let (V, p) be a finite metric space and let rJ, <.p: (~) -+ [0, (0)
be weight functions. We define

p2(rJ) = L rJ(u,v)p(u,v)2
{u,v}E(~)

and similarly for p2 (<.p), and we let

R'1,<p(p) =

15.5.2 Proposition. Let (V, p) be a finite metric space and let D ::::: 1 be the
smallest number such that (V, p) can be D-embedded into £2. Then there are
weight functions rJ, <.p: (~) -+ [0,(0) such that R'1,<p (p) ::::: D and R'1,<p (a) ~ 1
for any metric a induced on V by an embedding into £2.
Thus, the exact lower bound for the embeddability into Euclidean spaces
always has an "easy" proof, provided that we can guess the right weight
functions rJ and <.p. (As we will see below, there is even an efficient algorithm
for deciding D-embeddability into £2.)
15.5 A Tight Lower Bound via Expanders 377

Proposition 15.5.2 is included mainly because of generally useful concepts


appearing in its proof.
Let V be a fixed n-point set. An arbitrary function rp: (~) -t R, assigning
a real number to each unordered pair of points of V, can be represented by a
point in R N , where N = G); the coordinates of such a point are indexed by
pairs {u, v} E (~). For example, the set of all metrics on V corresponds to a
subset of RN called the metric cone (also see the notes to Section 5.5). As is
not difficult to verify, it is an N-dimensional convex polyhedron in RN. Its
combinatorial structure has been studied intensively.
In the proof of Proposition 15.5.2 we will not work with the metric cone
but rather with the cone of squared Euclidean metrics, denoted by £2. We
define

15.5.3 Observation. The set £2 is a convex cone.

Proof. Clearly, if x E £2, then AX E £2 for all A 2: 0, and so it suffices


to verify that if x, y E £2, then x + y E £2. Let x, y E £2 correspond
to embeddings f: V -t £~ and g: V -t £2' respectively. We define a new
embedding h: V -t £~+m by concatenating the coordinates of f and g; that
is,
h(v) = (f(vh, ... , f(V)k,g(vh, ... ,g(v)m) E £~+m.
The point of £2 corresponding to h is x + y. o
Proof of Proposition 15.5.2. Suppose that (V, p) cannot be D-embedded
into any Euclidean space. We are going to exhibit 'rJ and rp with R'I},<.p(p) 2: D
and RTJ,<.p ((J) ~ 1 for every Euclidean (J. The claim of the proposition is easily
derived from this by a compactness argument.
Let £2 C RN be the cone of squared Euclidean metrics on V as above
and let

IC = {(xuv){U'V}E(~) ERN: there exists an r > 0 with


r2p(u, v? ~ XUV ~ D2r2p(u, v)2 for all u, v}.

This IC includes all squares of metrics arising by D-embeddings of (V, p). But
not all elements of IC are necessarily squares of metrics, since the triangle
inequality may be violated. Since there is no Euclidean D-embedding of (V, p),
we have IC n £2 = 0. Both IC and £2 are convex sets in R N , and so they can
be separated by a hyperplane, by the separation theorem (Theorem 1.2.4).
Moreover, since £2 is a cone and K is a cone minus the origin 0, the separating
hyperplane has to pass through o. So there is an a ERN such that

(a, x) 2: 0 for all x E IC and (a, x) ~ 0 for all x E £2. (15.5)

Using this a, we define the desired 'rJ and rp, as follows:


378 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

ry(u, v)
°
{ auv if auv 2: 0,
otherwise;

{ -auv if auv < 0,


cp(u, v)
° otherwise.

First we show that R'I},<p(p) 2: D. To this end, we employ the property


(15.5) for the following x E K:

x _ {D 2p(u,v)2
uv - p(u, v)2 l'f auv < .°
if auv 2: 0,

Then (a,x) 2: ° boils down to D2p2(ry) - p2(cp) 2: 0, which means that


R'I},<p(p) 2: D.

°
Next, let u be a metric induced by a Euclidean embedding of V. This
time we apply (a, x) :::; with the x E £2 corresponding to u, i.e., Xuv =
u( u, v)2. This yields u 2 (r/) - u 2(cp) :::; 0, and so R'I},<p( u) :::; 1. This proves
Proposition 15.5.2. 0
Algorithmic remark: Euclidean embeddings and semidefinite pro-
gramming. The problem of deciding whether a given n-point metric space
(V, p) admits a D-embedding into £2 (i.e., into a Euclidean space without re-
striction on the dimension), for a given D 2: 1, can be solved by a polynomial-
time algorithm. Let us stress that the dimension of the target Euclidean space
cannot be prescribed in this method. If we insist that the embedding be into
£~, for some given d, we obtain a different algorithmic problem, and it is not
known how hard it is. Many other similar-looking embedding problems are
known to be NP-hard, such as the problem of D-embedding into £1'
The algorithm for D-embedding into £2 is based on a powerful technique
called semidefinite programming, where the problem is expressed as the exis-
tence of a positive semidefinite matrix in a suitable convex set of matrices.
Let (V, p) be an n- point metric space, let f: V ---+ R n be an embedding,
and let X be the n x n matrix whose columns are indexed by the elements
of V and such that the vth column is the vector f(v) ERn. The matrix
Q = X T X has both rows and columns indexed by the points of V, and the
entry quv is the scalar product (f (u), f (v)).
The matrix Q is positive semidefinite, since for any x ERn, we have
xTQx = (x T XT)(Xx) = IIXxl12 2: 0. (In fact, as is not too difficult to check,
a real symmetric n x n matrix P is positive semidefinite if and only if it can
be written as X T X for some real n x n matrix X.)
Let u(u, v) = Ilf(u) - f(v)11 = (f(u) - f(v), f(u) - f(v))1/2. We can ex-
press

u(u, V)2 = (j(u), f(u)) + (f(v), f(v)) - 2(f(u), f(v)) = quu + qvv - 2quv'
Therefore, the space (V, p) can be D-embedded into £2 if and only if there
exists a symmetric real positive semidefinite matrix Q whose entries satisfy
15.5 A Tight Lower Bound via Expanders 379

the following constraints:

for all u, v E V. These are linear inequalities for the unknown entries of Q.
The problem of finding a positive semidefinite matrix whose entries sat-
isfy a given system of linear inequalities can be solved efficiently, in time
polynomial in the size of the unknown matrix Q and in the number of the
linear inequalities. The algorithm is not simple; we say a little more about it
in the remarks below.

Bibliography and remarks. Theorem 15.5.1 was proved by Linial,


London, and Rabinovich [LLR95]. This influential paper introduced
methods and results concerning low-distortion embeddings, developed
in local theory of Banach spaces, into theoretical computer science, and
it gave several new results and algorithmic applications. It is very in-
teresting that using low-distortion Euclidean embeddings, one obtains
algorithmic results for certain graph problems that until then could
not be attained by other methods, although the considered problems
look purely graph-theoretic without any geometric structure. A simple
but important example is presented at the end of Section 15.7.
The bad embeddability of expanders was formulated and proved
in [LLR95] in connection with the problem of multicommodity flows
in graphs. The proof was similar to the one shown above, but it es-
tablished an O(logn) bound for embedding into £1' The result for
Euclidean spaces is a corollary, since every finite Euclidean metric
space can be isometrically embedded into £1 (Exercise 5). An inequal-
ity similar to (15.4) was used, but with squares of differences replaced
by absolute values of differences. Such an inequality was well known
for expanders. The method of [LLR95] was generalized for embeddings
to £p-spaces with arbitrary p in [Mat97]; it was shown that the mini-
mum distortion required to embed all n-point metric spaces into £p is
of order log n , and a matching upper bound was proved by the method
shown in Section 15.7.
The proof of Theorem 15.5.1 given in the text can easily be ex-
tended to prove a lower bound for £1-embeddability as well. It ac-
tually shows that distortion O(logn) is needed for approximating the
expander metric by a squared Euclidean metric, and every £l-metric is
a squared Euclidean metric. Squared Euclidean metrics do not gener-
ally satisfy the triangle inequality, but that is not needed in the proof.
Those squared Euclidean metrics that do satisfy the triangle inequal-
ity are sometimes called the metrics of negative type. Not all of these
metrics are £rmetrics, but a challenging conjecture (made by Linial
and independently by Goemans) states that perhaps they are not very
far from £l-metrics: Each metric of negative type might be embeddable
380 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

into i'I with distortion bounded by a universal constant. If true, this


would have significant algorithmic consequences: Many problems can
be formulated as optimization over the cone of all C1-metrics, which is
computationally intractable, and the metrics of negative type would
provide a good and algorithmically manageable approximation.
The formulation of the minimum distortion problem for Euclidean
embeddings as semidefinite programming is also due to [LLR95], as
well as Proposition 15.5.2. These ideas were further elaborated and
applied in examples by Linial and Magen [LMOO]. The proof of Propo-
sition 15.5.2 given in the text is simpler than that in [LLR95], and it
extends to Cp-embeddability (Exercise 4), unlike the formulation of
the D-embedding problem as a semidefinite program. It was commu-
nicated to me by Yuri Rabinovich.
A further significant progress in lower bounds for C2-embeddings of
graphs was made by Linial, Magen, and Naor [LMN01]. They proved
that the metric of every r-regular graph, r > 2, of girth g requires
distortion at least O( yg) for embedding into C2 (an O(g) lower bound
was conjectured in [LLR95]). They give two proofs, one based on the
concept of Markov type of a metric space due to Ball [BaI92] and
another that we now outline (adapted to the notation of this section).
Let G = (V, E) be an r-regular graph of girth 2t+1 or 2t+2 for some
integer t 2: 1, and let p be the metric of G. We set F = {{ u, v} E
(~): p(u,v) = t}j note that the graph H = (V,F) is s-regular for
s = r(r-1)t-l. Calculating RE,F(p) is trivial, and it remains to bound
RE,F(a) for all Euclidean metrics a on V, which amounts to finding
the largest j3 > 0 such that a 2(E) - j3 . a 2(F) 2: 0 for all a. Here it
suffices to consider line metrics aj so let Xv E R be the image of v in
the embedding V -+ R inducing a. We may assume EVEV XV = 0 and,
as in the proof in the text, a 2(E) = E{U,V}EE(x u - xv)2 = x T Lax =
xT(rl -Aa)x T , where 1 is the identity matrix and Aa is the adjacency
matrix of G, and similarly for a 2(F). So we require x T ex 2: 0 for all x
with EVE V Xv = 0, where e = (r-j3s)1 - Aa + j3A H . It turns out that
there is a degree-t polynomial Pt(x) such that AH = Pt(Aa) (here we
need that the girth of G exceeds 2t). This Pt(x) is called the Geronimus
polynomial, and it is not hard to derive a recurrence for it: Po(x) = 1,
Pl(X) = x, P 2(x) = x 2 - r, and Pt(x) = xPt-1(x) - (r-1)Pt - 2(x) for
t>2. So e = Q(A) for Q(x) = r-j3s-x+Pt (x). As is well known, all
the eigenvalues of A lie in the interval [-r, r], and so if we make sure
that Q(x) 2: 0 for all x E [-r, r], all eigenvalues of e are nonnegative,
and our condition holds. This leaves us with a nontrivial but doable
calculus problem whose discussion we omit.
Semidefinite programming. The general problem of semidefinite pro-
gramming is to optimize a linear function over a set of positive definite
nxn matrices defined by a system of linear inequalities. This is a con-
15.5 A Tight Lower Bound via Expanders 381

vex set in the space of all real nxn matrices, and in principle it is
not difficult to construct a polynomial-time membership oracle for it
(see the explanation following Theorem 13.2.1). Then the ellipsoid
method can solve the optimization problem in polynomial time; see
Grotschel, Lovasz and Schrijver [GLS88]. More practical algorithms
are based on interior point methods. Semidefinite programming is an
extremely powerful tool in combinatorial optimization and other ar-
eas. For example, it provides the only known polynomial-time algo-
rithms for computing the chromatic number of perfect graphs and the
best known approximation algorithms for several fundamental NP-
hard graph-theoretic problems. Lovasz's recent lecture notes [Lov] are
a beautiful concise introduction. Here we outline at least one lovely
application, concerning the approximation of the maximum cut in a
. graph, in Exercise 8 below.
The second eigenvalue. The investigation of graph eigenvalues consti-
tutes a well established part of graph theory; see, e.g., Biggs [Big93]
for a nice introduction. The second eigenvalue of the Laplace matrix as
an important graph parameter was first considered by Fiedler [Fie73]
(who called it the algebraic connectivity). Tanner [Tan84] and Alon
and Milman [AM$5] gave a lower bound for the so-called vertex ex-
pansion of a regular graph (a notion similar to edge expansion) in
terms of f.L2(G), and a reverse relation was proved by Alon [Alo86a].
There are many useful analogies of graph eigenvalues with the
eigenvalues of the Laplace operator .6. on manifolds, whose theory is
classical and well developed; this is pursued to a considerable depth in
Chung [Chu97]. This point of view prefers the eigenvalues of the Lapla-
cian matrix of a graph, as considered in this section, to the eigenvalues
of the adjacency matrix. In fact, for nonregular graphs, a still closer
correspondence with the setting of manifolds is obtained with a differ-
ently normalized Laplacian matrix Co: (Co)v.v = 1 for all v E V(G),
(Co)uv = -(dego(u)dega(v))-1/2 for {u,v} E E(G), and (Co)uv =
otherwise.
°
Expanders have been used to address many fundamental problems of
computer science in areas such as network design, theory of compu-
tational complexity, coding theory, on-line computation, and crypto-
graphy; see, e.g., [RVWOO] for references.
For random graphs, parameters such as edge expansion or vertex
expansion are usually not too hard to estimate (the technical difficulty
of the arguments depends on the chosen model of a random graph). On
the other hand, estimating the second eigenvalue of a random r-regular
graph is quite challenging, and a satisfactory answer is known only for
r large (and even); see Friedman, Koml6s, and Szemeredi [FKS89] or
Friedman [Fri91]. Namely, with high probability, a random r-regular
graph with r even has A2 ::; 2vr-1 + O(1ogr). Here the number of
382 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

vertices n is assumed to be sufficiently large in terms of r and the


0(·) notation is with respect to r -+ 00. At the same time, for every
fixed r 2: 3 and any r-regular graph on n vertices, ),2 2: 2vr-1- 0(1),
where this time 0(') refers to n -+ 00. So random graphs are almost
optimal for large r.
For many of the applications of expanders, random graphs are
not sufficient, and explicit constructions are required. In fact, explic-
itly constructed expanders often serve as substitutes for truly random
graphs; for example, they allow one to convert some probabilistic algo-
rithms into deterministic ones (derandomization) or reduce the num-
ber of random bits required by a probabilistic algorithm.
Explicit construction of expanders was a big challenge, and it has
led to excellent research employing surprisingly deep results from
classical areas of mathematics (group theory, number theory, har-
monic analysis, etc.). In the analysis of such constructions, one usually
bounds the second eigenvalue (rather than edge expansion or vertex
expansion). After the initial breakthrough by Margulis in 1973 and
several other works in this direction (see, e.g., [Mor94] or [RVWOO] for
references), explicit families of constant-degree expanders matching
the quality of random graphs in several parameters (and even super-
seding them in some respects) were constructed by Lubotzky, Phillips,
and Sarnak [LPS88] and independently by Margulis [Mar88]. Later
Morgenstern [Mor94] obtained similar results for many more values of
the parameters (degree and number of vertices). In particular, these
constructions achieve ),2 :::; 2vr-1, which is asymptotically optimal,
as was mentioned earlier.
For illustration, here is one of the constructions (from [LPS88]). Let
p i= q be primes with p,q == 1 (mod 4) and such that p is a quadratic
nonresidue modulo q, let i be an integer with i 2 == -1 (modq), and
let F denote the field of residue classes modulo q. The vertex set
V (G) consists of all 2 x 2 nonsingular matrices over F. Two matrices
A, B E V(G) are connected by an edge iff AB- 1 is a matrix of the
£orm ( aO+ial
+' a2+ia3)
. , wereh aO,al,a2,a3 are . .
mtegers WIt h ao2 +
-a2 ~a3 aO-~al

ai + a~ + a§ = p, ao > 0, ao odd, and aI, a2, a3 even. By a theorem


of Jacobi, there are exactly p+1 such vectors (aO,aI,a2,a3), and it
follows that the graph is (p+1)-regular with q(q2_1) vertices. A family
of constant-degree expanders is obtained by fixing p, say p = 5, and
letting q -+ 00.
Reingold, Vadhan, and Wigderson [RVWOO] discovered an ex-
plicit construction of a different type. Expanders are obtained from
a constant-size initial graph by iterating certain sophisticated prod-
uct operations. Their parameters are somewhat inferior to those from
[Mar88], [LPS88], [Mor94]' but the proof is relatively short, and it uses
only elementary linear algebra.
15.5 A Tight Lower Bound via Expanders 383

Exercises
1. Show that every real symmetric positive semidefinite n x n matrix can
be written as XT X for a real n x n matrix X. 0
2. (Dimension for isometric Cp-embeddings)
(a) Let V be an n-point set and let N = G). Analogous to the set
£2 defined in the text, let £~fin) C RN be the set of all metrics on V
induced by embeddings f: V -+ C}, k = 1,2,... . Show that £~fin) is
the convex hull of line pseudometrics,5 i.e., pseudometrics induced by
mappings f: V -+ C~. I2l
(b) Prove that any metric from £~fin) can be isometrically embedded
into c1(. That is, any n-point set in some C} can be realized in cf'. 0
(Examples show that one cannot do much better and that dimension
f2(n 2 ) is necessary, in contrast to Euclidean embeddings, where dimension
n-1 always suffices.)
(c) Let £1 eRN be all metrics induced by embeddings of V into C1 (the
space of infinite sequences with finite C1-norm). Show that £1 = £~fin),
and thus that any n- point subset of C1, can be realized in c1(. 0
(d) Extend the considerations in (a)-(c) to Cp-metrics with arbitrary
p E [1,00).0
See Ball [Bal90] for more on the dimension of isometric Cp-embeddings.
3. With the notation as in Exercise 2, show that every line pseudometric
v on an n-point set V is a nonnegative linear combination of at most
n-1 cut pseudometrics: v = L:7:11aiTi, a1, ... ,an -1 2: 0, where each
Ti is a cut pseudometric, i.e., a line pseudometric induced by a mapping
'(Pi: V -+ {O, I}. (Consequently, by Exercise 2(a), every finite metric iso-
metrically embeddable into £1 is a nonnegative linear combination of cut
pseudometrics.) 0
4. (An Cp-analogue of Proposition 15.5.2) Let p E [1,00) be fixed. Using
Exercise 2, formulate and prove an appropriate £p-analogue of Proposi-
tion 15.5.2. 0
5. (Finite C2-metrics embed isometrically into Cp )
(a) Let p be fixed. Check that if for all c > 0, a finite metric space
(V, p) can be (l+c)-embedded into some C~, k = k(c), then (V, p) can be
isometrically embedded into £{:, where N = (I~I). Use Exercise 2. I2l
(b) Prove that every n-point set in C2 can be isometrically embedded into
C{:. I2l
6. (The second eigenvalue and edge expansion) Let G be an r-regular graph
with n vertices, and let A, B ~ V be disjoint. Prove that the number of
edges connecting A to B is at least e(A, B) 2: J..l2(G) . IAIJBI (use (15.3)
with a suitable vector x), and deduce that <p(G) 2: ~ J..l2(G). 0

5 A pseudometric v satisfies all the axioms of a metric except that we may have
v(x, y) = 0 even for two distinct points x and y.
384 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

7. (Expansion and measure concentration) Let us consider the vertex set


of a graph G as a metric probability space, with the usual graph metric
and with the uniform probability measure P (each vertex has measure
~, n = IV(G)I). Suppose that <Jl = <Jl(G) > 0 and that the maximum
degree of G is Do. Prove the following measure concentration inequality:
If A ~ V(G) satisfies PIA] ~ !,
then 1 - PlAt] :::; !e-tip/~, where At
denotes the t-neighborhood of A. III
8. (The Goemans-Williamson approximation to MAXCUT) Let G = (V, E)
be a given graph and let n = IVI. The MAXCUT problem for G is to find
the maximum possible number of "crossing" edges for a partition V =
AUB of the vertex set into two disjoint subsets, i.e., maxA~V e(A, V\A).
This is an NP-complete problem. The exercise outlines a geometric ran-
domized algorithm that finds an approximate solution using semidefinite
programming.
(a) Check that the MAXCUT problem is equivalent to computing

M opt = max { ! L
{u,v}EE
(1 - xuxv): Xv E {-I, I}, v E V}.
m
(b) Let

M re1ax = max { ! L
{u,v}EE
(1- (Yu,Yv)): Yv ERn, IIYvll = 1, v E V}.
Clearly, Mrel ax ~ M opt . Verify that this relaxed version of the problem is
an instance of a semidefinite program, that is, the maximum of a linear
function over the intersection of a polytope with the cone of all symmetric
positive semidefinite real matrices. m
(c) Let (yv: v E V) be some system of unit vectors in R n for which Mrel ax
is attained. Let r E R n be a random unit vector, and set Xv = sgn(yv, r),
!
v E V. Let Mapprox = L{u,v}EE(1 - xuxv) for these Xv. Show that
the expectation, with respect to the random choice of r, of Mapprox is
at least 0.878 . M re1ax (consider the expected contribution of each edge
separately). So we obtain a polynomial-time randomized algorithm pro-
ducing a solution to MAX CUT whose expected value is at least about
88% of the optimal solution. [!]
Remark. This algorithm is due to Goemans and Williamson [GW95].
Later, Hastad [Has97] proved that no polynomial-time algorithm can
produce better approximation in the worst case than about 94% unless
P=NP (also see Feige and Schechtman [FS01] for nice mathematics show-
ing that the Goemans-Williamson value 0.878 ... is, in a certain sense,
optimal for approaches based on semidefinite programming).
15.6 Upper Bounds for foo-Embeddings 385

15.6 Upper Bounds for lOCJ-Embeddings


In this section we explain a technique for producing low-distortion embed-
dings of finite metric spaces. Although we are mainly interested in Euclidean
embeddings, here we begin with embeddings into the space Coo, which are
somewhat simpler. We derive almost tight upper bounds.
Let (V, p) be an arbitrary metric space. To specify an embedding

f: (V,p) ~ c'/xo

means to define d functions iI, ... , fd: V ~ R, the coordinates of the embed-
ded points. If we aim at a D-embedding, without loss of generality we may
require it to be nonexpanding, which means that Ifi(U) - fi(v)1 :::; p(u, v) for
all u, v E V and all i = 1, 2, ... ,d. The D-embedding condition then means
that for every pair {u, v} of points of V, there is a coordinate i = i (u, v) that
"takes care" of the pair: Ifi(U) - fi(v)l2:: fJp(U, v).
One of the key tricks in constructions of such embeddings is to take each
fi as the distance to some suitable subset Ai ~ V; that is, fi(U) = p(u, Ai) =
maxaEA;p(u,a). By the triangle inequality, we have Ip(u,A i ) - p(v,Ai)1 :::;
p(u, v) for any u, v E V, and so such an embedding is automatically nonex-
panding. We "only" have to choose a suitable collection of the Ai that take
care of all pairs {u, v}.
We begin with a simple case: an old observation showing that every finite
metric space embeds isometrically into Coo.

15.6.1 Proposition (Frechet's embedding). Let (V, p) be an arbitrary


n-point metric space. Then there is an isometric embedding f: V ~ C~.

Proof. Here the coordinates in C~ are indexed by the points of V, and the
vth coordinate is given by fv (u) = p( u, v). In the notation above, we thus put
Av = {v}. As we have seen, the embedding is nonexpanding by the triangle
inequality. On the other hand, the coordinate v takes care of the pairs {u, v}
for all u E V:

Ilf(u) - f(v) 1100 2:: Ifv(u) - fv(v)1 = p(u,v).


o
The dimension of the image in this embedding can be reduced a little;
for example, we can choose some Vo E V and remove the coordinate cor-
responding to vo, and the above proof still works. To reduce the dimension
significantly, though, we have to pay the price of distortion. For example, from
Corollary 15.3.4 we know that for distortions below 3, the dimension must
generally remain at least a fixed fraction of n. We prove an upper bound on
the dimension needed for embeddings with a given distortion, which nearly
matches the lower bounds in Corollary 15.3.4:
386 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

15.6.2 Theorem. Let D = 2q-1 ~ 3 be an odd integer and let (V, p) be an


n-point metric space. Then there is a D-embedding of V into l~ with

d = O(qn 1 / q In n).

Proof. The basic scheme of the construction is as explained above: Each


coordinate is given by the distance to a suitable subset of V. This time the
subsets are chosen at random with suitable densities.
Let us consider two points u, v E V. What are the sets A such that
Ip(u, A) - p( v, A) I ~ ~, for a given real ~ > O? For some r ~ 0, they must
intersect the closed r-ball around u and avoid the open (r+~)-ball around Vi
schematically,

or conversely (with the roles of u and v interchanged).


In the favorable situation where the closed r-ball around u does not con-
tain many fewer points of V than the open (r+~)-ball around v, a random A
with a suitable density has a reasonable chance to work. Generally we have
no control over the distribution of points around u and around v, but by
considering several suitable balls simultaneously, we can find a good pair of
balls. We also do not know the right density needed for the sample to work,
but since we have many coordinates, we can take samples of essentially all
possible densities.
Now we begin with the formal proof. We define an auxiliary param-
eter P = n- 1 / q , and for j = 1,2, ... , q, we introduce the probabilities
Pj = min(~,pi). Further, let m = r24n 1 / q lnnl For i = 1,2, . .. ,m and
j = 1,2, ... , q, we choose a random subset Aij ~ V. The sets (and the cor-
responding coordinates in f~q) now have double indices, and the index j
influences the "density" of A ij . Namely, each point v E V has probability Pj
of being included into A ij , and these events are mutually independent. The
choices of the A ij , too, are independent for distinct indices i and j. Here is a
schematic illustration of the sampling:

, .0
• o.

00
00.0
·0


..
o•
o
0
.0

o•
·0

0
'6
o.
o
00
00.0
0
00
o.

0
0000

• ••
0 0 0


0 0 •
0

0
0
0
0 0

A*l A*2 A*3
15.6 Upper Bounds for foo-Embeddings 387

We divide the coordinates in £~ into q blocks by m coordinates. For


v E V, we let

!(v)ij=p(v,Aij ), i=1,2, ... ,m, j=1,2, ... ,q.

We claim that with a positive probability, this f: V -+ £~q is aD-embedding.


We have already noted that ! is nonexpanding, and the following lemma
serves for showing that with a positive probability, every pair {u, v} is taken
care of.
15.6.3 Lemma. Let u, v be two distinct points of V. Then there exists an
index j E {I, 2, ... , q} such that if the set Aij is chosen randomly as above,
then the probability of the event

(15.6)

is at least i2'
First, assuming this lemma, we finish the proof of the theorem. To show
that! is a D-embedding, it suffices to show that with a nonzero probability,
for every pair {u, v} there are i, j such that the event (15.6) in the lemma
occurs for the set A ij . Consider a fixed pair {u, v} and select the appropriate
index j as in the lemma. The probability that the event (15.6) does not occur
for any of the m indices i is at most (1- f2)m::::; e- pm / 12 ::::; n- 2 . Since there
are G) < n 2 pairs {u, v}, the probability that we fail to choose a good set
for any of the pairs is smaller than 1. 0

Proof of Lemma 15.6.3. Set A = -b


p(u,v). Let Bo = {u}, let Bl be
the (closed) A-ball around v, let B2 be the (closed) 2A-ball around u, ... ,
finishing with B q , which is a qA-ball around u (if q is even) or around v (if q
is odd). The parameters are chosen so that the radii of Bq- 1 and Bq add up
to p(u, v); that is, the last two balls just touch (recall that D = 2q-1):

Let nt denote number of points of V in B t .


We want to select an index j such that

nt ~ n(j-l)/q and nt+l::::; n j / q . (15.7)


To this end, we divide the interval [1, n] into q intervals h, 12 , .•. , l q , where
388 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

If the sequence (nl, n2, ... , nq) is not monotone increasing, i.e., if nt+l < nt
for some t, then (15.7) holds for the j such that 1j contains nt. On the other
hand, if 1 = no :::; nl :::; ... :::; nq :::; n, then by the pigeonhole principle, there
exist t and j such that the interval 1j contains both nt and nt+l. Then (15.7)
holds for this j as well.
In this way, we have selected the index j whose existence is claimed in the
lemma. We will show that with probability at least 1'2, the set A ij , randomly
selected with point probability Pj, includes a point of B t (event E l ) and is
disjoint from the interior of Bt+1 (event E 2 ); such an Aij satisfies (15.6).
Since B t and the interior of Bt+l are disjoint, the events El and E2 are
independent.
We calculate

Prob[ElJ = 1- Prob[Aij nBt = 0J = 1- (1- pj)n, 2: 1- e- pjn ,.

Using (15.7), we have Pjnt 2: pjn(j-l)/q = Pjp- j +1 = min(~,pJ)p-j+l 2:

P< t
min( l.,p). For P 2: ~, we get Prob[ElJ 2: 1 - e- l / 2 > ~ 2: ~, while for
we have Prob [EIJ 2: 1 - e- P , and a bit of calculus verifies that the
last expression is well above ~ for all P E [0, ~).
Further,

Prob[E2J 2: (1 - pjt'+l 2: (1- pj)n j / q 2: (1 - pj)l/pj 2: i


(since Pj :::; ~). Thus Prob[El n E2J 2: i2' which proves the lemma. 0

Bibliography and remarks. The embedding method discussed


in this section was found by Bourgain [Bou85], who used it to prove
Theorem 15.7.1 explained in the subsequent section. Theorem 15.6.2
is from [Mat96bJ.

Exercises
1. (a) Find an isometric embedding of Ct into C~. m
(b) Explain how an embedding as in (a) can be used to compute the
diameter of an n-point set in Ct in time O(d2 d n). m
2. Show that if the unit ball K of some finite-dimensional normed space
is a convex polytope with 2m facets, then that normed space embeds
isometrically into C:.
~
(Using results on approximation of convex bodies by polytopes, this yields
useful approximate embeddings of arbitrary norms into C~.)
3. Deduce from Theorem 15.6.2 that every n-point metric space can be D-
embedded into C~ with D = 0(1og2 n) and k = 0(1og2 n). ~
15.7 Upper Bounds for Euclidean Embeddings 389

15.7 Upper Bounds for Euclidean Embeddings


By a method similar to the one shown in the previous section, one can also
prove a tight upper bound on Euclidean embeddings; the method was actually
invented for this problem.
15.7.1 Theorem (Bourgain's embedding into £2). Everyn-point metric
space (V, p) can be embedded into a Euclidean space with distortion at most
O(logn).
The overall strategy of the embedding is similar to the embedding into .e~
in the proof of Theorem 15.6.2. The coordinates in .e~ are given by distances
to suitable subsets. The situation is slightly more complicated than before:
For embedding into .e~, it was enough to exhibit one coordinate "taking care"
of each pair, whereas for the Euclidean embedding, many of the coordinates
will contribute significantly to every pair. Here is the appropriate analogue
of Lemma 15.6.3.
15.7.2 Lemma. Let u, v E V be two distinct points. Then there exist real
numbers D.I,D.2, ... ,D.q ::::: 0 with D.1 + ... + D.q = tp(u,v), where q =
llog2nJ+l, and such that the following holds for each j = 1,2, ... ,q: If
Aj ~ V is a randomly chosen subset of V, with each point of V included in
Aj independently with probability 2- j , then the probability Pj of the event

satisfies Pj ::::: 112 ,

Proof. We fix u and v. We define rq = t p(u, v) and for j = 0, 1, ... , q-l, we


let rj be the smallest radius such that both IB(u, rj)1 ::::: 2j and IB(v, rj)1 ::::: 2j
where, as usual, B(x, r) = {y E V: p(x, y) ~ r}. We are going to show that
the claim of the lemma holds with D.j = rj - rj-1.
Fix j E {I, 2, ... , q} and let Aj ~ V be a random sample with point
probability 2- j . By the definition ofrj, IBO(u,rj)1 < 2j or IBO(v,rj)1 < 2j ,
where BO(x,r) = {y E V: p(x,y) < r} denotes the open ball (this holds
for j = q, too, because IVI ~ 2q). We choose the notation u, v so that
IBO(u,rj)1 < 2j . A random set Aj is good if it intersects B(v,rj_1) and
misses BO (u, r j). The former set has cardinality at least 2j -1 and the latter
at most 2j . The calculation of the probability that Aj has these properties is
identical to the calculation in the proof of Lemma 15.6.3 with p = 1. 0

In the subsequent proof of Theorem 15.7.1 we will construct the embed-


ding in a slightly roundabout way, which sheds some light on what is really
going on. Define a line pseudometric on V to be any pseudometric l/ induced
by a mapping cp:V ~ R, that is, given by l/(u,v) = Icp(u) - cp(v)l. For
each A ~ V, let l/A be the line pseudo metric corresponding to the mapping
v M p( v, A). As we have noted, each l/A is dominated by p, i.e., l/A ~ P
390 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

(the inequality between two (pseudo )metrics on the same point set means
inequality for each pair of points).
The following easy lemma shows that if a metric p on V can be approx-
imated by a convex combination of line pseudometrics, each of them domi-
nated by p, then a good embedding of (V, p) into f2 exists.
15.7.3 Lemma. Let (V, p) be a finite metric space, and let V!, ... , VN be
line pseudometrics on V with Vi ::; P for all i and such that

N 1
"
~ " Q'V-
t t_ > -D P
i=l

for some nonnegative summing up to 1. Then (V, p) can be D-


fro
Ql, ... ,QN
embedded into

Proof. Let CPi: V -+ R be a mapping inducing the line pseudometric Vi. We


define the embedding f: V -+ by fr

Then, on the one hand,


N
Ilf(u) - f(v)112 = LQiVi(U,V)2::; p(U,V)2,
i=l

because all Vi are dominated by p and L Qi = 1. On the other hand,

/2
Ilf(u)-f(v)1I = (t, QiVi(U, V)2r = (t,Qir/2(t,QiVi(U,V)2r/2
N
~ LQiVi(U,V)
i=l

by Cauchy-Schwarz, and'the latter expression is at least -b p(u, v) by the


assumption. 0

Proof of Theorem 15.7.1. As was remarked above, each of the line pseu-
dometrics V A corresponding to the mapping v H p( v, A) is dominated by p.
It remains to observe that Lemma 15.7.2 provides a convex combination of
these line pseudometrics that is bounded from below by 4~q' p. The coefficient
of each V A in this convex combination is given by the probability of A appear-
ing as one of the sets Aj in Lemma 15.7.2. More precisely, write 7fj(A) for
the probability that a random subset of V, with points picked independently
with probability 2- j , equals A. Then the claim of Lemma 15.7.2 implies, for
every pair {u, v},
15.7 Upper Bounds for Euclidean Embeddings 391

L ?fj(A)· VA(U,V) ~ 112 f:1 j .


A<;;V
Summing over j = 1,2, . .. , q, we have

Dividing by q and using 2:A<;;V ?fj(A) = 1, we arrive at

LetA = 1,
A<;;V

with etA = ~ 2:J=l?fj(A). Lemma 15.7.3 now gives embeddability into f2


with distortion at most 48q. Theorem 15.7.1 is proved. 0
Remarks. Almost the same proof with a slight modification of Lemma 15.7.3
shows that for each p E [1,(0), every n-point metric space can be embedded
into fp with distortion o (log n); see Exercise 1.
The proof as stated produces an embedding into space of dimension 2n,
since there are 2n subsets A <;;; V, each of them yielding one coordinate.
To reduce the dimension, one can argue that not all the sets A are needed:
by suitable Chernoff-type estimates, it follows that it is sufficient to choose
o (log n) random sets with point probability 2- j , i.e., O(log2 n) sets altogether
(Exercise 2). Of course, for Euclidean embeddings, an even better dimension
O(logn) is obtained using the Johnson-Lindenstrauss flattening lemma, but
for other fp, no flattening lemma is available.
An algorithmic application: approximating the sparsest cut. We
know that every n-point metric space can be O(logn)-embedded into f~ with
d = O(log2 n). By inspecting the proof, it is not difficult to give a randomized
algorithm that computes such an embedding in polynomial expected time.
We show a neat algorithmic application to a graph-theoretic problem.
Let G = (V, E) be a graph. A cut in G is a partition of V into two
nonempty subsets A and B = V \ A. The density of the cut (A, B) is ,i~l~l,
where e(A, B) is the number of edges connecting A and B. Given G, we
would like to find a cut of the smallest possible density. This problem is NP-
hard, and here we discuss an efficient algorithm for finding an approximate
answer: a cut whose density is at most O(log n) times larger than the density
of the sparsest cut, where n = IVI (this is the best known approximation
guarantee for any polynomial-time algorithm). Note that this also allows us
to approximate the edge expansion of G (discussed in Section 15.5) within a
multiplicative factor of O(logn).
First we reformulate the problem equivalently using cut pseudometrics. A
cut pseudometric on V is a pseudometric r corresponding to some cut (A, B),
with r(u, v) = rev, u) = 1 for u E A and v E Band r(u, v) = 0 for u, v E A or
392 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

U,v E B. In other words, a cut pseudometric is a line pseudometric induced


by a mapping 'IjJ: V --+ {O, I} (excluding the trivial case where all of V gets
mapped to the same point). Letting F = (~), the density of the cut (A, B)
can be written as T(E)/T(F), where T is the corresponding cut pseudometric
and T(E) = L:{U,V}EE T(U, v). Therefore, we would like to minimize the ratio
RI(T) = T(E)/T(F) over all cut pseudometrics T.
In the first step of the algorithm we relax the problem, and we find a pseu-
dometric, not necessarily a cut one, minimizing the ratio R 1 (p) = p( E) / p( F).
This can be done efficiently by linear programming. The minimized function
looks nonlinear, but we can get around this by a simple trick: We postulate
the additional condition p(F) = 1 and minimize the linear function p(E). The
variables in the linear program are the G) numbers p( u, v) for {u, v} E F,
°
and the constraints are p( u, v) 2: (for all u, v), p( F) = 1, and those express-
ing the triangle inequalities for all triples u, v, w E V.
Having computed a Po minimizing RI(p), we find aD-embedding j of
(V, Po) into some it with D = O(log n). If 0"0 is the pseudometric induced on
V by this j, we clearly have RI (0"0) ~ D·R I (Po). Now since 0"0 is an iI-pseudo-
metric, it can be expressed as a nonnegative linear combination of suitable
cut pseudometrics (Exercise 15.5.3): 0"0 = L:!I aiTi, all" . ,aN> 0, N ~
d(n-1). It is not difficult to check that RI(O"o) 2: min{RIh): i = 1,2, ... ,N}
(Exercise 3). Therefore, at least one of the Ti is a cut pseudometric satisfying
RI(Ti) ~ RI(O"o) ~ D· RI(po) ~ D· RI(TO), where TO is a cut pseudometric
with the smallest possible R 1 (TO)' Therefore, the cut corresponding to this Ti
has density at most O(1ogn) times larger than the sparsest possible cut.

Bibliography and remarks. Theorem 15.7.1 is due to Bourgain


[Bou85]. The algorithmic application to approximating the sparsest
cut uses the idea of an algorithm for a somewhat more complicated
problem (multicommodity flow) found by Linial et al. [LLR95] and
independently by Aumann and Rabani [AR98].
We will briefly discuss further results proved by variations of Bour-
gain's embedding technique. Many of them have been obtained in the
study of approximation algorithms and imply strong algorithmic re-
sults.
Tree metrics. Let 9 be a class of graphs and consider a graph G E g.
Each positive weight function w: E( G) --+ (0,00) defines a metric on
V(G), namely the shortest-path metric, where the length of a path
is the sum of the weights of its edges. All subspaces of the resulting
metric spaces are referred to as g-metrics. A tree metric is aT-metric
for T the class of all trees. Tree metrics generally behave much better
than arbitrary metrics, but for embedding problems they are far from
trivial.
Bourgain [Bou86] proved, using martingales, a surprising lower
bound for embedding tree metrics into i 2 : A tree metric on n points
requires distortion Q( y'log log n ) in the worst case. His example is the
15.7 Upper Bounds for Euclidean Embeddings 393

complete binary tree with unit edge lengths, and for that example,
he also constructed an embedding with O( vllog log n) distortion. For
embedding the complete binary tree into £P' p > 1, the distortion is
n((log log n)min(1/2,1/p»), with the constant of proportionality depend-
ing on p and tending to 0 as p -+ 1. (For Banach-space specialists, we
also remark that all tree metrics can be embedded into a given Banach
space Z with bounded distortion if and only if Z is not superreflexive.)
In Matousek [Mat99b] it was shown that the complete binary tree is
essentially the worst example; that is, every n-point tree metric can be
embedded into £p with distortion O((loglogn)min(1/2,1/P»). An alter-
native, elementary proof was given for the matching lower bound (see
Exercise 5 for a weaker version). Another proof of the lower bound,
very short but applying only for embeddings into £2, was found by
Linial and Saks [LS02] (Exercise 6).
In the notes to Section 15.3 we mentioned that general n-point
metric spaces require worst-case distortion n(n 1/ L(d+1)/2 J ) for embed-
ding into £g, d;:::: 2 fixed. Gupta [GupOO] proved that for n-point tree
metrics, O(n1/(d-l»)-embeddings into £g are possible. The best known
lower bound is n(nl/d), from a straightforward volume argument. Ba-
bilon, Matousek, Maxova, and Valtr [BMMV02] showed that every
n-vertex tree with unit-length edges can be O( Vn )-embedded into £~.
Planar-graph metrics and metrics with excluded minor. A planar-
graph metric is a P-metric with P standing for the class of all pla-
nar graphs (the shorter but potentially confusing term planar met-
ric is used in the literature). Rao [Ra099] proved that every n-point
planar-graph metric can be embedded into £2 with distortion only
O( vllog n ), as opposed to log n for general metrics. More generally,
the same method shows that whenever H is a fixed graph and Excl(H)
is the class of all graphs not containing H as a minor, then Excl(H)-
metrics can be O( vllog n )-embedded into £2. For a matching lower
bound, valid already for the class Excl(K4 ) (series-parallel graphs),
and consequently for planar-graph metrics; see Exercise 15.4.2.
We outline Rao's method of embedding. We begin with graphs
where all edges have unit weight (this is the setting in [Ra099], but
our presentation differs in some details), and then we indicate how
graphs with arbitrary edge weights can be treated. The main new
ingredient in Rao's method, compared to Bourgain's approach, is a
result of Klein, Plotkin, and Rao [KPR93] about a decomposition of
graphs with an excluded minor into pieces of low diameter. Here is the
decomposition procedure.
Let G be a graph, let p be the corresponding graph metric (with all
edges having unit length), and let D. be an integer parameter. We fix a
vertex Va E V(G) arbitrarily, we choose an integer r E {O, 1, ... , D.-I}
uniformly at random, and we let Bl = {v E V(G): p(V, va) ==
394 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

r (mod D.)}. By deleting the vertices of Bl from G, the remaining


vertices are partitioned into connected components; this is the first
level of the decomposition. For each of these components of G \ B l ,
we repeat the same procedure; ,0. remains unchanged and r is chosen
anew at random (but we can use the same r for all the components).
Let B2 be the set of vertices deleted from G in this second round,
taken together for all the components. The second level of the decom-
position consists of the connected components of G \ (Bl U B2)' and
decompositions of levels 3, 4, ... can be produced similarly. The fol-
lowing schematic drawing illustrates the two-level decomposition; the
graph is marked as the gray area, and the vertices of Bl and B2 are
indicated by the solid and dashed arcs, respectively.

/
.
~
.............

For planar graphs, it suffices to use a 3-level decomposition, and for


every fixed graph H, there is a suitable k = k(H) such that a k-Ievel
decomposition is appropriate for all graphs G E Excl (H).
Let B = Bl U··· U B k ; this can be viewed as the boundary of the
components in the k-level decomposition. Here are the key properties
of the decomposition:
(i) For each vertex v E V(G), we have p(v, B) 2: cl,0. with proba-
bility at least C2, for suitable constants Cl, C2 > O. The probability is
with respect to the random choices of the parameters r at each level
of the decomposition. (This is not hard to see; for example, in the
first level of the decomposition, for every fixed v, p( v, vo) is some fixed
number and it has a good chance to be at least Cl,0. away, modulo D.,
from a random r .)
(ii) Each component in the resulting decomposition has diameter
at most 0(,0.). (This is not so easy to prove, and it is where one needs
k = k(H) sufficiently large. For H = K 3 ,3, which includes the case of
planar graphs, the proof is a relatively simple case analysis.)
Next, we describe the embedding of V (G) into £2 in several steps.
First we consider ,0. and the decomposition as above fixed, and we let
15.7 Upper Bounds for Euclidean Embeddings 395

Gl , ... , Gm be the components of G \ B. For all the Gi , we choose


random signs d Gi ) E {-I, + I} uniformly and independently. For a
vertex x E V(G), we define a(x) = 0 if x E Band a(x) = a(Gi )
if x E V(Gi ). Then we define the mapping ({JB,a: V(G) -t R by
((JB,a(V) = a(x) . p(x, B) (the distance of x to the boundary signed by
the component's sign). This ({JB,a induces a line pseudometric VB,a,
and it is easy to see that VB,a is dominated by p.
Let G be a constant such that all the Gi have diameter at most
GI::1, and let x, y E V(G) be such that GI::1 < p(x, y) :::; 2GI::1. Such
x and y certainly lie in distinct components, and a(x) =I- dy) with
probability ~. With probability at least C2, we have p(x, B) 2: c1 1::1,
and so with a fixed positive probability, VB,a places x and y at distance
at least Cll::1.
Now, we still keep 1::1 fixed and consider VB,a for all possible Band
a. Letting aB,a be the probability that a particular pair (B, a) results
from the decomposition procedure, we have
L aB,aVB,a(X, y) = fl(p(x, y))
B,a

whenever GI::1 < p(x,y) :::; 2GI::1. As in the proof of Lemma 15.7.3,
this yields a I-Lipschitz embedding h: V(G) -t £~ (for some N) that
shortens distances for pairs x, y as above by at most a constant factor.
(It is not really necessary to use all the possible pairs (B, a) in the
embedding; it is easy to show that const . log n independent random
B and a will do.)
To construct the final embedding f: V (G) -t £2, we let f (v) be the
concatenation of the vectors h for 1::1 E {2 j : 1 :::; 2j :::; diam(G)}. No
distance is expanded by more than O( Jlog diam( G)) = O( Jlog n ),
and the contraction is at most by a constant factor, and so we have
an embedding into £2 with distortion O( Jlog n ).
Why do we get a better bound than for Bourgain's embedding?
In both cases we have about log n groups of coordinates in the em-
bedding. In Rao's embedding we know that for every pair (x, y), one
of the groups contributes at least a fixed fraction of p(x, y) (and no
group contributes more than p(x, y)). Thus, the sum of squares of the
contributions is between p(x, y)2 and p(x, y)210gn. In Bourgain's em-
bedding (with a comparable scaling) no group contributes more than
p(x, y), and the sum of the contributions of all groups is at least a
fixed fraction of p(x, y). But since we do not know how the contri-
butions are distributed among the groups, we can conclude only that
the sum of squares of the contributions is between p(x, y)2 / log nand
p(x, y)2log n.
It remains to sketch the modifications of Rao's embedding for a
graph G with arbitrary nonnegative weights on edges. For the un-
weighted case, we defined Bl as the vertices lying exactly at the given
396 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

distances from Vo. In the weighted case, there need not be vertices
exactly at these distances, but we can add artificial vertices by subdi-
viding the appropriate edges; this is a minor technical issue. A more
serious problem is that the distances p(x, y) can be in a very wide
range, not just from 1 to n. We let ~ run through all the relevant
powers of 2 (that is, such that C~ < p(x,y) :S 2C~ for some x =I y),
but for producing the decomposition for a particular ~, we use a mod-
ified graph G 6. obtained from G by contracting all edges shorter than
~. In this way, we can have many more than logn values of ~, but
only o (log n) of them are relevant for each pair (x, y), and the analysis
works as before.
Gupta, Newman, Rabinovich, and Sinclair [GNRS99] proved that
any Excl(K4 )-metric, as well as any Excl(K2,3)-metric, can be 0(1)-
embedded into 1'1, and they conjectured that for any H, Excl(H)-
metrics might be O(I)-embeddable into £1 (the constant depending
on H).
Volume-respecting embeddings. Feige [FeiOO] introduced an interest-
ing strengthening of the notion of the distortion of an embedding,
concerning embeddings into Euclidean spaces. Let f: (V, p) ---+ £2 be
an embedding that for simplicity we require to be I-Lipschitz (nonex-
panding). The usual distortion of f is determined by looking at pairs
of points, while Feige's notion takes into account all k-tuples for some
k ?: 2. For example, if V has 3 points, every two with distance 1, then
the following two embeddings into £~ have about the same distortion:


• • •
• •
But while the left embedding is good in Feige's sense for k = 3, the
right one is completely unsatisfactory. For a k-point set P C £2, de-

by P (so Evol(P) = °
fine Evol(P) as the (k-I)-dimensional volume of the simplex spanned
if P is affinely dependent). For a k-point
metric space (8, p), the volume Vol(8) is defined as sup! Evol(f(8)),
where the supremum is over all I-Lipschitz f: 8 ---+ £2. An embedding
f: (V,p) ---+ £2 is (k,D) volume-respecting if for every k-point subset
8 ~ V, we have D· Evol(f(8))1/(k-1) ?: Vol(8)1/(k-l). For D small,
this means that the image of any k-tuple spans nearly as large a vol-
ume as it possibly can for a I-Lipschitz map. (Note, for example, that
an isometric embedding of a path into £2 is not volume-respecting.)
Feige showed that Vol(8) can be approximated quite well by an
intrinsic parameter of the metric space (not referring to embeddings),
namely, by the tree volume Tvol(8), which equals the products of the
edge lengths in a minimum spanning tree on 8 (with respect to the
metric on 8). Namely, Vol(8) :S (k~l)! Tvol(8) :S 2(k-2)/2 Vol(8). He
15.7 Upper Bounds for Euclidean Embeddings 397

proved that for any n-point metric space and all k 2: 2, the embed-
ding as in the proof of Theorem 15.7.1 is (k, O(log n + y'k log n log k))
volume-respecting (the result in the conference version of his paper is
slightly weaker).
The notion of volume-respecting embeddings currently still looks
somewhat mysterious. In an attempt to convey some feeling about
it, we outline Feige's application and indicate the use of the volume-
respecting condition in it. He considered the problem of approximat-
ing the bandwidth of a given n-vertex graph G. The bandwidth is
the minimum, over all bijective maps cp:V(G) -+ {I,2, ... ,n}, of
max{lcp(u) - cp(v)l: {u, v} E E(G)} (so it has the flavor of an approx-
imate embedding problem). Computing the bandwidth is NP-hard,
but Feige's ingenious algorithm approximates it within a factor of
O((1og n )const). The algorithm has two main steps: First, embed the
graph (as a metric space) into e~, with m being some suitable power
oflogn, by a (k,D) volume-respecting embedding f, where k = logn
and D is as small as one can get. Second, let A be a random line in
e~ and let 'ljJ(v) denote the orthogonal projection of f(v) on A. This
'ljJ: V(G) -+ A is almost surely injective, and so it provides a linear or-
dering of the vertices, that is, a bijective map cp: V (G) -+ {I, 2, ... , n},
and this is used for estimating the bandwidth.
To indicate the analysis, we need the notion of local density of the
graph G: Id(G) = max{IB(v, r)l/r: v E V(G), r = 1,2, ... , n}, where
B(v, r) are all vertices at distance at most r from v. It is not hard to
see that ld( G) is a lower bound for the bandwidth, and Feige's analysis
shows that O(ld(G)(lognyonst) is an upper bound.
One first verifies that with high probability, if {u, v} E E( G), then
the images 'ljJ(u) and 'ljJ(v) on A are close; concretely, 1'ljJ(u) - 'ljJ(v) I : : ;
~ = O( J (log n) / m ). For proving this, it suffices to know that f is
I-Lipschitz, and it is an immediate consequence of measure concentra-
tion on the sphere. If b is the bandwidth obtained from the ordering
given by 'ljJ, then some interval of length ~ on A contains the images of
b vertices. Call a k-tuple S c V( G) squeezed if'ljJ(S) lies in an interval
of length ~. If b is large, then there are many squeezed S. On the
other hand, one proves that, not surprisingly, if Id(G) is small, then
Vol(S) is large for all but a few k-tuples S c V(G). Now, the volume-
respecting condition enters: If Vol(S) is large, then conv(f(S)) has
large (k-I)-dimensional volume. It turns out that the projection of a
convex set in e~ with large (k-I)-dimensional volume on a random
line is unlikely to be short, and so S with large Vol(S) is unlikely to be
squeezed. Thus, by estimating the number of squeezed k-tuples in two
ways, one gets an inequality bounding b from above in terms of Id(G).
Vempala [Vem98] applied volume-respecting embeddings in an-
other algorithmic problem, this time concerning arrangement of graph
398 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

vertices in the plane. Moreover, he also gave alternative proof of some


of Feige's lemmas. Rao in the already mentioned paper [Rao99] also
obtained improved volume-respecting embeddings for planar metrics.
Bartal's trees. As we have seen, in Bourgain's method, for a given
-b
metric p one constructs a convex combination 2:: aivi ~ p, where Vi
are line pseudometrics dominated by p. An interesting "dual" result
was found by Bartal [Bar96], following earlier work in this direction by
Alon, Karp, Peleg, and West [AKPW95]. He approximated a given p
by a convex combination 2::;:'1 aiTi, where this time the inequalities go
in the opposite direction: Ti ~ P and 2:: aiTi ::; Dp, with D = O{log2 n)
(later he improved this to O{logn log log n) in [Bar98]). The Ti are not
line metrics (and in general they cannot be), but they are tree metrics,
and even of a special form, the so-called hierarchically well-separated
trees. This means that Ti is given as the shortest-path metric of a
rooted tree with weighted edges such that the distances from each
vertex to all of its sons are the same, and if v is a son of u, and w a
son of v, then Ti{U, v) ~ K· Ti{V, w), where K ~ 1 is a parameter that
can be set at will (and the constant in the bound on D depends on
it).
This result has been used in approximation algorithms for problems
involving metric spaces, according to the following scheme: Choose
i E {I, 2, ... , N} at random, with each i having probability ai, solve
the problem in question for the tree metric Ti, and show that the
expected value of the solution is not very far from the optimal solution
for the original metric p.
Since tree metrics embed isometrically into £1, Bartal's result also
implies O{log n log log n )-embeddability of all n-point metric spaces
into £1, which is just a little weaker than Bourgain's approach (and it
also implies that O(logn) is a lower bound in Bartal's setting). For a
simpler proof of a weaker version Bartal's result see Indyk [Ind01].

Exercises
1. (Embedding into £p) Prove that under the assumptions of Lemma 15.7.3,
the metric space (V, p) can be D-embedded into £~, 1 ::; p ::; 00, with
distortion at most D. (You may want to start with the rather easy cases
p = 1 and p = 00, and use Holder's inequality for an arbitrary p.) 0
2. (Dimension reduction for the embedding)
(a) Let El"'" Em be independent events, each of them having proba-
bility at least 112, Prove that the probability of no more than ~ of the
Ei occurring is at most e- cm , for a sufficiently small positive constant c.
Use suitable Chernoff-type estimates or direct estimates of binomial co-
efficients. 0
15.7 Upper Bounds for Euclidean Embeddings 399

(b) Modify the proof of Theorem 15.7.1 as follows: For each j =


1,2, ... , q, pick sets Aij independently at random, i = 1,2, ... , m,
where the points are included in Aij with probability 2- j and where
m = Clog n for a sufficiently large constant C. Using (a) and Lem-
mas 15.7.2 and 15.7.3, prove that with a positive probability, the embed-
ding f: V -t t'~m given by f(V)ij = p(v, Aij) has distortion O(logn). !II
3. Let aI, a2, ... , an, bl , b2 , ... , bn , aI, a2,"" an be positive real numbers.
Show that

o
4. Let Pn be the metric space {O, 1, ... ,n} with the metric inherited from
R (or a path of length n with the graph metric). Prove the following
Ramsey-type result: For every D > 1 and every c > 0 there exists an
n = n(D, c) such that whenever f: Pn -t (Z, a) is a D-embedding of Pn
into some metric space, then there are a < b < c, b = ate, such that f
restricted to the subspace {a, b, c} of P n is a (1 +c )-embedding. That is,
if a sufficiently long path is D-embedded, then it contains a scaled copy
of a path of length 2 embedded with distortion close to 1. I1l
Can you extend the proof so that it provides a scaled copy of a path of
length k?
5. (Lower bound for embedding trees into t'2)
(a) Show that for every E > 0 there exists 0 > 0 with the following
property. Let XO,XI,X2,X; E t'2 be points such that Ilxo - xIII, IIXI -
X;
x211, IlxI - II E [1,1 + 0] and Ilxo - x211, IIxo - x;11 E [2,2 + 0] (so all the
distances are almost like the graph distances in the following tree, except
possibly for the one marked by a dotted line).

Then IIx2 - x; II :s; c; that is, the remaining distance must be very short.
!II
(b) Let Tk,m denote the complete k-ary tree of height m; the following
picture shows T3 ,2:

Show that for every rand m there exists k such that whenever the leaves
of Tk,m are colored by r colors, there is a subtree of Tk,m isomorphic to
T 2 ,m with all leaves having the same color. 0
400 Chapter 15: Embedding Finite Metric Spaces into Normed Spaces

(c) Use (a), (b), and Exercise 4 to prove that for any D > 1 there exist
m and k such that the tree Tk,m considered as a metric space with the
shortest-path metric cannot be D-embedded into £2. ~
6. (Another lower bound for embedding trees into £2)
(a) Let XO,X1, ... ,Xn be arbitrary points in a Euclidean space (we think
of them as images of the vertices of a path of length n under some em-
bedding). Let r = {(a, a+2 k , a+2k+1): a = 0,1,2, ... , a+2 k+1 ::; n, k =
0,1,2 ... }. Prove that

"
~
Ilxa - 2Xb
(c-a )2
+ xcl1 2 ::; ~ II
~ Xa - X a +1
112.,
(a,b,c)Er a=O

this shows that an average triple (xa, Xb, xc) is "straight" (and provides
an alternative solution to Exercise 4 for Z = £2). ~
(b) Prove that the complete binary tree T 2 ,m requires n( yllog m) dis-
tortion for embedding into £2. Consider a nonexpanding embedding
f: V(T2 ,m) ---+ £2 and sum the inequalities as in (a) over all images of
the root-to-Ieaf paths. ~
7. (Bourgain's embedding of complete binary trees into £2) Let Bm = T 2 ,m
be the complete binary tree of height m (notation as in Exercise 5).
We identify the vertices of Bm with words of length at most mover
the alphabet {O, I}: The root of Bm is the empty word, and the sons
of a vertex ware the vertices wO and wl. We define the embedding
f: V(Bm) ---+ £~V(B",)I-l, where the coordinates in the range of fare
indexed by the vertices of Bm distinct from the root, i.e., by nonempty
words. For a word w E V(Bm) of length a, let f(w)u = yla-b+1 if u is
a nonempty initial segment of w of length b, and f(w)u = 0 otherwise.
Prove that this embedding has distortion O( yllog m ). ~
8. Prove that any finite tree metric can be isometrically embedded into £1.
~
9. (Low-dimensional embedding of trees)
(a) Let T be a tree (in the graph-theoretic sense) on n ::::: 3 vertices. Prove
that there exist subtrees Tl and T2 of T that share a single vertex and
no edge and together cover T, such that min(IV(Tdl, IV(T2 )1) ::; l+~n.
~
(b) Using (a), prove that every tree metric space with n points can be
isometrically embedded into £~ with d = O(logn). ~
This result is from [LLR95].
What Was It About? An
Informal Summary

Chapter 1
• Linear and affine notions (dependence, hull, subspace, mapping); hyper-
plane, k-flat.
• General position: Degenerate configurations have measure zero in the
space of all configurations, provided that degeneracy can be described by
countably many polynomial equations.
• Convex set, hull, combination.
• Separation theorem: Disjoint convex sets can be separated by a hyper-
plane; strictly so if one of them is compact and the other closed.
• Theorems involving the dimension: Helly (if:F is a finite family of convex
sets with empty intersection, then there is a subfamily of at most d+ 1 sets
with empty intersection), Radon (d+2 points can be partitioned into two
subsets with intersecting convex hulls), CaratModory (if x E conv(X),
then x E conv(Y) for some at most (d+l)-point Y ~ X).
• Centerpoint of X: Every half-space containing it contains at least d!l
of X. It always exists by Helly. Ham-sandwich: Any d mass distributions
in Rd can be simultaneously bisected by a hyperplane.

Chapter 2
• Minkowski's theorem: A O-symmetric convex body of volume larger than
2d contains a nonzero integer point.
• General lattice: a discrete subgroup of (Rd, + ). It can be written as the
set of all integer linear combinations of at most d linearly independent
vectors (basis). Determinant = volume of the parallelotope spanned by
a basis.
• Minkowski for general lattices: Map the lattice onto Zd by a linear map-
ping.
402 What Was It About? An Informal Summary

Chapter 3
• Erdos-Szekeres theorem: Every sufficiently large set in the plane in gen-
eral position contains k points in convex position. How large? Exponential
in k.
• What about k-holes (vertex sets of empty convex k-gons)? For k = 5
yes (in sufficiently large sets), for k ;::: 7 no (Horton sets), k = 6 is a
challenging open problem.

Chapter 4
• Szemeredi-Trotter theorem: m distinct points and n distinct lines in the
plane have at most O(m 2 / 3 n 2 / 3 + m + n) incidences.
• This is tight in the worst case. Example for m = n: Use the k x 4k2 grid
and lines y = ax + b with a = 0, 1, ... , 2k-1 and b = 0,1, ... , 2k 2 -1.
• Crossing number theorem: A simple graph with n vertices and m ;::: 4n
edges needs n(m 3 jn 2 ) crossings. Proof: At least m-3n crossings, since
planar graphs have fewer than 3n edges, then random sampling.
• Forbidden bipartite subgraphs: A graph on n vertices without KT,s has
O(n 2 - 1 / T ) edges.
• Cutting lemma: Given n lines and r, the plane can be subdivided into
O(r2) generalized triangles such that the interior of each triangle is in-
tersected by at most ~ lines. Proof of a weaker version: Triangulate the
arrangement of a random sample and show that triangles intersected by
many lines won't survive. Application: geometric divide-and-conquer.
• For unit distances and distinct distances in the plane, bounds can be
proved, but a final answer seems to be far away.

Chapter 5
• Geometric duality: Sends a point a to the hyperplane (a, x) = 1 and vice
versa; preserves incidences and sidedness.
• Convex polytope: the convex hull of a finite set and also the intersection
of finitely many half-spaces.
• Face, vertex, edge, facet, ridge. A polytope is the convex hull of its ver-
tices. A face of a face is a face. Face lattice. Duality turns it upside down.
Simplex. Simple and simplicial polytopes.
• The convex hull of n points in Rd can have as many as n( n Ld/2J) facets;
cyclic polytopes.
• This is as bad as it can get: Given the number of vertices, cyclic polytopes
maximize the number of faces in each dimension (upper bound theorem).
• Gale transform: An n-point sequence in Rd (affinely spanning R d) is
mapped to a sequence of n vectors in Rn-d-l. Properties: a simple linear
algebra. Faces of the convex hull go to subsets whose complement contains
o in the convex hull.
What Was It About? An Informal Summary 403

• 3-dimensional polytopes are nice: Their graphs correspond to vertex 3-


connected planar graphs (Steinitz theorem), and they can be realized
with rational coordinates. From dimension 4 on, bad things can happen
(irrational or doubly exponential coordinates may be required, recogni-
tion is difficult).
• Voronoi diagram. It is the projection of a convex polyhedron in dimension
one higher (lifting using the paraboloid). Delaunay triangulation (defined
using empty balls; dual to the Voronoi diagram).

Chapter 6
• Arrangement of hyperplanes (faces, vertices, edges, facets, cells). For d
fixed, there are O(n d ) faces.
• Clarkson's theorem on levels: At most O( n ld/2Jk rd/21) vertices are at
level at most k. Proof: Express the expected number of level-O vertices
of a random sample in two ways!
• Zone theorem: The zone of a hyperplane has O(n d - 1 ) vertices. Proof:
Delete a random hyperplane, and look at how many zone faces are sliced
into two by adding it back.
• Proof of the cutting lemma by a finer sampling argument: Vertically
decompose the arrangement of a sample taken with probability p, show
that the number of trapezoids intersected by at least tnp lines decreases
exponentially with t, take i-cuttings within the trapezoids.
• Canonical triangulation, cutting lemma in R d (O( r d ) simplices).
• Milnor-Thom theorem: The arrangement of the zero sets of n polynomi-
als of degree at most D in d real variables has at most O(Dn/d)d faces.
• Most arrangements of pseudolines are nonstretchable (by Milnor-Thom).
Similarly for many other combinatorial descriptions of geometric config-
urations; usually most of them cannot be realized.

Chapter 7
• Davenport-Schinzel sequences of order 8 (no abab . .. with 8+2 letters);
maximum length As(n). Correspond to lower envelopes of curves: The
curves are graphs of functions defined everywhere, every two intersecting
at most 8 times. Lower envelopes of segments yield DS sequences of or-
der 3.
• A3 = 8(na(n)); As(n) is almost linear for every fixed 8.
• The lower envelope of n algebraic surface patches in R d, as well as a
single cell in their arrangement, have complexity O(n d -1+€). Charging
schemes and more random sampling.
404 What Was It About? An Informal Summary

Chapter 8
• Fractional Helly theorem: If a family of n convex sets has aC~I) inter-
secting (d+l)-tuples, then there is a point common to at least d~l n of
the sets.
• Colored Caratheodory theorem: If each of d+ 1 sets contains 0 in the
convex hull, then we can pick one point from each set so that the convex
hull of the picked points contains O.
• Tverberg's theorem: (d+l)(r-l)+l points can be partitioned into r sub-
sets with intersecting convex hulls (the number is the smallest conceivable
one: r-l simplices plus one extra point).
• Colored Tverberg theorem: Given points partitioned into d+ 1 color
classes by t points each, we can choose r disjoint rainbow subsets with
intersecting convex hulls, t = t(d, r). Only topological proofs are known.

Chapter 9
• The dimension is considered fixed in this chapter. First selection lemma:
Given n points, there exists a point contained in a fixed fraction of all
simplices with vertices in the given points.
• Second selection lemma: If a(d~l) of the simplices are marked, we can
find a point in many of the marked simplices (at least n(a Sd (d~l) )).
Needs colored Tverberg and Erdos-Simonovits.
• Order type. Same-type lemma: Given n points in general position and k
fixed, one can find k disjoint subsets of size n( n), all of whose transversals
have the same order type.
• A hypergraph regularity lemma: For an c > 0 and a k-partite hypergraph
of density bounded below by a constant (3 > 0 and with color classes
Xl'·.·' Xn of size n, we can choose subsets YI ~ Xl' ... ' Yk ~ Xk, WII =
... = IYkl ;::: en, e = (k, (3, c) > 0, such that any ZI ~ YI , ... , Zk ~ Yk
with IZil ;::: clYiI induce some edge.
• Positive-fraction selection lemma: Given n red, n white, and n blue points
in the plane, we can choose {'2 points of each color so that all red-white-
blue triangles have a common point; similarly in Rd.

Chapter 10
• Set systems; transversal number T, packing number //. Fractional transver-
sal and fractional packing; //* = T* by LP duality.
• Epsilon net, shattered set, VC-dimension. Shatter function lemma: A set
system on n points with VC-dimension d has at most 2::%=0 (~) sets.
• Epsilon net theorem: A random sample of C ~ log ~ points in a set system
of VC-dimension d is an c-net with high probability. In particular, c-nets
exist of size depending only on d and c.
• Corollary: T = O(T*logT*) for bounded VC-dimension.
What Was It About? An Informal Summary 405

• Half-spaces in Rd have VC-dimension d+1. Lifting (Veronese map) and


the shatter function lemma show that systems of sets in R d definable by
Boolean combinations of a bounded number of bounded-degree polyno-
mial inequalities have bounded VC-dimension.
• Weak epsilon nets for convex sets: Convex sets have infinite VC-dimen-
sion, but given a finite set X and E > 0, we can choose a weak E-net
of size at most f(d,E), that is, a set (generally not a subset of X) that
intersects every convex C with IC n XI 2: EIXI.
• Consequently, T is bounded by a function of T* for any finite system of
convex sets in Rd.
• Alon-Kleitman (p, q)-theorem: Let F be a system of convex sets such
that among every p sets, some q intersect (p 2: q 2: d+l). Then T(F) is
bounded by a function of d,p, q. Proof: First bound 1I* using fractional
Helly; then T is bounded in terms of T* = 1I* as above.
• A similar (p, q)-theorem for hyperplane transversals of convex sets (even
though no Helly theorem!).

Chapter 11
• k-sets, k-facets (only for sets in general position!), halving facets. Dual:
cells of level k, vertices of level k. The k-set problem is still unsolved.
Straightforward bounds from Clarkson's theorem on levels.
• Bounds for halving facets yield bounds for k-facets sensitive to k.
• A recursive planar construction with a superlinear. number of halving
edges.
• Lovasz lemma: No line intersects more than O(n d - l ) halving facets.
Proof: When a moving line crosses the convex hull of d-l points of X,
the number of halving facets intersected changes by 1 (halving-facet in-
terleaving lemma).
• Implies an upper bound of O(n d - c5 (d)) for halving facets by the second
selection lemma.
• In the plane a continuous motion argument proves that the crossing num-
ber of the halving-edge graph is O(n 2 ), and consequently, it has O(n 4 / 3 )

°
edges by the crossing number theorem. This is the best we can do in the
plane, although O(n1+E) for every fixed E > is suspected.

Chapter 12
• Perfect graph (X = w hereditarily). weak perfect graph conjecture (now
theorem): A graph is perfect iff its complement is.
• Proof via the polytope {x E R v: x 2: 0, x(K) ::; 1 for every clique K}.
• Brunn's slice volume inequality: For a compact convex C c Rn+l,
voln({x E C: Xl = t})l/n is a concave function of t (as long as the
slices do not miss the body).
406 What Was It About? An Informal Summary

• Brunn-Minkowski inequality: vol(A)l/n + vol(B)l/n :::; vol(A + B)l/n for


nonempty compact A, B eRn.
• A partially ordered set with N linear extensions can be sorted by
O(log N) comparisons. There always exists a comparison that reduces
the number of linear extensions by a fixed fraction: Compare elements

• Order polytope: °: :;
whose average heights differ by less than l.
x :::; 1, Xa :::; Xb whenever a j b. Linear extensions
correspond to congruent simplices and good comparison to dividing the
volume evenly by a hyperplane Xa = Xb. The best ratio is not known
(conjectured to be -! : ~).

Chapter 13
• Volumes and other things in high dimensions behave differently from
what we know in R2 and R3. For example, the ball inscribed in the unit
cube has a tiny volume.
• An 1J-net is an inclusion-maximal 1J-separated set. It is mainly useful
because it is 1J-dense. In sn-l, a simple volume argument yields 1J-nets
of size at most (4/1J)n.
• An N-vertex convex polytope inscribed in the unit ball B n occupies at
most O(ln( ~+I)/n)n/2 of the volume of Bn. Thus, with polynomially
many vertices, the error of deterministic volume approximation is expo-
nential in the worst case.
• Polytopes with such volume can be constructed: For N = 2n use the
crosspolytope, for N = 4n a I-net in the dual sn-l, and interpolate
using a product.
• Ellipsoid: an affine image of Bn. John's lemma: Every n-dimensional
convex body has inner and outer ellipsoids with ratio at most n, and
a symmetric convex body admits the better ratio ..;n. The maximum-
volume inscribed ellipsoid (which is unique) will do as the inner ellipsoid.

Chapter 14
• Measure concentration on sn-l: For any set A occupying half of the
sphere, almost all of sn-l is at most O(n-l/2) away from A. Quantita-
tively, 1 - PlAt] :::; 2e- t2n / 2 .
• Similar concentration phenomena in many other high-dimensional spaces:
Gaussian measure on Rn, cube {O, l}n, permutations, etc.
• Many concentration inequalities can be proved via isoperimetric inequal-
ities. Isoperimetric inequality: Among all sets of given volume, the ball
has the smallest volume of at-neighborhood.
• Levy's lemma: A I-Lipschitz function f on sn-l is within O(n-l/2) of
its median on most of sn-l.
• Consequently (using 1J-nets), there is a high-dimensional subspace on
which f is almost constant (use a random subspace).
What Was It About? An Informal Summary 407

• Normed spaces, norm induced by a symmetric convex body.


• For any n-dimensional symmetric convex polytope, log(fo) log(fn-1) =
n(n) (many vertices or many facets).
• Dvoretsky's theorem: For every k and E > 0 there exists n (n = eO(k/c 2 )
suffices) such that any n-dimensional convex body has a k-dimension-
al (1 +E )-spherical section. In other words, any high-dimensional normed
space has an almost Euclidean subspace.

Chapter 15
• Metric space; the distortion of a mapping between two metric spaces,
D-embedding. Spaces £~ and £p.
• Flattening lemma: Any n- point Euclidean metric space can be (1 +E)-
embedded into £~, k = O(c 2 Iogn) (project on a random k-dimensional
subspace).
• Lower bound for D-embedding into ad-dimensional normed space: count-
ing; take all subgraphs of a graph without short cycles and with many
edges.
• The m-dimensional Hamming cube needs Vm distortion for embedding
into £2 (short diagonals and induction).
• Edge expansion (conductance), second eigenvalue of the Laplacian ma-
trix. Constant-degree expanders need n(log n) distortion for embedding
into £2 (tight). Method: Compare sums of squared distances over the
edges and over all pairs, in the graph and in the target space.
• D-embeddability into £2 is polynomial-time decidable by semidefinite
programming.
• All n-point spaces embed isometrically into £~. For embeddings with
smaller dimension, use distances to random subsets of suitable density
as coordinates. A similar method yields o (log n )-embedding into £2 (or
any other £p).
• Example of algorithmic application: approximating the sparsest cut. Em-
bed the graph metric into £1 with low distortion; this yields a cut pseu-
dometric defining a sparse cut.
Hints to Selected Exercises

1.2.7(a). The existence of an x 2': 0 with Ax = b means that b lies in the


convex cone generated by the columns of A. If b is not in the cone, then it
can be separated from it as in Exercise 6(b).
1.2.7(b). Apply (a) with the dx(n+d) matrix (A I Id), where Id is the iden-
tity matrix.
1-;-d-+
1.3.5 ( c ). Jr=-2d77 :;( 1:-7)•
1.3.8(b). By Helly's theorem, K = nXEx conv(V(x) =f. 0. Prove that K is
the kernel.
1.3.10(b). Assign the set Hx = {(a,b) E Rd x R: (a,x) < b} to each x E X
and the set Gy = {(a, b) E Rd x R: (a,x) 2': b} to each y E Y. Use Helly's
theorem.
1.4.1(a). Express 'Y as U:l Gi , where G1 C;;; G2 C;;; ••• are compact. Then
P,C'Y) = 2:::1 p,(Gi+l \ Gi ) by the a-additivity of p,.
(More generally, every
Borel probability measure on a separable metric space is regular: The measure
of any set can be approximated with arbitrary precision by the measure of a
compact set contained in it.)
2.1.4(c). Let p(x) be a polynomial with integer coefficients having a as a
root. If deg(p) = d and la - mini < nd+l, say, then ndp(mln) is integral,
but Indp(mln) I < 1 for large n.
2.1.5(a). Seek a nonzero vector in Z3 close to the line y = alX, Y = a2x.
2.2.1. Show that elementary row operations on the matrix, which do not
change the determinant, also preserve the volume. Diagonalize the matrix.
3.1.4. Project orthogonally on a suitable plane and apply Erdos-Szekeres.
3.2.4. It suffices to deal with the case k = 4m. First prove by induction that
a 2m-point cup contained in a Horton set has at least 2m - 2m points of the
set above it.
4.1.2. Place points on two circles lying in orthogonal planes in R4.
4.3,3. Choose a point set P, one point in each of the m cells. From each top
edge, cut off a little segment ab and replace it by the segments ap and pb,
where pEP lies below the edge. Each line is replaced by a polygonal curve.
410 Hints to Selected Exercises

Consider a graph drawing with P as the vertices and the polygonal curves
defining edges.
4.3.4(c). Consider a drawing of G witnessing pair-cr(G) = k. At most 2k
edges are involved in any crossings, and the remaining ones (the good edges)
form a planar graph. Redraw the edges with crossings so that they do not
intersect any of the good edges and, subject to this, have the minimum pos-
sible number of crossings.
4.4.1(a). O(n lO / 7 ) = O(n1. 43 ).
4.4.1 (b). Let C i be the points of C that are the centers of at least 2i and
at most 2i+l circles. We have ICil = qi :::; n/2i. One incidence of a line of the
form Cuv with acE C i contributes at most 2i+2 edges.
4.4.2(b). Look at u,v with /1(U, v) 2: 4y1(:4, and suppose that at least half
of the uv edges have their partner edges adjacent to u, say. These partner
edges connect u to at least 2y1(:4 distinct neighbor vertices. By (a), at most
yI(:4/2 of these partner edges may belong to E h .
4.4.2(c). We get lEI = O(IE \ Ehl) = O(n4/3d;/6); at the same time, lEI 2:
ndd2. This gives di = O(n 2 / 5 ) and Icirc(n, n) = O(n 7 / 5 ) = O(n1. 4).
4.7.1. Consider a trapezoid ABB'A'; AB is the bottom side and A'B' the
top side. Suppose AB is contained in an edge CD of Pj and A' B' is an edge
of Pj+l (the few other possible cases are discussed similarly). Let Al be the
intersection of the level qj + i with the vertical line AA', and similarly for B l .
The segments A'B', A'Al' and B'BI each have at most q+1 intersections.
Observe that if AAI has some a intersections, then C A also has at least a
intersections, and similarly for BBI and BD. At the same time CD has at
most q+1 intersections altogether. Therefore, AA l , AB, and BBI have no
more than q+ 1 intersections in total.
5.1.9(b). Geometric duality and Helly's theorem.
5.1.9(c). The first segment Sl is a chord of the unit circle passing near the
center. Each Si+l has one endpoint on the unit circle, and the other endpoint
almost touches Si near the center.
5.3.2. Ask in this way: Given a normal vector a E Rd of a hyperplane, which

if ai > 0, then Xi has to be +1; if ai < 0, then Xi = -1; and for ai = both
Xi = ±1 are possible.
°
vertices maximize the linear function x f-t (a, x)? For example, for the cube,

5.3.8. If the removed vertices u, v lie in a common 2-face j, let h be the plane
defining j; from each vertex there is an edge going "away from h," except for
the vertices of a single face g -I- j "opposite" to j. The graph of the face g
is connected and can be reached from any other vertex. If u, v do not share
a 2-face, pass a plane h through them and one more vertex w. The subgraph
on the vertices below h is connected, and so is the subgraph on the vertices
above h; they are connected via the vertex w.
5.4.2. Do not forget to check that (3 is not contained in any hyperplane.
Hints to Selected Exercises 411

5.5.1(c). The simplest example seems to be the product of an n-vertex 4-di-


mensional cyclic polytope with its dual.
5.7.11(c). Assume n 2: 2. If x,y are points on the surface of such an inter-
section P, coming from the surface of the same ball K" show that the shorter
of the great circle arcs on K, connecting x and y lies entirely on the surface of
P (this is a kind of "convexity" of the facets). Infer that each ball contributes
at most one facet, and use Euler's formula.
6.1.5. n! . Cn, where C n = n~l (2:) is the nth Catalan number.
6.1.6(a). One possibility is a perturbation argument. Another one is a proof
by induction, adding one line at a time.
6.1. 7(b). Warning: The G) lines determined by n points in general position
are not in general position!
6.2.2(b). Assuming that no 8i is vertical, write 8i = {(x, y) E R2: Ci :=:; x :=:;
di , Y = ai X + bi }. Whether 8i and 8j intersect can be determined from the
signs of the O(n 2 ) polynomials a·Z - a·J' c·Z - c·J' d·Z - d·J' c·(a·
Z Z J + b·Z - b·
- a·) J'
di(ai - aj) + bi - bj , i,j = 1,2, ... ,no
6.2.2(c). Use the lower bound for the quantity K(n, n) in Chapter 4.
6.3.4(a). First derive Xw 2: IWI - n, and then use it for a random sample
of the lines.
6.4.3(a). Define an incidence graph between lines and the considered m cells
(incidence = the line contributes an edge to the cell). This graphs contains
no K 2 ,5, since two cells have at most 4 "common tangents."
6.4.3. Each of the given n cells either lies completely within a single triangle
D.i, or it is in the zone of an edge of some triangle. Use the zone theorem for
bounding the total number of edges of cells of the latter type.
6.5.2(a). E [X2] = Ei,j E[XiXj]. E[XiXj] = p2 for i =f. j and E [Xl] = p.
The result is p2 n( n - 1) + pn.
7.1.1. Construct the curves from left to right: Start with n horizontal lines
on the left and always "bring down" the curve required by the sequence.
7.1.4. Warning: The abab subsequence can appear!
7.1.8(b). For simplicity assume that all the 8i and ti are all distinct and
let E = {81' t1, ... , 8 n , t n }. Call a vertex v active for an interval I ~ R if v
appears on the lower envelope of L t for some t E I and I n { 8i, 8 j, ti, t j} =f. 0,
where e·i , ej are the lines defining V. Let g(I) be the number of active vertices
for I and let g(m) = max{g(I): II n EI :=:; m}. Split I in the middle of E n I
and derive g(m) :=:; O(m) + g(lm/2J) + g(lm/2l).
7.3.2(b). Zero out the first and last 1 in each row. Go through the matrix
column by column and write down the row indices of 1 'so Deleting contiguous
repetitions produces a Davenport-Schinzel sequence with no ababa.
7.4.1(b). Given a sequence w witnessing 'l/J;(m,n), replace each of the m
segments in the decomposition of w by the list of its symbols (and erase
contiguous repetitions if needed).
412 Hints to Selected Exercises

8.1.2. Make the sets compact as in the proof of the fractional Helly theorem.
Consider all d-element collections K containing one set from each Ci but one,
n
and let VIC be the lexicographic minimum of the intersection of K. Let Ko
be such that V = VICo is the lexicographically largest among all VIC, and let
io be the index such that Ko contains no set from Cio . Show that for each
n
G E Cio ' V is the minimum of G n Ko, and in particular, v E G.
8.2.1. Regard BuT as a Gale transform of a point sequence and reformulate
the problem using that sequence. Or lift BuT into R d+1 suitably.
9.2.2(b). For d = 3: Choose k points on the moment curve, say, and replace
each by a cluster of n/k points. Use all tetrahedra having two vertices in one
cluster and the other two vertices in another cluster. There are about n 4 /k 2
such tetrahedra, and no point is contained in more than n 4 / k4 of them if the
clusters are small and k is not too large compared to n.
9.3.1(b). Be careful with degenerate cases; first determine the dimension of
the affine hull of PI, ... ,Pd+1 and test whether Pd+2 lies in it. Then you may
need to use some number of other affinely independent points among the Pi.
9.3.3(a). Let Xi'X~ E Xi be such that (XI, ••• ,xd+d and (x~, ... ,X~+l)
have different orientations. Let Yi be a point moving along the segment XiX~
at constant speed, starting at Xi at time 0 and reaching x~ at time 1. By
continuity of the determinant, all the Yi lie in a common hyperplane at some
moment, and this hyperplane intersects the convex hulls of all the Xi.
9.3.3(b). Let the hyperplane h intersect all the Gi , and let ai E h n Gi . Use
Radon's lemma.
9.3.3(c). Suppose that 0 E conv(UEI Gi ) n conv(Uj!i!'I Gj ). Then there are
points Xi E Gi , i = 1,2, ... , d+l, such that 0 E conv{xi: i E f} and 0 E
conv{xj: j fj. I}. Hence the vectors {Xi: i E f} are linearly dependent, as well
as those of {Xj: j fj. I}. Thus, the linear subspace generated by all the Xi has
dimension at most d-1.
9.3.5(a). Partition Pinto 3 sets and apply the same-type lemma. IfYI, Y2 , Y3
are the resulting sets, then each line misses at least one conv(Yi). Let pI be
the Yi whose convex hull is missed by the largest number of lines of L.
9.3.5(b). First apply (a) with P consisting of the left endpoints of the seg-
ments of B. Then apply (a) again with the right endpoints of the remaining
segments and the remaining lines. Finally, discard either the lines intersected
by all segments or those intersected by no segment.
9.3.5(c). Use (b) twice.
9.4.4. Consider the complete bipartite graphs with classes Vi and Vj, 1 ::::;
i < j $ 4, and color each of their edges randomly either red or blue with
equal probability. A triple {u, v, w} with u E Vi, v E Vj, W E Vk, i <: j .::: k,
is present if and only if the edges {u, v} and {u, w} have distinct colors.
10.1.3. Choose the appropriate number of points independently at random.
according to the distribution given by an optimal fractional transversal.
Hints to Selected Exercises 413

10.1.4(a). Let mk be the number of yet uncovered sets after the last step
i such that Xi covered more than k previously uncovered sets (md = IFI,
mo = 0). Derive t ::; 2::~=1 mk-;;k-l and note that mk ::; vk(F).
1O.1.6(b). By the Farkas lemma, it suffices to check the following: For all
u E R m , vERn, and z E R such that u ;::: 0, v ;::: 0, z ;::: 0, uTA::; zc, and
Av ;::: zb, we have uTb ::; cT v. For z =J: 0 this is (a), and for z = 0 choose
Xo E P and Yo ED and use uTb::; u T Axo ::; 0 and cTv;::: Y6 Av;::: O.
10.2.2. All subsets of size at most d.
10.3.1. 7.
10.3.3. Such a p would have to be 0 on the boundary, but if a polynomial is
o on a segment, then it is 0 on the whole line containing that segment.
1O.3.4(b). Choose a ~-net S S;; L for the set system (L, T) and triangulate
the arrangement of S. No dangerous triangle appears in this triangulation.
10.3.6(c). The shattering graph SCd considered in Exercise 5 contains a
subdivision of Kd where each edge is subdivided once. Some care is needed,
since some vertices might be both shattering and shattered in C.
10.4.1(b). This method gives size 0 (€_2 d - I
).

1O.4.2(b). (a) yields f(€) ::; m + £f(£€/3); set £ = 3/JE. The exponent of
log ~ is log2 3.
10.4.3. We may assume that € is sufficiently small. Let C be convex with
IC n X I ;::: En. Then C n X contains points a, b, c such that the shortest of the
3 arcs determined by them, call it G, is at least O(€). Show that the triangle
abc contains a point of N i , where i is the smallest with €(l.Ol)i /10 > G.
10.5.2. If X is the last among the lexicographic minima of d-wise intersections
of F, the family {F E F: x tJ. F} satisfies the (p-d, q-d+ 1)-condition.
1O.5.3(b). By ham-sandwich, choose lines £,£' with IRi nXI::; k+1, where
R 1 , •.. ,R4 are the "quadrants" determined by £ and £'. The point £ n £' and
centerpoints of Ri n X form a transversal.
10.6.1(a). No need to invoke the Alon-Kleitman machinery here.
10.6.1(b). Use Ramsey's theorem.
10.6.2(a). Count the incidences of endpoints with intervals (it can be as-
sumed that all the intervals have distinct endpoints). To get a better /3, apply
Thran's theorem.
10.6.3. For F c K~ finite, let 9 = US EF {Sl, S2,"" Sd, where S = Sl U
... u S k with the Si convex. If F has many intersecting (d+ 1)-tuples, then 9
has many intersecting (d+1)-tuples and so fractional Helly for F, with worse
parameters, follows from that for g.
10.6.4. Let C = f(d+1, d, k), where f(p, d, k) is as in Exercise 3, and h =
(d+1)C. Let F' be the family of all intersections of C-tuples of sets of F.
414 Hints to Selected Exercises

This F' has the (d+1,d+1)-property, and so it has a C-point transversal T.


Show that some point of T is contained in all members of F.
11.1.4. In R3: Place the planar construction on ~ points into the xz plane
so that all of its points lie very near 0 and all the halving edges are almost
parallel to the x-axis. A set A of ~ points is placed on the line x = 0, y = 1,
and the remaining ~ points are the reflected set - A.
11.1.5{a). Use the lower bound for K(n, n) in Chapter 4.
11.1.6{a). All the 12 lenses corresponding to such a K 3 ,4 are contained in
L U U, and so L intersects U at least 24 times. This is impossible, since U
has at most 5 edges and L at most 7 edges (using A2(n) ::; 2n-1).
11.1.6{d). To bound Vk(.c), fix a k-packing M ~ .c, take a random sample
R ~ r, and consider the family A of all lenses f in the arrangement of
R "inherited" from M and such that none of the extremal edges of fare
contained in any other lens in the arrangement of R. Extremal edges of a lens
are those contained in the lens and adjacent to one of its two end-vertices.
11.3.2. By Exercise l(a), a vertical line intersects the interior of at most
2: kE K(k+1) k-edges with k E K. Argue as in the proof of the planar case of
Theorem 11.3.3.
11.3.4{b). These halving triangles are not influenced by projecting the other
points of X centrally from Pk+I on a sphere around Pk+1'
11.3.5{a). Let V be the vertex set of a j-facet F entered by f. Among the
j points below the hyperplane defined by V we can choose any k points and
add them to V, obtaining an S with F being the facet of conv( S) through
which f leaves conv(S).
11.3.5{b). See the end of Section 5.5 for a similar trick.
11.3.5{c). For hj = hn-d-j, let X' be the mirror reflection of X by a
horizontal hyperplane.
11.3.5{d). Move x far up.
11.3.6{a). Corollary 5.6.3(iii).
11.3.6{b). Use (a) and the formulas expressing the fk using the h j and the
Sk using the hj, respectively.

11.3.8{a). Draw a tiny sphere a around a vertex incident to at least 3n


triangles. The intersections of the triangles with a define a graph drawn
on a. With n vertices and at least 3n edges, the graph is nonplanar.
12.1.5. Let v be a vertex of P. First check that there is an a E zn such
that v is the unique vertex minimizing (a, v). Moreover, we may assume that
a' = a + (1,0, ... ,0), too, has this property. Then VI = (a', v) - (a, v) E Z.
12.1.6{b). We need that each integral b EARn is the image of an integer
point. Let A be a regular k x k submatrix of A with k = rank(A); we may
assume that A is contained in the first k rows and in the first k columns of
Hints to Selected Exercises 415

A. Let b consist of the first k components of b; then x = A-I b is integral by


(a). Append n - k zero components to x.
12.1.6( c). A vertex is determined by some n of the inequalities holding with
equality; use (b).
12.1. 7(b). It suffices to consider n = 2d + 1. For contradiction, suppose that
Zd n n7=1 'Yi = 0. For i = 1,2, ... ,n, let 'Y: be 'Yi translated as far outwards
as possible so that Zd n int ((n;=l 'Yj) n (n;=i+l 'Yj)) = 0. Show that each
'Y: contributes a facet of pI = n7=1 'Y: and there is a E Zd in the relative
Zi
interior of this facet. Applying (a) to {Zl' ... , zn} yields a lattice point interior
to P'.
12.2.5(b). Suppose vol(A), vol(B) > 0, fix t with vol(A)/(l-t)n = vol(B)/t n ,
and set C = l~tA and D = tB.
12.2.7(a). Consider the horizontal slice Fy = {x E R: f(x) = y}, and
Gy, Hy defined analogously. We have I f = IOI vol(Fy) dy. The assumption
implies (l-t)Fy + tGy ~ H y. Apply the one-dimensional Brunn-Minkowski
to (l-t)Fy and tGy and integrate over y.
12.2.7(b). Let f(u) be the (n-1)-dimensional volume of the slice of C by
the hyperplane Xl = u; similarly for g(u) and D and for h(u) and C+D.
13.1.1. 2n In!.
Io e-
13.1.2(b). In = nVn oo r2 r n- l dr.

°
°
13.2.3. Fix the coordinate system so that c = and F lies in the coordinate
hyperplane h = {xn = O}. Since is not the center of gravity, for some i
we have I = IFXi dx =I 0. Without loss of generality, i = 1 and I > 0. Let
hI be h slightly rotated around the flat {Xl = Xn = a}; i.e., hI = {X E
Rn: (a, X) = O} with a = (6',0, ... ,0,1). Let Sl be the simplex determined
by the same facet hyperplanes as S except that h is replaced by hI. The
difference vol(S) - vol(Sd is proportional to €I + 0(6'2) as 6' ~ 0. Let h'
be a parallel translation of hI that touches B n (near 0), and let S' be the
corresponding simplex. Calculation shows that j VOl(SI) - vol(S')j = 0(6'2).

°
13.2.5. The Thales theorem implies that if X .;. B( ~v, ~ IIvll), then v lies in
the open half-space 'Yx containing and bounded by the hyperplane passing
through X and perpendicular to OX.
13.3.1(b). Geometric duality and Theorem 13.2.1.
13.4.4(b). Helly's theorem for suitable sets in Rn+l.
13.4.5(a). Since the ratio of areas is invariant under affine transforms, we
may assume that P contains B(O, 1) and is contained in B(0,2). Infer that
99% of the edges of P have length O(~) and 99% of the angles are 7I"-0(~).
Then there are two consecutive short edges with angle close to 71".
14.1.4. Choose a radius r such that the caps cut off from rBn by the
considered slabs together cover at most half of the surface of r Bn. Then
vol(K) 2: vol(K n rBn) 2: ~rn.
416 Hints to Selected Exercises

14.6.1. Suppose that maxi IVil = IVll. For any fixed choice of er2,"" ern, use
~(Ix + yl + Ix - yl) :2: IYI with Y = VI and x = L~=2 erivi·
14.6.2. We need to bound n- l / 2EUIZlldIIZlll from below for Z as in
Lemma 14.6.4. Each IZil is at least a small constant f3 > 0 with proba-
bility at least ~; derive that IIZlll = !l(n) with probability at least ~.
15.2.3(b). Let AI,"" An be the eigenvalues of A. The rank is the number
of nonzero Ai' Estimate L A; in two ways: First use the trace of AT A, and
then the trace of A and Cauchy-Schwarz.
15.2.3( d). If VI, ... ,Vn E R k, then the matrix A with aij = (Vi, Vj) has rank
at most k.
15.3.4(a). Let n = 2m+1 and let each n-tuple in V have the form
(0, el, e2,"" em, em+l + 10C:Wl, em+l + 10c:w2,.··, e2m + 10c:wm), where each
Wi is an 0/1 vector with l406e: 2 J ones among the first m positions and zeros
elsewhere.
15.4.2. Let G i = (Vi, E i ), where Vo C VI C ... C Vm . For each e E E i - l , we
have a pair {u e , vel of new vertices in G i in the square that replaces e; let
Fi = {{ U e , vel: e E Ei-d. With notation as in the proof of Theorem 15.4.1,
put E = Em and F = Eo U U7:l Fi and show that RE,F(p) = v'm+1, while
RE,F(er) ::; 1. For the latter, sum up the inequalities er 2(Fi) + er 2(Ei _ l ) ::;
er 2(Ei), i = 1,2, ... ,m, obtained from the short diagonals lemma.
15.4.3. Color the pairs of points; the color of {x, y} is the remainder of
flogHe:/2 p(x, y)l modulo r, where r is a sufficiently large integer. Show by
induction that a homogeneous set can be embedded satisfactorily.
15.5.2(b). By (a) and CaratModory's theorem, every metric in fin ) is a .ci
convex combination of at most N + 1 line metrics. To get rid of the extra + 1,
.ci
use the fact that fin ) is a convex cone.
15.5.8(c). The expectation of ~(l-xuxv) is the probability that the hyper-
plane through 0 perpendicular to r separates Yu and Yv, and this equals ~,
where {) E [0, 1f) is the angle of Yu and Yv' On the other hand, the contribution
of the edge {u,v} to Mrelax is ~(1- (Yu,Yv) = (1-cos{)/2. The constant
0.878 ... is the minimum of ~ . 1-:08'0' 0 ::; {) ::; 1f.
15.7.5(c). Suppose that there is aD-embedding f of Tk,m' For every leaf C,
consider f restricted to the path p( e) from the root to C, fix a triple {ae, be, ce}
of vertices as in Exercise 4 (a scaled copy of P2), and label the corresponding
leaf by the distances of ae,be,ce from the root. Using (b), choose a T 2 ,m
subtree where all leaves have the same labels, consider leaves C and C' of this
subtree such that p(e) and p(e') first meet at be = be', and use (a) with
Xo = f(ae), Xl = f(be), X2 = f(ce), x~ = f(ce')'
15.7.6(a). Sum the parallelogram identities (c~a)2 (11(x a -Xb) - (Xb -xc )11 2 +
II(xa-Xb)+(Xb-Xc)112) = (c!a)2(llxa-XbI12+lIxb-XcI12) over (a, b, c) E r.
Bibliography

The references are sorted alphabetically by the abbreviations (rather than by


the authors' names).
[AA92] P. K. Agarwal and B. Aronov. Counting facets and incidences.
Discrete Comput. Geom., 7:359-369, 1992. (refs: pp. 46, 47)
[AACS98] P. K. Agarwal, B. Aronov, T. M. Chan, and M. Sharir. On
levels in arrangements of lines, segments, planes, and triangles.
Discrete Comput. Geom., 19(3):315-331, 1998. (refs: pp. 269,
270, 271, 286, 287)
[AAHP+98] A. Andrzejak, B. Aronov, S. Har-Peled, R. Seidel, and E. Welzl.
Results on k-sets and j-facets via continuous motion arguments.
In Proc. 14th Annu. ACM Sympos. Comput. Geom., pages 192-
199, 1998. (refs: pp. 269, 270, 286)
[AAP+97] P. K. Agarwal, B. Aronov, J. Pach, R. Pollack, and M. Sharir.
Quasi-planar graphs have a linear number of edges. Combina-
torica, 17:1-9, 1997. (ref: p. 177)
[AAS01] P. K. Agarwal, B. Aronov, and M. Sharir. On the complexity
of many faces in arrangements of circles. In Proc. 42nd IEEE
Symposium on Foundations of Computer Science, 2001. (refs:
pp. 47, 70)
[ABFK92] N. Alon, 1. Barany, Z. Fiiredi, and D. Kleitman. Point selections
and weak c-nets for convex hulls. Combin., Probab. Comput.,
1(3):189-200, 1992. (refs: pp. 215, 254, 270)
[ABS97] D. Avis, D. Bremner, and R. Seidel. How good are convex hull
algorithms? Comput. Geom. Theory Appl., 7:265-302, 1997.
(ref: p. 106)
[ABV98] J. Arias-de-Reyna, K. Ball, and R. Villa. Concentration of the
distance in finite dimensional normed spaces. Mathematika,
45:245-252, 1998. (ref: p. 332)
418 Bibliography

[ACE+91] B. Aronov, B. Chazelle, H. Edelsbrunner, L. J. Guibas,


M. Sharir, and R. Wenger. Points and triangles in the plane
and halving planes in space. Discrete Gomput. Geom., 6:435-
442, 1991. (refs: pp. 215, 270)
[Ach01] D. Achlioptas. Database-friendly random projections. In Proc.
20th AGM SIGAGT-SIGMOD-SIGART Symposium on Princi-
ples of Database Systems, pages 274-281, 2001. (ref: p. 361)
[ACNS82] M. Ajtai, V. Chvatal, M. Newborn, and E. Szemeredi. Crossing-
free subgraphs. Ann. Discrete Math., 12:9-12, 1982. (ref: p. 56)
[AEG+94] B. Aronov, P. Erdos, W. Goddard, D. J. Kleitman, M. Kluger-
man, J. Pach, and L. J. Schulman. Crossing families. Gombi-
natorica, 14:127-134, 1994. (ref: p. 177)
[AEGS92] B. Aronov, H. Edelsbrunner, L. Guibas, and M. Sharir. The
number of edges of many faces in a line segment arrangement.
Gombinatorica, 12(3):261-274, 1992. (ref: p. 46)
[AF92] D. Avis and K. Fukuda. A pivoting algorithm for convex hulls
and vertex enumeration of arrangements and polyhedra. Dis-
crete Gomput. Geom., 8:295-313, 1992. (ref: p. 106)
[AFOO] N. Alon and E. Friedgut. On the number of permutations avoid-
ing a given pattern. J. Gombin. Theory, Ser. A, 81:133-140,
2000. (ref: p. 177)
[AFH+OO] H. Alt, S. Felsner, F. Hurtado, M. Noy, and E. Welzl. A class
of point-sets with few k-sets. Gomput. Geom. Theor. Appl.,
16:95-101,2000. (ref: p. 270)
[AFR85] N. Alon, P. Frankl, and V. Rodl. Geometrical realization of set
systems and probabilistic communication complexity. In Proc.
26th IEEE Symposium on Foundations of Gomputer Science,
pages 277-280, 1985. (ref: p. 140)
[AG86] N. Alon and E. Gyori. The number of small semispaces of a
finite set of points in the plane. J. Gombin. Theory Ser. A,
41:154-157, 1986. (ref: p. 145)
[AGHV01] P. K. Agarwal, L. J. Guibas, J. Hershberger, and E. Veach.
Maintaining the extent of a moving point set. Discrete Gomput.
Geom., 26:353-374, 2001. (ref: p. 194)
[AHOO] R. Aharoni and P. E. Haxell. Hall's theorem for hypergraphs.
J. Graph Theory, 35:83-88, 2000. (ref: p. 235)
[Aha01] R. Aharoni. Ryser's conjecture for tri-partite hypergraphs.
Gombinatorica, 21:1-4,2001. (ref: p. 235)
Bibliography 419

[AHL01] N. Alon, S. Hoory, and N. Linial. The Moore bound for irregular
graphs. Graphs and Combinatorics, 2001. In press. (ref: p. 367)
[AI88] F. Aurenhammer and H. Imai. Geometric relations among
Voronoi diagrams. Geom. Dedicata, 27:65-75, 1988. (ref:
p. 121)
[Ajt98] M. Ajtai. Worst-case complexity, average-case complexity and
lattice problems. Documenta Math. J. DMV, Extra volume
ICM 1998, vol. III:421-428, 1998. (ref: p. 26)
[AK85] N. Alon and G. Kalai. A simple proof of the upper bound
theorem. European J. Combin., 6:211-214, 1985. (ref: p. 103)
[AK92] N. Alon and D. Kleitman. Piercing convex sets and the Had-
wiger Debrunner (p,q)-problem. Adv. Math., 96(1):103-112,
1992. (ref: p. 258)
[AK95] N. Alon and G. Kalai. Bounding the piercing number. Discrete
Comput. Geom., 13:245-256, 1995. (ref: p. 261)
[AKOO] F. Aurenhammer and R. Klein. Voronoi diagrams. In J.-R. Sack
and J. Urrutia, editors, Handbook of Computational Geometry,
pages 201-290. Elsevier Science Publishers B.V. North-Holland,
Amsterdam, 2000. (refs: pp. 120, 121)
[AKMM01] N. Alon, G. Kalai, J. Matousek, and R. Meshulam. Transversal
numbers for hypergraphs arising in geometry. Adv. Appl. Math.,
2001. In press. (ref: p. 262)
[AKP89] N. Alon, M. Katchalski, and W. R. Pulleyblank. The maximum
size of a convex polygon in a restricted set of points in the plane.
Discrete Comput. Geom., 4:245-251, 1989. (ref: p. 33)
[AKPW95] N. Alon, R. M. Karp, D. Peleg, and D. West. A graph-theoretic
game and its application to the k-server problem. SIAM J.
Computing, 24(1):78-100, 1995. (ref: p. 398)
[AKV92] R. Adamec, M. Klazar, and P. Valtr. Generalized Davenport-
Schinzel sequences with linear upper bound. Discrete Math.,
108:219-229, 1992. (ref: p. 176)
[Alo] N. Alon. Covering a hypergraph of subgraphs. Discrete Math.
In press. (ref: p. 262)
[Alo86a] N. Alon. Eigenvalues and expanders. Combinatorica, 6:83-96,
1986. (ref: p. 381)
[Alo86b] N. Alon. The number of polytopes, configurations, and real
matroids. Mathematika, 33:62-71, 1986. (ref: p. 140)
420 Bibliography

[Alo98] N. Alon. Piercing d-intervals. Discrete Comput. Geom., 19:333-


334, 1998. (ref: p. 262)
[ALPS01] N. Alon, H. Last, R. Pinchasi, and M. Sharir. On the complex-
ity of arrangements of circles in the plane. Discrete Comput.
Geom., 26:465-492, 2001. (ref: p. 271)
[AM85] N. Alon and V. D. Milman. AI, isoperimetric inequalities for
graphs, and superconcentrators. J. Combinatorial Theory, Ser.
B, 38(1):73-88, 1985. (ref: p. 381)
[Ame96] N. Amenta. A short proof of an interesting Helly-type theorem.
Discrete Comput. Geom., 15:423-427, 1996. (ref: p. 261)
[AMS94] B. Aronov, J. Matousek, and M. Sharir. On the sum of squares
of cell complexities in hyperplane arrangements. J. Combin.
Theory Ser. A, 65:311-321, 1994. (refs: pp. 47, 152)
[AMS98] P. K. Agarwal, J. Matousek, and O. Schwarzkopf. Computing
many faces in arrangements of lines and segments. SIAM J.
Comput., 27(2):491-505, 1998. (ref: p. 162)
[APS93] B. Aronov, M. Pellegrini, and M. Sharir. On the zone of a
surface in a hyperplane arrangement. Discrete Comput. Geom.,
9(2):177-186, 1993. (ref: p. 151)
[AR92] J. Arias-de-Reyna and L. Rodriguez-Piazza. Finite metric
spaces needing high dimension for Lipschitz embeddings in Ba-
nach spaces. Israel J. Math., 79:103-113, 1992. (ref: p. 367)
[AR98] Y. Aumann and Y. Rabani. An O(logk) approximate min-
cut max-flow theorem and approximation algorithm. SIAM J.
Comput., 27(1):291-301, 1998. (ref: p. 392)
[AroOO] B. Aronov. A lower bound for Voronoi diagram complexity.
Manuscript, Polytechnic University, Brooklyn, New York, 2000.
(refs: pp. 123, 192)
[ARS99] N. Alon, L. Ronyai, and T. Szabo. Norm-graphs: variations
and applications. J. Combin. Theory Ser. B, 76:280-290, 1999.
(ref: p. 68)
[AS94] B. Aronov and M. Sharir. Castles in the air revisited. Discrete
Comput. Geom., 12:119-150, 1994. (ref: p. 193)
[ASOOa] P. K. Agarwal and M. Sharir. Arrangements and their appli-
cations. In J.-R. Sack and J. Urrutia, editors, Handbook of
Computational Geometry, pages 49-119. North-Holland, Ams-
terdam, 2000. (refs: pp. 47, 128, 145, 168, 191)
Bibliography 421

[ASOOb] P. K. Agarwal and M. Sharir. Davenport-Schinzel sequences


and their geometric applications. In J.-R. Sack and J. Urru-
tia, editors, Handbook of Computational Geometry, pages 1-47.
North-Holland, Amsterdam, 2000. (ref: p. 168)
[ASOOc] P. K. Agarwal and M. Sharir. Pipes, cigars, and kreplach: The
union of Minkowski sums in three dimensions. Discrete Comput.
Geom., pages 645-685, 2000. (ref: p. 194)
[ASOOd] N. Alon and J. Spencer. The Probabilistic Method (2nd edition).
J. Wiley and Sons, New York, NY, 2000. First edition 1993.
(refs: pp. 336, 340)
[ASOla] B. Aronov and M. Sharir. Cutting circles into pseudo-segments
and improved bounds for incidences. Discrete Comput. Geom.,
2001. To appear. (refs: pp. 44, 46, 69, 70, 271)
[AS01b] B. Aronov and M. Sharir. Distinct distances in three dimen-
sions. Manuscript, School of Computer Science, Tel Aviv Uni-
versity, 2001. (ref: p. 45)
[Ass83] P. Assouad. Density and dimension (in French). Ann. Inst.
Fourier (Grenoble), 33:233-282, 1983. (ref: p. 250)
[ASS89] P. K. Agarwal, M. Sharir, and P. Shor. Sharp upper and lower
bounds on the length of general Davenport-Schinzel sequences.
J. Combin. Theory Ser. A, 52(2):228-274, 1989. (ref: p. 176)
[ASS96] P. K. Agarwal, M. Sharir, and O. Schwarzkopf. The overlay of
lower envelopes and its applications. Discrete Comput. Geom.,
15:1-13, 1996. (ref: p. 192)
[AST97] B. Aronov, M. Sharir, and B. Tagansky. The union of convex
polyhedra in three dimensions. SIAM J. Comput., 26:1670-
1688, 1997. (ref: p. 194)
[Aur91] F. Aurenhammer. Voronoi diagrams: A survey of a fundamental
geometric data structure. ACM Comput. Surv., 23(3):345-405,
September 1991. (ref: p. 120)
[Avi93] D. Avis. The m-core properly contains the m-divisible points
in space. Pattern Recognit. Lett., 14(9):703-705, 1993. (ref:
p.205)
[Bal] K. Ball. Convex geometry and functional analysis. In W. B.
Johnson and J. Lindenstrauss, editors, Handbook of Banach
Spaces. North-Holland, Amsterdam. In press. (refs: pp. 314,
320,337)
[BaWO] K. Ball. Isometric embedding in tp-spaces. European J. Com-
bin., 11(4):305-311, 1990. (ref: p. 383)
422 Bibliography

[BaI92] K. Ball. Markov chains, Riesz transforms and Lipschitz maps.


Geom. Funct. Anal., 2(2):137-172, 1992. (ref: p. 380)
[BaI97] K. Ball. An elementary introduction to modern convex geome-
try. In S. Levi, editor, Flavors of Geometry (MSRI Publications
vol. 31), pages 1-58. Cambridge University Press, Cambridge,
1997. (refs: pp. viii, 300, 315, 327, 336, 337, 346)
[Bar82] 1. Barany. A generalization of CaratModory's theorem. Discrete
Math., 40:141-152, 1982. (refs: pp. 198, 199,210)
[Bar89] 1. Barany. Intrinsic volumes and f-vectors of random polytopes.
Math. Ann., 285(4):671-699, 1989. (ref: p. 99)
[Bar93] A.1. Barvinok. A polynomial time algorithm for counting inte-
gral points in polyhedra when the dimension is fixed. In Pmc.
34th IEEE Symposium on Foundations of Computer Science,
pages 566-572, 1993. (ref: p. 24)
[Bar96] Y. Bartal. Probabilistic approximation of metric spaces and its
algorithmic applications. In Pmc. 37th IEEE Symposium on
Foundations of Computer Science, pages 184-193, 1996. (ref:
p.398)
[Bar97] A.1. Barvinok. Lattice points and lattice polytopes. In J. E.
Goodman and J. O'Rourke, editors, Handbook of Discrete and
Computational Geometry, chapter 7, pages 133-152. CRC Press
LLC, Boca Raton, FL, 1997. (refs: pp. 24, 294)
[Bar98] Y. Bartal. On approximating arbitrary metrics by tree metrics.
In Proc. 30th Annu. ACM Sympos. on Theory of Computing,
pages 161-168, 1998. (ref: p. 398)
[Bas98] S. Basu. On the combinatorial and topological complexity of a
single cell. In Pmc. 39th IEEE Symposium on Foundations of
Computer Science, pages 606-616, 1998. (ref: p. 193)
[BCM99] H. Bri::innimann, B. Chazelle, and J. Matousek. Product range
spaces, sensitive sampling, and derandomization. SIAM J.
Comput., 28:1552-1575, 1999. (ref: p. 106)
[BCR98] J. Bochnak, M. Coste, and M.-F. Roy. Real Algebraic Geometry.
Springer, Berlin etc., 1998. Transl. from the French, revised and
updated edition. (refs: pp. 135, 191)
[BD93] D. Bienstock and N. Dean. Bounds for rectilinear crossing num-
bers. J. Graph Theory, 17(3):333-348, 1993. (ref: p. 58)
[BDV91] A. Bialostocki, P. Dierker, and B. Voxman. Some notes on the
Erd6s-Szekeres theorem. Discrete Math., 91(3):231-238, 1991.
(ref: p. 38)
Bibliography 423

[Bec83] J. Beck. On the lattice property of the plane and some prob-
lems of Dirac, Motzkin and Erdos in combinatorial geometry.
Combinatorica, 3(3-4):281-297, 1983. (refs: pp. 45, 50)
[Ben66] C. T. Benson. Minimal regular graphs of girth eight and twelve.
Canad. J. Math., 18:1091-1094, 1966. (ref: p. 367)
[BEPY91] M. Bern, D. Eppstein, P. Plassman, and F. Yao. Horizon the-
orems for lines and polygons. In J. Goodman, R. Pollack, and
W. Steiger, editors, Discrete and Computational Geometry: Pa-
pers fmm the DIMACS Special Year, volume 6 of DIMACS
Series in Discrete Mathematics and Theoretical Computer Sci-
ence, pages 45-66. American Mathematical Society, Association
for Computing Machinery, Providence, RI, 1991. (ref: p. 151)
[Ber61] C. Berge. Farbungen von Graphen, deren samtliche bzw.
deren ungerade Kreise starr sind (Zusammenfassung). Wis-
sentschaftliche Zeitschrift, Martin Luther Universitiit Halle-
Wittenberg, Math.-Naturwiss. Reihe, pages 114-115, 1961. (ref:
p. 293)
[Ber62] C. Berge. Sur une conjecture relative au probleme des codes op-
timaux. Communication, 13eme assemblee generale de l'URSI,
Tokyo, 1962. (ref: p. 293)
[BF84] E. Boros and Z. Fiiredi. The number of triangles covering the
center of an n-set. Geom. Dedicata, 17:69-77, 1984. (ref:
p.21O)
[BF87] I. Barany and Z. Fiiredi. Computing the volume is difficult.
Discrete Comput. Geom., 2:319-326, 1987. (refs: pp. 320, 322,
324)
[BF88] I. Barany and Z. Fiiredi. Approximation of the sphere by poly-
topes having few vertices. Pmc. Amer. Math. Soc., 102(3):651-
659, 1988. (ref: p. 320)
[BFL90] I. Barany, Z. Fiiredi, and L. Lovasz. On the number of halving
planes. Combinatorica, 10:175-183, 1990. (refs: pp. 205, 215,
229, 269, 270, 280)
[BFM86] J. Bourgain, T. Figiel, and V. Milman. On Hilbertian subsets
of finite metric spaces. Israel J. Math., 55:147-152, 1986. (ref:
p.373)
[BFT95] G. R. Brightwell, S. Felsner, and W. T. Trotter. Balancing pairs
and the cross product conjecture. Order, 12(4):327-349, 1995.
(ref: p. 308)
424 Bibliography

[BGK+99] A. Brieden, P. Gritzmann, R. Kannan, V. Klee, L. Lov8.sz, and


M. Simonovits. Deterministic and randomized polynomial-time
approximation of radii. 1999. To appear in Mathematika. Pre-
liminary version in Proc. 39th IEEE Symposium on Foundations
of Computer Science, 1998, pages 244-251. (refs: pp. 322, 334)
[BH93] U. Betke and M. Henk. Approximating the volume of convex
bodies. Discrete Comput. Geom., 10:15-21, 1993. (ref: p. 321)
[Big93] N. Biggs. Algebraic Graph Theory. Cambridge Univ. Press,
Cambridge, 1993. 2nd edition. (refs: pp. 367, 381)
[BK63] W. Bonnice and V. L. Klee. The generation of convex hulls.
Math. Ann., 152:1-29, 1963. (ref: p. 8)
[BKOO] A. Brieden and M. Kochol. A note on cutting planes, vol-
ume approximation and Mahler's conjecture. Manuscript, TU
Miinchen, 2000. (ref: p. 324)
[BL81] L. J. Billera and C. W. Lee. A proof of the suffiency of Mc-
Mullen's conditions for f-vectors of simplicial polytopes. J.
Combin. Theory Ser. A, 31(3):237-255, 1981. (ref: p. 105)
[BL92] 1. Barany and D. Larman. A colored version of Tverberg's
theorem. J. London Math. Soc. II. Ser., 45:314-320,1992. (ref:
p.205)
[BL99] Y. Benyamini and J. Lindenstrauss. Nonlinear Functional Anal-
ysis, Vol. I, Colloquium Publications 48. American Mathemat-
ical Society (AMS), Providence, RI, 1999. (refs: pp. 336, 352,
358)
[BLM89] J. Bourgain, J. Lindenstrauss, and V. Milman. Approximation
of zonoids by zonotopes. Acta Math., 162:73-141, 1989. (ref:
p.320)
[BLPS99] W. Banaszczyk, A. E. Litvak, A. Pajor, and S. J. Szarek. The
flatness theorem for nonsymmetric convex bodies via the lo-
cal theory of Banach spaces. Math. Oper. Res., 24(3):728-750,
1999. (ref: p. 24)
[BLZV94] A. Bjorner, L. Lov8.sz, R. Zivaljevic, and S. VreCica. Chessboard
complexes and matching complexes. J. London Math. Soc.,
49:25-39, 1994. (ref: p. 205)
[BMMV02] R. Babilon, J. Matousek, J. Maxova, and P. Valtr. Low-
distortion embeddings of trees. In Proc. Graph Drawing 2001.
Springer, Berlin etc., 2002. In press. (ref: p. 393)
Bibliography 425

[BMT95] C. Buchta, J. Miiller, and R. F. Tichy. Stochastical approxima-


tion of convex bodies. Math. Ann., 271:225-235, 1895. (ref:
p.324)
[B097] I. Barany and S. Onn. Colourful linear programming and its
relatives. Math. Oper. Res., 22:550-567, 1997. (refs: pp. 199,
204)
[BoI85] B. Bollobas. Random Graphs. Academic Press (Harcourt Brace
Jovanovich, Publishers), London-Orlando etc., 1985. (ref:
p.366)
[BoI87] B. Bollobas. Martingales, isoperimetric inequalities and random
graphs. In 52. Combinatorics, Eger (Hungary), Colloq. Math.
Soc. J. Bolyai, pages 113-139. Math. Soc. J. Bolyai, Budapest,
1987. (refs: pp. 336, 340)
[Bor75] C. Borell. The Brunn-Minkowski inequality in Gauss space.
Invent. Math., 30(2):207-216, 1975. (ref: p. 336)
[BOR99] A. Borodin, R. Ostrovsky, and Y. Rabani. Subquadratic ap-
proximation algorithms for clustering problems in high dimen-
sional spaces. In Proc. 31st Annual ACM Symposium on Theory
of Computing, pages 435-444, 1999. (ref: p. 361)
[Bou85] J. Bourgain. On Lipschitz embedding of finite metric spaces in
Hilbert space. Israel J. Math., 52:46-52, 1985. (refs: pp. 367,
388, 392)
[Bou86] J. Bourgain. The metrical interpretation of superreflexivity in
Banach spaces. Israel J. Math., 56:222-230, 1986. (ref: p. 392)
[BP90] K. Ball and A. Pajor. Convex bodies with few faces. Proc.
Amer. Math. Soc., 110(1):225-231, 1990. (ref: p. 320)
[BPR96] S. Basu, R. Pollack, and M.-F. Roy. On the number of cells
defined by a family of polynomials on a variety. M athematika,
43:120-126, 1996. (ref: p. 135)
[Bre93] G. Bredon. Topology and Geometry (Graduate Texts in Math-
ematics 139). Springer-Verlag, Berlin etc., 1993. (ref: p. 4)
[Bro66] W. G. Brown. On graphs that do not contain a Thomsen graph.
Canad. Math. Bull., 9:281-285, 1966. (ref: p. 68)
[Bro83] A. Bronsted. An Introduction to Convex Polytopes. Springer-
Verlag, New York, NY, 1983. (ref: p. 85)
[BS89] J. Bokowski and B. Sturmfels. Computational Synthetic Geom-
etry. Lect. Notes in Math. 1355. Springer-Verlag, Heidelberg,
1989. (ref: p. 138)
426 Bibliography

[BSTY98] J.-D. Boissonnat, M. Sharir, B. Tagansky, and M. Yvinec.


Voronoi diagrams in higher dimensions under certain polyhe-
dral distance functions. Discrete Comput. Geom., 19(4):473-
484, 1998. (ref: p. 194)
[BT89] T. Bisztriczky and G. Fejes T6th. A generalization of the
Erdos-Szekeres convex n-gon theorem. J. Reine Angew. Math.,
395:167-170,1989. (ref: p. 33)
[BV82] E. O. Buchman and F. A. Valentine. Any new Helly numbers?
Amer. Math. Mon., 89:370-375, 1982. (ref: p. 13)
[BV98] 1. Ba,niny and P. Valtr. A positive fraction Erdos-Szekeres the-
orem. Discrete Comput. Geom, 19:335-342, 1998. (ref: p. 220)
[BVS+99] A. Bjorner, M. Las Vergnas, B. Sturmfels, N. White, and
G. M. Ziegler. Oriented Matroids (2nd edition). Encyclopedia
of Mathematics 46. Cambridge University Press, Cambridge,
1999. (refs: pp. 100, 137, 139, 222)
[Can69] R. Canham. A theorem on arrangements of lines in the plane.
Israel J. Math., 7:393-397, 1969. (ref: p. 46)
[Car07] C. CaratModory. Uber den Variabilitatsbereich der Koeffizien-
ten von Potenzreihen, die gegebene Werte nicht annehmen.
Math. Ann., 64:95-115, 1907. (refs: pp. 8, 98)
[Car85] B. Carl. Inequalities of Bernstein-Jackson-type and the de-
gree of compactness of operators in Banach spaces. Ann. Inst.
Fourier, 35(3):79-118, 1985. (ref: p. 320)
[Cas59] J. Cassels. An Introduction to the Geometry of Numbers.
Springer-Verlag, Heidelberg, 1959. (ref: p. 20)
[CCPS98] W. J. Cook, W. H. Cunningham, W. R. Pulleyblank, and
A. Schrijver. Combinatorial Optimization. Wiley, New York,
NY, 1998. (ref: p. 294)
[CEG+90] K. Clarkson, H. Edelsbrunner, L. Guibas, M. Sharir, and
E. Welzl. Combinatorial complexity bounds for arrangements
of curves and spheres. Discrete Comput. Geom., 5:99-160, 1990.
(refs: pp. 44, 45, 46, 47, 68, 152)
[CEG+93] B. ChazeIle, H. Edelsbrunner, L. Guibas, M. Sharir, and
J. Snoeyink. Computing a face in an arrangement of line seg-
ments and related problems. SIAM J. Comput., 22:1286-1302,
1993. (ref: p. 162)
[CEG+94] B. Cha:z;eIle, H. Edelsbrunner, L. Guibas, J. Hershberger, R. Sei-
del, and M. Sharir. Selecting heavily covered points. SIAM J.
Comput., 23:1138-1151, 1994. (ref: p. 215)
Bibliography 427

[CEG+95] B. Chazelle, H. Edelsbrunner, M. Grigni, L. Guibas, M. Sharir,


and E. Welzl. Improved bounds on weak E-nets for convex sets.
Discrete Comput. Geom., 13:1-15, 1995. (ref: p. 254)
[CEGS89] B. Chazelle, H. Edelsbrunner, 1. Guibas, and M. Sharir. A
singly-exponential stratification scheme for real semi-algebraic
varieties and its applications. In Proc. 16th Internat. Colloq.
Automata Lang. Program., volume 372 of Lecture Notes Com-
put. Sci., pages 179-192. Springer-Verlag, Berlin etc., 1989.
(ref: p. 162)
[CEM+96] K. L. Clarkson, D. Eppstein, G. L. Miller, C. Sturtivant, and
S.-H. Teng. Approximating center points with iterative Radon
points. Internat. J. Comput. Geom. Appl., 6:357-377, 1996.
(ref: p. 16)
[CF90] B. Chazelle and J. Friedman. A deterministic view of random
sampling and its use in geometry. Combinatorica, 10(3):229-
249, 1990. (refs: pp. 68, 161)
[CGL85] B. Chazelle, L. J. Guibas, and D. T. Lee. The power of geometric
duality. BIT, 25:76-90, 1985. (ref: p. 151)
[Cha93a] B. Chazelle. Cutting hyperplanes for divide-and-conquer. Dis-
crete Comput. Geom., 9(2):145-158, 1993. (refs: pp. 69, 162)
[Cha93b] B. Chazelle. An optimal convex hull algorithm in any fixed
dimension. Discrete Comput. Geom., 10:377-409, 1993. (ref:
p. 106)
[ChaOOa] T. M. Chan. On levels in arrangements of curves. In Proc. 41st
IEEE Symposium on Foundations of Computer Science, pages
219-227,2000. (refs: pp. 140, 271)
[ChaOOb] T. M. Chan. Random sampling, halfspace range reporting, and
construction of (::; k)-levels in three dimensions. SIAM J. Com-
put., 30(2):561-575, 2000. (ref: p. 106)
[ChaOOc] B. Chazelle. The Discrepancy Method. Cambridge University
Press, Cambridge, 2000. (ref: p. 162)
[Chu84] F. R. K. Chung. The number of different distances determined
by n points in the plane. J. Combin. Theory Ser. A, 36:342-
354, 1984. (ref: p. 45)
[Chu97] F. Chung. Spectral Graph Theory. Regional Conference Series
in Mathematics 92. Amer. Math. Soc., Providence, 1997. (ref:
p. 381)
428 Bibliography

L. P. Chew, K. Kedem, M. Sharir, B. Tagansky, and E. Welzl.


Voronoi diagrams of lines in 3-space under polyhedral convex
distance functions. J. Algorithms, 29(2):238-255, 1998. (ref:
p. 192)
[Cla87] K. L. Clarkson. New applications of random sampling in compu-
tational geometry. Discrete Comput. Geom., 2:195-222, 1987.
(refs: pp. 68, 72)
[Cla88a] K. L. Clarkson. Applications of random sampling in computa-
tional geometry, II. In Proc. 4th Annu. ACM Sympos. Comput.
Geom., pages 1-11, 1988. (refs: pp. 145, 161)
[Cla88b] K. L. Clarkson. A randomized algorithm for closest-point
queries. SIAM J. Comput., 17:830-847, 1988. (ref: p. 161)
[Cla93] K. L. Clarkson. A bound on local minima of arrangements that
implies the upper bound theorem. Discrete Comput. Geom.,
10:427-233, 1993. (refs: pp. 103, 280)
[CL092] D. Cox, J. Little, and D. O'Shea. Ideals, Varieties, and Algo-
rithms. Springer-Verlag, New York, NY, 1992. (ref: p. 135)
[CP88] B. Carl and A. Pajor. Gelfand numbers of operators with values
in a Hilbert space. Invent. Math., 94:479-504, 1988. (refs:
pp. 320, 324)
[CS89] K. L. Clarkson and P. W. Shor. Applications of random sam-
pling in computational geometry, II. Discrete Comput. Geom.,
4:387-421, 1989. (refs: pp. 68, 105, 145, 161)
[CS99] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices and
Groups (3rd edition). Grundlehren der Mathematischen Wis-
senschaften 290. Springer-Verlag, New York etc., 1999. (ref:
p.24)
[CST92] F. R. K. Chung, E. Szemeredi, and W. T. Trotter. The number
of different distances determined by a set of points in the Eu-
clidean plane. Discrete Comput. Geom., 7:1-11, 1992. (ref:
p.45)
[Dan63] G. B. Dantzig. Linear Programming and Extensions. Princeton
University Press, Princeton, NJ, 1963. (ref: p. 93)
[Dan86] L. Danzer. On the solution of the problem of Gallai about
circular discs in the Euclidean plane (in German). Stud. Sci.
Math. Hung., 21:111-134, 1986. (ref: p. 235)
[dBvKOS97] M. de Berg, M. van Kreveld, M. Overmars, and O. Schwarzkopf.
Computational Geometry: Algorithms and Applications.
Springer-Verlag, Berlin, 1997. (refs: pp. 116, 122, 162)
Bibliography 429

[DE94] T. K. Dey and H. Edelsbrunner. Counting triangle crossings


and halving planes. Discrete Comput. Geom., 12:281-289, 1994.
(ref: p. 270)
[DeI34] B. Delaunay. Sur la sphere vide. A la memoire de Georges
Voronoi. Izv. Akad. Nauk SSSR, Otdelenie Matematicheskih i
Estestvennyh Nauk, 7:793-800, 1934. (ref: p. 120)
[Dey98] T. K. Dey. Improved bounds on planar k-sets and related
problems. Discrete Comput. Geom., 19:373-382, 1998. (refs:
pp. 269, 270, 285)
[DFK91] M. E. Dyer, A. Frieze, and R. Kannan. A random polynomial
time algorithm for approximating the volume of convex bodies.
J. ACM, 38:1-17, 1991. (ref: p. 321)
[dFPP90] H. de Fraysseix, J. Pach, and R. Pollack. How to draw a planar
graph on a grid. Combinatorica, 10(1):41-51, 1990. (ref: p. 94)
[DFPSOO] A. Deza, K. Fukuda, D. Pasechnik, and M. Sato. Generating
vertices with symmetries. In Pmc. of the 5th Workshop on Al-
gorithms and Computation, Tokyo University, pages 1-8, 2000.
(ref: p. 106)
[DG99] S. Dasgupta and A. Gupta. An elementary proof of the
Johnson-Lindenstrauss lemma. Technical Report TR-99-06,
IntI. Comput. Sci. Inst., Berkeley, CA, 1999. (ref: p. 361)
[DGK63] L. Danzer, B. Griinbaum, and V. Klee. Helly's theorem and its
relatives. In Convexity, vohJ.me 7 of Proc. Symp. Pure Math.,
pages 101-180. American Mathematical Society, Providence,
1963. (refs: pp. 8, 12, 13, 327)
[DiI50] R. P. Dilworth. A decomposition theorem for partially ordered
sets. Annals of Math., 51:161-166,1950. (ref: p. 295)
[Dir42] G. L. Dirichlet. Verallgemeinerung eines Satzes aus der Lehre
von Kettenbriichen nebst einigen Anwendungen auf die Theorie
der Zahlen. In Bericht tiber die zur Bekantmachung geeigneten
Verhandlungen der K oniglich Preussischen Akademie der Wis-
senschaften zu Berlin, pages 93-95. 1842. Reprinted in L.
Kronecker (editor): G. L. Dirichlet's Werke Vol. I, G. Reimer,
Berlin 1889, reprinted Chelsea, New York 1969. (ref: p. 21)
[Dir50] G. L. Dirichlet. Uber die Reduktion der positiven quadratischen
Formen mit drei unbestimmten ganzen Zahlen. J. Reine Angew.
Math., 40:209-227, 1850. (ref: p. 120)
[DL97] M. M. Deza and M. Laurent. Geometry of Cuts and Metrics.
Algorithms and Combinatorics 15. Springer-Verlag, Berlin etc.,
1997. (refs: pp. 107, 357)
430 Bibliography

[Dol'92] V. L. Dol'nikov. A generalization of the ham sandwich theorem.


Mat. Zametki, 52(2):27-37, 1992. In Russian; English transla-
tion in Math. Notes 52,2:771-779, 1992. (ref: p. 16)
[Doi73] J.-P. Doignon. Convexity in cristallographicallattices. J. Ge-
ometry, 3:71-85, 1973. (ref: p. 295)
[DR50] A. Dvoretzky and C. A. Rogers. Absolute and unconditional
convergence in normed linear spaces. Pmc. Natl. Acad. Sci.
USA, 36:192-197, 1950. (ref: p. 352)
[DS65] H. Davenport and A. Schinzel. A combinatorial problem con-
nected with differential equations. Amer. J. Math., 87:684-689,
1965. (ref: p. 175)
[Dud78] R. M. Dudley. Central limit theorems for empirical measures.
Ann. Pmbab., 6:899-929, 1978. (ref: p. 250)
[DV02] H. Djidjev and 1. Vrt'o. An improved lower bound for crossing
numbers. In Pmc. Graph Drawing 2001. Springer, Berlin etc.,
2002. In press. (ref: p. 57)
[Dvo59] A. Dvoretzky. A theorem on convex bodies and applications to
Banach spaces. Pmc. Natl. Acad. Sci. USA, 45:223-226, 1959.
Errata. Ibid. 1554. (ref: p. 352)
[Dvo61] A. Dvoretzky. Some results on convex bodies and Banach
spaces. In Pmc. Int. Symp. Linear Spaces 1960, pages 123-
160. Jerusalem Academic Press, Jerusalem; Pergamon, Oxford,
1961. (refs: pp. 346, 352)
[Dwo97] C. Dwork. Positive applications of lattices to cryptography. In
Proc. 22dn International Symposium on Mathematical Foun-
dations of Computer Science (Lect. Notes Comput. Sci. 1295),
pages 44-51. Springer, Berlin, 1997. (ref: p. 26)
[Eck85] J. Eckhoff. An upper-bound theorem for families of convex sets.
Geom. Dedicata, 19:217-227, 1985. (ref: p. 197)
[Eck93] J. Eckhoff. Helly, Radon and CaratModory type theorems.
In P. M. Gruber and J. M. Wills, editors, Handbook of Convex
Geometry. North-Holland, Amsterdam, 1993. (refs: pp. 8, 12,
13)
[Ede89] H. Edelsbrunner. The upper envelope of piecewise linear func-
tions: Tight complexity bounds in higher dimensions. Discrete
Comput. Geom., 4:337-343, 1989. (ref: p. 186)
[Ede98] H. Edelsbrunner. Geometry of modeling biomolecules. In P. K.
Agarwal, L. E. Kavraki, and M. Mason, editors, Proc. W ork-
shop Algorithmic Found. Robot. A. K. Peters, Natick, MA, 1998.
(ref: p. 122)
Bibliography 431

[Edm65] J. Edmonds. Maximum matching and a polyhedron with 0,1-


vertices. J. Res. National Bureau of Standards (B), 69:125-130,
1965. (ref: p. 294)
[EE94] Gy. Elekes and P. Erdos. Similar configurations and pseudo
grids. In K. Boroczkyet al., editors, Intuitive Geometry. Pro-
ceedings of the 3rd International Conference Held in Szeged,
Hungary, From 2 To 7 September, 1991, Colloq. Math. Soc.
Janos Bolyai. 63, pages 85-104. North-Holland, Amsterdam,
1994. (refs: pp. 47, 51)
[EFPR93] P. Erdos, Z. Fiiredi, J. Pach, and I. Ruzsa. The grid revisited.
Discrete Math., 111:189-196, 1993. (ref: p. 47)
[EGS90] H. Edelsbrunner, L. Guibas, and M. Sharir. The complexity
of many cells in arrangements of planes and related problems.
Discrete Comput. Geom., 5:197-216, 1990. (ref: p. 46)
[EHP89] P. Erdos, D. Hickerson, and J. Pach. A problem of Leo Moser
about repeated distances on the sphere. Amer. Math. Mon.,
96:569-575, 1989. (ref: p. 45)
[EKZ01] D. Eppstein, G. Kuperberg, and G. M. Ziegler. Fat 4-polytopes
and fatter 3-spheres. Manuscript, TU Berlin, 2001. (ref: p. 107)
[Ele86] Gy. Elekes. A geometric inequality and the complexity of com-
puting the volume. Discrete Comput. Geom., 1:289-292, 1986.
(refs: pp. 320, 322)
[Ele97] Gy. Elekes. On the number of sums and products. Acta Arith.,
81(4):365-367, 1997. (ref: p. 50)
[Ele99] Gy. Elekes. On the number of distinct distances and certain
algebraic curves. Period. Math. Hung., 38(3):173-177, 1999.
(ref: p. 48)
[Ele01] Gy. Elekes. Sums versus products in number theory, algebra
and Erdos geometry. In G. Halasz et al., editors, Paul Erdos
and His Mathematics. J. Bolyai Math. Soc., Budapest, 2001. In
press. (refs: pp. 47, 48, 49, 54)
[ELSS73] P. Erdos, L. Lovasz, A. Simmons, and E. Straus. Dissection
graphs of planar point sets. In J. N. Srivastava, editor, A Sur-
vey of Combinatorial Theory, pages 139-154. North-Holland,
Amsterdam, Netherlands, 1973. (refs: pp. 269, 276)
[Enf69] P. EnBo. On the nonexistence of uniform homeomorphisms
between Lp-spaces. Ark. Mat., 8:103-105, 1969. (ref: p. 372)
432 Bibliography

[EOS86] H. Edelsbrunner, J. O'Rourke, and R. Seidel. Constructing


arrangements of lines and hyperplanes with applications. SIAM
J. Gomput., 15:341-363, 1986. (ref: p. 151)
[EP71] P. Erdos and G. Purdy. Some extremal problems in geometry.
J. Gombin. Theory, 10(3):246-252, 1971. (ref: p. 50)
[Epp95] D. Eppstein. Dynamic Euclidean minimum spanning trees and
extrema of binary functions. Discrete Gomput. Geom., 13: 111-
122, 1995. (ref: p. 124)
[Epp98] D. Eppstein. Geometric lower bounds for parametric matroid
optimization. Discrete Gomput. Geom., 20:463-476,1998. (ref:
p.271)
[EROO] Gy. Elekes and L. R6nyai. A combinatorial problem on poly-
nomials and rational functions. J. Gombin. Thoery Ser. B,
89(1):1-20,2000. (ref: p. 48)
[Erd46] P. Erdos. On a set of distances of n points. Amer. Math.
Monthly, 53:248-250, 1946. (refs: pp. 44, 45, 53, 54, 68)
[Erd60] P. Erdos. On sets of distances of n points in Euclidean space.
Publ. Math. Inst. Hungar. Acad. Sci., 5:165-169, 1960. (ref:
p.45)
[ES35] P. Erdos and G. Szekeres. A combinatorial problem in geometry.
Gompositio Math., 2:463-470, 1935. (refs: pp. 32, 33)
[ES63] P. Erdos and H. Sachs. Regular graphs with given girth and
minimal number of knots (in German). Wiss. Z. Martin-Luther-
Univ. Halle- Wittenberg, Math.-Naturwiss. Reihe, 12:251-258,
1963. (ref: p. 368)
[ES83] P. Erdos and M. Simonovits. Supersaturated graphs and hy-
pergraphs. Gombinatorica, 3:181-192, 1983. (ref: p. 215)
[ES96] H. Edelsbrunner and N. R. Shah. Incremental topological flip-
ping works for regular triangulations. Algorithmica, 15:223-241,
1996. (ref: p. 121)
[ESOO] A. Efrat and M. Sharir. On the complexity of the union of
fat objects in the plane. Discrete Gomput. Geom., 23:171-189,
2000. (ref: p. 194)
[ESS93] H. Edelsbrunner, R. Seidel, and M. Sharir. On the zone theorem
for hyperplane arrangements. SIAM J. Gomput., 22(2):418-429,
1993. (ref: p. 151)
[EVW97] H. Edelsbrunner, P. Valtr, and E. Welzl. Cutting dense point
sets in half. Discrete Gomput. Geom., 17(3):243-255, 1997.
(refs: pp. 270, 273)
Bibliography 433

[EW85] H. Edelsbrunner and E. Welzl. On the number of line sepa-


rations of a finite set in the plane. Journal of Combinatorial
Theory Ser. A, 38:15-29, 1985. (ref: p. 269)
[EW86] H. Edelsbrunner and E. Welzl. Constructing belts in two-
dimensional arrangements with applications. SIAM J. Com-
put., 15:271-284, 1986. (ref: p. 75)
[Far94] G. Farkas. Applications of Fourier's mechanical principle (in
Hungarian). Math. Termes. Ertesito, 12:457-472, 1893/94. Ger-
man translation in Math. Nachr. Ungarn 12:1-27, 1895. (ref:
p.8)
[FeiOO] U. Feige. Approximating the bandwidth via volume respecting
embeddings. J. Comput. Syst. Sci, 60:510-539, 2000. (ref:
p. 396)
[FeI97] S. Felsner. On the number of arrangements of pseudolines.
Discrete Comput. Geom., 18:257-267, 1997. (ref: p. 139)
[FH92] Z. Fiiredi and P. Hajnal. Davenport-Schinzel theory of matri-
ces. Discrete Math., 103:233-251, 1992. (ref: p. 177)
[Fie73] M. Fiedler. Algebraic connectivity of graphs. Czechosl. Math.
J., 23(98):298-305, 1973. (ref: p. 381)
[FKS89] J. Friedman, J. Kahn, and E. Szemeredi. On the second eigen-
value of random regular graphs. In Proceedings of the Twenty
First Annual ACM Symposium on Theory of Computing, pages
587-598, 1989. (ref: p. 381)
[FLM77] T. Figiel, J. Lindenstrauss, and V. D. Milman. The dimension of
almost spherical sections of convex bodies. Acta Math., 139:53-
94, 1977. (refs: pp. 336, 346, 348, 352, 353)
[FRO 1] P. Frankl and V. Rodl. Extremal problems on set systems.
Random Structures and Algorithms, 2001. In press. (refs:
pp. 226, 227)
[Fre73] G. A. Freiman. Foundations of a Structural Theory of Set Addi-
tion. Translations of Mathematical Monographs. Vol. 37. Amer-
ican Mathematical Society, Providence, RI, 1973. (ref: p. 47)
[Fre76] M. L. Fredman. How good is the information theory bound in
sorting? Theor. Comput. Sci., 1:355-361, 1976. (ref: p. 308)
[Fri91] J. Friedman. On the second eigenvalue and random walks in
random d-regular graphs. Combinatorica, 11:331-362, 1991.
(ref: p. 381)
434 Bibliography

[FS01] U. Feige and G. Schechtman. On the optimality of the random


hyperplane rounding technique for MAXCUT. In Pmc. 33rd
Annual ACM Symposium on Theory of Computing, 2001. (ref:
p.384)
[FuI70] D. R. Fulkerson. The perfect graph conjecture and pluperfect
graph theorem. In R. C. Bose et aI., editors, Pmc. of the Second
Chapel Hill Conference on Combinatorial Mathematics and Its
Applications, pages 171-175. Univ. of North Carolina, Chapel
Hill, North Carolina, 1970. (ref: p. 293)
[Fiir96] Z. Fiiredi. New asymptotics for bipartite Tunin numbers. J.
Combin. Theory Ser. A, 75:141-144, 1996. (ref: p. 68)
[GaI56] D. Gale. Neighboring vertices on a convex polyhedron. In
H. W. Kuhn and A. W. Tucker, editors, Linear Inequalities and
Related Systems, Annals of Math. Studies 38, pages 255-263.
Princeton University Press, Princeton, 1956. (ref: p. 114)
[GaI63] D. Gale. Neighborly and cyclic polytopes. In V. Klee, editor,
Convexity, volume 7 of Pmc. Symp. Pure Math., pages 225-232.
American Mathematical Society, 1963. (ref: p. 98)
[GGL95] R. L. Graham, M. Grotschel, and L. Lovasz, editors. Handbook
of Combinatorics. North-Holland, Amsterdam, 1995. (refs:
pp. viii, 85)
[GJOO] E. Gawrilow and M. Joswig. polymake: a framework
for analyzing convex polytopes. In G. Kalai and G. M.
Ziegler, editors, Polytopes-Combinatorics and Computation,
pages 43-74. Birkhauser, Basel, 2000. Software available
at http://www.math.tu-berlin.de/diskregeom/polymake/.
(ref: p. 85)
[GKS99] R. J. Gardner, A. Koldobsky, and T. Schlumprecht. An analytic
solution to the Busemann-Petty problem on sections of convex
bodies. Annals of Math., 149:691-703, 1999. (ref: p. 314)
[GKV94] L. Gargano, J. Korner, and U. Vaccaro. Capacities: From in-
formation theory to extremal set theory. J. Combin. Theory,
Ser. A, 68(2):296-316, 1994. (ref: p. 309)
[GL87] P. M. Gruber and C. G. Lekkerkerker. Geometry of Numbers.
North-Holland, Amsterdam, 2nd edition, 1987. (ref: p. 20)
[GLS88] M. Grotschel, L. Lovasz, and A. Schrijver. Geometric Al-
gorithms and Combinatorial Optimization, volume 2 of Algo-
rithms and Combinatorics. Springer-Verlag, Berlin etc., 1988.
2nd edition 1993. (refs: pp. 24, 26, 293, 321, 322, 327, 381)
Bibliography 435

[Glu89] E. D. Gluskin. Extremal properties of orthogonal paral-


lelepipeds and their applications to the geometry of Banach
spaces. Math. USSR Sbornik, 64(1):85-96, 1989. (ref: p. 320)
[GM90] H. Gazit and G. L. Miller. Planar separators and the Euclidean
norm. In Pmc. 1st Annu. SIGAL Internat. Sympos. Algorithms.
Information Processing Society of Japan, Springer-Verlag, Au-
gust 1990. (ref: p. 57)
[GNRS99] A. Gupta, I. Newman, Yu. Rabinovich, and A. Sinclair. Cuts,
trees and .e1-embeddings of graphs. In Pmc. 40th IEEE Sym-
posium on Foundations of Computer Science, pages 399-409,
1999. Also sumbitted to Combinatorica. (ref: p. 396)
[G097] J. E. Goodman and J. O'Rourke, editors. Handbook of Discrete
and Computational Geometry. CRC Press LLC, Boca Raton,
FL, 1997. (refs: pp. viii, 85)
[Goo97] J. E. Goodman. Pseudoline arrangements. In J. E. Goodman
and J. O'Rourke, editors, Handbook of Discrete and Computa-
tional Geometry, pages 83-110. CRC Press LLC, Boca Raton,
FL, 1997. (refs: pp. 136, 139)
[Gor88] Y. Gordon. Gaussian processes and almost spherical sections of
convex bodies. Ann. Probab., 16:180-188, 1988. (ref: p. 353)
[Gow98] W. T. Gowers. A new proof of Szemeredi's theorem for
arithmetic progressions of length four. Geom. Funct. Anal.,
8(3):529-551, 1998. (refs: pp. 48, 227)
[GP84] J. E. Goodman and R. Pollack. On the number of k-subsets
of a set of n points in the plane. J. Combin. Theory Ser. A,
36:101-104, 1984. (ref: p. 145)
[GP86] J. E. Goodman and R. Pollack. Upper bounds for configurations
and polytopes in ~d. Discrete Comput. Geom., 1:219-227, 1986.
(ref: p. 140)
[GP93] J. E. Goodman and R. Pollack. Allowable sequences and or-
der types in discrete and computational geometry. In J. Pach,
editor, New Trends in Discrete and Computational Geometry,
volume 10 of Algorithms and Combinatorics, pages 103-134.
Springer, Berlin etc., 1993. (ref: p. 220)
[GPS90] J. E. Goodman, R. Pollack, and B. Sturmfels. The intrinsic
spread of a configuration in ~d. J. Amer. Math. Soc., 3:639-
651, 1990. (ref: p. 138)
[GPW93] J. E. Goodman, R. Pollack, and R. Wenger. Geometric transver-
sal theory. In J. Pach, editor, New Trends in Discrete and
436 Bibliography

Computational Geometry, volume 10 of Algorithms and Com-


binatorics, pages 163-198. Springer, Berlin etc., 1993. (ref:
p. 262)
[GPW96] J. E. Goodman, R. Pollack, and R. Wenger. Bounding the num-
ber of geometric permutations induced by k-transversals. J.
Combin. Theory Ser. A, 75:187-197, 1996. (ref: p. 220)
[GPWZ94] J. E. Goodman, R. Pollack, R. Wenger, and T. Zamfirescu.
Arrangements and topological planes. Amer. Math. Monthly,
101(10):866-878, 1994. (ref: p. 136)
[Gro56] A. Grothendieck. Sur certaines classes de suites dans les espaces
de Banach et Ie theoreme de Dvoretzky Rogers. Bol. Soc. Math.
Sao Paulo, 8:81-110, 1956. (ref: p. 352)
[Gro98] M. Gromov. Metric Structures for Riemmanian and non-
Riemmanian spaces. Birkhauser, Basel, 1998. (ref: p. 336)
[GRS97] Y. Gordon, S. Reisner, and C. Schiitt. Umbrellas and polytopal
approximation of the Euclidean ball. J. Approximation Theory,
90(1):9-22, 1997. Erratum ibid. 95:331, 1998. (ref: p. 321)
[Grii60] B. Griinbaum. Partitions of mass-distributions and of convex
bodies by hyperplanes. Pac. J. Math., 10:1257-1267, 1960.
(ref: p. 308)
[Grii67] B. Griinbaum. Convex Polytopes. John Wiley & Sons, New
York, NY, 1967. (refs: pp. 85, 114)
[Grii72] B. Griinbaum. Arrangements and Spreads. Regional Conf. Ser.
Math. American Mathematical Society, Providence, RI, 1972.
(ref: p. 128)
[Gru93] P. M. Gruber. Geometry of numbers. In P. M. Gruber and J. M.
Wills, editors, Handbook of Convex Geometry (Vol. B), pages
739-763. North-Holland, Amsterdam, 1993. (ref: p. 20)
[GupOO] A. Gupta. Embedding tree metrics into low dimensional Eu-
clidean spaces. Discrete Comput. Geom., 24:105-116, 2000.
(ref: p. 393)
[GW93] P. M. Gruber and J. M. Wills, editors. Handbook of Convex Ge-
ometry (volumes A and B). North-Holland, Amsterdam, 1993.
(refs: pp. viii, 8, 85, 314, 320)
[GW95] M. X. Goemans and D. P. Williamson. Improved approximation
algorithms for maximum cut and satisfiability problems using
semidefinite programming. J. ACM, 42:1115-1145,1995. (ref:
p.384)
Bibliography 437

[Har66] L. H. Harper. Optimal numberings and isoperimetric problems


on graphs. J. Combin. Theory, 1:385-393, 1966. (ref: p. 336)
[Har79] H. Harborth. Konvexe Fiinfecke in ebenen Punktmengen. Elem.
Math., 33:116-118, 1979. (ref: p. 37)
[Has97] J. Hastad. Some optimal inapproximability results. In Proc.
29th Annual ACM Symposium on Theory of Computing, pages
1-10, 1997. (ref: p. 384)
[Hoc96] D. Hochbaum, editor. Approximation Algorithms for NP-hard
Problems. PWS Pub!. Co., Florence, Kentucky, 1996. (ref:
p.236)
[Hor83] J. D. Horton. Sets with no empty convex 7-gons. Canad. Math.
Bull., 26:482-484, 1983. (ref: p. 37)
[HS86] S. Hart and M. Sharir. Nonlinearity of Davenport-Schinzel
sequences and of generalized path compression schemes. Com-
binatorica, 6:151-177,1986. (refs: pp. 173, 175)
[HS94] D. Halperin and M. Sharir. New bounds for lower envelopes
in three dimensions, with applications to visibility in terrains.
Discrete Comput. Geom., 12:313-326, 1994. (refs: pp. 189,
192)
[HS95] D. Halperin and M. Sharir. Almost tight upper bounds for
the single cell and zone problems in three dimensions. Discrete
Comput. Geom., 14:385-410, 1995. (ref: p. 193)
[HW87] D. Haussler and E. Welz!. Epsilon-nets and simplex range
queries. Discrete Comput. Geom., 2:127-151, 1987. (refs:
pp. 68, 242, 254)
[IM98] P. Indyk and R. Motwani. Approximate nearest neighbors: To-
wards removing the curse of dimensionality. In Proc. 30th An-
nual ACM Symposium on Theory of Computing, pages 604-613,
1998. (ref: p. 361)
[Ind01] P. Indyk. Algorithmic applications of low-distortion embed-
dings. In Proc. 42nd IEEE Symposium on Foundations of Com-
puter Science, 2001. (refs: pp. 357, 361, 398)
[JL84] W. B. Johnson and J. Lindenstrauss. Extensions of Lipschitz
mappings into a Hilbert space. Contemp. Math., 26:189-206,
1984. (ref: p. 361)
[JM94] S. Jadhav and A. Mukhopadhyay. Computing a centerpoint of
a finite planar set of points in linear time. Discrete Comput.
Geom., 12:291-312, 1994. (ref: p. 16)
438 Bibliography

[Joh48] F. John. Extremum problems with inequalities as subsidiary


conditions. In Studies and Essays, presented to R. Courant on
his 60th birthday, January 8, 1948, pages 187-204. Interscience
Publishers, Inc., New York, N. Y., 1948. Reprinted in: J. Moser
(editor): Fritz John, Collected papers, Volume 2, Birkhauser,
Boston, Massachusetts, 1985, pages 543-560. (ref: p. 327)
[JSVOl] M. Jerrum, A. Sinclair, and E. Vigoda. A polynomial-time
approximation algorithm for the permanent of a matrix with
non-negative entries. In Proc. 33rd Annu. ACM Symposium on
Theory of Computing, pages 712-721, 2001. Also available in
Electronic Colloquium on Computational Complexity, Report
TROO-079, http://eccc.uni -trier. de/eccc/. (ref: p. 322)
[Kai97] T. Kaiser. Transversals of d-intervals. Discrete Comput. Geom.,
18:195-203, 1997. (ref: p. 262)
[KaI84] G. Kalai. Intersection patterns of convex sets. Israel J. Math.,
48:161-174,1984. (ref: p. 197)
[KaI86] G. KalaL Characterization of f-vectors of families of convex
sets in Rd. II: Sufficiency of Eckhoff's conditions. J. Combin.
Theory, Ser. A, 41:167-188, 1986. (ref: p. 197)
[KaI88] G. KalaL A simple way to tell a simple polytope from its graph.
J. Combin. Theory, Ser. A, 49(2):381-383, 1988. (ref: p. 93)
[KaI91] G. Kalai. The diameter of graphs of convex polytopes and f-
vector theory. In Applied Geometry and Discrete Mathematics
(The Victor Klee Festschrift), DIMACS Series in Discr. Math.
and Theoret. Comput. Sci. Vol. 4, pages 387-411. Amer. Math.
Soc., Providence, RI, 1991. (ref: p. 104)
[KaI92] G. KalaL A sub exponential randomized simplex algorithm. In
Proc. 24th Annu. ACM Sympos. Theory Comput., pages 475-
482, 1992. (ref: p. 93)
[Ka197] G. KalaL Linear programming, the simplex algorithm and sim-
ple polytopes. Math. Program., 79B:217-233, 1997. (ref: p. 93)
[Ka101] G. KalaL Combinatorics with a geometric flavor: Some exam-
ples. In Visions in Mathematics Towards 2000 (GAFA, special
volume), part II, pages 742-792. Birkhauser, Basel, 2001. (ref:
p.204)
[Kan96] G. Kant. Drawing planar graphs using the canonical ordering.
Algorithmica, 16:4-32, 1996. (ref: p. 94)
[KarOl] Gy. Karolyi. Ramsey-remainder for convex sets and the Erd6s-
Szekeres theorem. Discrete Applied Math., 109:163-175, 2001.
(ref: p. 33)
Bibliography 439

[Kat78] M. Katchalski. A Helly type theorem for convex sets. Can.


Math. Bull., 21:121-123, 1978. (ref: p. 13)
[KGTOl] D. J. Kleitman, A. Gyarfa,s, and G. T6th. Convex sets in
the plane with three of every four meeting. Combinatorica,
21(2):221-232,2001. (ref: p. 258)
[Kha89] L. G. Khachiyan. Problems of optimal algorithms in convex
programming, decomposition and sorting (in Russian). In Yu. I.
Zhuravlev, editor, The Computer and Choice Problems, pages
161-205. Nauka, Moscow, 1989. (ref: p. 308)
[Kir03] P. Kirchberger. tIber Tschebyschefsche Annaherungsmethoden.
Math. Ann., 57:509-540, 1903. (ref: p. 13)
[Kis68] S. S. Kislitsyn. Finite partially ordered sets and their corre-
sponding permutation sets (in Russian). Mat.Zametki, 4:511-
518, 1968. English translation in Math. Notes 4:798-801, 1968.
(ref: p. 308)
[KK95] J. Kahn and J.-H. Kim. Entropy and sorting. J. Assoc. Comput.
Machin., 51:390-399, 1995. (ref: p. 309)
[KL79] M. Katchalski and A. Liu. A problem of geometry in Rn. Proc.
Amer. Math. Soc., 75:284-288, 1979. (ref: p. 197)
[KL91] J. Kahn and N. Linial. Balancing extensions via Brunn-
Minkowski. Combinatorica, 11(4):363-368, 1991. (ref: p. 308)
[Kla92] M. Klazar. A general upper bound in extremal theory of se-
quences. Comment. Math. Univ. Carol., 33:737-746,1992. (ref:
p. 176)
[Kla99] M. Klazar. On the maximum length of Davenport-Schinzel se-
quences. In R. Graham et al., editors, Contemporary Trends in
Discrete Mathematics (DIMACS Series in Discrete Mathemat-
ics and Theoretical Computer Science, Vol. 49), pages 169-178.
Amer. Math. Soc., Providence, RI, 1999. (ref: p. 176)
[KlaOO] M. Klazar. The Fiiredi-Hajnal conjecture implies the Stanley-
Wilf conjecture. In D. Krob et al., editors, Formal Power
Series and Algebraic Combinatorics (Proceedings of the 12th
FPSAC conference, Moscow, June 25-30,2000), pages 250-255.
Springer, Berlin etc., 2000. (ref: p. 177)
[Kle53] V. Klee. The critical set of a convex body. Amer. J. Math.,
75:178-188, 1953. (ref: p. 12)
[Kle64] V. Klee. On the number of vertices of a convex polytope. Cana-
dian J. Math, 16:701-720, 1964. (refs: pp. 103, 105)
440 Bibliography

[Kle89] R. Klein. Concrete and Abstract Voronoi Diagrams, volume


400 of Lecture Notes Comput. Sci. Springer-Verlag, Berlin etc.,
1989. (ref: p. 121)
[Kle97] J. Kleinberg. Two algorithms for nearest-neighbor search in
high dimension. In Proc. 29th Annu. ACM Sympos. Theory
Comput., pages 599-608, 1997. (ref: p. 361)
[KLL88] R. Kannan, A. K. Lenstra, and L. Lovasz. Polynomial factor-
ization and nonrandomness of bits of algebraic and some tran-
scendental numbers. Math. Comput., 50(181):235-250, 1988.
(ref: p. 26)
[KLMR98] L. E. Kavraki, J.-C. Latombe, R. Motwani, and P. Raghavan.
Randomized query processing in robot path planning. J. Com-
put. Syst. Sci., 57:50-60, 1998. (ref: p. 250)
[KLPS86] K. Kedem, R. Livne, J. Pach, and M. Sharir. On the union of
Jordan regions and collision-free translational motion amidst
polygonal obstacles. Discrete Comput. Geom., 1:59-71, 1986.
(ref: p. 194)
[KLS97] R. Kannan, L. Lovasz, and M. Simonovits. Random walks and
an O*(n 5 ) volume algorithm for convex bodies. Random Struc.
Algo., 11:1-50, 1997. (ref: p. 322)
[KM97a] G. Kalai and J. Matousek. Guarding galleries where every point
sees a large area. Israel J. Math, 101:125-140, 1997. (refs:
pp. 235, 250)
[KM97b] M. Karpinski and A. Macintyre. Polynomial bounds for VC
dimension of sigmoidal and general Pfaffian neural networks.
J. Syst. Comput. Sci., 54(1):169-176, 1997. (ref: p. 250)
[Koc94] M. Kochol. Constructive approximation of a ball by polytopes.
Math. Slovaca, 44(1):99-105, 1994. (ref: p. 324)
[Ko101] V. Koltun. Almost tight upper bounds for vertical decomposi-
tions in four dimensions. In Proc. 42nd IEEE Symposium on
Foundations of Computer Science, 2001. (ref: p. 162)
[Kor73] J. Korner. Coding of an information source having ambigu-
ous alphabet and the entropy of graphs. In Inform. The-
ory, statist. Decision Funct., Random Processes; Transact. 6th
Prague Conf. 1971, pages 411-425, 1973. (ref: p. 309)
[KPOl] V. Kaibel and M. E. Pfetsch. Computing the face lattice of
a polytope from its vertex-facet incidences. Technical Report,
Inst. fur Mathematik, TU Berlin, 2001. (ref: p. 105)
Bibliography 441

[KPR93] P. Klein, S. Plotkin, and S. Rao. Excluded minors, network


decomposition, and multicommodity flow. In Proc. 25th Annual
ACM Symposium on the Theory of Computing, pages 682-690,
1993. (ref: p. 393)
[KPT01] Gy. Karolyi, J. Pach, and G. Toth. A modular version of the
Erd6s-Szekeres theorem. Studia Mathematica Hungarica, 2001.
In press. (ref: p. 38)
[KPW92] J. Komlos, J. Pach, and G. Woeginger. Almost tight bounds
for f-nets. Discrete Comput. Geom., 7:163-173, 1992. (ref:
p.243)
[KRS96] J. Kollar, L. Ronyai, and T. SzabO. Norm-graphs and bipartite
Turan numbers. Combinatorica, 16(3):399-406, 1996. (ref:
p.68)
[KS84] J. Kahn and M. Saks. Balancing poset extensions. Order, 1:113-
126, 1984. (ref: p. 308)
[KS96] J. Komlos and M. Simonovits. Szemeredi's regularity lemma
and its applications in graph theory. In D. Miklos et al. edi-
tors, Combinatorics, Paul Erdos Is Eighty., Vol. 2, pages 295-
352. Janos Bolyai Mathematical Society, Budapest, 1996. (ref:
p. 226)
[KST54] T. K6vari, V. Sos, and P. Turan. On a problem of
k. zarankiewicz. Coll. Math., 3:50-57, 1954. (ref: p. 68)
[KT99] N. Katoh and T. Tokuyama. Lovasz's lemma for the three-
dimensional K-level of concave surfaces and its applications.
In Proc. 40th IEEE Symposium on Foundations of Computer
Science, pages 389-398, 1999. (ref: p. 271)
[KV94] M. Klazar and P. Valtr. Generalized Davenport-Schinzel se-
quences. Combinatorica, 14:463-476, 1994. (ref: p. 176)
[KVOl] Gy. Karolyi and P. Valtr. Point configurations in d-space with-
out large subsets in convex position. Discrete Comput. Geom.,
2001. To appear. (ref: p. 33)
[KZOO] G. Kalai and G. M. Ziegler, editors. Polytopes-Combinatorics
and Computation. DMV-seminar Oberwolfach, Germany,
November 1997. Birkhauser, Basel, 2000. (ref: p. 85)
[Lar72] D. G. Larman. On sets projectively equivalent to the vertices of
a convex polytope. Bull. Lond. Math. Soc., 4:6-12,1972. (ref:
p.206)
[Lat91] J.-C. Latombe. Robot Motion Planning. Kluwer Academic Pub-
lishers, Boston, 1991. (ref: p. 122)
442 Bibliography

[LedOl] M. Ledoux. The Concentration of Measure Phenomenon, vol-


ume 89 of Mathematical Surveys and Monographs. Amer. Math.
Soc., Providence, RI, 2001. (refs: pp. 336, 340)
[Lee82] D. T. Lee. On k-nearest neighbor Voronoi diagrams in the plane.
IEEE Trans. Comput., C-31:478-487, 1982. (ref: p. 122)
[Lee91] C. W. Lee. Winding numbers and the generalized lower-bound
conjecture. In J.E. Goodman, R. Pollack, and W. Steiger, edi-
tors, Computational Geometry: Papers from the DIMACS spe-
cial year, DIMACS Series in Discrete Mathema.tics and Theo-
retical Computer Science 6, pages 209-219. Amer. Math. Soc.,
1991. (ref: p. 280)
[Lei83] F. T. Leighton. Complexity issues in VLSI. MIT Press, Cam-
bridge, MA, 1983. (ref: p. 57)
[Lei84] F. T. Leighton. New lower bound techniques for VLSI. Math.
Systems Theory, 17:47-70, 1984. (ref: p. 56)
[Len83] H. W. Lenstra. Integer programming with a fixed number of
variables. Math. Oper. Res., 8:538-548, 1983. (ref: p. 24)
[Lev26] F. Levi. Die Teilung der projektiven Ebene durch Gerade
oder Pseudogerade. Ber. Math.-Phys. Kl. slichs. Akad. Wiss.
Leipzig, 78:256-267, 1926. (ref: p. 136)
[Lev51] P. Levy. Problemes concrets d'analyse fonctionelle. Gauthier
Villars, Paris, 1951. (ref: p. 340)
[Lin84] N. Linial. The information-theoretic bound is good for merging.
SIAM J. Comput., 13:795-801, 1984. (ref: p. 308)
[Lin92] J. Lindenstrauss. Almost spherical sections; their existence and
their applications. In Jahresbericht der DMV, Jubilaeumstag.,
100 Jahre DMV, Bremen/Dtschl. 1990, pages 39-61, 1992.
(refs: pp. 336, 352)
[LLL82] A. K. Lenstra, H. W. Lenstra, Jr., and L. Lovasz. Factoring
polynomials with rational coefficients. Math. Ann., 261:514-
534, 1982. (ref: p. 25)
[LLR95] N. Linial, E. London, and Yu. Rabinovich. The geometry of
graphs and some its algorithmic applications. Combinatorica,
15:215-245, 1995. (refs: pp. 379, 380, 392, 400)
[LM75] D. Larman and P. Mani. Almost ellipsoidal sections and projec-
tions of convex bodies. Math. Proc. Camb. Philos. Soc., 77:529-
546, 1975. (ref: p. 353)
Bibliography 443

[LM93] J. Lindenstrauss and V. D. Milman. The local theory of normed


spaces and its applications to convexity. In P. M. Gruber and
J. M. Wills, editors, Handbook of Convex Geometry, pages 1149-
1220. North-Holland, Amsterdam, 1993. (refs: pp. 327, 336)
[LMOO] N. Linial and A. Magen. Least-distortion Euclidean embeddings
of graphs: Products of cycles and expanders. J. Combin. Theory
Ser. B, 79:157-171, 2000. (ref: p. 380)
[LMNOl] N. Linial, A. Magen, and N. Naor. Euclidean embeddings of
regular graphs-the girth lower bound. Geometric and Func-
tional Analysis, 2001. In press. (ref: p. 380)
[LMS94] C.- Y. Lo, J. Matousek, and W. L. Steiger. Algorithms for ham-
sandwich cuts. Discrete Comput. Geom., 11:433, 1994. (ref:
p. 16)
[L096] J.-P. Laumond and M. H. Overmars, editors. Algorithms for
Robotic Motion and Manipulation. A. K. Peters, Wellesley, MA,
1996. (ref: p. 122)
[Lov] L. Lovasz. Semidefinite programs and combinatorial optimiza-
tion. In C. Linhares-Sales and B. Reed, editors, Recent Ad-
vances in Algorithmic Discrete Mathematics. Springer, Berlin
etc. To appear. (refs: pp. 293, 381)
[Lov71] L. Lovasz. On the number of halving lines. Annal. Univ. Scie.
Budapest. de Rolando Eotvos Nominatae, Sectio Math., 14:107-
108, 1971. (refs: pp. 269, 280)
[Lov72] L. Lovasz. Normal hypergraphs and the perfect graph conjec-
ture. Discrete Math., 2:253-267, 1972. (ref: p. 293)
[Lov74] L. Lovasz. Problem 206. Matematikai Lapok, 25:181, 1974.
(ref: p. 198)
[Lov86] L. Lovasz. An Algorithmic Theory of Numbers, Graphs and
Conve.rity. SIAM Regional Series in Applied Mathematics.
SIAM, Philadelphia, 1986. (refs: pp. 24, 25)
[Lov93] L. Lovasz. Combinatorial Problems and Exercises (2nd edition).
Akademiai Kiad6, Budapest, 1993. (refs: pp. 235, 374)
[LP86] L. Lovasz and M. D. Plummer. Matching Theory, volume 29 of
Ann. Discrete Math. North-Holland, 1986. (ref: p. 235)
[LPS88] A. Lubotzky, R. Phillips, and P. Sarnak. Ramanujan graphs.
Combinatorica, 8:261-277, 1988. (refs: pp. 367, 382)
444 Bibliography

[LR97] M. Laczkovich and 1. Ruzsa. The number of homothetic sub-


sets. In R. L. Graham and J. Nesetfil, editors, The Mathematics
of Paul Erdos, Vol. II, volume 14 of Algorithms and Gombina-
tories, pages 294-302. Springer, Berlin etc., 1997. (ref: p. 47)
[LS02] N. Linial and M. Saks. The Euclidean distortion of complete
binary trees-An elementary proof. Discr. Gomput. Geom.,
2002. To appear. (ref: p. 393)
[LUW95] F. Lazebnik, V. A. Ustimenko, and A. J. Woldar. A new series
of dense graphs of high girth. Bull. Amer. Math. Soc., New
Ser., 32(1):73-79, 1995. (ref: p. 367)
[LUW96] F. Lazebnik, V. A. Ustimenko, and A. J. Woldar. A characteri-
zation of the components of the graphs D(k, q). Discrete Math.,
157(1-3):271-283, 1996. (ref: p. 367)
[LW88] N. Linial L. Lovasz and A. Wigderson. Rubber bands, convex
embeddings and graph connectivity. Gombinatorica, 8:91-102,
1988. (ref: p. 92)
[Mac50] A.M. Macbeath. A compactness theorem for affine equivalence-
classes of convex regions. Ganad. J. Math, 3:54-61, 1950. (ref:
p. 321)
[Mar88] G. A. Margulis. Explicit group-theoretic constructions of com-
binatorial schemes and their application to the design of ex-
panders and concentrators (in Russian). Probl. Peredachi Inf.,
24(1):51-60, 1988. English translation: Probl. Inf. Transm. 24,
No.1, 39-46 (1988). (ref: p. 382)
[Mat90a] J. Matousek. Construction of E-nets. Discrete Gomput. Geom.,
5:427-448, 1990. (refs: pp. 68, 75)
[Mat90b] J. Matousek. Bi-Lipschitz embeddings into low-dimensional Eu-
clidean spaces. Gomment. Math. Univ. Garolinae, 31:589-600,
1990. (ref: p. 368)
[Mat92] J. Matousek. Efficient partition trees. Discrete Gomput. Geom.,
8:315-334, 1992. (ref: p. 69)
[Mat96a] J. Matousek. Note on the colored Tverberg theorem. J. Gom-
bin. Theory Ser. B, 66:146-151, 1996. (ref: p. 205)
[Mat96b] J. Matousek. On the distortion required for embedding finite
metric spaces into normed spaces. Israel J. Math., 93:333-344,
1996. (refs: pp. 140, 367, 388)
[Mat97] J. Matousek. On embedding expanders into fp spaces. Israel J.
Math., 102:189-197, 1997. (ref: p. 379)
Bibliography 445

[Mat98] J. Matousek. On constants for cuttings in the plane. Discrete


Comput. Geom., 20:427-448, 1998. (ref: p. 75)
[Mat99a] J. Matousek. Geometric Discrepancy (An Illustrated Guide).
Springer-Verlag, Berlin, 1999. (ref: p. 243)
[Mat99b] J. Matousek. On embedding trees into uniformly convex Banach
spaces. Israel J. Math, 114:221-237, 1999. (ref: p. 393)
[Mat01] J. Matousek. A lower bound for weak epsilon-nets in high di-
mension. Discrete Comput. Geom., 2001. In press. (ref: p. 254)
[McM70] P. McMullen. The maximal number of faces of a convex poly-
tope. Mathematika, 17:179-184, 1970. (ref: p. 103)
[McM93] P. McMullen. On simple polytopes. Invent. Math., 113:419-
444, 1993. (ref: p. 105)
[McM96] P. McMullen. Weights on polytopes. Discrete Comput. Geom.,
15:363-388, 1996. (ref: p. 105)
[Mic98] D. Micciancio. The shortest vector in a lattice is hard to approx-
imate within some constants. In Proc. 39th IEEE Symposium
on Foundations of Computer Science, pages 92-98, 1998. (ref:
p.25)
[Mil64] J. W. Milnor. On the Betti numbers of real algebraic varieties.
Proc. Amer. Math. Soc., 15:275-280, 1964. (ref: p. 135)
[Mil69] V. D. Milman. Spectrum of continuous bounded functions on
the unit sphere of a Banach space. Funct. Anal. Appl., 3:67-79,
1969. (refs: pp. 341, 348)
[Mil71] V. D. Milman. New proof of the theorem of Dvoretzky on sec-
tions of convex bodies. Funct. Anal. Appl., 5:28-37, 1971. (refs:
pp. 341, 348, 353)
[Mil98] V. D. Milman. Surprising geometric phenomena in high-
dimensional convexity theory. In A. Balog et al., editors, Eu-
ropean Congress of Mathematics (ECM), Budapest, Hungary,
July 22-26, 1996. Volume II, pages 73-91. Birkhauser, Basel,
1998. (refs: pp. 313, 321, 336)
[Min96] H. Minkowski. Geometrie der Zahlen. Teubner, Leipzig, 1896.
Reprinted by Johnson, New York, NY 1968. (refs: pp. 20,300)
[Mne89] M. E. Mnev. The universality theorems on the classification
problem of configuration varieties and convex polytopes vari-
eties. In O. Y. Vira, editor, Topology and Geometry-Rohlin
Seminar, volume 1346 of Lecture Notes Math., pages 527-544.
Springer, Berlin etc., 1989. (ref: p. 138)
446 Bibliography

[Mor94] M. Morgenstern. Existence and explicit constructions of q + 1


regular Ramanujan graphs for every prime power q. J. Combin.
Theory, Ser. B, 62(1):44-62, 1994. (ref: p. 382)
[Mos52] L. Moser. On the different distances determined by n points.
Amer. Math. Monthly, 59:85-91, 1952. (ref: p. 45)
[MPS+94] J. Matousek, J. Pach, M. Sharir, S. Sifrony, and E. Welzl. Fat
triangles determine linearly many holes. SIAM J. Comput.,
23:154-169, 1994. (ref: p. 194)
[MS71] P. McMullen and G. C. Shephard. Convex Polytopes and the
Upper Bound Conjecture, volume 3 of Lecture Notes. Cambridge
University Press, Cambridge, England, 1971. (refs: pp. 85, 114)
[MS86] V. D. Milman and G. Schechtman. Asymptotic Theory of Finite
Dimensional Normed Spaces. Lecture Notes in Math. 1200.
Springer-Verlag, Berlin etc., 1986. (refs: pp. 300, 335, 336,
340, 346, 353, 361)
[MSOO] W. Morris and V. Soltan. The Erd6s-Szekeres problem on
points in convex position-a survey. Bull. Amer. Math. Soc.,
New Ser., 37(4):437-458, 2000. (ref: p. 32)
[MSW96] J. Matousek, M. Sharir, and E. Welzl. A subexponential bound
for linear programming. Algoritmica, 16:498-516, 1996. (refs:
pp. 94, 327)
[Mu193a] K. Mulmuley. Computational Geometry: An Introduction
Through Randomized Algorithms. Prentice Hall, Englewood
Cliffs, NJ, 1993. (refs: pp. 161, 162)
[MuI93b] K. Mulmuley. Dehn-Sommerville relations, upper bound the-
orem, and levels in arrangements. In Proc. 9th Annu. ACM
Sympos. Comput. Geom., pages 240-246, 1993. (ref: p. 280)
[NarOO] W. Narkiewicz. The Development of Prime Number Theory.
Springer, Berlin etc., 2000. (ref: p. 54)
[NPPSOl] E. Nevo, J. Pach, R. Pinchasi, and M. Sharir. Lenses in ar-
rangements of pseudo circles and their applications. Discrete
Comput. Geom., 2001. In press. (ref: p. 271)
[NR01] I. Newman and Yu. Rabinovich. A lower bound on the dis-
tortion of embedding planar metrics into Euclidean space.
Manuscript, Computer Science Department, Univ. of Haifa;
submitted to Discrete Comput. Geom., 2001. (ref: p. 372)
[NykOO] H. Nyklova. Almost empty convex polygons. KAM-DIMATIA
Series 498-2000 (technical report), Charles University, Prague,
2000. (ref: p. 39)
Bibliography 447

[OBS92] A. Okabe, B. Boots, and K. Sugihara. Spatial Tessellations:


Concepts and Applications of Voronoi Diagrams. John Wiley
& Sons, Chichester, UK, 1992. (ref: p. 120)
[OP49] O. A. Oleinik and I. B. Petrovskii. On the topology of of real
algebraic surfaces (in Russian). Izv. Akad. Nauk SSSR, 13:389-
402, 1949. (ref: p. 135)
[OS94] S. Onn and B. Sturmfels. A quantitative Steinitz' theorem.
Beitriige zur Algebra und Geometrie / Contributions to Algebra
and Geometry, 35:125-129, 1994. (ref: p. 94)
[OT91] P. Orlik and H. Terao. Arrangements of Hyperplanes. Springer-
Verlag, Berlin etc., 1991. (ref: p. 129)
[OY85] C. O'Dunlaing and C. K. Yap. A "retraction" method for plan-
ning the motion of a disk. J. Algorithms, 6:104-111,1985. (ref:
p. 122)
[PA95] J. Pach and P. K. Agarwal. Combinatorial Geometry. John
Wiley & Sons, New York, NY, 1995. (refs: pp. viii, 20, 24, 44,
45, 50, 53, 56, 57, 92, 243)
[Pac98] J. Pach. A Tverberg-type result on multicolored simplices.
Comput. Geom.: Theor. Appl., 1O:7l-76, 1998. (refs: pp. 220,
226, 229)
[Pac99] J. Pach. Geometric graph theory. In J. D. Lamb et al., editors,
Surveys in Combinatorics. Proceedings of the 17th British com-
binatorial conference, University of Kent at Canterbury, UK,
1999, Lond. Math. Soc. Lect. Note Ser. 267, pages 167-200.
Cambridge University Press, 1999. (ref: p. 56)
[Pin02] R. Pinchasi. Gallai-Sylvester theorem for pairwise intersecting
unit circles. Discrete Comput. Geom., 2002. To appear. (ref:
p.44)
[Pis89] G. Pisier. The Volume of Convex Bodies and Banach Space Ge-
ometry. Cambridge University Press, Cambridge, 1989. (refs:
pp. 315, 335, 336, 353, 361)
[Por02] A. Por. A partitioned version of the Erdos-Szekeres theorem.
Discrete Comput. Geom., 2002. To appear. (ref: p. 220)
[PP01] J. Pach and R. Pinchasi. On the number of balanced lines.
Discrete Comput. Geom., 25:611-628, 2001. (ref: p. 280)
[PR93] R. Pollack and M.-F. Roy. On the number of cells defined by a
set of polynomials. C. R. Acad. Sci. Paris, 316:573-577, 1993.
(ref: p. 135)
448 Bibliography

[PS89] J. Pach and M. Sharir. The upper envelope of piecewise lin-


ear functions and the boundary of a region enclosed by convex
plates: combinatorial analysis. Discrete Comput. Geom., 4:291-
309, 1989. (ref: p. 186)
[PS92] J. Pach and M. Sharir. Repeated angles in the plane and related
problems. J. Combin. Theory Ser. A, 59:12-22, 1992. (refs:
pp. 46, 49, 50)
[PS98a] J. Pach and M. Sharir. On the number of incidences between
points and curves. Combinatorics, Probability, and Computing,
7:121-127, 1998. (refs: pp. 46, 49, 64)
[PS98b] J. Pach and J. Solymosi. Canonical theorems for convex sets.
Discrete Comput. Geom., 19:427-435, 1998. (ref: p. 220)
[PS01] J. Pach and J. Solymosi. Crossing patterns of segments. J.
Combin. Theory Ser. A, 96:316-325, 2001. (refs: pp. 223, 227)
[PSS88] R. Pollack, M. Sharir, and S. Sifrony. Separating two sim-
ple polygons by a sequence of translations. Discrete Comput.
Geom., 3:123-136,1988. (ref: p. 176)
[PSS92] J. Pach, W. Steiger, and E. Szemen§di. An upper bound on the
number of planar k-sets. Discrete Comput. Geom., 7:109-123,
1992. (ref: p. 269)
[PSS96] J. Pach, F. Shahrokhi, and M. Szegedy. Applications of the
crossing number. Algorithmica, 16:111-117, 1996. (ref: p. 57)
[PSSOl] J. Pach, I. Safruti, and M. Sharir. The union of cubes in
three dimensions. In Proc. 17th Annu. ACM Sympos. Com-
put. Geom., pages 19-28, 2001. (ref: p. 194)
[PSTOO] J. Pach, J. Spencer, and G. T6th. New bounds for crossing
numbers,. Discrete Comput. Geom., 24:623-644, 2000. (refs:
pp. 57, 58)
[PT97] J. Pach and G. T6th. Graphs drawn with few crossings per
edge. Combinatorica, 17:427-439, 1997. (ref: p. 56)
[PT98] J. Pach and G. T6th. A generalization of the Erdos-Szekeres
theorem to disjoint convex sets. Discrete Comput. Geom.,
19(3):437-445, 1998. (ref: p. 33)
[PTOO] J. Pach and G. T6th. Which crossing number is it anyway? J.
Combin. Theory Ser. B, 80:225-246, 2000. (ref: p. 58)
[Rad21] J. Radon. Mengen konvexer K6rper, die einen gemeinsamen
Punkt enthalten. Math. Ann., 83:113-115, 1921. (ref: p. 12)
Bibliography 449

[Rad47] R. Rado. A theorem on general measure. J. London Math. Soc.,


21:291-300, 1947. (ref: p. 16)
[Rao99] S. Rao. Small distortion and volume respecting embeddings
for planar and Euclidean metrics. In Proc. 15th Annual ACM
Symposium on Comput. Geometry, pages 300-306,1999. (refs:
pp. 393, 398)
[RBG01] L. R6nyai, L. Babai, and M. K. Ganapathy. On the number
of zero-patterns of a sequence of polynomials. J. Amer. Math.
Soc., 14(3):717-735 (electronic), 2001. (ref: p. 136)
[Rea68] J. R. Reay. An extension of Radon's theorem. Illinois J. Math,
12:184-189, 1968. (ref: p. 204)
[RG97] J. Richter-Gebert. Realization Spaces of Polytopes. Lecture
Notes in Mathematics 1643. Springer, Berlin, 1997. (refs:
pp. 92, 94, 139)
[RG99] J. Richter-Gebert. The universality theorems for oriented ma-
troids and polytopes. In B. Chazelle et al., editors, Advances
in Discrete and Computational Geometry, Contemp. Math. 223,
pages 269-292. Amer. Math. Soc., Providence, RI, 1999. (refs:
pp. 94, 138, 139)
[Rou01a] J.-P. Roudneff. Partitions of points into simplices with k-
dimensional intersection. Part I: The conic Tverberg's theorem.
European J. Combinatorics, 22:733-743, 2001. (ref: p. 204)
[Rou01b] J.-P. Roudneff. Partitions of points into simplices with k-
dimensional intersection. Part II: Proof of Reay's conjecture in
dimensions 4 and 5. European J. Combinatorics, 22:745-765,
2001. (ref: p. 204)
[RS64] A. Renyi and R. Sulanke. tiber die konvexe Hulle von n zufaJ1ig
gewahlten Punkten II. Z. Wahrsch. Verw. Gebiete, 3:138-147,
1964. (ref: p. 328)
[Rud91] W. Rudin. Functional Analysis (2nd edition). McGraw-Hill,
New York, 1991. (ref: p. 8)
[Ruz94] 1. Z. Ruzsa. Generalized arithmetical progressions and sumsets.
Acta Math. Hung., 65(4):379-388, 1994. (ref: p. 47)
[RVWOO] O. Reingold, S. P. Vadhan, and A. Wigderson. Entropy waves,
the zig-zag graph product, and new constant-degree expanders
and extractors. In Proc. 41st IEEE Symposium on Foundations
of Computer Science, pages 3-13,2000. (refs: pp. 381, 382)
450 Bibliography

[SA95] M. Sharir and P. K. Agarwal. Davenport-Schinzel Sequences and


Their Geometric Applications. Cambridge University Press,
Cambridge, 1995. (refs: pp. 168, 172, 173, 176, 181, 191)
[SaI75] G.T. Sallee. A Helly-type theorem for widths. In Geom. Metric
Lin. Spaces, Proc. Conf. East Lansing 1974, Lect. Notes Math.
490, pages 227-232. Springer, Berlin etc., 1975. (ref: p. 13)
[Sar91] K. Sarkaria. A generalized van Kampen-Flores theorem. Proc.
Amer. Math. Soc., 11:559-565, 1991. (ref: p. 368)
[Sar92] K. Sarkaria. Tverberg's theorem via number fields. Israel J.
Math., 79:317, 1992. (ref: p. 204)
[Sau72] N. Sauer. On the density of families of sets. Journal of Combi-
natorial Theory Ser. A, 13:145-147, 1972. (ref: p. 242)
[Sch01] L. SchHifli. Theorie der vielfachen Kontinuitat. DenkschriJten
der Schweizerichen naturforschender GesellschaJt, 38:1-237,
1901. Written in 1850-51. Reprinted in Ludwig Schliifti, 1814-
1895, Gesammelte mathematische Abhandlungen, Birkhauser,
Basel 1950. (ref: p. 85)
[Schll] P. H. Schoute. Analytic treatment of the polytopes regularly de-
rived from the regular polytopes. Verhandelingen der Koning-
lijke Akademie van Wetenschappen te Amsterdam, 11(3), 1911.
(ref: p. 85)
[Sch38] I. J. Schoenberg. Metric spaces and positive definite functions.
Trans. Amer. Math. Soc., 44:522-53, 1938. (ref: p. 357)
[Sch48] E. Schmidt. Die Brunn-Minkowski Ungleichung. Math.
Nachrichten, 1:81-157, 1948. (ref: p. 336)
[Sch86] A. Schrijver. Theory of Linear and Integer Programming.
Wiley-Interscience, New York, NY, 1986. (refs: pp. 8, 24, 25,
85)
[Sch87] C. P. Schnorr. A hierarchy of polynomial time lattice basis re-
duction algorithms. Theor. Comput. Sci., 53:201-224, 1987.
(ref: p. 25)
[Sch90] W. Schnyder. Embedding planar graphs on the grid. In Proc.
1st ACM-SIAM Sympos. Discrete Algorithms, pages 138-148,
1990. (ref: p. 94)
[Sch93] R. Schneider. Convex Bodies: The Brunn-Minkowski Theory,
volume 44 of Encyclopedia of Mathematics and Its Applications.
Cambridge University Press, Cambridge, 1993. (ref: p. 301)
Bibliography 451

[Sei91] R. Seidel. Small-dimensional linear programming and convex


hulls made easy. Discrete Comput. Geom., 6:423-434, 1991.
(ref: p. 105)
[Sei95] R. Seidel. The upper bound theorem for polytopes: an easy
proof of its asymptotic version. Comput. Geom. Theory Appl.,
5:115-116, 1995. (ref: p. 104)
[Sei97] R. Seidel. Convex hull computations. In J. E. Goodman and
J. O'Rourke, editors, Handbook of Discrete and Computational
Geometry, chapter 19, pages 361-376. CRC Press LLC, Boca
Raton, FL, 1997. (ref: p. 105)
[Sha94] M. Sharir. Almost tight upper bounds for lower envelopes in
higher dimensions. Discrete Comput. Geom., 12:327-345, 1994.
(ref: p. 192)
[ShaOl] M. Sharir. The Clarkson-Shor technique revisited and ex-
tended. In Proc. 17th Annu. ACM Sympos. Comput. Geom.,
pages 252-256, 2001. (refs: pp. 145, 146)
[She72] S. Shelah. A combinatorial problem, stability and order for
models and theories in infinitary languages. Pacific J. Math.,
41:247-261, 1972. (ref: p. 242)
[Sho91] P. W. Shor. Stretchability of pseudolines is NP-hard. In
P. Gritzman and B. Sturmfels, editors, Applied Geometry and
Discrete Mathematics: The Victor Klee Festschrift, volume 4 of
DIMACS Series in Discrete Mathematics and Theoretical Com-
puter Science, pages 531-554. AMS Press, 1991. (ref: p. 138)
[Sib81] R. Sibson. A brief description of natural neighbour interpola-
tion. In V. Barnet, editor, Interpreting Multivariate Data, pages
21-36. John Wiley & Sons, Chichester, 1981. (ref: p. 122)
[Sie89] C. L. Siegel. Lectures on the Geometry of Numbers. Notes by B.
Friedman. Rewritten by K omaravolu Chandrasekharan with the
assistance of Rudolf Suter. Springer-Verlag, Berlin etc., 1989.
(ref: p. 20)
[SST84] J. Spencer, E. Szemeredi, and W. T. Trotter. Unit distances
in the Euclidean plane. In B. Bollobas, editor, Graph Theory
and Combinatorics, pages 293-303. Academic Press, New York,
NY, 1984. (ref: p. 45)
[SST01] M. Sharir, S. Smorodinsky, and G. Tardos. An improved
bound for k-sets in three dimensions. Discrete Comput. Geom.,
26:195-204, 2001. (refs: pp. 270, 286)
452 Bibliography

[ST74] V. N. Sudakov and B. S. Tsirel'son. Extremal properties of half-


spaces for spherically invariant measures (in Russian). Zap.
Naucn. Sem. Leningrad. at del. Mat. Inst. Steklov. (LOMI),
41:14-24, 1974. Translation in J. Soviet. Math. 9:9-18, 1978.
(ref: p. 336)
[ST83] E. Szemen§di and W. Trotter, Jr. A combinatorial distinction
between Euclidean and projective planes. European J. Gombin.,
4:385-394, 1983. (ref: p. 44)
[ST01] J. Solymosi and Cs. T6th. Distinct distances in the plane. Dis-
crete Gomput. Geom., 25:629-634, 2001. (refs: pp. 45, 61)
[Sta75] R. Stanley. The upper-bound conjecture and Cohen-Macaulay
rings. Stud. Appl. Math., 54:135-142, 1975. (ref: p. 104)
[Sta80] R. Stanley. The number of faces of a simplical convex polytope.
Adv. Math., 35:236-238, 1980. (ref: p. 105)
[Sta86] R. P. Stanley. Two poset polytopes. Discrete Gomput. Geom.,
1:9-23, 1986. (ref: p. 309)
[Ste26] J. Steiner. Einige Gesetze iiber die Theilung der Ebene und
des Raumes. J. Reine Angew. Math., 1:349-364, 1826. (ref:
p. 128)
[Ste16] E. Steinitz. Bedingt konvergente Reihen und konvexe Systeme
I; II; III. J. Reine Angew. Math, 143; 144; 146:128-175; 1-40;
1-52, 1913; 1914; 1916. (ref: p. 8)
[Ste22] E. Steinitz. Polyeder und Raumeinteilungen. Enzykl. Math.
Wiss., 3:1-139, 1922. Part 3AB12. (ref: p. 92)
[Ste85] H. Steinlein. Borsuk's antipodal theorem and its generaliza-
tions and applications: a survey. In A. Granas, editor, Methodes
topologiques en analyse nonlineaire, pages 166-235. Colloq.
Semin. Math. Super., Semin. Sci. OTAN (NATO Advanced
Study Institute) 95, Univ. de Montreal Press, Montreal, 1985.
(ref: p. 16)
[SUOO] J.-R. Sack and J. Urrutia, editors. Handbook of Computational
Geometry. North-Holland, Amsterdam, 2000. (refs: pp. viii,
162)
[SV94] O. Sykora and I. Vrt'o. On VLSI layouts of the star graph and
related networks. Integration, The VLSI Journal, 17(1):83-93,
1994. (ref: p. 57)
[SW01] M. Sharir and E. Welzl. Balanced lines, halving triangles, and
the generalized lower bound theorem. In Proc. 17th Annu. AGM
Sympos. Gomput. Geom., pages 315-318, 2001. (refs: pp. 280,
281)
Bibliography 453

[SY93] J. R. Sangwine-Yager. Mixed volumes. In P. M. Gruber and


J. M. Wills, editors, Handbook of Convex Geometry (Vol. A),
pages 43-71. North-Holland, Amsterdam, 1993. (refs: pp. 300,
301)
[SyI93] J. J. Sylvester. Mathematical question 11851. Educational
Times, 59:98, 1893. (ref: p. 44)
[Sze74] E. Szemeredi. On a problem of Davenport and Schinzel. Acta
Arithmetica, 25:213-224, 1974. (ref: p. 175)
[Sze78] E. Szemeredi. Regular partitions of graphs. In Problemes comb i-
natoires et theorie des graphes, Orsay 1976, Colloq. into CNRS
No.260, pages 399-401. CNRS, Paris, 1978. (ref: p. 226)
[Sze97] L. Szekely. Crossing numbers and hard Erdos problems in dis-
crete geometry. Combinatorics, Probability, and Computing,
6:353-358, 1997. (refs: pp. 44, 45, 56, 61)
[Tag96] B. Tagansky. A new technique for analyzing substructures in
arrangements of piecewise linear surfaces. Discrete Comput.
Geom., 16:455-479, 1996. (ref: p. 186)
[TaI93] G. Talenti. The standard isoperimetric theorem. In P. M. Gru-
ber and J. M. Wills, editors, Handbook of Convex Geometry
(Vol. A), pages 73-123. North-Holland, Amsterdam, 1993. (ref:
p.336)
[Tal95] M. Talagrand. Concentration of measure and isoperimetric in-
equalities in product spaces. Publ. Math. I.H.E.S., 81:73-205,
1995. (ref: p. 336)
[Tam88] A. Tamir. Improved complexity bounds for center location
problems on networks by using dynamic data structures. SIAM
J. Discr. Math., 1:377-396, 1988. (ref: p. 169)
[Tan84] M. R. Tanner. Explicit concentrators from generalized n-gons.
SIAM J. Alg. Discr. Methods, 5(3):287-293,1984. (ref: p. 381)
[Tar75] R. E. Tarjan. Efficiency of a good but not linear set union
algorithm. J. ACM, 22:215-225, 1975. (ref: p. 175)
[Tar95] G. Tardos. Transversals of 2-intervals, a topological approach.
Combinatorica, 15:123-134, 1995. (ref: p. 262)
[TarO 1] G. Tardos. On distinct sums and distinct distances. Manuscript,
Renyi Institute, Budapest, 2001. (refs: pp. 45, 61, 63)
[Tho65] R. Thorn. On the homology of real algebraic varieties (in
French). In S.S. Cairns, editor, Differential and Combinato-
rial Topology. Princeton Univ. Press, 1965. (ref: p. 135)
454 Bibliography

[Tit59] J. Tits. Sur la trialite et certains groupes qui s'en deduisent.


Publ. Math. 1. H. E. S., 2:13-60, 1959. (ref: p. 367)
[TJ89] N. Tomczak-Jaegermann. Banach-Mazur Distances and Finite-
Dimensional Operator Ideals. Pitman Monographs and Surveys
in Pure and Applied Mathematics 38. J. Wiley, New York, 1989.
(refs: pp. 327, 353)
[T6t65] L. Fejes T6th. Regular Figures (in German). Akademiai Kiad6
Budapest, 1965. (ref: p. 322)
[T6t01a] Cs. T6th. The Szemeredi-Trotter theorem in the complex
plane. Combinatorica, 2001. To appear. (ref: p. 44)
[T6t01b] G. T6th. Point sets with many k-sets. Discrete Comput. Geom.,
26:187-194,2001. (refs: pp. 269, 276)
[Tr092] W. T. Trotter. Combinatorics and Partially Ordered Sets: Di-
mension Theory. Johns Hopkins Series in the Mathematical Sci-
ences. The Johns Hopkins University Press, 1992. (ref: p. 308)
[Tr095] W. T. Trotter. Partially ordered sets. In R. L. Graham,
M. Grotschel, and L. Lovasz, editors, Handbook of Combina-
torics, pages 433-480. North-Holland, Amsterdam, 1995. (ref:
p.308)
[TT98] H. Tamaki and T. Tokuyama. How to cut pseudo-parabolas into
segments. Discrete Comput. Geom., 19:265-290, 1998. (refs:
pp. 70, 270)
[Tut60] W. T. Tutte. Convex representations of graphs. Proc. London
Math. Soc., 10(38):304-320, 1960. (ref: p. 92)
[TV93] H. Tverberg and S. VreCica. On generalizations of Radon's
theorem and the ham sandwich theorem. European J. Combin.,
14:259-264, 1993. (ref: p. 204)
[TV98] G. T6th and P. Valtr. Note on the Erd6s-Szekeres theorem.
Discrete Comput. Geom., 19(3):457-459, 1998. (ref: p. 33)
[Tve66] H. Tverberg. A generalization of Radon's theorem. J. London
Math. Soc., 41:123-128, 1966. (ref: p. 203)
[Tve81] H. Tverberg. A generalization of Radon's theorem II. Bull.
Aust. Math. Soc., 24:321-325, 1981. (ref: p. 204)
[UrrOO] J. Urrutia. Art gallery and illumination problems. In J.-R. Sack
and J. Urrutia, editors, Handbook of Computational Geometry,
pages 973-1027. North-Holland, 2000. (ref: p. 250)
Bibliography 455

[Va192a] P. Valtr. Convex independent sets and 7-holes in restricted


planar point sets. Discrete Comput. Geom., 7:135-152, 1992.
(refs: pp. 33, 37)
[VaI92b] P. Valtr. Sets in Rd with no large empty convex subsets. Dis-
crete Appl. Math., 108:115-124, 1992. (ref: p. 37)
[VaI94] P. Valtr. Planar point sets with bounded ratios of distances.
Doctoral Thesis, Mathematik, FU Berlin, 1994. (ref: p. 34)
[VaI98] P. Valtr. Guarding galleries where no point sees a small area.
Israel J. Math, 104:1-16, 1998. (ref: p. 250)
[VaI99a] P. Valtr. Generalizations of Davenport-Schinzel sequences.
In R. Graham et al., editors, Contemporary Trends in Dis-
crete Mathematics, volume 49 of DIMACS Series in Discrete
Mathematics and Theoretical Computer Science, pages 349-
389. Amer. Math. Soc., Providence, RI, 1999. (refs: pp. 176,
177)
[VaI99b] P. Valtr. On galleries with no bad points. Discrete and Com-
putational Geometry, 21:193-200, 1999. (ref: p. 250)
[ValO1] P. Valtr. A sufficient condition for the existence of large empty
convex polygons. Discrete Comput. Geom., 2001. To appear.
(ref: p. 38)
[VC71] V. N. Vapnik and A. Va. Chervonenkis. On the uniform con-
vergence of relative frequencies of events to their probabilities.
Theory Probab. Appl., 16:264-280, 1971. (refs: pp. 242, 243)
[Vem98] S. Vempala. Random projection: a new approach to VLSI lay-
out. In Proc. 39th IEEE Symposium on Foundations of Com-
puter Science, pages 389-395, 1998. (ref: p. 397)
[Vin39] P. Vincensini. Sur une extension d'un tMoreme de M. J. Radon
sur les ensembles de corps convexes. Bull. Soc. Math. France,
67:115-119, 1939. (ref: p. 12)
[Vor08] G. M. Voronoi. Nouvelles applications des parametres conti-
nus a la theorie des formes quadratiques. deuxieme Memoire:
Recherches sur les parallelloedres primitifs. J. Reine Angew.
Math., 134:198-287, 1908. (ref: p. 120)
[VZ93] A. VuCic and R. Zivaljevic. Note on a conjecture of Sierksma.
Discrete Comput. Geom, 9:339-349, 1993. (ref: p. 205)
[Wag01] U. Wagner. On the number of corner cuts. Adv. Appl. Math.,
2001. In press. (ref: p. 271)
456 Bibliography

[War68] H. E. Warren. Lower bound for approximation by nonlinear


manifolds. Trans. Amer. Math. Soc., 133:167-178, 1968. (ref:
p. 135)
[Weg75] G. Wegner. d-collapsing and nerves of families of convex sets.
Arch. Math., 26:317-321, 1975. (ref: p. 197)
[WeI86] E. Welzl. More on k-sets of finite sets in the plane. Discrete
Comput. Geom., 1:95-100, 1986. (ref: p. 270)
[WeI88] E. Welzl. Partition trees for triangle counting and other range
searching problems. In Proc. 4th Annu. ACM Sympos. Comput.
Geom., pages 23-33, 1988. (ref: p. 242)
[WelD 1] E. Welzl. Entering and leaving j-facets. Discrete Comput.
Geom., 25:351-364, 2001. (refs: pp. 104, 145, 280, 282)
[Wil99] A. J. Wilkie. A theorem of the complement and some new 0-
minimal structures. Sel. Math., New Ser., 5(4):397-421, 1999.
(ref: p. 250)
[WoI97] T. Wolff. A Kakeya-type problem for circles. Amer. J. Math.,
119(5):985-1026, 1997. (ref: p. 44)
[WS88] A. Wiernik and M. Sharir. Planar realizations of nonlinear
Davenport-Schinzel sequences by segments. Discrete Comput.
Geom., 3:15-47, 1988. (refs: pp. 173, 176)
[WW93] W. Weil and J. A. Wieacker. Stochastic geometry. In P. M.
Gruber and J. M. Wills, editors, Handbook of Convex Geometry
(Vol. B), pages 391-1438. North-Holland, Amsterdam, 1993.
(ref: p. 99)
[WW01] U. Wagner and E. Welzl. A continuous analogue of the upper
bound theorem. Discrete Comput. Geom., 26:205-219, 2001.
(ref: p. 114)
[Zas75] T. Zaslavsky. Facing up to Arrangements: Face-Count Formulas
for Partitions of Space by Hyperplanes, volume 154 of Memoirs
Amer. Math. Soc. American Mathematical Society, Providence,
RI, 1975. (ref: p. 128)
[Zie94] G. M. Ziegler. Lectures on Polytopes, volume 152 of Graduate
Texts in Mathematics. Springer-Verlag, Heidelberg, 1994. Cor-
rected and revised printing 1998. (refs: pp. viii, 78, 85, 86, 89,
90, 92, 93, 103, 105, 114, 129, 137)
[Ziv97] R. T. Zivaljevic. Topological methods. In J. E. Goodman and
J. O'Rourke, editors, Handbook of Discrete and Computational
Geometry, chapter 11, pages 209-224. CRC Press LLC, Boca
Raton, FL, 1997. (ref: p. 368)
Bibliography 457

[Ziv98] R. T. Zivaljevic. User's guide to equivariant methods in combi-


natorics. II. Publ. Inst. Math. (Beograd) (N.S.), 64(78):lO7-132,
1998. (ref: p. 205)
[ZV90] R.T. Zivaljevic and S.T. VreCica. An extension of the ham
sandwich theorem. Bull. London Math. Soc., 22:183-186, 1990.
(ref: p. 16)
[ZV92] Zivaljevic and S. Vrecica. The colored Tverberg's problem and
complexes of injective functions. J. Gombin. Theory Ser. A,
61:309-318, 1992. (ref: p. 205)
Index

The index starts with notation composed of special symbols, and Greek let-
ters are listed next. Terms consisting of more than one word mostly appear
in several variants, for example, both "convex set" and "set, convex." An
entry like "armadillo, 19(8.4.1), 22(Ex. 4)" means that the term is located in
theorem (or definition, etc.) 8.4.1 on page 19 and in Exercise 4 on page 22.
For many terms, only the page with the term's definition is shown. Names or
notation used only within a single proof or remark are usually not indexed
at all. For important theorems, the index also points to the pages where they
are applied.

l x J (floor function), xv a(G) (independence number), 290


rx 1 (ceiling function), xv a(n) (inverse Ackermann), 173
IXI (cardinality), xv X(G) (chromatic number), 290
Ilxll (Euclidean norm), xv X(G, w) (weighted chromatic
Ilxllr (f1-norm), 84 number), 292
Ilxllp (fp-norm), 357 c:-approximation, 242
Ilxll oo (maximum norm), 83, 357 c:-net, 237(10.2.1), 237(10.2.2)
IIfllLip (Lipschitz norm), 356 - size, 239(10.2.4)
Ilxllz (general norm), 344 - weak, 261(10.6.3)
IIxliK (norm induced by K), 344
- - for convex sets, 253(10.4.1)
G (graph complement), 290
c:-pushing, 102
(~) (unordered k-tuples), xvi
Fly (restriction of a set system), 1]-dense set, 313
238 1]-net, 314
8A (boundary), xv - application, 323, 340, 343, 365,
X* (dual set), 80(5.1.3) 368
(x, y) (scalar product), xv 1]-separated set, 314
A + B (Minkowski sum), 297 rp(d) (Euler's function), 53
r(x) (gamma function), 312 As(n) (maximum length of DS
no (asymptotically at least), xv sequence), 167
<I>(G) (edge expansion), 373 I/(F) (packing number), 232
<I>d(n) = L:t=l (7), 127(6.1) 1/* (F) (fractional packing
eo (both 0(·) and n(·)), xv number), 233
460 Index

lJk(F) (simple k-packing number), - for volume approximation,


236(Ex.4) 315, 321
w(G) (clique number), 290 - Goemans-Williamson for
w( G, w) (weighted clique MAXCUT,384(Ex.8)
number), 291 - greedy, 235, 236{Ex.4)
'lrF(-) (shatter function), 239 -LLL,25
7/J( m, n) (m-decomposable - simplex, 93
DS-sequence, length), 178 - sparsest cut, approximation,
p(YI, ... , Yk ) (hypergraph 391
density), 223 almost convex set, 38, 39(Ex.5)
a(n) (lower envelope of segments, almost orthogonal vectors,
complexity), 166 362(Ex.3)
r(F) (transversal number), 232 t-almost spherical body, 341
r* (F) (fractional transversal almost spherical projection, 353
number), 232 almost spherical section
- of a convex body, 345{14.4.5),
Ak (n) (kth function in the 348(14.6.1)
Ackermann hierarchy), 173 - of a crosspolytope, 346,
A(n) (Ackermann function), 173
353{Ex.2)
Ackermann function, 173
- of a cube, 343
AffDep(a), 109
- of an ellipsoid, 342(14.4.1)
affine combination, 1
antichain, 295(Ex.4)
affine dependence, 2
approximation
affine Gale diagram, 112
- by a fraction, 19{2.1.3),
affine hull, 1
20(Ex.4), 21 (Ex. 5)
affine mapping, 3
- of a sparsest cut, 391
affine subspace, 1
affinely isomorphic arrangements, - of edge expansion, 391
133 - of volume, 321
AfNal(a), 109 - - hardness, 315
Alexandrov-Fenchel inequality, €-approximation, 242
301 are, 54
algebraic geometry, 131 arithmetic progression
algebraic number, 20{Ex.4) - generalized, 47
algebraic surface patches - primes in, 53(4.2.4)
- lower envelope, 189 - Szemeredi's theorem, 227
- single cell, 191(7.7.2) arrangement
algebraic surfaces, arrangement, - affine isomorphism, 133
130 - central, 129
- decomposition problem, 162 - isomorphism, 133
algorithm - many cells, 43,46, 58{Ex.3),
- convex hull, 86, 105 152(Ex.3)
- for t'2-embedding, 378 - of arbitrary sets, 130
- for centerpoint, 16 - of hyperplanes, 126
- for ham sandwich, 16 - - number of cells, 127(6.1.1)
Index 461

- - unbounded cells, 129(Ex.2) Borsuk-Ulam theorem,


- of lines, 42 application, 15, 205
- of pseudolines, 132, 136 bottom-vertex triangulation, 160,
- of pseudosegments, 270 161
- of segments, 130 brick set, 298
- realization space, 138 Brunn's inequality, 297(12.2.1)
- simple, 127 - application, 306
- stretchable, 134, 137 Brunn-Minkowski inequality,
- triangulation, 72(Ex.2), 160 297(12.2.2)
art gallery, 246, 250 - application, 331, 333
atomic lattice, 89 - dimension-free form,
301(Ex.5)
Bn (unit ball in R n ), xv Busemann-Petty problem, 313
B(x, r) (r-ball centered at x), xv
balanced line, 280
en (Hamming cube), 335
cage representation, 93
Balinski's theorem, 88
canonical triangulation,
ball
see bottom-vertex triangulation
- e1, see crosspolytope cap, 31
- random point in, 312 - spherical (volume), 333
- smallest enclosing, 13(Ex.5) Caratheodory's theorem, 6(1.2.3),
- - uniqueness, 328(Ex.4) 8
- volume, 311 - application, 199, 200, 208, 319
Banach spaces, local theory, 329, - colorful, 199(8.2.1)
336 - - application, 202
Banach-Mazur distance, 346 Cauchy-Schwarz inequality, xvi
bandwidth, 397 cell
basis (lattice), 21 - complexity
- reduced, 25 - - in R2, 176
Bezdek's conjecture, 44 - - in higher dimensions, 191,
bi-Lipschitz mapping, 356 193
binomial distribution, 240 - of an arrangement, 43, 126,
bipartite graph, xvi 130
bisection width, 57 24-cell, 95(Ex.4)
bisector, 121 center transversal theorem,
Blaschke-Santal6 inequality, 320 15(1.4.4)
body, convex centerpoint, 14(1.4.1), 210
- almost spherical, 341 centerpoint theorem, 14(1.4.2),
- almost spherical section, 205
345(14.4.5), 348(14.6.1) central arrangement, 129
- approximation by ellipsoids, chain, 295(Ex.4)
325(13.4.1) chain polytope, 309
- lattice points in, 17-28 Chebyshev's inequality, 240
- volume approximation, 315, chirotope, 216
321 chromatic number, 290
462 Index

circles complex, simplicial


- cutting lemma, 72 - d-Leray, 197
- incidences, 45, 63(Ex.l), - d-collapsible, 197
63(Ex. 2), 69, 70(Ex.2), - d-representable, 197
73(Ex.4) - Van Kampen-Flores, 368
- - application, 50(Ex. 8) compression, path, 175
- touching (and planar graphs), concentration
92 - for a Hamming cube,
- unit 335(14.2.3)
- - incidences, 42, 49(Ex.l), - for a sphere, 331(14.1.1)
52(4.2.2), 58(Ex.2), - for an expander, 384(Ex.7)
70(Ex.l) - for product spaces, 340
- - Sylvester-like result, 44 - Gaussian, 334(14.2.2)
circumradius, 317(13.2.2) - of projection, 359(15.2.2)
- approximation, 322 (p, q)-condition, 255
Clarkson's theorem on levels, conductance, see edge expansion
141(6.3.1) cone
clique number, 290 - convex, 9(Ex. 6),201
closed from above (or from - metric, 106, 377
below),36 - of squared Euclidean metrics,
closest pair, computation, 122 377
coatomic lattice, 89 cone(X), 201
d-coIlapsible simplicial complex, conjecture
197 - ~-~, 308
colored HeIly theorem, 198(Ex.2) - d-step, 93
colored Tverberg theorem, - Bezdek's, 44
203(8.3.3) - Dirac-Motzkin, 50
- application, 213 - Furedi-Hajnal, 177
- for r = 2,205 - Grunbaum-Motzkin, 261
colorful Caratheodory theorem, - Hirsch, 93
199(8.2.1 ) - Kalai's, 204
- application, 202 - perfect graph, strong, 291
combination - perfect graph, weak, 291
- affine, 1 - Purdy's, 48
- convex, 6 - Reay's, 204
combinatorially equivalent - Ryser's, 235
polytopes, 89(5.3.4) - Sierksma's, 205
combinatorics, polyhedral, 289 - Stanley-Wilf, 177
compact set, xvi connected graph, xvi
comparability graph, 294(Ex.4), constant, lattice, 23
309 continuous motion argument, 284
complete graph, xvi continuous upper bound theorem,
complex plane, point-line 114
incidences, 44 conv(X) (convex hull), 5
Index 463

convex body copies, similar (counting), 47,


- almost spherical, 341 51(Ex.10)
- almost spherical section, cr(G) (crossing number), 55
345(14.4.5),348(14.6.1) cr(X) (crossing number of the
- approximation by ellipsoids, halving-edge graph), 283
325(13.4.1) criterion, Gale's, 97(5.4.4)
- lattice points in, 17-28 cross-ratio, 47
- volume approximation, 315, crossing (in a graph drawing), 54
321 crossing edges, pairwise, 176
convex combination, 6 crossing number, 54
convex cone, 9(Ex. 6), 201 - and forbidden subgraphs, 57
convex function, xvi - odd, 58
convex hull, 5 - pairwise, 58
- algorithm, 86, 105 crossing number theorem,
- of random points, 99, 324 55(4.3.1)
convex independent set, 30(3.1.1) - application, 56, 61, 70, 283
- in a grid, 34(Ex.2) - for multigraphs, 60(4.4.2)
crosspolytope, 83
- in higher dimension, 33
- almost spherical section, 346,
- size, 32
353(Ex.2)
convex polygons, union
- faces, 88
complexity, 194
- projection, 86(Ex. 2)
convex polyhedron, 83
cryptography, 26
convex polytope, 83
cube, 83
- almost spherical, number of
- almost spherical section, 343
facets, 343(14.4.2)
- faces, 88
- integral, 295(Ex.5)
- Hamming, 335
- number of, 139(Ex.3)
- - embedding into £2,369
- realization, 139 - - measure concentration,
- symmetric, number of facets, 335(14.2.3)
347(14.4.2) cubes, union complexity, 194
- volume cup, 30
- - lower bound, 322 curve, moment, 97(5.4.1)
- - upper bound, 315(13.2.1) curves
convex polytopes, union - cutting into pseudosegments,
complexity, 194 70, 271, 272(Ex. 5), 272 (Ex. 6)
convex position, 30 - incidences, 46
convex set, 5(1.2.1) - lower envelope, 166, 187(7.6.1)
convex sets - single cell, 176
- in general position, 33 cut pseudometric, 383(Ex. 3), 391
- intersection patterns, 197 cut, sparsest, approximation, 391
- transversal, 256(10.5.1) cutting, 66
- upper bound theorem, 198 - on the average, 68
- VC-dimension, 238 cutting lemma, 66(4.5.3), 68
464 Index

- application, 66, 261 - and orientation, 216


- for circles, 72 - and volume, 26{Ex.l)
- higher-dimensional, 160(6.5.3) - of a lattice, 21
- lower bound, 71 diagram
- proof, 71, 74, 153, 162, - Gale, 112
251(Ex.4) - power, 121
cutwidth, 57 - Voronoi, 115
cyclic polytope, 97(5.4.3) - - abstract, 121
- universality, 99(Ex.3) - - complexity, 119(5.7.4),
cylinders, union complexity, 194 122(Ex.2), 123(Ex.3), 192
- - farthest-point, 120
'D (duality), 81 - - higher-order, 122
'Do (duality), 78(5.1.1) - wiring, 133
D(Ll) (defining set), 157 diameter
D-embedding, 356(15.1.1)
- and smallest enclosing ball,
d-intervals, transversal, 262,
13(Ex.5)
262(Ex.2)
- approximation, 322
d-step conjecture, 93
- in f1' computation, 388(Ex.l)
Davenport-Schinzel sequence, 167
Dilworth's theorem, 294(Ex.4)
- asymptotics, 174
dim (F) (VC-dimension),
- decomposable, 178
238(10.2.3)
- generalized, 174, 176
dimension
- realization by curves,
168(Ex.l) - of a polytope, 83
decomposition problem, 162 - Vapnik-Chervonenkis,
decomposition, vertical, 72(Ex.3), see VC-dimension
156 - VC-dimension, 238{1O.2.3)
deep below, 35 Dirac-Motzkin conjecture, 50
defining set, 158 Dirichlet tessellation, see Voronoi
deg(x) (degree in halving-edge diagram
graph),283 Dirichlet's theorem, 53
degree, xvi disk, largest empty, computation,
Dehn-Sommerville relations, 103 122
Delaunay triangulation, 117, 120, disks
123(Ex.5) - transversal, 231, 262(Ex.l)
Delone, see Delaunay - union complexity, 124{Ex.1O),
dense set, 33 193
1J-dense set, 313 distance, Banacl1-Mazur, 346
density distances
- of a graph, local, 397 - distinct, 42, 59{4.4.1)
- of a hypergraph, 223 - - bounds, 45
dependence, affine, 2 - unit, 42
detA,21 - - and incidences, 49(Ex.l)
determinant - - for convex position, 45
- and affine dependence, 3 - - i n R2, 45
Index 465

- - in R3, 45 - parallel, 176


- - in R 4 , 45, 49(Ex. 2) Edmonds' matching polytope
- -lower bound, 52(4.2.2) theorem, 294
- - on a 2-sphere, 45 efficient comparison theorem,
- - upper bound, 58(Ex.2) 303(12.3.1)
distortion, 356(15.1.1) eigenvalue, second, 374, 381
distribution Elekes-Ronyai theorem, 48
- binomial, 240 elimination, Fourier-Motzkin, 86
- normal, 334, 352 ellipsoid
divisible point, 204 - almost spherical section,
domains of action, 120 342(14.4.1)
dominated (pseudo )metric, 389 - definition, 325
double-description method, 86 - Lowner-John, 327
drawing (of graph), 54 - smallest enclosing
- on a grid, 94 - - computation, 327
- rubber-band, 92 - - uniqueness, 328(Ex.3)
dual polytope, 90 ellipsoid method, 381
dual set, 80(5.1.3) embedding
dual set system, 245
- distortion and dimension, 368
dual shatter function, 242
- - lower bound, 364(15.3.3)
duality
- into £1,378, 379, 396
- of linear programming,
- into £2, 399(Ex. 5), 400(Ex.6),
233(10.1.2)
400(Ex.7)
- of planar graphs, 80
- - algorithm, 378
- transform, 78(5.1.1), 81(5.1.4)
- - dimension reduction,
Dvoretzky's theorem, 348(14.6.1),
358(15.2.1), 362(Ex.3),
352
369(Ex.4)
Dvoretzky-Rogers lemma,
349(14.6.2), 352 - - lower bound, 366,
370(15.4.1),375(15.5.1),
E[·] (expectation), xv 380
E(:!:.) (linear extensions), 303 - - testability, 376(15.5.2)
e(:!:.) = IE(:!:.)I, 303 - - upper bound, 388(Ex.3),
E(G) (edge set), xvi 389(15.7.1)
e(Y1 , •.. , Yk ) (number of edges on - into £=
the Yi), 223 - - isometric, 385(15.6.1)
edge - - upper bound, 386(15.6.2)
- k-edge, 266 - into £p, 379, 391, 398(Ex.2),
- halving, 266 398(Ex.l)
- of a polytope, 87 - - isometric, 383(Ex. 5),
- of an arrangement, 43, 130 383(Ex.2)
edge expansion, 373 - into arbitrary normed space,
- approximation, 391 367
edges - isometric, 356
- pairwise crossing, 176 - of planar-graph metrics, 393
466 Index

- of tree metrics, 392, extremal point, 87, 95(Ex.9),


399(Ex. 5), 400(Ex.6), 95(Ex.1O)
400(Ex. 7), 400(Ex.9) extreme (in arrangement),
- volume-respecting, 396 145(Ex.1)
D-embedding, 356(15.1.1)
entropy (graph), 309 fk(P) (number of k-faces), 96
envelope, lower f -vector, 96
- of curves, 166, 187(7.6.1) - of a representable complex,
- of segments, 165 197
- - lower bound, 169(7.2.1) face
- of simplices, 186 - of a polytope, 86(5.3.1)
- of triangles, 183(7.5.1), 186 - of an arrangement, 126, 130
- superimposed projections, 192 - popular, 151
epsilon net theorem, 239(10.2.4) face lattice, 88
facet
- application, 247, 251(Ex. 4)
- k-facet, 265
- if and only if form, 252
- halving, 266
equivalent polytopes,
- - interleaving lemma,
combinatorially, 89(5.3.4)
277(11.3.1)
equivalen~ radius, 297
- - interleaving lemma,
Erdos-Sachs construction,
application, 279, 284, 287
368(Ex.1)
- of a polytope, 87
Erdos-Simonovits theorem,
- of an arrangement, 126
213(9.2.2)
factorization, of polynomial, 26
Erdos-Szekeres lemma, 295(Ex.4)
Fano plane, 44
Erdos-Szekeres theorem, Farkas lemma, 7(1.2.5), 8,
30(3.1.3) 9(Ex.7)
- another proof, 32 farthest-point Voronoi diagram,
- application, 35 120
- generalizations, 33 fat objects, union complexity, 194
- positive-fraction, 220(9.3.3), fat-lattice polytope, 107(Ex.1)
222(Ex.4) finite projective plane, 44, 66
- quantitative bounds, 32 first selection lemma, 208(9.1.1)
Euler function, 53 - application, 253
excess, 154 - proofs, 210
Excl(H) (excluded minor class), flag, 105, 129(Ex.6)
393 flat, 3
excluded minor, and metric, 393 flattening lemma, 358(15.2.1)
expander, 373, 381-382 - application, 366
- measure concentration, - lower bound, 362(Ex.3),
384(Ex.7) 369(Ex.4)
exposed point, 95(Ex.9) flipping (Delaunay triangulation),
extension 120
- linear, 302 forbidden
- of Lipschitz mapping, 361 - permutation, 177
Index 467

- short cycles, 362 Gale's criterion, 97(5.4.4)


- subgraph, 64 GaUai-type problem, 231
- - and crossing number, 57 gallery, art, 246, 250
- subhypergraph, 213(9.2.2) Gaussian distribution, 352
- submatrix, 177 Gaussian integers, 52
- subsequence, 174 Gaussian measure, 334
forest, regular, 18(2.1.2) - concentration, 334(14.2.2)
form, linear, 27(Ex.4) general position, 3
four-square theorem, Lagrange's, - of convex sets, 33
28(Ex.l) generalized arithmetic
Fourier-Motzkin elimination, 86 progression, 47
fraction, approximation by, generalized Davenport-Schinzel
19(2.1.3), 20(Ex. 4), 21(Ex. 5) sequence, 174, 176
fractional Helly theorem, generalized lower bound theorem,
195(8.1.1) 105
- application, 209, 211, 258 - application, 280
- for line transversals, generalized triangle, 66(4.5.3)
260(10.6.2) genus, and VC-dimension,
fractional packing, 233 251(Ex.6)
fractional transversal, 232 geometric graph, 56, 176
- bound, 256(10.5.2) geometry
- for infinite systems, 235
- of numbers, 17, 20
Freiman's theorem, 47
- real algebraic, 131
Frechet's embedding, 385
Geronimus polynomial, 380
function
girth, 362
- Ackermann, 173
- and t'2-embeddings, 380
- convex, xvi
Goemans-Williamson algorithm
- dual shatter, 242
for MAXCUT, 384(Ex.8)
- Euler's, 53
graded lattice, 89
- Lipschitz, 337
graph, xvi
- - concentration, 337-341
- primitive recursive, 174 - bipartite, xvi
- rational, on Cartesian product, - comparability, 294(Ex. 4), 309
48 - complete, xvi
- shatter, 239 - connected, xvi
functional, Laplace, 340 - determines a simple polytope,
Fiiredi-Hajnal conjecture, 177 93
- entropy, 309
g(n) (number of distinct - geometric, 56, 176
distances), 42 - intersection, 139(Ex.2)
g-theorem, 104 - isomorphism, xvi
g- vector, 104 - Kr,s-free, 65, 68
Gale diagram, 112 - Moore, 367
Gale transform, 107 - of a polytope, 87
- application, 210, 282(Ex. 6) - - connectivity, 88, 95(Ex.8)
468 Index

- perfect, 290-295 - application, 12{Ex.2),


- regular, xvi 13{Ex.5), 14{1.4.1), 82{Ex.9),
- shattering, 251{Ex. 5) 196{8.1.2), 200
- without short cycles, 362 - colored, 198{Ex. 2)
graph drawing, 54 - fractional, 195{8.1.1)
- on a grid, 94 - - application, 209, 211, 258
- rubber-band, 92 - - for line transversals,
Grassmannian, 339 260{1O.6.2)
greedy algorithm, 235, 236{Ex.4) Helly-type theorem, 261,
growth function, see shatter 263{Ex.4)
function - for containing a ray, 13{Ex. 7)
Griinbaum-Motzkin conjecture, - for lattice points, 295{Ex. 7)
261 - for line transversals, 82{Ex.9)
- for separation, 13{Ex. 10)
h{a) (height in poset), 305 - for visibility, 13{Ex.8)
H-polyhedron, 82{5.2.1) - for width, 12{Ex.4)
H-polytope,82{5.2.1) HFACd{n) (number of halving
h-vector, 102 facets), 267
Hadwiger's transversal theorem, hierarchically well-separated tree,
262 398
Hadwiger-Debrunner high above, 35
(p, q)-problem, 255 higher-order Voronoi diagram,
Hahn-Banach theorem, 8 122
half-space, 3 Hilbert space, 357
half-spaces, VC-dimension, Hirsch conjecture, 93
244{1O.3.1 ) k-hole,34
Hall's marriage theorem, 235 - modulo q, 38
halving edge, 266 Horton set, 36
halving facet, 266 - in Rd, 38
- interleaving lemma, hull
277{11.3.1) - affine, 1
- - application, 279, 284, 287 - convex, 5
ham-sandwich theorem, 15{1.4.3) - - algorithm, 86, 105
- application, 218 hypergraph, 211
Hammer polytope, 348 {Ex. 1) hyperplane, 3
Hamming cube, 335 - linear, 109
- embedding into £2, 369 hyperplane transversal,
Harper's inequality, 335 259{1O.6.1), 262
HDd{p,q) ({p,q)-theorem), hyperplanes, arrangement, 126
256{1O.5.1)
height, 304 I{Ll) (intersecting objects), 154
Helly number, 12 I{m, n) (number of point-line
Helly order, 263{Ex.4) incidences), 41
Helly's theorem, 1O{1.3.2) I{P, L) (point-line incidences), 41
Index 469

hcirc (m, n) (number of point-unit intersection graph, 139(Ex.2)


circle incidences), 42 d-intervals, transversal, 262,
lcirc(m, n) (number of 262(Ex.2)
point-circle incidences), 45 inverse Blaschke-Santa16
incidence matrix, 234 inequality, 320
incidences, 41 isometric embedding, 356
- point-circle, 45, 63(Ex.1), isomorphism
63(Ex. 2), 69, 70(Ex.2), - of arrangements, 133
73(Ex.4) - - affine, 133
- - application, 50(Ex.8) - of graphs, xvi
- point-curve, 46 - of hypegraphs, 211
- point-line, 41(4.1.1) isoperimetric inequality, 333-337
- - in the complex plane, 44 - reverse, 337
- - lower bound, 51(4.2.1)
- point-plane, 46 Jensen's inequality, xvi
- point-unit circle, 42, 49(Ex.1), John's lemma, 325(13.4.1)
52(4.2.2), 70(Ex.1) - application, 347, 350
- - upper bound, 58 (Ex. 2) Johnson-Lindenstrauss flattening
independence number, 290 lemma, 358(15.2.1)
independent set, 290 - application, 366
induced subgraph, 290 - lower bound, 362(Ex.3),
369(Ex.4)
inequality
join, 89
- Alexandrov-Fenchel, 301
- Blaschke-Santa16, 320
Kn (complete graph), xvi
- Brunn's, 297(12.2.1)
Kr,s (complete bipartite graph),
- - application, 306 64
- Brunn-Minkowski, 297(12.2.2) Kr,s-free graph, 65, 68
- - application, 331, 333 Kk(t) (complete k-partite
- - dimension-free form, hypergraph), 212
301 (Ex. 5) K2 (planar convex sets), 238
- Cauchy-Schwarz, xvi K (m, n) (number of edges of m
- Chebyshev's, 240 cells), 43
- Harper's, 335 k-edge, 266
- isoperimetric, 333-337 k-facet, 265
- - reverse, 337 k-flat, 3
- Jensen's, xvi k-hole,34
- Pf/§kopa-Leindler, 300, - modulo q, 38
302(Ex.7) k-interior point, 9
- Sobolev, logarithmic, 337 k-partite hypergraph, 211
inradius, 317(13.2.2) k-set, 265
- approximation, 322 - polytope, 273(Ex. 7)
integer programming, 25 k-uniform hypergraph, 211
k-interior point, 9 K6vari-S6s-Thran theorem,
interpolation, 117 65(4.5.2)
470 Index

Kakeya problem, 44 lemma


Kalai's conjecture, 204 - cutting, 66(4.5.3), 68
kernel, 13(Ex.8) - - application, 66, 261
KFACd(n, k) (maximum number - - for circles, 72
of k-facets), 266 - - higher-dimensional,
KFAC(X, k) (number of 160(6.5.3)
k-facets), 266 - - lower bound, 71
Kirchberger's theorem, 13(Ex. 10) - - proof, 71, 74, 153, 162,
knapsack problem, 26 251(Ex.4)
Koebe's representation theorem, - Dvoretzky-Rogers,
92 349(14.6.2), 352
Konig's edge-covering theorem, ~ Erdos-Szekeres, 295(Ex.4)
235, 294(Ex.3) - Farkas, 7(1.2.5), 8, 9(Ex. 7)
Krasnosel'skii's theorem, 13(Ex.8) - first selection, 208(9.1.1)
Krein-Milman theorem, in R d , - - application, 253
96(Ex.1O) - - proofs, 210
Kruskal-Hoffman theorem, - halving-facet interleaving,
295(Ex.6) 277(11.3.1)
- - application, 279, 284, 287
£2 (squared Euclidean metrics),
- John's, 325(13.4.1)
377
- - application, 347, 350
£p (countable sequences with
- Johnson-Lindenstrauss
£p-norm), 357
flattening, 358(15.2.1)
£p-norm, 357
- - application, 366
£g (Rd with £p-norm), 357
£l-ball, see crosspolytope - -lower bound, 362(Ex.3),
Lagrange's four-square theorem, 369(Ex.4)
28(Ex.l) - Levy's, 338(14.3.2), 340
Laplace functional, 340 - - application, 340, 359
Laplacian matrix, 374 - Lov8sz, 278(11.3.2)
largest empty disk, computation, - - exact, 280, 281(Ex. 5)
122 - - planar, 280(Ex.1)
lattice - positive-fraction selection,
- face, 88 228(9.5.1)
- general definition, 22 - Radon's, 9(1.3.1), 12
- given by a basis, 21 - - application, 11, 12(Ex.l),
- shortest vector, 25 222(Ex. 3), 244
lattice basis theorem, 22(2.2.2) - - positive-fraction, 220
lattice constant, 23 - regularity
lattice packing, 23 - - for hypergraphs, 226
lattice point, 17 - - for hypergraphs, weak,
- computation, 24 223(9.4.1)
- Helly-type theorem, 295(Ex.7) - - for hypergraphs, weak,
Lawrence's representation application, 227(Ex. 2), 229
theorem, 137 - - Szemeredi's, 223, 226
Index 471

- same-type, 217(9.3.1) Lipton-Tarjan separator theorem,


- - application, 220, 229 57
- - partition version, 220 LLL algorithm, 25
- second selection, 211(9.2.1) local density, 397
- - application, 228, 279 local theory of Banach spaces,
- - lower bounds, 215(Ex. 2) 329, 336
- - one-dimensional, 215(Ex.1) location, in planar subdivision,
- shatter function, 239(10.2.5) 116
- - application, 245, 248 log* x (iterated logarithm), xv
lens (in arrangement), 272(Ex. 5), Lovasz lemma, 278(11.3.2)
272(Ex.6) - exact, 280, 281(Ex. 5)
d-Leray simplicial complex, 197 - planar, 280(Ex.l)
level, 73, 141 lower bound theorem,
- and k-sets, 266 generalized, 105
- and higher-order Voronoi - application, 280
diagrams, 122 lower envelope
- at most k, complexity, - of curves, 166, 187(7.6.1)
141(6.3.1) - of segments, 165
- for segments, 186(Ex.2) - - lower bound, 169(7.2.1)
- for triangles, 183 - of simplices, 186
- simplification, 74 - of triangles, 183(7.5.1), 186
- superimposed projections, 192
Levy's lemma, 338(14.3.2), 340
Lowner-John ellipsoid, 327
- application, 340, 359
LinDep(ii), 109
m(£, n) (maximum number of
line pseudometric, 383(Ex. 2), 389 edges for girth > £), 362
line transversal, 82(Ex.9), Manhattan distance, see £l-norm
259(10.6.1),262 many cells, complexity, 43, 46,
line, balanced, 280 58(Ex.3), 152(Ex.3)
linear extension, 302 mapping
linear form, 27(Ex.4) - affine, 3
linear hyperplane, 109 - bi-Lipschitz, 356
linear ordering, 302 - Lipschitz, 337
linear programming, 7 - - extension, 361
- algorithm, 93 - Veronese, 244
- duality, 233(10.1.2) marriage theorem, Hall's, 235
linear subspace, 1 matching, 232
linearization, 244 matching number, see packing
lines, arrangement, 42 number
LinVal(ii), 109 matching polytope, 289, 294
Lipschitz function, concentration, matrix
337-341 - forbidden pattern, 177
Lipschitz mapping, 337 - incidence, 234
- extension, 361 - Laplacian, 374
Lipschitz norm, 356 - rank and signs, 140(Ex.4)
472 Index

matroid, oriented, 137 - for general lattices, 22(2.2.1)


MAXCUT problem, 384(Ex.8) Minkowski-Hlawka theorem, 23
maximum norm, see foo-norm minor, excluded, and metric, 393
measure mixed volume, 301
- Gaussian, 334 molecular modeling, 122
- on k-dimensional subspaces, moment curve, 97(5.4.1)
339 x-monotone (curve), 73
- on sn-l, uniform, 330 monotone subsequence, 295(Ex.4)
- on SO(n) (Haar), 339 Moore graph, 367
- uniform, 237 motion planning, 116, 122, 193
measure concentration multigraph, xvi
- for a Hamming cube, multiset, xv
335(14.2.3)
- for a sphere, 331(14.1.1) nearest neighbor searching, 116
- for an expander, 384(Ex. 7) neighborhood, orthogonal, 318
nerve, 197
- for product spaces, 340
- Gaussian, 334(14.2.2) l1-net, 314
- application, 323, 340, 343, 365,
med(f) (median of f), 337
368
medial axis transform, 120
c-net, 237(10.2.1), 237(10.2.2)
median, 14, 337
- size, 239(10.2.4)
meet, 89
- weak, 261(10.6.3)
method - - for convex sets, 253(10.4.1)
- double-description, 86 nonrepetitive segment, 178
- ellipsoid, 381 norm, 344
metric - foo, 357
- cut, 383(Ex. 3),391 - fp, 357
- line, 383(Ex. 2),389 - Lipschitz, 356
- of negative type, 379 - maximum, see foo-norm
- planar-graph, 393 normal distribution, 334, 352
- shortest-path, 392 number
- squared Euclidean, cone, 377 - algebraic, 20(Ex.4)
- tree, 392, 398, 399(Ex.5), - chromatic, 290
400(Ex. 6), 400(Ex.7), - clique, 290
400(Ex.9) - crossing, 54
metric cone, 106, 377 - - and forbidden subgraphs,
metric polytope, 106 57
metric space, 355 --odd, 58
Milnor-Thom theorem, 131, 135 - - pairwise, 58
minimum spanning tree, - fractional packing, 233
123(Ex.6) - fractional transversal, 232
minimum, successive, 24 - Helly, 12
Minkowski sum, 297 -independence, 290
Minkowski's second theorem, 24 - matching, see packing number
Minkowski's theorem, 17(2.1.1) - packing, 232
Index 473

- piercing, see transversal - Tverberg, 200


number partition theorem, 69
- transversal, 232 patches, algebraic surface
- - bound using 7*, 236, - lower envelope, 189
242(10.2.7) - single cell, 191(7.7.2)
path compression, 175
0(·) (asymptotically at most), xv pattern, sign, of polynomials, 131
0(') (asymptotically smaller), xv - on a variety, 135
octahedron, generalized,
pencil, 132
see crosspolytope
pentagon, similar copies,
odd crossing number, 58
51(Ex.1O)
odd-cr(G) (odd crossing number),
perfect graph, 290-295
58
permanent, approximation, 322
-!-~ conjecture, 308
oracle (for convex body), 316, 321 permutahedron, 78, 85
order polytope, 303(12.3.2) - faces, 95(Ex.3)
order type, 216, 221(Ex. 1) permutation, forbidden pattern,
order, Helly, 263(Ex.4) 177
ordering, 302 perturbation argument, 5, 101
- linear, 302 planar-graph metric, 393
orientation, 216 plane, 3
oriented matroid, 137 - Fano, 44
orthogonal neighborhood, 318 - projective, 2
- topological, 136
P(:5.) (order polytope), 303 planes, incidences, 46
P [ . 1(uniform measure on sn-l), point
330 - r-divisible, 204
Pd,D (sets definable by
- exposed, 95(Ex. 9)
polynomials), 244(10.3.2)
- extremal, 87, 95(Ex.9),
packing, 232
95(Ex.1O)
- fractional, 233
- k-interior, 9
- lattice, 23
- lattice, 17
packing number, 232
- - computation, 24
pair, closest, computation, 122
- - Helly-type theorem,
pair-cr( G) (pairwise crossing
number), 58 295(Ex.7)
pairwise crossing edges, 176 - Radon, 10, 13(Ex.9)
pairwise crossing number, 58 - random, in a ball, 312
Pappus theorem, 134 - Tverberg, 200
paraboloid, unit, 118 point location, 116
parallel edges, 176 point-line incidences, 41 (4.1.1)
partially ordered s~t, 302 - in the complex plane, 44
k-partite hypergraph, 211 points, random, convex hull, 99,
partition 324
- Radon, 10 polarity, see duality
474 Index

polygons, convex, union - - upper bound, 315(13.2.1)


complexity, 194 - V-polytope, 82(5.2.1)
polyhedral combinatorics, 289 popular face, 151
polyhedron poset, 302
- convex, 83 position
- H-polyhedron, 82(5.2.1) - convex, 30
polymake, 85 - general, 3
polynomial positive-fraction
- factorization, 26 - Erdos-Szekeres theorem,
- Geronimus, 380 220(9.3.3), 222(Ex.4)
- on Cartesian products, 48 - Radon's lemma, 220
polytope (convex), 83 - selection lemma, 228(9.5.1)
- almost spherical, number of - Tverberg's theorem, 220
facets, 343(14.4.2) post-office problem, 116
- chain, 309 power diagram, 121
- combinatorial equivalence, (p, q)-condition, 255
89(5.3.4) (p,q)-theorem, 256(10.5.1)
- cyclic, 97(5.4.3) - for hyperplane transversals,
- - universality, 99(Ex.3) 259(10.6.1)
- dual, 90 Prekopa-Leindler inequality, 300,
- fat-lattice, 107(Ex.1) 302(Ex.7)
- graph, 87 prime
- - connectivity, 88, 95(Ex.8) - in a ring, 52
- Hammer, 348(Ex.l) - in arithmetic progressions,
- H-polytope, 82(5.2.1) 53(4.2.4)
- integral, 295(Ex. 5) prime number theorem, 52
- k-set, 273(Ex. 7) primitive recursive function, 174
- matching, 289, 294 Prob[·] (probability), xv
- metric, 106 probabilistic method, application,
- number of, 139(Ex.3) 55, 61, 71, 142, 148, 153, 184,
- order, 303(12.3.2) 240, 268, 281(Ex. 5), 340, 352,
- product, 107(Ex.1) 359, 364, 386-391
- realization, 94, 113, 139 problem
- simple, 90(5.3.6) - art gallery, 246, 250
- - determined by graph, 93 - Busemann-Petty, 313
- simplicial, 90(5.3.6) - decomposition, for algebraic
- spherical, 124(Ex. 11) surfaces, 162
- stable set, 293 - Gallai-type, 231
- symmetric, number of facets, - Hadwiger-Debrunner, (p, q),
347(14.4.2) 255
- traveling salesman, 289 - k-set, 265
- union complexity, 194 - Kakeya, 44
- volume - knapsack, 26
- - lower bound, 322 - post-office, 116
Index 475

- set cover, 235 radius, equivalent, 297


- subset sum, 26 Radon point, 10, 13(Ex.9)
- Sylvester's, 44 Radon's lemma, 9(1.3.1), 12
- UNION-FIND, 175 - application, 11, 12(Ex.1),
- Zarankiewicz, 68 222(Ex.3), 244
product space, measure - positive-fraction, 220
concentration, 340 rainbow simplex, 199
product, of polytopes, 107(Ex.1) Ramsey's theorem, 29
programming - application, 30, 32, 34(Ex.3),
- integer, 25 39(Ex. 6), 99(Ex. 3), 373(Ex.3)
-linear, 7 random point in a ball, 312
- - algorithm, 93 random points, convex hull, 99,
- - duality, 233(10.1.2) 324
- semidefinite, 378, 380 random rotation, 339
projection random subspace, 339
- almost spherical, 353 rank, and signs, 140(Ex.4)
- concentration of length, rational function on Cartesian
359(15.2.2) product, 48
- polytopes obtained by, ray, HeUy-type theorem, 13(Ex. 7)
86(Ex.2) real algebraic geometry, 131
projective plane, 2 realization
- finite, 44, 66 - of a polytope, 94, 113
pseudocircles, 271 - of an arrangement, 138
pseudodisk, 193
Reay's conjecture, 204
pseudolattice, pentagonal,
reduced basis, 25
51(Ex.1O)
Reg, 157
pseudolines, 132, 136
reg(p) (Voronoi region), 115
pseudometric, line, 383(Ex.2),
regular forest, 18(2.1.2)
389
regular graph, xvi
pseudoparabolas, 272(Ex.5),
272 (Ex. 6) regular simplex, 84
pseudosegments - volume, 319
- cutting curves into, 70, 271, regularity lemma
272(Ex.5), 272(Ex.6) - for hypergraphs, 226
- extendible, 140(Ex.5) - - weak, 223(9.4.1)
- level in arrangement, 270 - for hypergraphs, weak
Purdy's conjecture, 48 - - application, 227(Ex.2)
c-pushing, 102 - for hypergraphs, weak,
application, 229
QSTAB(G),293 - Szemeredi's, 223, 226
quadratic residue, 27 relation, Dehn-Sommerville, 103
quasi-isometry, 358 d-representable simplicial
complex, 197
Rd,l residue, quadratic, 27
r-divisible point, 204 restriction (of a set system), 238
476 Index

reverse isoperimetric inequality, - - proofs, 210


337 - positive-fraction, 228(9.5.1)
reverse search, 106 - second, 211(9.2.1)
ridge, 87 - - application, 228, 279
robot motion planning, 116, 122, - - lower bounds, 215(Ex.2)
193 - - one-dimensional, 215(Ex. 1)
rotation, random, 339 semialgebraic set, 189
Ryser's conjecture, 235 - and VC-dimension, 245
semidefinite programming, 378,
sn (unit sphere in Rn+1), 313 380
same-type lemma, 217(9.3.1) ry-separated set, 314
- application, 220, 222(Ex.5), separation theorem, 6(1.2.4)
229 - application, 8, 80, 323, 377
- partition version, 220 separation, Helly-type theorem,
same-type transversals, 217 13(Ex.1O)
searching separator theorem, 57
- nearest neighbor, 116
sequence, Davenport-Schinzel,
- reverse, 106
167
second eigenvalue, 374, 381
- asymptotics, 174
second selection lemma,
- decomposable, 178
211(9.2.1)
- generalized, 174, 176
- application, 228, 279
- realization by curves,
- lower bounds, 215(Ex.2)
168(Ex.l)
- one-dimensional, 215(Ex. 1)
set
section, almost spherical
- almost convex, 38, 39(Ex.5)
- of a convex body, 345(14.4.5),
348(14.6.1 ) - brick, 298
- of a crosspolytope, 346, - convex, 5(1.2.1)
353(Ex.2) - convex independent, 30(3.1.1)
- of a cube, 343 - - in a grid, 34(Ex.2)
- of an ellipsoid, 342(14.4.1) - - in higher dimension, 33
segments - - size, 32
- arrangement, 130 - defining, 158
- intersection graph, 139(Ex.2) - dense, 33
- level in arrangement, - dual, 80(5.1.3)
186(Ex.2) - Horton, 36
- lower envelope, 165 --inRd,38
- - lower bound, 169(7.2.1) -independent, 290
- Ramsey-type result, ~ partially ordered, 302
222(Ex. 5), 227(Ex.2) - PQla..r, see dual set
- single cell, 176 - semjaJ~praic, 189
- zone, 150 - - a..nd VG. dimension, 245
selection lemma - k~~et, 'JQQ
- first, 208(9.1.1) - - poly tope, 273(Ex. 7)
- - application, 253 - shattered, 238(10.2.3)
Index 477

- stable, see independent set - d- Leray, 197


set cover problem, 235 - d-collapsible, 197
set system, dual, 245 - d-representable, 197
sets, convex simplicial polytope, 90(5.3.6)
- intersection patterns, 197 simplicial sphere, 103
- transversal, 256(10.5.1) simplification (of a level), 74
- upper bound theorem, 198 single cell
- VC-dimension, 238 - in R2, 176
seven-hole theorem, 35(3.2.2) - in higher dimensions, 191, 193
shatter function, 239 site (in a Voronoi diagram), 115
- dual, 242 smallest enclosing ball, 13(Ex.5)
shatter function lemma, - uniqueness, 328(Ex.4)
239(10.2.5) smallest enclosing ellipsoid
- application, 245, 248 - computation, 327
shattered set, 238(10.2.3) - uniqueness, 328(Ex. 3)
shattering graph, 251(Ex. 5) SO(n),339
shelling, 104 - measure concentration, 335
shortest vector (lattice), 25 Sobolev inequalities, logarithmic,
shortest-path metric, 392 337
Sierksma's conjecture, 205 sorting with partial information,
sign matrix, and rank, 140(Ex.4) 302-309
sign pattern, of polynomials, 131 space
- on a variety, 135 - Hilbert, 357
sign vector (of a face), 126 - t p , 357
similar copies (counting), 47, - metric, 355
51(Ex.10) - realization, 138
simple arrangement, 127 spanner, 369(Ex.2)
simple k-packing, 236(Ex.4) spanning tree, minimum,
simple polytope, 90(5.3.6) 123(Ex.6)
- determined by graph, 93 sparsest cut, approximation, 391
simplex, 84(5.2.3) sphere
- circumradius and inradius, - measure concentration,
317(13.2.2) 331(14.1.1)
- faces, 88 - simplicial, 103
- projection, 86(Ex. 2) spherical cap, 333
- rainbow, 199 spherical polytope, 124(Ex.ll)
- regular, 84 STAB(G) (stable set polytope),
- X -simplex, 208 293
- volume, 319 stable set, see independent set
simplex algorithm, 93 stable set polytope, 293
simplices Stanley-Wilf conjecture, 177
- lower envelope, 186 star-shaped, 13(Ex.8)
- single cell, 193 Steinitz theorem, 88(5.3.3), 92
simplicial complex - quantitative, 94
478 Index

d-step conjecture, 93 tessellation, Dirichlet, see Voronoi


stretchability, 134, 137 diagram
strong perfect graph conjecture, theorem
291 - Balinski's, 88
strong upper bound theorem, 104 - Borsuk-Ulam, application, 15,
subgraph, xvi 205
- forbidden, 64 - Caratheodory's, 6(1.2.3), 8
- induced, 290 - - application, 199, 200, 208,
subgraphs, transversal, 262 319
subhypergraph, 211 - center transversal, 15(1.4.4)
subsequence, monotone, - centerpoint, 14(1.4.2), 205
295(Ex.4) - Clarkson's, on levels,
subset sum problem, 26 141(6.3.1)
subspace - colored Helly, 198(Ex.2)
- affine, 1 - colored Tverberg, 203(8.3.3)
-linear, 1 - - application, 213
- random, 339 - - for r = 2, 205
successive minimum, 24 - colorful Caratheodory,
sum 199(8.2.1)
- Minkowski, 297 - - application, 202
- of squared cell complexities, - crossing number, 55(4.3.1)
152(Ex.l) - - application, 56, 61, 70, 283
sums and products, 50(Ex.9) - - for multigraphs, 60(4.4.2)
superimposed projections of lower
- Dilworth's, 294(Ex.4)
envelopes, 192
- Dirichlet's, 53
surface patches, algebraic
- Dvoretzky's, 348(14.6.1), 352
- lower envelope, 189
- Edmonds', matching polytope,
- single cell, 191(7.7.2)
294
surfaces, algebraic, arrangement,
- efficient comparison,
130
303(12.3.1 )
- decomposition problem, 162
- Elekes-R6nyai, 48
Sylvester's problem, 44
Szemeredi regularity lemma, 223, - epsilon net, 239(10.2.4)
226 - - application, 247, 251(Ex. 4)
Szemeredi-Trotter theorem, - - if and only if form, 252
41(4.1.1) - Erdos-Simonovits, 213(9.2.2)
- application, 49(Ex.5), - Erdos-Szekeres, 30(3.1.3)
50 (Ex. 6), 50(Ex. 7), 50(Ex.9), - - another proof, 32
60, 63(Ex. 1) - - application, 35
- in the complex plane, 44 - - generalizations, 33
- proof, 56, 66, 69 - - positive-fraction,
220(9.3.3), 222(Ex.4)
T(d, r) (Tverberg number), 200 - - quantitative bounds, 32
Teol (d, r) (colored Tverberg - fractional Helly, 195(8.1.1)
number), 203 - - application, 209, 211, 258
Index 479

- - for line transversals, - Pappus, 134


260(10.6.2) - (p, q), 256(10.5.1)
- Freiman's, 47 - - for hyperplane transversals,
- g-theorem, 104 259(10.6.1 )
- Hadwiger's transversal, 262 - prime number, 52
- Hahn-Banach, 8 - Ramsey's, 29
- Hall's, marriage, 235 - - application, 30, 32,
- ham-sandwich, 15(1.4.3) 34(Ex. 3), 39(Ex.6),
- - application, 218 99(Ex. 3), 373(Ex.3)
- Helly's, 10(1.3.2) - separation, 6(1.2.4)
- - application, 12(Ex.2), - - application, 8, 80, 323, 377
13(Ex.5), 14(1.4.1), - separator, Lipton-Tarjan, 57
82(Ex.9), 196(8.1.2), 200 - seven-hole, 35(3.2.2)
- Helly-type, 261, 263(Ex.4) - Steinitz, 88(5.3.3), 92
- - for containing a ray, - - quantitative, 94
13(Ex.7) - Szemeredi-Trotter, 41(4.1.1)
- - for lattice points, 295(Ex. 7) - - application, 49(Ex.5),
- - for line transversals, 50(Ex. 6), 50(Ex.7),
82(Ex.9) 50(Ex.9), 60, 63(Ex.1)
- - for separation, 13(Ex.1O) - - in the complex plane, 44
- - for visibility, 13(Ex.8) - - proof, 56, 66, 69
- - for width, 12(Ex.4)
- Tverberg's, 200(8.3.1)
- Kovari-S6s-Tunin, 65(4.5.2)
- - application, 208
- Kirchberger's, 13(Ex. 10)
- - positive-fraction, 220
- Koebe's, 92
- - proofs, 203
- Konig's, edge-covering, 235,
- two-square, 27(2.3.1)
294(Ex.3)
- upper bound, 100(5.5.1), 103
- Krasnosel'skii's, 13(Ex.8)
- Krein-Milman, in Rd, - - and k-facets, 280
96(Ex.1O) - - application, 119
- Kruskal-Hoffman, 295(Ex.6) - - continuous analogue, 114
- Lagrange's, four-square, - - for convex sets, 198
28(Ex.1) - - formulation with h-vector,
- lattice basis, 22(2.2.2) 103
- Lawrence's, representation, - - proof, 282(Ex.6)
137 - - strong, 104
- lower bound, generalized, 105 - weak epsilon net, 253(10.4.2)
- - application, 280 - - another proof, 254(Ex. 1)
- Milnor-Thorn, 131, 135 - zone, 146(6.4.1)
- Minkowski's, 17(2.1.1) - - planar, 168(Ex.5)
- - for general lattices, Thiessen polygons, 120
22(2.2.1) topological plane, 136
- - second, 24 torus, n-dimensional, measure
- Minkowski-Hlawka, 23 concentration, 335
480 Index

total unimodularity, 294, - bottom-vertex, 160, 161


295{Ex.6) - canonical, see bottom-vertex
trace (of a set system), 238 triangulation
transform - Delaunay, 117, 120, 123{Ex.5)
- duality, 78{5.1.1), 81{5.1.4) - of an arrangement, 72{Ex.2)
- Gale, 107 Tverberg partition, 200
- - application, 210, 282{Ex.6) Tverberg point, 200
- medial axis, 120 Tverberg's theorem, 200{8.3.1)
transversal, 82{Ex. 9), 231 - application, 208
- criterion of existence, - colored, 203{8.3.3)
218{9.3.2) - - application, 213
- fractional, 232 --forr=2,205
- - bound, 256{1O.5.2) - positive-fraction, 220
- - for infinite systems, 235 - proofs, 203
- hyperplane, 259{1O.6.1), 262 24-cell, 95{Ex.4)
-line, 262 two-square theorem, 27{2.3.1)
- of convex sets, 256{1O.5.1) type, order, 216, 221{Ex.1)
- of disks, 231, 262{Ex. 1)
U{n) (number of unit distances),
- of d-intervals, 262, 262{Ex.2)
42
- of subgraphs, 262
unbounded cells, number of,
- same-type, 217 129{Ex.2)
transversal number, 232 k-uniform hypergraph, 211
- bound using r*, 236, uniform measure, 237
242{1O.2.7) unimodularity, total, 294,
transversal theorem, Hadwiger's, 295{Ex.6)
262 union, complexity, 193-194
traveling salesman polytope, 289 - for disks, 124{Ex. 10)
tree UNION-FIND problem, 175
- hierarchically well-separated, unit paraboloid, 118
398 unit circles
- spanning, minimum, - incidences, 42, 49{Ex.1),
123{Ex.6) 52{4.2.2), 58{Ex. 2), 70{Ex.1)
tree metric, 392, 398, 399{Ex.5), - Sylvester-like result, 44
400{Ex. 6), 400{Ex.7), unit distances, 42
400{Ex.9) - and incidences, 49{Ex.1)
tree volume, 396 - for convex position, 45
tree-width, 262 - in R2, 45
triangle, generalized, 66{ 4.5.3) - in R3, 45
triangles - in R4, 45, 49{Ex. 2)
- fat, union complexity, 194 - lower bound, 52(4.2.2)
- level in arrangement, 183 - on a 2-sphere, 45
-lower envelope, 183{7.5.1}, 186 - upper bound, 58{Ex.2)
- VC-dimension, 250{Ex.1) universality of cyclic polytope,
triangulation 99{Ex.3)
Index 481

up-set, 304 - approximation, 321


upper bound theorem, 100(5.5.1), - - hardness, 315
103 - mixed, 301
- and k-facets, 280 - of a ball, 311
- application, 119 - of a polytope
- continuous analogue, 114 - - lower bound, 322
- for convex sets, 198 - - upper bound, 315(13.2.1)
- formulation with h-vector, 103 - of a regular simplex, 319
- proof, 282(Ex.6) - tree, 396
- strong, 104 volume-respecting embedding,
396
v (G) (vertex set), xvi Voronoi diagram, 115
Vn (volume of the unit n-ball), - abstract, 121
311 - complexity, 119(5.7.4),
V(x) (visibility region), 247 122(Ex.2), 123(Ex.3), 192
V-polytope, 82(5.2.1) - farthest-point, 120
Van Kampen-Flores simplicial - higher-order, 122
complex, 368
Vapnik-Chervonenkis dimension, weak E-net, 261(10.6.3)
see VC-dimension - for convex sets, 253(10.4.1)
VC-dimension, 238(10.2.3) weak epsilon net theorem,
- bounds, 244(10.3.2), 253(10.4.2)
245(10.3.3) - another proof, 254(Ex.1)
- for half-spaces, 244(10.3.1) weak perfect graph conjecture,
- for triangles, 250(Ex. 1) 291
vector weak regularity lemma, 223(9.4.1)
- i-vector, 96 - application, 227(Ex. 2), 229
- - of a representable complex, width, 12(Ex.4)
197 - approximation, 322, 322(Ex.4)
- g- vector, 104 - bisection, 57
- h-vector, 102 Wigner-Seitz zones, 120
- shortest (lattice), 25 wiring diagram, 133
- sign (of a face), 126 X-simplex, 208
vectors, almost orthogonal, x-monotone (curve), 73
362(Ex.3)
Veronese mapping, 244 Zarankiewicz problem, 68
vertex zone
- of a polytope, 87 - (sk)-zone, 152(Ex.2)
- of an arrangement, 43, 130 - in a segment arrangement, 150
vertical decomposition, 72(Ex.3), - of a hyperplane, 146
156 - of a surface, 150, 151
visibility, 246 - of an algebraic variety, 150,
- Helly-type theorem, 13(Ex.8) 151
vol(·), xv zone theorem, 146(6.4.1)
volume - planar, 168(Ex.5)
Graduate Texts in Mathematics
(continuedfrom page iiJ

64 EDWARDS. Fourier Series. Vol. I. 2nd ed. 96 CONWAY. A Course in Functional


65 WELLS. Differential Analysis on Complex Analysis. 2nded.
Manifolds. 2nd ed. 97 KOBLITZ. Introduction to Elliptic Curves
66 WATERHOUSE. Introduction to Affine and Modular Forms. 2nd ed.
Group Schemes. 98 BR()cKER!TOM DIECK. Representations of
67 SERRE. Local Fields. Compact Lie Groups.
68 WEIDMANN. Linear Operators in Hilbert 99 GRovE/BENSON. Finite Reflection Groups.
Spaces. 2nd ed.
69 LANG. Cyclotomic Fields II. 100 BERG/CHRISTENSEN/REssEL. Harmonic
70 MASSEY. Singular Homology Theory. Analysis on Semigroups: Theory of
71 FARKAS/KRA. Riemann Surfaces. 2nd ed. Positive Definite and Related Functions.
72 STILLWELL. Classical Topology and 101 EDWARDS. Galois Theory.
Combinatorial Group Theory. 2nd ed. 102 VARADARAJAN. Lie Groups, Lie Algebras
73 HUNGERFORD. Algebra. and Their Representations.
74 DAVENPORT. Multiplicative Number 103 LANG. Complex Analysis. 3rd ed.
Theory. 3rd ed. 104 DUBROVINIFOMENKOINOVIKOV. Modem
75 HOCHSCHILD. Basic Theory of Algebraic Geometry-Methods and Applications.
Groups and Lie Algebras. Part II.
76 IITAKA. Algebraic Geometry. 105 LANG. SL2(R).
77 HECKE. Lectures on the Theory of 106 SILVERMAN. The Arithmetic of Elliptic
Algebraic Numbers. Curves.
78 BURRIS/SANKAPPANAVAR. A Course in 107 OLVER. Applications of Lie Groups to
Universal Algebra. Differential Equations. 2nd ed.
79 WALTERS. An Introduction to Ergodic 108 RANGE. Holomorphic Functions and
Theory. Integral Representations in Several
80 ROBINSON. A Course in the Theory of Complex Variables.
Groups. 2nd ed. 109 LEHTO. Univalent Functions and
81 FORSTER. Lectures on Riemann Surfaces. Teichmtiller Spaces.
82 BOTT/Tu. Differential Forms in Algebraic 110 LANG. Algebraic Number Theory.
Topology. 111 HUSEMOLLER. Elliptic Curves.
83 WASHINGTON. Introduction to Cyclotomic 112 LANG. Elliptic Functions.
Fields. 2nd ed. 113 KARATZAslSHREVE. Brownian Motion and
84 iRELAND/ROSEN. A Classical Introduction Stochastic Calculus. 2nd ed.
to Modem Number Theory. 2nd ed. 114 KOBLITZ. A Course in Number Theory and
85 EDWARDS. Fourier Series. Vol. II. 2nd ed. Cryptography. 2nd ed.
86 VAN LINT. Introduction to Coding Theory. 115 BERGERIGOSTIAUX. Differential Geometry:
2nd ed. Manifolds, Curves, and Surfaces.
87 BROWN. Cohomology of Groups. 116 KELLEy/SRINIVASAN. Measure and
88 PiERCE. Associative Algebras. Integral. Vol. r.
89 LANG. Introduction to Algebraic and 117 SERRE. Algebraic Groups and Class Fields.
Abelian Functions. 2nd ed. 118 PEDERSEN. Analysis Now.
90 BR0NDSTED. An Introduction to Convex 119 ROTMAN. An Introduction to Algebraic
Polytopes. Topology.
91 BEARDON. On the Geometry of Discrete 120 ZIEMER. Weakly Differentiable Functions:
Groups. Sobolev Spaces and Functions of Bounded
92 DIESTEL. Sequences and Series in Banach Variation.
Spaces. 121 LANG. Cyclotomic Fields I and II.
93 DUBROVIN/FoMENKOINOVIKOV. Modem Combined 2nd ed.
Geometry-Methods and Applications. 122 REMMERT. Theory of Complex Functions.
Part r. 2nd ed. Readings in Mathematics
94 WARNER. Foundations of Differentiable 123 EBBINGHAUs/HERMES et al. Numbers.
Manifolds and Lie Groups. Readings in Mathematics
95 SHIRYAEV. Probability. 2nd ed.
124 DUBROVIN/FoMENKoINovIKOV. Modern 154 BROWN/PEARCY. An Introduction to
Geometry-Methods and Applications. Analysis.
Part III 155 KAsSEL. Quantum Groups.
125 BERENSTEIN/GAY. Complex Variables: 156 KECHRIS. Classical Descriptive Set
An Introduction. Theory.
126 BOREL. Linear Algebraic Groups. 2nd ed. 157 MALLIAVIN. Integration and
127 MASSEY. A Basic Course in Algebraic Probability.
Topology. 158 ROMAN. Field Theory.
128 RAUCH. Partial Differential Equations. 159 CONWAY. Functions of One
129 FULTON/HARRIS. Representation Theory: A Complex Variable II.
First Course. 160 LANG. Differential and Riemannian
Readings in Mathematics Manifolds.
130 DODSON/POSTON. Tensor Geometry. 161 BORWEIN/ERDELYI. Polynomials and
131 LAM. A First Course in Noncommutative Polynomial Inequalities.
Rings. 2nd ed. 162 ALPERIN/BELL. Groups and
132 BEARDON. Iteration of Rational Functions. Representations.
133 HARRIS. Algebraic Geometry: A First 163 DIXONIMORTIMER. Permutation Groups.
Course. 164 NATHANSON. Additive Number Theory:
134 ROMAN. Coding and Information Theory. The Classical Bases.
135 ROMAN. Advanced Linear Algebra. 165 NATHANSON. Additive Number Theory:
136 ADKiNSIWEINTRAUB. Algebra: An Inverse Problems and the Geometry of
Approach via Module Theory. Sumsets.
137 AxLERIBoURDON/RAMEY. Harmonic 166 SHARPE. Differential Geometry: Cartan's
Function Theory. 2nd ed. Generalization of Klein's Erlangen
138 COHEN. A Course in Computational Program.
Algebraic Number Theory. 167 MORANDI. Field and Galois Theory.
139 BREDON. Topology and Geometry. 168 EWALD. Combinatorial Convexity and
140 AUBIN. Optima and Equilibria. An Algebraic Geometry.
Introduction to Nonlinear Analysis. 169 BHATIA. Matrix Analysis.
141 BECKERIWEISPFENNING/KREDEL. Grabner 170 BREDON. Sheaf Theory. 2nd ed.
Bases. A Computational Approach to 171 PETERSEN. Riemannian Geometry.
Commutative Algebra. 172 REMMERT. Classical Topics in Complex
142 LANG. Real and Functional Analysis. Function Theory.
3rd ed. 173 DIESTEL. Graph Theory. 2nd ed.
143 DOOB. Measure Theory. 174 BRIDGES. Foundations of Real and
144 DENNISIFARB. Noncommutative Abstract Analysis.
Algebra. 175 UCKORISH. An Introduction to Knot
145 VICK. Homology Theory. An Theory.
Introduction to Algebraic Topology. 176 LEE. Riemannian Manifolds.
2nded. 177 NEWMAN. Analytic Number Theory.
146 BRIDGES. Computability: A 178 CLARKEILEDYAEV/STERNlWoLENSKI.
Mathematical Sketchbook. Nonsmooth Analysis and Control
147 ROSENBERG. Algebraic K- Theory Theory.
and Its Applications. 179 DOUGLAS. Banach Algebra Techniques in
148 ROTMAN. An Introduction to the Operator Theory. 2nd ed.
Theory of Groups. 4th ed. 180 SRIVASTAVA. A Course on Borel Sets.
149 RATCLIFFE. Foundations of 181 KRESS. Numerical Analysis.
Hyperbolic Manifolds. 182 WALTER. Ordinary Differential
150 EISENBUD. Commutative Algebra Equations.
with a View Toward Algebraic 183 MEGGINSON. An Introduction to Banach
Geometry. Space Theory.
151 SILVERMAN. Advanced Topics in 184 BOLLOBAS. Modern Graph Theory.
the Arithmetic of Elliptic Curves. 185 COXILITTLEIO'SHEA. Using Algebraic
152 ZIEGLER. Lectures on Polytopes. Geometry.
153 FuLTON. Algebraic Topology: A
First Course.
186 RAMAKRISHNANIV ALENZA. Fourier 201 HINORY/SILVERMAN. Diophantine
Analysis on N!lmber Fields. Geometry: An Introduction.
187 HARRIS/MoRRISON. Moduli of Curves. 202 LEE. Introduction to Topological
188 GOLOBLATI. Lectures on the Hyperreals: Manifolds.
An Introduction to Nonstandard Analysis. 203 SAGAN. The Symmetric Group:
189 LAM. Lectures on Modules and Rings. Representations, Combinatorial
190 EsMONOElMURTY. Problems in Algebraic Algorithms, and Symmetric Functions.
Number Theory. 204 EsCOFIER. Galois Theory.
191 LANG. Fundamentals of Differential 205 F'Eux/HALPERINfI'HOMAS. Rational
Geometry. Homotopy Theory. 2nd ed.
192 HIRSCH/LACOMBE. Elements of 206 MURTY. Problems in Analytic Number
Functional Analysis. Theory.
193 COHEN. Advanced Topics in Readings in Mathematics
Computational Number Theory. 207 GOosILIROYLE. Algebraic Graph Theory.
194 ENGELINAGEL. One-Parameter Semigroups 208 CHENEY. Analysis for Applied
for Linear Evolution Equations. Mathematics.
195 NATHANSON. Elementary Methods in 209 ARVESON. A Short Course on Spectral
Number Theory. Theory.
196 OSBORNE. Basic Homological Algebra. 210 ROSEN. Number Theory in Function
197 EISENBuo/HARRIS. The Geometry of Fields.
Schemes. 211 LANG. Algebra. Revised 3rd ed.
198 ROBERT. A Course in p-adic Analysis. 212 MATOUSEK. Lectures on Discrete
199 HEOENMALMlKORENBLUMlZHU. Theory Geometry.
of Bergman Spaces. 213 FRlTzSCHElGRAUERT. From Holomorphic
200 BAO/CHERN/SHEN. An Introduction to Functions to Complex Manifolds.
Riemann-Finsler Geometry.

You might also like