L17-18 QuadTrees PDF

Quad Trees
CMSC 420
Applications of Geometric / Spatial Data Structs.
• Computer graphics, games, movies
• computer vision, CAD, street maps (google maps /

google Earth)
• Human-computer interface design (windowing

systems)
• Virtual reality
• Visualization (graphing complex functions)

Geometric Objects
• Scalars: 1-d poin

• Point: location in d-dimensional space. d-tuple of
scalars. P=(x1,x2,x3...,xd)
- arrays: double p[d];
- structures: struct { double x, y, z; }
- good compromise:
struct Point {
const int DIM = 3;
double coord[DIM];
};
• Vectors: direction and magnitude (length) in that

direction.
Lines, Segments, Rays
• Line: infinite in both directions

- y = mx + b [slope m, intercept b]
- ax + by = c
- In higher dimensions, any two points define a line.
• Ray: infinite in one direction

• Segment: finite in both directions
• Polygons: cycle of joined line segments What’s a good
representation
- simple if they don’t corss for a polygon?
- convex if any line segment connecting two points on its
circularly
surface lies entirely within the shape. linked list
- convex hull of a set of points P: smallest convex set that of points
contains P
Geometric Operations
• P - Q is a vector going from point Q to P

P
Q
• Q + v is a point at the head of vector v, if v were
anchored at Q x
• v + u: serially walk along v and then u. v+u is the

direct shortcut.
v u
v+u
• Great use for C++ operator overloading.
Types of Queries
• Is the object in the set?
• What is the closest object to a given point?
• What objects does a query object intersect with?
• What is the first object hit by the given ray? [Ray

shooting]
• What objects contain P?
• What objects are in a given range? [range queries]

Intersection of Circle & Rectangle
Circle
center = C
R.high[1]
Dimension 1
R.low[1]
R.low[0] R.high[0]
Question: how do you
compute the distance
from circle center to the
Dimension 0 rectangle?
Intersection of Circle & Rectangle
R.high[1]
Instead of a lot of
special cases, break the
distance down by
R.low[1] dimension (component)
R.low[0] R.high[0]
Distance = square root of the sum of the squares of the

distances in each dimension
d = √dx2 + dy2 + dz2

d2 = distx(C,R)2 + disty(C,R)2
distx(C,R) is 0 unless C is in blue regions

Distance between point C and rectangle R
distance(C, R):
dist = 0
for i = 0 to DIM:
if C[i] < R.low[i]:
dist += square(R.low[i] - C[i])
else if C[i] > R.high[i]:

dist += square(C[i] - R.high[i])
return sqrt(dist)
Why are geometric (spatial) data different?
No natural ordering...
• In 1-d:
- we usually had a natural ordering on the keys (integers,
alphabetical order, ...)
- But how do you order a set of points?
• Take a step back:

- In the 1-d case, how did we use this ordering?
- Mostly, it gave us an implicit was to partition the data.
• So:
- Instead of explicitly ordering and implicitly partitioning,
we usually: explicitly partition.
- Partitioning is very natural in geometric spaces.
Why are geometric (spatial) data different?
Static case also interesting...
• In 1-d:
- usually the static case (all data known at start) is not very
interesting
- can be solved by sorting the data
(heaps => sorted lists, balanced trees => binary search)
• With geometric data,

- it’s sometimes hard to answer queries even if all data are
known (what’s the analog of binary search for a set of
points?)
- Therefore, emphasize updates less (though we’ll still
consider them)
- Model: preprocess the data (may be “slow” like
O(n log n)) and then have efficient answers to queries.
Point Data Sets – Today
• Data we want to store is a collection of d-

dimensional points.
- We’ll focus on 2-d for now (hard to draw anything else)
• Simplest query: “Is point P in the collection?”

PR Quadtrees
PR Quadtrees (Point-Region)
• Recursively subdivide cells into 4 equal-sized

subcells until a cell has only one point in it.
• Each division results in a single node with 4 child

pointers.
• When cell contains no points, add special “no-point”

node.
• When cell contains 1 point, add node containing

point + data associated with that point (perhaps a
pointer out to a bigger data record).
PR Quadtrees Internal Nodes
NW NE
NW SE
NE SW
SW SE
PR Quadtrees
L
M Q
N
M Q
N P R
P R
NW SE
NE SW
Find in PR Quadtrees
L
M Q
N
M Q
N P R
P R
Insert in PR Quadtrees
• insert(P):
- find(P)
- if cell where P would go is empty, then add P to it
(change from to )
- If cell where P would go has a point Q in it, repeatedly
split until P is separated from Q. Then add P to correct
(empty) cell.
• How many times might you have to split?

unbounded in n
Delete in PR Quadtrees
• delete(P):
- find(P)
- If cell that would contain P is empty, return not found!
- Else, remove P (change to ).
- If at most 1 siblings of the cell has a point, merge siblings
into a single cell. Repeat until at least two siblings
contain a point.
• A cell “has a point” if it is or .

Features of PR Quadtrees
• Locations of splits don’t depend on exact point
values (it is a partitioning of space, not of the set of
keys)
• Leaves should be treated differently that internal

nodes because:
- Empty leaf nodes are common,
- Only leaves contain data
• Bounding boxes constructed on the fly and passed

into the recursive calls.
• Extension: allow a constant b > 1 points in a cell

(bucket quadtrees)
Height Lemma
• if
- c is the smallest distance between any two points
- s is the side length of the initial square containing all the
points
• Then
- the depth of a quadtree is ≤ log(s/c) + 3/2
internal node Therefore, s√2/2i ≥ c

diagonal
of depth i Hence,
c length =
s√2/2i i ≤ log s√2/c = log(s/c) + 1/2
side length = Height of tree is max depth of

s/2i internal node + 1, so height ≤
log(s/c) + 3/2
Size Corollary
Thm. A quadtree of depth d storing n points has

O((d+1)n) nodes.
Proof: Every internal node represents a square with at

least 2 points in it.
Hence, each level has fewer than n nodes.
North Neighbor
North neighbor of a cell S at depth i
is the deepest node of depth ≤ i that
north neighbor of the root is NULL is adjacent to the north side of S.
north neighbor of a SW or
SE node is the NW or NE
node respectively
North neighbor of a NE or
NW node is a child of the
north neighbor of its parent.
Algorithm: walk up until you get an easy

case, apply easy case, and then walk down,
moving to SW or SE as appropriate
Compute North Neighbor
def NorthNeighbor(v, Q):

if parent(v) is None: return None
if v is SW-child: return NW-child(parent(v))
if v is SE-child: return NE-child(parent(v))
u = NorthNeighbor(parent(v), Q)
if u is None or is_leaf(u): return u
if v is NW-child: return SW-child(u)

else return SE-child(u)
Demo
An Advantage of PR quadtrees
• Since partition locations don’t depend on the data

points, two different sets of data can be stored in two
separate PR quadtrees
- The partition locations will be “the same”
- E.g. a quadrant Q1 in T1 is either the same as, a superset
of, or a subset of any quadrant Q2 in T2
- You cannot get partially overlapping quadrants
- Recursive algorithms cleaner, e.g.
Issues with PR Quadtrees
• Can be inefficient:
- two closely spaced points may require a lot of levels in
the tree to split them
- Have to divide up space finely enough so that they end
up in different cells
• Generalizing to large dimensions uses a lot of space.

- octtree = Quadtree in 3-D (each node has 8 pointers)
In d d = 20 =>
dimensions, nodes will ~
each node 1 million
has 2d children
pointers!
Split & Merge Decomposition
Subdivide into
uniform blocks
Subdivide into
uniform blocks
Merge similar
brothers
Subdivide into
uniform blocks
Merge similar
brothers
Subdivide non-
homogenous cells
Subdivide into
uniform blocks
Merge similar
brothers
Subdivide non-
homogenous cells
Group identical
blocks to get regions
MX Quadtrees
• Good for image data

- smallest element is known, e.g. a pixel
- Space is recursively subdivided until smallest unit is
reached:
- Always subdivide to smallest unit:
MX Quadtree Demo
MX (MatriX) Quadtrees Shape of final tree
independent of
• Points are always at leaves insertion order
• All leaves with points are the same depth:

MX Quadtree Notes & Applications
• Shape of final tree independent of insertion order
• Can be used to represent a matrix (especially 0/1 matrix)

- recursive decomposition of matrix (given by the MX tree) can be
used for faster matrix transposition and multiplication
• Compression and transmission of images

- Hierarchy => progressive transmission:
- transmitting high levels of the tree gives you a rough image
- lower levels gives you more detail
• Requires points come from a finite & discrete domain

Point Quadtrees
• Similar to PR Quadtrees, except we split on points in

the data set, rather than evenly dividing space.
• Handling infinite space:
- Special infinity value => allow rectangles to extend to
infinity in some directions
- Assume global bounding box
Point Quadtrees
Insertion into Point Quadtrees
• Insert(P):
- Find the region that would contain the point P.
- If P is encountered during the search, report Duplicate!
- Add point where you fall off the tree.
NW NE
35,40
(35,40)
SW SE
NW NE SW SE
Point Quadtree Demo
Deletion from Point Quadtrees
• Reinsert all the points in the subtree rooted at the

deleted node P.
• Can be expensive.
• There are some more clever ways to delete that

work well under some assumptions about the data.
Some performance facts (random data):
• Cost of building a point quadtree empirically shown

to be O(n log4 n) [Finkel,Bentley] with random
insertions
• Expected height is O(log n).
• Expected cost of inserting the ith node into a d-

dimensional quad tree is (2/d)ln i + O(1).
More balanced Point Quadtrees
• Optimized Point Quadtree: want no subtree rooted at node A to

contain more than half the nodes (points) under A.
• Assume you know all the data at the start:
x1 y1
x2 y2
x3 y3
...
• Sort the points lexicographically: primary key is x-coordinate,
secondary key is y-coordinate.
• Make root = the median of this list (middle element)
=> half the elements will be to the left of the root, half to the
right.
• Recursively apply to top and bottom halves of the list.
Pseudo Point Quadtrees
• Like PR quadtrees: splits don’t occur at data points.

• Like Point Quadtrees: actual key values determine
splits
• Determine a point that splits up the dataset in the
most balanced way.
- Overmars & van Leeuwen: for any N points, there is a
partitioning point so that each quadrant contains ≤
ceil(N/(d+1)) points.
Comparison of Point-based & Trie-based
Quadtrees
• “Trie-based” = MX and PR quadtrees
- rely on regular space decomposion
- data points associated only with leaf nodes
- simple deletion
- shape independent of insertion order
• Point-based quadtrees
- data points in internal nodes
- often have fewer nodes
- harder deletion
- shape depends on insertion order
Problems with Point Quadtrees
• May not be balanced...
- But expected to be if points are randomly inserted.
• Size is bounded in n.
- Partitioning key space rather than geometric space.
- Because each node contains a point, you have at most n
nodes.
• But may have lots of unused pointers if d is large!
• Solution is kd-trees.

L17-18 QuadTrees PDF

Uploaded by

Copyright:

Available Formats

L17-18 QuadTrees PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

L17-18 QuadTrees PDF

Uploaded by

Copyright:

Available Formats

Quad Trees

• Computer graphics, games, movies

• computer vision, CAD, street maps (google maps /

• Human-computer interface design (windowing

• Visualization (graphing complex functions)

• Scalars: 1-d poin

• Vectors: direction and magnitude (length) in that

• Line: infinite in both directions

• Ray: infinite in one direction

• P - Q is a vector going from point Q to P

• v + u: serially walk along v and then u. v+u is the

• Is the object in the set?

• What is the closest object to a given point?

• What objects does a query object intersect with?

• What is the first object hit by the given ray? [Ray

• What objects contain P?

• What objects are in a given range? [range queries]

Distance = square root of the sum of the squares of the

d = √dx2 + dy2 + dz2

distx(C,R) is 0 unless C is in blue regions

else if C[i] > R.high[i]:

• Take a step back:

• With geometric data,

• Data we want to store is a collection of d-

• Simplest query: “Is point P in the collection?”

• Recursively subdivide cells into 4 equal-sized

• Each division results in a single node with 4 child

• When cell contains no points, add special “no-point”

• When cell contains 1 point, add node containing

• How many times might you have to split?

• A cell “has a point” if it is or .

• Leaves should be treated differently that internal

• Bounding boxes constructed on the fly and passed

• Extension: allow a constant b > 1 points in a cell

internal node Therefore, s√2/2i ≥ c

side length = Height of tree is max depth of

Thm. A quadtree of depth d storing n points has

Proof: Every internal node represents a square with at

Algorithm: walk up until you get an easy

def NorthNeighbor(v, Q):

if v is NW-child: return SW-child(u)

• Since partition locations don’t depend on the data

• Generalizing to large dimensions uses a lot of space.

• Good for image data

• All leaves with points are the same depth:

• Can be used to represent a matrix (especially 0/1 matrix)

• Compression and transmission of images

• Requires points come from a finite & discrete domain

• Similar to PR Quadtrees, except we split on points in

• Reinsert all the points in the subtree rooted at the

• There are some more clever ways to delete that

• Cost of building a point quadtree empirically shown

• Expected height is O(log n).

• Expected cost of inserting the ith node into a d-

• Optimized Point Quadtree: want no subtree rooted at node A to

• Like PR quadtrees: splits don’t occur at data points.

• May not be balanced...

- But expected to be if points are randomly inserted.

• But may have lots of unused pointers if d is large!