Spatial Point Patterns
Spatial Point Patterns
Lecture #1
1
3/11/2010
Types of distributions
Three general patterns
Random - any point is equally likely to occur at any location
and the position of any point is not affected by the position of
any other point
Uniform - every point is as far from all of its neighbors as
possible
Clustered - many points are concentrated close together, and
large areas that contain very few, if any, points
2
3/11/2010
Types of distributions
Methods
“Exploratory” analysis
Visualization (maps)
Estimate how intensity of point pattern varies over an area
Quadrat analysis, kernel estimation
Estimate the presence of spatial dependence among events
Nearest neighbor distances, K-function
Modeling techniques
Statistical tests for significant spatial patterns in data, compared
with the null hypothesis of complete spatial randomness (CSR)
3
3/11/2010
Modeling techniques
We can conduct statistical tests for significant patterns in
our data
H0: events exhibit complete spatial randomness (CSR)
Ha: events are spatially clustered or dispersed
4
3/11/2010
Some notes on R
> library(maptools)
> library(rgdal)
> library(shapefiles)
> library(spatstat)
> library(splancs)
5
3/11/2010
Splancs
> library(shapefiles)
> border <- readShapePoly(paste(workingDir,
"/shapefiles/FLBndy.shp", sep=""))
> flbord <- border@polygons[[1]]@Polygons[[1]]@coords
> str(border)
> flinv<-readShapePoints("C:/Users/Elisabeth
Root/Desktop/Quant/R/shapefiles/FL_Invasive.shp")
> flinvxy<-coordinates(flinv)
> flinv<-
readShapePoints("C:/Users/Eroot/Quant/R/shapefiles/
FL_Invasive.shp")
> flpt<-as(flinv,"ppp")
> flppp<-ppp(flpt$x,flpt$y,window=flbdry)
6
3/11/2010
Quadrat methods
Divide the study area into subregions of equal size
Often squares, but don‟t have to be
Count the frequency of events in each subregion
Calculate the intensity of events in each subregion
7
3/11/2010
Quadrat methods
Quadrat method
Compare the intensity variation over R
8
3/11/2010
3 1 2 2 0 0
5 0 2 2 0 0
2 1 2 2 10 10
1 3 2 2 0 0
3 1 2 2 0 0
Quadrats in R
Done using spatstat package
> plot(flppp)
> plot(qt, add = TRUE, cex =
.5)
9
3/11/2010
Kernel estimation
Believe it or not, we already talked about this with GWR!
Calculating the density of events within a specified search
radius around each event
A moving three-dimensional function (the kernel) of a given
radius (bandwidth) „visits‟ each point in the study area
Use kernel to weight the area surrounding the point
proportionately to its distance to the event
Sum these individual kernels for the study region
Produce a smoothed surface
10
3/11/2010
Kernel estimation
Kernel estimation
s is a location in R (the study area)
s1…sn are the locations of n events in R
kernel
(which is a function of the
distance and bandwidth)
Summed across all points si within the radius ()
11
3/11/2010
3 hi2
Quartic: k 1
Gaussian 2
h2
1 2i 2
Normal: k e
2
Kernel estimation
The kernel (k) is basically a mathematical function that
calculates how the surface value “falls off” as it reaches
the radius
There are lots of different kernel functions
Most researchers believe it doesn‟t really matter which you use
Most common in GIS is the quartic kernel
2
n
3 d2
ˆ ( s) 2 1 i2 distance between point s and si
d i
At point s, the weight is
bandwidth 3/2 and drops smoothly
(radius of the circle) to a value of 0 at
Summed for all values of di which are not larger than
12
3/11/2010
Kernel estimation
(s)
Individual “bumps”
2
3 d i2
1
2 2
A few notes
Like GWR, we can used fixed and adaptive kernels
Fixed = bandwidth is a specified distance
Adaptive = fixed number of points used
Results are sensitive to change in bandwidth
When bandwidth is larger, the intensity will appear smooth and
local details obscured
When bandwidth is small, the intensity appears as local spikes
at event locations
No agreement on how to select the “best” bandwidth
prior information about underlying spatial process
comparison of various bandwidths
using Mean Square Error (in R)
13
3/11/2010
Kernel estimation in R
Can be done in both splancs and spatstat
splancs = quartic kernel
spatstat=gaussian kernel
14
3/11/2010
Kernel estimation in R
Need to make a grid to “dump” kernel estimates into
The Sobj_SpatialGrid() function in maptools takes a maxDim=
argument, which indirectly controls the cell resolution
> sG <- Sobj_SpatialGrid(border, maxDim=400)$SG
> grd <- slot(sG, "grid")
> summary(grd)
Kernel estimation in R
Using splancs
> k0 <- spkernel2d(flinvxy, flbord, h0=400, grd)
> k1 <- spkernel2d(flinvxy, flbord, h0=600, grd)
> k2 <- spkernel2d(flinvxy, flbord, h0=800, grd)
> k3 <- spkernel2d(flinvxy, flbord, h0=1000, grd)
> df <- data.frame(k0=k0, k1=k1, k2=k2, k3=k3)
> kernels <- SpatialGridDataFrame(grd, data=df)
> summary(kernels)
> gp <- grey.colors(5, 0.9, 0.45, 2.2)
> print(spplot(kernels, at=seq(0,.00001,length.out=20),
col.regions=colorRampPalette(gp)(21)))
Using spatstat
> plot(density(flppp, sigma = 600))
15
3/11/2010
16
3/11/2010
G-function
Nearest
Event x y neighbor rmin
1 66.22 32.54 10 25.59 1
2 22.52 22.39 4 15.64
3 31.01 81.21 5 21.14 0.75
4 9.47 31.02 8 24.81
5 30.78 60.10 3 9.00
G(r)
0.5
6 75.21 58.93 10 21.14
7 79.26 7.68 12 21.94
8 8.23 39.93 4 9.00 0.25
9 98.73 42.53 6 21.94
10 89.78 42.53 6 21.94
0
11 65.19 92.08 6 34.63
12 54.46 8.48 7 24.81 0 9 15 22 25 26 35
Distance (r)
# [rmin ( si ) r ]
G (r )
n
# point pairs where rmin r
# of points in study area
17
3/11/2010
G-function
The shape of G-function tells us the way the events
are spaced in a point pattern
1
Clustered = G increases
rapidly at short distance
0.75
Evenness = G increases
slowly up to distance where
G(r)
0.5
most events spaced, then
increases rapidly 0.25
How do we examine
significance (significant 0
0 9 15 22 25 26 35
departure from CSR)? Distance (r)
This is done in R!
radius (r)
18
3/11/2010
G estimate in R
> r=seq(0,350,by=50)
G estimate in R
19
3/11/2010
Three steps:
1. Randomly select m points (p1, p2, …, pn)
2. Calculate dmin(pi, s) as the minimum distance from location pi
to any event in the point pattern s
3. Calculate F(d)
F-function
1
0.75
F(r)
0.5
0.25
0
0 5 10 15 20 25
10 meters
Distance (r)
= randomly chosen point
= event in study area # [d min ( pi , s) d ]
= dmin
F (d )
m
# of point pairs where rmin r
# sample points
20
3/11/2010
F-function
Clustered = F(r) rises
slowly at first, but more
1
rapidly at longer distances
Evenness = F(r) rises rapidly 0.75
at first, then slowly at longer
distances
F(r)
0.5
Examine significance by
simulating “envelopes” 0.25
0
0 5 10 15 20 25
Distance (r)
F estimate in R
> r=seq(0,350,by=50)
> plot(F)
lty col key label meaning
obs 1 1 obs obs(r) observed value of F(r) for data pattern
theo 2 2 theo theo(r) theoretical value of F(r) for CSR
hi 3 3 hi hi(r) upper pointwise envelope of F(r) from simulations
lo 4 4 lo lo(r) lower pointwise envelope of F(r) from simulations
21
3/11/2010
F estimate in R
22
3/11/2010
K function
Limitation of nearest neighbor distance method is that it
uses only nearest distance
Considers only the shortest scales of variation
K function (Ripley, 1976) uses more points
Provides an estimate of spatial dependence over a wider range
of scales
Based on all the distances between events in the study area
Assumes isotropy over the region
K function
Defined as:
1
K (h) E (# (events w/in distance h of randomly chosen event)
= the intensity of events (n/A)
23
3/11/2010
24
3/11/2010
Interpreting K with L
This L-function is nothing more than a standardized
version of the K function
Transforms the K function so we can easily interpret it
Compare it to 0 uniform
Kˆ (h) random
Lˆ (h) h
L(h)
clustered
L(h) = 0 if point process is random
Peaks of positive values = clustering
Troughs of negative values = regularity radius (h)
25
3/11/2010
K function in R
> L <- envelope(flppp, Lest, nsim = 59, rank = 2, global=TRUE)
> L
Simultaneous critical envelopes for L(r)
Edge correction: “iso”
Obtained from 59 simulations of CSR
Significance level of Monte Carlo test: 1/60 = 0.0166667
Data: flppp
Entries:
id label description
-- ----- -----------
r r distance argument r
obs obs(r) observed value of L(r) for data pattern
theo theo(r) theoretical value of L(r) for CSR
lo lo(r) lower critical boundary for L(r)
hi hi(r) upper critical boundary for L(r)
> plot(L)
K function in R
26
3/11/2010
27