Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

May 2021 Examination Diet School of Mathematics & Statistics MT4537

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

MAY 2021 EXAMINATION DIET

SCHOOL OF MATHEMATICS & STATISTICS


MODULE CODE: MT4537
MODULE TITLE: Spatial Statistics
EXAM DURATION: 2 hours
EXAM INSTRUCTIONS: Attempt ALL questions.
The number in square brackets shows the
maximum marks obtainable for that
question or part-question.
Your answers should contain the full
working required to justify your solutions.

INSTRUCTIONS FOR ONLINE EXAMS:

Each page of your solution must have the page number, module code, and your student
ID number at the top of the page. You must make sure all pages of your solutions are
clearly legible.

MT4537 May 2021, Page 1 of 11


1. Summary characteristics

(a) Describe the spherical contact distribution function and how it is derived
from the void probability for a point process. Why is it useful? You need
only consider the R2 case. [2]

(b) Figure 1 shows the plot of the estimated pair correlation function for an
unseen point pattern. What can you infer about the point pattern from the
shape of the curve? [2]

Pair correlation function

g^R i pl ey (r )
g^T rans (r )
1.5

g P oi s (r )
1.0
g (r )

0.5
0.0

0.00 0.05 0.10 0.15 0.20 0.25

Figure 1: Function for unseen point pattern

(c) Explain hard-core (distance) in this context, estimate it from from Figure 1,
and explain why they may occur in reality. [2]

(d) The pair correlation function is strongly related to Ripley’s K-function. Pro-
vide details of this relationship; justify why 1 is an important reference num-

MT4537 May 2021, Page 2 of 11


ber; and state the main advantage of using the pair correlation function over
the K-function. [3]

(e) Suggest a model class that may be chosen to model a pattern with a pair
correlation function shaped like the one in Figure 1. Justify your choice. [2]

(f) What theoretical K function describes the CSR situation (in R2 )? Give its
form and explain why this is so. [2]

(g) Describe how the weighting for an isotropic edge correction is calculated,
for a pair of events x1 and x2 with associated window W in R2 . Show its
rationale via a sketch. [2]

2. Thinning and associated simulation

(a) A type of flower in a field is sampled in three ways: (1) by taking an aerial
photo of the field and counting the flowers in the photo (some of which are
likely to be missed because they are so small), (2) doing the same, but on
a misty day when mist might make flowers in some parts of the field more
difficult to see than others, and (3) a surveyor stands in the middle of the
field and locates all the flowers she can see. Say whether p, p(x) or P (x)-
thinning would be most appropriate to model the observed flower locations
in each case, and explain your answer. [3]

(b) What is the relationship between the K function for a homogeneous Poisson
point process, and that of a p-thinned variant? Explain the relationship. [2]

(c) Consider the flower scenario above, in which a surveyor searches from the
middle of the field. Suppose that the probability of her seeing a flower is
given by exp(−r2 /σ 2 ), where r is radial distance from the surveyor. If the
intensity of flowers in the field is λ throughout the field, what is the apparent
intensity at a point that is a distance r from the surveyor? How would you
simulate this using thinning? [3]

MT4537 May 2021, Page 3 of 11


3. Neyman-Scott processes

(a) What are the main properties/assumptions that define a Neyman-Scott pro-
cess? [2]

(b) The Matérn and Thomas processes are Neyman-Scott processes. Describe
each of these, including their governing parameters. Further comment on
their applicability to real-world situations. [4]

4. Gibbs processes

(a) State the density of a Gibbs process with a fixed number of points and explain
its components. How does this create regularity in a point pattern? [2]

(b) The probability density for a Strauss process may be expressed as:

f (x1 , . . . , xn ) = αβ n(x) γ s(x)

where x1 ...xn are points of the pattern, α is a normalising constant, n(x) the
number of points, s(x) the number of point-pairs that are within r units of
one another (the interaction radius). γ here is the interaction parameter and
is 0 ≤ γ ≤ 1. Explain how altering γ can make this process CSR at one
extreme, or a Gibbs hard-core at the other. [3]

5. Consider a Cox process as an inhomogeneous Poisson process with random in-


tensity function Λ(x).

(a) Why does a Gaussian Random Field not generally serve as a driving intensity
for a point process? [1]

MT4537 May 2021, Page 4 of 11


(b) A log Gaussian Cox process is a specific type of Cox process. How is it
defined? [2]

6. Geostatistical data

(a) Describe a variogram in the context of spatial dependence on a spatial ran-


dom field Z(x). You can assume the field is stationary and isotropic. Explain
the variogram in a real-world context. [1]

(b) What is the difference between ordinary and universal kriging? Briefly com-
ment on how this can relate to spline regression. [2]

(c) Pages 6 to 10 presents outputs from models applied to data measuring heavy
metals in top-soil on a flood plain beside the river Meuse. There is location
information for the samples (x, y in metres) and several covariates - one
of which is considered here: dist, the normalised distance from the river.
Three models have been fitted to this data in order to predict, and explain,
the levels of zinc in the top-soil.

(i) Describe the structure of each of the models. [3]

(ii) Interpret the outputs for each model. [3]

(iii) Referring to model 2, briefly describe what sort of smooth s(x, y) is,
and how its complexity is controlled. [2]

(iv) What is the fundamental difference in the way spatial information


is used in models 2 and 3? Describe the philosophical difference in
approach. [2]

MT4537 May 2021, Page 5 of 11


# Model 1
model1 <- autoKrige(log(zinc)~dist, meuse, meuse.grid, model = "Sph")
plot(model1)

Kriging prediction Kriging standard error


7.5

7.0
0.50

6.5

0.45
6.0

5.5
0.40
5.0

4.5 0.35

4.0

Experimental variogram and fitted variogram model

0.30
1349 1314
0.25 1139
830 1355
Semi−variance

0.20 711
149
184
0.15
114
0.10 36
17 Model: Sph
Nugget: 0.07
0.05
Sill: 0.28
Range: 724

500 1000 1500

Distance

Figure 2: Model 1 outputs

MT4537 May 2021, Page 6 of 11


# Model 2
> model2 <- gam(log(zinc) ~ s(dist) + s(x, y), data = meuse)
> summary(model2)

Family: gaussian
Link function: identity

Formula:
log(zinc) ~ s(dist) + s(x, y)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.88578 0.02644 222.6 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:


edf Ref.df F p-value
s(dist) 3.753 4.711 5.750 0.000131 ***
s(x,y) 23.407 26.723 4.034 1.37e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) = 0.792 Deviance explained = 82.9%


GCV = 0.1324 Scale est. = 0.10835 n = 155

MT4537 May 2021, Page 7 of 11


Model 2 Model 2

333000
1.0
0.5

332000
s(dist,3.75)

0.0

y
331000
−0.5
−1.0

330000
−1.5

0.0 0.2 0.4 0.6 0.8 178500 179000 179500 180000 180500 181000 181500
x
dist

GAM predictions
Model 2

333000

332000

331000

330000

179000 180000 181000

predicted
5.0 5.5 6.0 6.5 7.0

Figure 3: Model 2 outputs

MT4537 May 2021, Page 8 of 11


# Model 3
> model3 <- gamm(log(zinc) ~ s(dist), correlation = corExp(form = ~ x + y), data = meuse)
> summary(model3$lme)
Linear mixed-effects model fit by maximum likelihood
Data: strip.offset(mf)
AIC BIC logLik
160.5266 175.7437 -75.26329

Random effects:
Formula: ~Xr - 1 | g
Structure: pdIdnot

....

Correlation Structure: Exponential spatial correlation


Formula: ~x + y | g
Parameter estimate(s):
range
122.0032
Fixed effects: y.0 ~ X - 1
Value Std.Error DF t-value p-value
X(Intercept) 5.848174 0.06091025 153 96.01296 0e+00
Xs(dist)Fx1 -0.861074 0.24072381 153 -3.57702 5e-04
Correlation:
X(Int)
Xs(dist)Fx1 0.006

....

> summary(model3$gam)

Family: gaussian
Link function: identity

Formula:
log(zinc) ~ s(dist)

Parametric coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.84817 0.06071 96.33 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:


edf Ref.df F p-value
s(dist) 4.038 4.038 38.93 <2e-16 ***

MT4537 May 2021, Page 9 of 11


---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) = 0.655
Scale est. = 0.17914 n = 155

Model 3
1.5
1.0
0.5
s(dist,4.04)

0.0
−0.5
−1.0
−1.5

0.0 0.2 0.4 0.6 0.8

dist

GAM predictions
Model 3

333000

332000

331000

330000

179000 180000 181000

predicted
5.0 5.5 6.0 6.5

Figure 4: Model 3 outputs

MT4537 May 2021, Page 10 of 11


END OF PAPER

MT4537 May 2021, Page 11 of 11

You might also like