F 11 Final
F 11 Final
F 11 Final
GROUND RULES:
This exam contains two parts:
Part 1. Multiple Choice (50 questions, 1 point each)
Part 2. Problems/Short Answer (10 questions, 5 points each)
The maximum number of points on this exam is 100.
Print your name at the top of this page in the upper right hand corner.
IMPORTANT: Although not always stated, it is understood that {et } is a zero
mean white noise process with var(et ) = e2 .
This is a closed-book and closed-notes exam. You may use a calculator if you wish.
Cell phones are not allowed.
Any discussion or otherwise inappropriate communication between examinees, as
well as the appearance of any unnecessary material, will be dealt with severely.
You have 3 hours to complete this exam. GOOD LUCK!
PAGE 1
STAT 520
PART 1: MULTIPLE CHOICE. Circle the best answer. Make sure your answer is
clearly marked. Ambiguous responses will be marked wrong.
1. Which of the following processes is stationary?
(a) An MA(1) process with = 1.4
(b) Yt = 12.3 + 1.1Yt1 + et
(c) IMA(1,1)
(d) Yt = 0 + 1 t + et
2. Which statement about an AR(2) process is always true?
(a) The process is invertible.
(b) The process is stationary.
(c) The theoretical ACF k = 0 for all k > 2.
(d) The theoretical PACF kk decays exponentially or according to a sinusoidal pattern
as k gets large.
3. Suppose that we have observations from an MA(1) process with = 0.9. Which of
the following is true?
(a) The scatterplot of Yt versus Yt1 will display a negative linear trend and the scatterplot of Yt versus Yt2 will display a negative linear trend.
(b) The scatterplot of Yt versus Yt1 will display a positive linear trend and the scatterplot of Yt versus Yt2 will display a positive linear trend.
(c) The scatterplot of Yt versus Yt1 will display a negative linear trend and the scatterplot of Yt versus Yt2 will display a random scatter of points.
(d) The scatterplot of Yt versus Yt1 will display a positive linear trend and the scatterplot of Yt versus Yt2 will display a random scatter of points.
4. Suppose that we have observed the time series Y1 , Y2 , ..., Yt . If the forecast error
et (l) = Yt+l Ybt (l) has mean zero, then we say that the MMSE forecast Ybt (l) is
(a) stationary.
(b) unbiased.
(c) consistent.
(d) complementary.
5. What did we discover about the method of moments procedure when estimating
parameters in ARIMA models?
(a) The procedure gives reliable results when the sample size n > 100.
(b) The procedure gives unbiased estimates.
(c) The procedure should not be used when models include AR components.
(d) The procedure should not be used when models include MA components.
PAGE 2
STAT 520
process.
10. If you were going to overt this model for diagnostic purposes, which two models
would you t?
(a) IMA(1,1) and ARMA(2,2)
(b) ARMA(1,1) and AR(2)
(c) ARMA(1,2) and ARI(1,1)
(d) IMA(1,1) and ARI(2,1)
PAGE 3
STAT 520
11. We used R to generate a white noise process. We then calculated rst dierences
of this white noise process. What would you expect the sample ACF rk of the rst differences to look like?
(a) Most of the rk estimates should be close to zero, possibly with the exception of a
small number of estimates which exceed the white noise bounds when k is larger.
(b) The rk estimates will decay very, very slowly across k.
(c) The r1 estimate should be close to 0.5 and all other rk estimates, k > 1, should be
small.
(d) It is impossible to tell unless we specify a distributional assumption for the white
noise process (e.g., normality).
12. True or False. The use of an intercept term 0 has the same eect in stationary
and nonstationary ARIMA models.
(a) True
(b) False
13. The augmented Dickey-Fuller unit root test can be used to test for
(a) normality.
(b) independence.
(c) stationarity.
(d) invertibility.
14. An observed time series displays a clear upward linear trend. We t a straight line
regression model to remove this trend, and we notice that the residuals from the straight
line t are stationary in the mean level. What should we do next?
(a) Search for a stationary ARMA process to model the residuals.
(b) Perform a Shapiro-Wilk test.
(c) Calculate the rst dierences of the residuals and then consider tting another regression model to them.
(d) Perform a t-test for the straight line slope estimate.
15. Excluding the intercept 0 and white noise variance e2 , which model has the largest
number of parameters?
(a) ARIMA(1, 1, 1) (2, 0, 1)12
(b) ARMA(3,3)
(c) ARMA(1, 1) (1, 2)4
(d) ARIMA(2,2,3)
PAGE 4
STAT 520
16. In performing diagnostics for an ARMA(1,1) model t, I see the following output in
R:
> runs(rstandard(data.arma11.fit))
$pvalue
[1] 0.27
How do I interpret this output?
(a) The standardized residuals seem to be well modeled by a normal distribution.
(b) The standardized residuals are not well represented by a normal distribution.
(c) The standardized residuals appear to be independent.
(d) We should probably consider a model with either p > 1 or q > 1 (or both).
17. True or False. If {Yt } is a stationary process, then {Yt } must be stationary.
(a) True
(b) False
18. A 95 percent condence interval for the Box Cox transformation parameter
is (0.77, 1.41). Which transformation is appropriate?
(a) Square root
(b) Square
(c) Log
(d) Identity (no transformation)
20. If an AR(1)12 model is the correct model for a data set, which model is also mathematically correct?
(a) AR(1)
(b) AR(12)
(c) AR(11)
(d) ARMA(1,12)
PAGE 5
STAT 520
21. In Chapter 3, we discussed the use of regression to detrend a time series. Two models
used to handle seasonal trends were the cosine-trend model and the seasonal means
model. Which statement is true?
(a) With monthly data (so that the number of seasonal means is 12), the cosine trend
model is more parsimonious.
(b) Standardized residuals from the cosine trend model t will be normally distributed
if the process is truly sinusoidal.
(c) Dierencing should be used before tting a seasonal means model.
(d) All of the above are true.
22. When we used least squares regression to t the deterministic trend regression model
Yt = 0 + 1 t + Xt in Chapter 3, the only assumption we needed for the least squares
estimators b0 and b1 to be unbiased was that
(a) {Xt } has constant variance.
(b) {Xt } is a white noise process.
(c) {Xt } has zero mean.
(d) {Xt } is normally distributed.
23. If {Yt } is an ARI(1,1) process, then what is the correct model for {Yt }?
(a) Random walk with drift
(b) AR(1)
(c) IMA(2,1)
(d) None of the above
24. For a stationary ARMA(p, q) process, when the sample size n is large, the sample
autocorrelations rk
(a) follow an MA(1) process.
(b) are approximately normal.
(c) are likely not to be statistically signicant from zero.
(d) have variances which decrease as k gets larger.
25. What is the main characteristic of an AR(1) process with parameter = 0.2?
(a) The mean of the process is equal to 0.2.
(b) The variance of the process is equal to (0.2)2 = 0.04.
(c) The autocorrelation function k exhibits a slow decay across lags.
(d) None of the above.
PAGE 6
STAT 520
28. In class, we proved that the autocorrelation function for a zero-mean random
walk Yt = Yt1 + et is equal to
t
k =
,
t+k
for k = 1, 2, ...,. Which of the following statements is true?
(a) This process is stationary.
(b) The variance of this process approaches 0 = 1.
(c) corr(Y1 , Y2 ) is larger than corr(Y99 , Y100 ).
(d) None of the above.
30. We used the Bonferroni criterion for judging a standardized residual (from an
ARIMA model t) as an outlier. What essentially does this mean?
(a) We look at the mean of each residual and take the largest one as an outlier.
(b) Each residual is compared to z0.025 1.96, and all those beyond 1.96 (in absolute
value) are declared outliers.
(c) We perform an intervention analysis and determine if the associated parameter estimate is signicant.
(d) None of the above.
PAGE 7
STAT 520
35. The length of a prediction interval for Yt+l computed from tting a stationary
ARMA(p, q) model generally
(a) increases as l increases.
(b) decreases as l increases.
(c) becomes constant for l suciently large.
(d) tends to zero as l increases.
PAGE 8
STAT 520
38. True or False. In a stationary ARMA(p, q) model, maximum likelihood estimators of model parameters (i.e., the s and the s) are approximately normal in large
samples.
(a) True
(b) False
PAGE 9
STAT 520
41. You have an observed time series that has clear nonconstant variance and a sharp
linear trend over time. What should you do?
(a) Display the ACF, PACF, and EACF of the observed time series to look for candidate
models.
(b) Split the data set up into halves and then t a linear regression model to each part.
(c) Try dierencing the series rst and then try a transformation to stabilize the variance.
(d) Try a variance-stabilizing transformation rst and then use dierencing to remove
the trend.
42. During lecture, I recounted a true story where I had asked a fortune teller (in the
French Quarters) to comment on the precision of her predictions about my future. In
what city did this story take place?
(a) New Orleans
(b) Nome
(c) Nairobi
(d) Neverland
PAGE 10
STAT 520
46. What technique did we use in class to simulate the sampling distribution of sample
autocorrelations and method of moments estimators?
(a) Monte Carlo
(b) Bootstrapping
(c) Jackkning
(d) Backcasting
47. True or False. In a stationary ARMA(p, q) process, the MMSE forecast Ybt (l) depends on the MA components only when l q.
(a) True
(b) False
PAGE 11
STAT 520
PART 2: PROBLEMS/SHORT ANSWER. Show all of your work and explain all
of your reasoning to receive full credit.
1. Suppose that {Yt } is an MA(1)4 process with mean , that is,
Yt = + et et4 ,
where {et } is a zero mean white noise process with var(et ) = e2 .
(a) Find t = E(Yt ) and 0 = var(Yt ).
(b) Show that {Yt } is (weakly) stationary.
PAGE 12
STAT 520
2. Suppose that {et } is zero mean white noise with var(et ) = e2 . Consider the model
Yt = 0.5Yt1 + et 0.2et1 0.15et2 .
(a) Write this model using backshift notation.
(b) Determine whether this model is stationary and/or invertible.
(c) Identify this model as an ARIMA(p, d, q) process; that is, specify p, d, and q.
PAGE 13
STAT 520
3. Suppose that {et } is zero mean white noise with var(et ) = e2 . Consider the deterministic model
Yt = 0 + 1 t + Xt ,
where Xt = Xt1 + et et1 .
(a) Derive an expression for Yt .
(b) What is the name of the process identied by {Yt }? Is {Yt } stationary?
PAGE 14
STAT 520
4. Explain how the sample ACF, PACF, and EACF can be used to specify the orders p
and q of a stationary ARMA(p, q) process.
PAGE 15
STAT 520
5. Suppose that {et } is zero mean white noise with var(et ) = e2 . Consider the random
walk with drift model
Yt = 0 + Yt1 + et .
We have observed the data Y1 , Y2 , ..., Yt .
(a) Show that the minimum mean squared error (MMSE) forecast of Yt+1 is equal to
Ybt (1) = 0 + Yt .
(b) Show that the MMSE forecast of Yt+l satises
Ybt (l) = 0 + Ybt (l 1),
for all l > 1.
PAGE 16
STAT 520
150
100
50
Number of homeruns
200
6. In class, we looked at the number of homeruns hit by the Boston Red Sox each year
during 1909-2010. Denote this process by {Yt }.
1920
1940
1960
1980
2000
Year
PAGE 17
STAT 520
PAGE 18
STAT 520
2
1
0
3 2 1
Standardized Residuals
7. I have displayed below the tsdiag output from tting the model in Question 6 to the
Boston Red Sox homerun data.
1920
1940
1960
1980
2000
0.1
0.0
0.2
ACF of Residuals
0.2
Time
10
15
20
15
20
Pvalues
Lag
10
Number of lags
I have also performed the Shapiro-Wilk and runs tests for the standardized residuals; see
the R output below:
> shapiro.test(rstandard(homerun.fit)
W = 0.9884, p-value = 0.5256
> runs(rstandard(homerun.fit))
$pvalue
[1] 0.378
Based on the information available, what do you think of the adequacy of the model t
in Question 6? Use the back of this page if you need extra space.
PAGE 19
STAT 520
200
8. Recall the TB data from our midterm; these data are the number of TB cases (per
month) in the United States from January 2000 to December 2009. Denote this process
by {Yt }. On the midterm, you used regression methods to detrend the data. On this
problem, we will use a SARIMA modeling approach. In the gure below, the TB data
are on the left, and the combined rst dierenced data (1 B)(1 B 12 )Yt are on the
right.
S
O
J
F
MM
J
MJ
A
JA
J
J
A
O F
S J
M
M
A
A
J
J
S F
O
100
1200
1000
S
F JO
O
F
J
S
J
A
J
S
O
MJ
A
MA
J
F
J J
SO
JF
800
N
N
A
J
M
M
J
A
J
J
A M
MJ
S F
SO
A
F
O
S
N
J
O
N
600
AM
J
J
A
D
D
400
A
MM
100
M
A A
J
200
1400
M
MJ
2000
2002
2004
2006
2008
2010
2002
2004
2006
2008
2010
Year
Year
PAGE 20
STAT 520
9. As promised, I used R to t the model stated in Question 8 (using maximum likelihood). Here is the output:
> tb.fit
Coefficients:
ma1
sma1
-0.9182 -0.4406
s.e.
0.0482
0.1098
sigma^2 estimated as 3436:
aic = 1183.63
PAGE 21
STAT 520
800
600
200
400
Number of TB cases
1000
1200
10. The model I t to the TB data in Question 9 has been declared a very good model
(after doing thorough residual diagnostics and overtting). In the gure below, I have
displayed the MMSE forecasts (with 95 percent prediction limits) for the next 3 years
(Jan 2010 through Dec 2012).
2007
2008
2009
2010
2011
2012
2013
Year
(a) A 95 percent prediction interval for the TB count in January 2012 is (527.7, 863.5).
Interpret this interval.
(b) Which assumption on the white noise terms {et } is crucial for the prediction interval
in part (a) to be valid?
(c) Careful inspection reveals that the prediction limits in the gure above tend to widen
as the lead time increases. Why is this true?
PAGE 22