GDP Forecasting Using Time Series Analysis
GDP Forecasting Using Time Series Analysis
Course Project
MTH517 Time Series Analysis
IIT Kanpur
1
Contents
1 Introduction 3
1.1 GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.1 Nominal GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.2 Real GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.3 Components of GDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Mathematical Background 4
2.1 Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Related Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 ARIMA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2.1 Criteria for choosing order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Holt-Winters Seasonal Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Augmented Dickey-Fuller Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.5 Kwiatkowski Phillips Schmidt Shin (KPSS) Test . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.6 Ljung Box Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4 Conclusion 15
5 Acknowledgement 15
2
1 Introduction
1.1 GDP
Gross domestic product (GDP) is a monetary measure of the market value of all final goods and services
produced in a period (quarterly or yearly) of time. Governments and businesses use GDP forecasts to help
them determine their strategy, multi-year plans, and budgets for the upcoming year
Y = C + I + G + (X M )
where
C (consumption) consists of private expenditures in the economy like durable goods, nondurable goods,
and services. Examples include food, rent, jewelry, gasoline, and medical expenses (not the purchase of
new housing)
I (investment) includes business investment in equipment, but does not include exchanges of existing
assets. Examples include construction of a new mine, purchase of software, or purchase of machinery and
equipment for a factory.
G (government spending) is the sum of government expenditures on final goods and services. It includes
salaries of public servants, purchases of weapons for the military and any investment expenditure by a
government. It does not include any transfer payments, such as social security or unemployment benefits.
X (exports) represents gross exports. GDP captures the amount a country produces, including goods and
services produced for other nations consumption
M (imports) represents gross imports. Imports are subtracted since imported goods will be included in
the terms G, I, or C, and must be deducted to avoid counting foreign supply as domestic.
3
2 Mathematical Background
2.1 Time Series
A time series of observations recorded sequentially over a period of time (i.e. a collection of observations recorded
along with the time stamp) represented as (t, Xt ). Xt may be univariate (single variable) or multivariate
(collection of variables)
AIC = 2log(L) + 2k
where k is the number of parameters in the model being fitted to the data (p + q + 1).
Bayesian Information Criterion (BIC)
where
yx+m is the forecasted value m number of points into the future
lx or level is the expected value of the xth data point.
bx is the trend or slope
L is the season length
is the smoothing coefficient for series data points.
is the trend factor or coefficient.
is the smoothing factor for the seasonal component.
1 xL+1 x1 xL+2 x2 xL+3 x3 xL+L xL
Initialize s0 = x0 and b0 = L( L + L + L + ... + L )
4
2.4 Augmented Dickey-Fuller Test
Augmented Dickey Fuller test (ADF) tests the null hypothesis that a unit root is present in a time series sample
and the alternative hypothesis is usually stationarity or trend-stationarity, depending on which version of the
test is used.
The intuition behind the test is as follows. If the series Xt is stationary (or trend stationary), then it has a
tendency to return to a constant (or deterministically trending) mean. Therefore large values will tend to be
followed by smaller values (negative changes), and small values by larger values (positive changes). Accordingly,
the level of the series will be a significant predictor of next periods change, and will have a negative coefficient.
You usually reject the null when the p-value is less than or equal to a specified significance level, often 0.05
(5%)
5
3 Experiments, Observations and Conclusions
3.1 Dataset
We have worked on yearly real GDP data of India in local currency unit that is Rupees for the period 1960-2016.
Since the GDP values were too big, we have worked with loge of GDP values.
6
Figure 2: Comparison of the HW predicted and actual values of log GDP
7
Figure 3: Forecasted Values of log GDP
The Holt Winters method assumes that the residuals are normally distributed with zero mean and are
uncorrelated with constant variance to predict the CI.To test this we plot various graphs and use the
Box-Ljung test.
The above graph of residuals suggests a zero mean and constant variance.
The Box Ljung test gives us the following results
X-squared = 24.382, df = 20, p-value = 0.2261
suggesting that the residuals are uncorrelated.
8
The ACF graph of residuals also supports this as all correlation values are below significance line.
The histogram of the residual errors suggests a fairly normal distribution with a slight skewness towards
the left. Hence we conclude that our estimated CI are correct.
9
3.3 ARIMA Model
Applied Augmented Dickey Fuller(ADF) test to check for stationarity giving the following results
On log GDP
Dickey-Fuller = -0.46268, Lag order = 3, p-value = 0.9804
p-value > critical value implying non stationarity
On differencing series for order 1
Dickey-Fuller = -6.61, Lag order = 3, p-value = 0.01
p-value < critical value implying stationarity
On differencing series for order 2
Dickey-Fuller = -6.7314, Lag order = 3, p-value = 0.01
p-value < critical value implying stationarity
The plots of these series however suggest that the diff series of order 1 is not stationary but of order 2 is
stationary
10
Figure 7: Plot of 5(log GDP) vs Time
On log series
Level Stationarity : KPSS Level = 2.8804, Truncation lag parameter = 1, p-value = 0.01
Trend Stationarity : KPSS Trend = 0.70213, Truncation lag parameter = 1, p-value = 0.01
Data is neither trend nor level stationary
On diff of log series
Level Stationarity : KPSS Level = 1.1825, Truncation lag parameter = 1, p-value = 0.01
11
Trend Stationarity : KPSS Trend = 0.028799, Truncation lag parameter = 1, p-value = 0.1
Data is trend stationary but not level stationary
On diff of order 2 of log series
Level Stationarity : KPSS Level = 0.015645, Truncation lag parameter = 1, p-value = 0.1
Trend Stationarity : KPSS Trend = 0.015604, Truncation lag parameter = 1, p-value = 0.1
Data is both trend stationary and level stationary
Hence we conclude that the d parameter in ARIMA(p,d,q) process is 2.
To conclude the p and q values we observe the ACF and PACF of the 52 log(GDP)
12
p/q 0 1
0 -196.0436 -229.5718
1 -207.7324 -228.0191
2 -217.1027 -227.7479
3 -215.9419 -225.7616
4 -221.2816 -229.9328
5 -224.6599 -228.9041
6 -227.9001 -227.9575
7 -229.4173 -227.9231
p/q 0 1
0 -194.0363 -225.5572
1 -203.7177 -221.9971
2 -211.0807 -219.7185
3 -207.9126 -215.7249
4 -211.2449 -217.8888
5 -212.6159 -214.8528
6 -213.8488 -211.8989
7 -213.3586 -209.8571
The ACF tails off after lag 1 while PACF tails off after lag 7 implying p1 and q7 . Calculating the
AIC and BIC of all possible values we get
Both the AIC and BIC values suggest an ARIMA(0,2,1) model.Fitting this we obtain the root as shown
The residual plot suggests zero mean and a fairly constant variance.
The sum of squares error was found to be 0.04655945 for the known values
13
Figure 12: Residuals of ARIMA process
14
4 Conclusion
The GDP data seems to follow ARIMA(0,2,1) process.
Sum of squared errors was found to be nearly same for both Holt Winters and ARIMA fitting methods
making them equally reliable.
5 Acknowledgement
Wikipedia, the free encyclopedia : https://en.wikipedia.org/wiki/Main_Page
Forecasting GDP growth : homepage.univie.ac.at/robert.kunst/070107_efc.pdf
15