Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Time Series Analysis CEN-531 Notes SKJ

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

1

CEN 531
Time Series Analysis
TIME SERIES
A time series is a set of observations generated sequentially in time, e.g., river flow, precipitation,
temperature, ...

Time series analysis is useful for many applications, such as forecasting, detecting trends in
records, filling-in missing data, and generation of synthetic data.

Time series data are classified in two categories:


(1) continuous time series: when observations are made continuously in time, and
(2) discrete time series: when observations are taken only at specific times, usually equally spaced.

A time series whose values have been observed at regular intervals, such as each day or each hour,
is termed as regularly spaced time series.

Observations in a discrete series, made at equidistant time intervals (h) 0 + h, 0 + 2h, … 0 + th,
…0 + Nh may be denoted by z(1), z(2), …z(t), …z(N).

Discrete time series can arise in several ways:


• Measuring values at equal time intervals to give a discrete series, or
• A variable may not have an instantaneous value but we can aggregate (or accumulate)
values over equal intervals of time. Examples: rainfall measured daily, annual groundwater
draft.

Correlated and uncorrelated series


If xt is linearly related with xt-k for k = 1, 2, … → TS is auto-correlated or serially correlated.
Otherwise, series is uncorrelated or independent.

A hydrologic TS may have dependence due to effect of storage / memory

Two time series xt and yt showing linear dependence at time t-k for k = 0, 1, …, → cross-
correlated.
Two series xt and yt may not have auto-correlation but may have cross-correlation.
Two series xt and yt may have auto-correlation but may not have cross-correlation.

TS Analysis - SKJ
2

Two time-series xt and yt. Both have autocorrelation as well as cross-correlation.

Stationary and Nonstationary Series


• A stationary TS does not have any trend, shift, or periodicity.
• Stationary Series: statistical properties of the series do not change with time.
• Stationarity can be described as:

Let ut, t= ..., -1, 0, 1, ..., be the random variables describing successive terms of a
time series.

Further, let distribution of any set of n consecutive u's, say ut+1, ut+2, ..., ut+n, be
F(ut+1, ut+2, . . . , ut+n).
If F is independent of t for all integers n > 0, the time series is stationary.

The joint distribution of any set of n consecutive variables is the same, regardless
of their location.

• A TS whose statistical properties change with time is a nonstationary series.

• Stationarity is a property of an underlying stochastic process, not of observed data.


• Systems for water management have been designed and operated under the
assumption of stationarity.
• Stationarity assumption has been compromised by human disturbances in river
basins. Flood risk, water supply, and water quality are affected by water
infrastructure, channel modifications, drainage works, and land-cover and land-use
change.
• Question: is the nonstationarity substantial enough to require a characterization of the
process, or can a comparatively simple stationary stochastic model accurately represent
the process?

Components of a Time series


Main components of a hydrologic TS are:
• Trends and other deterministic changes,
• Cycles or periodic changes and autocorrelation,
• Almost periodic changes, such as tidal effects, and
• Components representing stochastic or random variations.

Trends
Trends are introduced in a time series due to gradual or sudden changes in the major processes that
generate the time series.
Jumps are introduced in a time series due to sudden changes in major processes that generate the
time series or due to external influence.

Changes may arise due to natural and anthropogenic causes.


A forest fire covering large parts of a basin may introduce sudden changes in flow series.
Gradual conversion of forested area in agriculture or urban areas may cause slow changes in the
discharge series.
TS Analysis - SKJ
3

A water quality time series may show trends if a new factory upstream begins to discharge its
untreated effluent in the river.
Closure of a diversion dam will lead to a sudden change in the series because flow will be reduced
due to diversion.

Trend Analysis
Identification of trends in hydrologic data is helpful in modeling and predicting future behaviour
of the data.
Trends can be spatial and temporal. Will discuss temporal trends only here.
A temporal trend is general increase or decrease in observed values of a variable over time.
It describes long smooth movement of variable, ignoring short term changes.
Trend analysis is performed to determine significance of a trend (if present) and to estimate its
magnitude.

Parametric and non-parametric methods are applied for trend detection.

Magnitude of Trend
Magnitude of trend in a time series is determined either by using regression analysis (parametric
test) or by using Sen’s estimator method (non-parametric method).
Both these methods assume a linear trend in time series.
Regression analysis discussed earlier - linear trend or slope of regression line is the rate of rise/fall
in the variable.

Sen’s estimator
Slopes (Ti) of all data pairs are calculated by
x j − xk
Ti =
j−k for i = 1,2, …, N

where xj and xk are data values at time j and k (j>k), respectively.


Median of these N values of Ti is Sen’s estimator of slope

Significance of Trend: Mann-Kendall Test


To ascertain the presence of statistically significant trend in hydrologic variables, nonparametric
Mann-Kendall (MK) test can be employed.

MK test checks the null hypothesis of no trend versus the alternative hypothesis of the existence of
increasing or decreasing trend.

Statistics (S) is computed as


N −1 N
S =   sgn( x j − xi )
i =1 j =i +1

where N is number of data points.

Let (xj-xi) = θ, the value of sgn(θ) is computed as follows:

TS Analysis - SKJ
4

 1 if  0

sgn( ) =  0 if  =0
− 1 if  0

This statistic represents number of positive differences minus number of negative differences for
all the differences considered.
For large samples (N>10), the test is conducted using a normal distribution with mean and variance:

E[S] = 0
𝑁(𝑁−1)(2𝑁+5)−∑𝑛 𝑡 (𝑡 −1)(2𝑡 +5)
𝑘=1 𝑘 𝑘 𝑘
𝑉𝑎𝑟(𝑆) = 18
where n is the number of tied (zero difference between compared values) groups, and tk is the
number of data points in the kth tied group.

Standard normal deviate (Z-statistics) is computed as:


 S −1
 Var ( S ) if S  0

Z = 0 if S = 0
 S +1
 if S  0
 Var ( S )

If computed value of │Z│> zα/2, null hypothesis (Ho) is rejected at the significance level α in a
two-sided test.

Seasonality / cyclicity
TS of WR variables measured or accumulated at sub-annual time intervals (monthly, 10-daily, etc.)
normally have seasonal (periodic) patterns.
Revolution of earth around sun produces annual cycles in most hydrologic variables.
Seasonal patterns can be seen in TS, e.g., monthly rainfall, daily runoff, daily urban water demands,
and these series are said to have seasonal or periodic patterns.
Many series that are used in hydrologic studies such as urban water use, hydropower demand, may
also show weekly patterns.

Stationary Stochastic Processes

A stochastic process is called stationary if its properties are unaffected by a change of time origin.

Covariance between zt and zt+k is called the autocovariance at lag k and is calculated by

k = cov[zt, zt+k] = E[(zt - )(zt+k - )]

For a stationary process, the variance at time (t + k) is the same as at time t.

Estimate of kth lag autocovariance k


1
𝑐𝑘 = 𝑁 ∑𝑁−𝑘
𝑡=1 (𝑧𝑡 − 𝑧̄ ) (𝑧𝑡+𝑘 − 𝑧̄ ), 𝑘 = 0, 1, 2 . . . , 𝐾

Lag k autocorrelation is

TS Analysis - SKJ
5

rk = ck/c0

which implies that r0 = 1.

Plot of rk vs time is called correlogram.

If correlogram rapidly falls after a few lags, it indicates weak persistence or short memory.

A slow decline of correlogram is an indicator of strong persistence or long memory.

Correlograms with Long and Short memory.

Correlograms are slow decaying, fast decaying, and intermediate.

Matlab function
autocorr(x) - to draw correlogram
Excel: =CORREL(array1, array2)

TS Analysis - SKJ
6

TS models use the dependence for understanding or forecasting.

Time Series Models


A mathematical model representing a TS is called a time series model.
Stochastic model: a model describing the probability structure of time-sequence of observations is
called a stochastic model.

Some operators: Backward shift operator

B zt = zt-1 B2 zt = zt-2 Bm zt = zt-m

Forward shift operator


F = B-1 F zt = zt+1 F2 zt = zt+2 Fm zt = zt+m

Difference operator

 zt = zt - zt-1 = zt - B zt = (1-B) zt

Autoregressive (AR) Models 05 Nov.


AR models are extremely useful to represent certain practical series.

Current value of the process is given by the weighted sum of pre-assigned number of
past values and a random term.
These are linear models – current value is additively equal to past values – not square, log, …
River flow arises from time-dependent components – surface flow, GW, ET, …
Precip – depends on atm circulation, ocean temp – long term persistence.

Let the values of a process at equally spaced times t, t-1, t-2, … be Yt, Yt-1, Yt-2, …
Let zt, zt-1, zt-2 … be the deviations from the mean µ; for example, zt = Yt - µ.
In a pth order linear AR model, current value of the process is expressed as a finite, linear aggregate
of previous p values of the process and a shock at.
𝑧𝑡 = 𝜑1 𝑧𝑡−1 + 𝜑2 𝑧𝑡−2 + ⋯ + 𝜑𝑝 𝑧𝑡−𝑝 + 𝑎𝑡

Then zt is called an autoregressive process of order p and is denoted by AR(p).


1, 2, … p are the autoregressive parameters or weights.

TS Analysis - SKJ
7

a’s are a series of independent variables, assumed to follow ND with mean 0 and variance 𝜎𝑎2
a’s are also called white noise.

𝐸(𝑧𝑡 ) = 𝐸(𝑎𝑡 ) = 0; 𝑉𝑎𝑟(𝑧𝑡 ) = 𝐸(𝑧𝑡 2 ) = 𝜎𝑧2 = 1; E denotes expected value or mean.


𝐸(𝑧𝑡 𝑧𝑡−𝑘 ) = 𝐸(𝑧𝑡−𝑘 𝑎𝑡 ) = 0;;
𝑉𝑎𝑟(𝑎𝑡 ) = 𝐸(𝑎𝑡2 ) = 𝜎𝑎2

An AR operator of order p can be defined as


𝜑(𝐵) = 1 − 𝜑1 𝐵 − 𝜑2 𝐵 2 − ⋯ 𝜑𝑝 𝐵 𝑝
𝜑(𝐵)𝑧𝑡 = 𝑎𝑡
AR(p) model has (p+2) parameters, 𝜇, 𝜑1 , 𝜑2 , … 𝜑𝑝 , 𝜎𝑎2

Stationarity Requirement:
A time series is called stationary if its statistical properties remain constant over time.
Order of stationarity represents highest central moment which remain constant over time.
First-order stationarity indicates time invariant mean.
Second-order stationary: if both mean and variance remain constant over time.

AR process will be stable only when model parameters lie within a certain range.
Otherwise, past effects (influence of previous data points) would accumulate and the successive
values of the variable xt would move towards infinity, and therefore, the TS would not be stationary.
If there is more than one AR model parameter, similar restrictions on parameter values can be
defined.

• An AR model can predict future behavior based on past behavior.


• Physical interpretation, AR models are suggestive of baseflow recession type behavior.
• AR models are widely used in technical analysis to forecast future stock prices – prices
depend on past performance.
• AR model can be inaccurate under certain conditions, say, periods of rapid technological
change.

First-order Autoregressive model


A special model is AR model of first-order, AR(1) (also called Markov model):

𝒛𝒕 = 𝝋𝟏 𝒛𝒕−𝟏 + 𝒂𝒕 (AR1)

Coefficient ϕ1 is a number by which we multiply the value at (t-1), zt-1.


This is the part of the previous value which finds its way to the future. Short memory model !
This coefficient should always lie between -1 and 1 for the series to remain stationary.

𝑧𝑡 = 𝜑1 𝑧𝑡−1 + 𝑎𝑡

Multiply eq. AR1 by zt-1 and take expectation,


𝐸(𝑧𝑡 , 𝑧𝑡−1 ) = 𝜌1 /𝜎1 = 𝜌1 E(at,zt-1) = 0, E(zt)2 = 1, and E(zt) = 0

𝐸(𝑧𝑡 𝑧𝑡−1 ) = 𝜑1 𝐸(𝑧𝑡−1 𝑧𝑡−1 ) + 𝐸(𝑧𝑡−1 𝑎𝑡 )

𝜌1 = 𝜑1 −1 < 𝜌1 < 1

Also 𝜎𝑎2 = 1 − 𝜌12


TS Analysis - SKJ
8

Example: Mean and SD of annual flows of a river are 4.7 and 0.958; 𝑟1 = 0.324.
Generate three data for t, t+1, and t+2 using AR(1) model. Let at’s = 0.87, -0.65, and 1.15.

Solution:
If at is independent standard normal variate, multiply at by 𝜎𝑎 = (1 − 𝑟12 )0.5
The model is:
𝑧𝑡 = 𝜑1 𝑧𝑡−1 + 𝜎𝑎 𝑎𝑡
0.5
Here, 𝜎𝑎 = (1 − 0.324) = 0.946
𝑧𝑡 = 0.324 𝑧𝑡−1 + 0.946 × 𝑎𝑡

Assume zt-1 = 0, or xt-1 = 4.7 zt = x t - µ

For time t, zt = 0.3240 + 0.9460.87 = 0.823; xt = 4.7 + 0.9580.823 = 5.49

For time t+1, zt+1 =

Same for time t+2.

Second-order Autoregressive model


Another important model is AR model of second-order, AR(2):

𝑧𝑡 = 𝜑2,1 𝑧𝑡−1 + 𝜑2,2 𝑧𝑡−2 + 𝑎𝑡 (AR2)

From Yule-Walker equations


𝜌1 = 𝜑2,1 + 𝜑2,2 𝜌1
𝜌2 = 𝜑2,1 𝜌1 + 𝜑2,2
So, parameter estimates are
r1 is estimate of 𝜌1

𝜑̂2,2 = (𝑟2 − 𝑟12 )/(1 − 𝑟12 )


𝜑̂2,1 = 𝑟1 (1 − 𝑟2 )/(1 − 𝑟12 )

For stationary conditions 𝜌12 < (𝜌2 + 1)/2 and −1 < 𝜌1 , 𝜌2 < 1

Variance of independent variables

𝑧𝑡 = 𝜑𝑝,1 𝑧𝑡−1 + 𝜑𝑝,2 𝑧𝑡−2 + ⋯ + 𝜑𝑝,𝑝 𝑧𝑡−𝑝 + 𝑎𝑡


Variance of the independent variables at is given by

𝜎𝑎2 = 1 − 𝜑𝑝,1 𝜌1 − 𝜑𝑝,2 𝜌2 − ⋯ − 𝜑𝑝,𝑝 𝜌𝑝

= 1 − 𝑅2 R2 = coefficient of determination

Example: Mean and SD of annual flows of a river are 1.0 and 0.182. AR(2) model is found to fit
well to the flow data, for which 𝑟1 = 0.458 and 𝑟2 = -0.004.
Generate data for t, and t+1 using AR(2) model. Let at’s = 1.352, and -0.532.

Solution:
TS Analysis - SKJ
9

We first compute AR(2) coefficients


𝑟2 − 𝑟12 −0.004 − 0.4582
𝜑̂2,2 = = = −0.271
1 − 𝑟12 1 − 0.4582
0.458(1+0.004)
𝜑̂2,1 = 𝑟1 (1 − 𝑟2 )/(1 − 𝑟12 )= 1−0.4582 = 0.582

𝜎𝑎2 = 1 − 𝜑2,1 𝜌1 − 𝜑2,2 𝜌2


= 1 − 0.582 × 0.458 − 0.271 × 0.004 = 0.732
𝜎𝑎 = 0.856
The model is:
𝑧𝑡 = 𝜑2,1 𝑧𝑡−1 + 𝜑2,2 𝑧𝑡−2 + 𝜎𝑎 𝑎𝑡

𝑧𝑡 = 0.582 × 𝑧𝑡−1 − 0.271 × 𝑧𝑡−2 + 0.856𝑎𝑡

Assume zt-1 = 0, zt-2 = 0, or xt-1 = 1.0

For time t, zt = 0.5820 - 0.2710 + 0.8561.352 = 1.157; xt = 1.0 + 0.1821.157 = 1.211


For time t+1, zt+1 = 0.582… - 0.2710 + 0.856(-0.532) = …; xt+1 = 1.0 + 0.182… =

Moving Average (MA) Models


Another kind of model of great practical importance represents observed TS as finite moving
average process.
Here 𝑧𝑡 is linearly dependent on a finite number q of previous a’s.

𝑧𝑡 = 𝑎𝑡 − 𝜃1 𝑎𝑡−1 − 𝜃2 𝑎𝑡−2 − ⋯ − 𝜃𝑞 𝑎𝑡−𝑞

is called a moving average (MA) process of order q and is denoted by MA(𝑞).


Similar to AR operator, a moving average operator of order q can be written as
𝜃(𝐵) = 1 − 𝜃1 𝐵 − 𝜃2 𝐵 2 − ⋯ − 𝜃𝑞 𝐵 𝑞
𝑧𝑡 = 𝜃(𝐵)𝑎𝑡
It has q+2 parameters, 𝜇, 𝜃1 , 𝜃2 , … 𝜃𝑞 , 𝜎𝑎2

Here, MA is misnomer – weights 1, 𝜃1 , 𝜃2 , … 𝜃𝑞 may not sum to unity nor be positive.

Nov. 06
For the first-order moving-average model MA(1)

𝑧𝑡 = 𝑎𝑡 − 𝜃1 𝑎𝑡−1

𝜌1 = −𝜃1 /(1 + 𝜃12 )

So, 𝜃1 can be estimated from the roots of this equation.

Variance of random component


𝜎𝑎2 = 1/(1 + 𝜃12 + 𝜃22 + ⋯ + 𝜃𝑞2 )

Auto-Regressive Moving Average (ARMA) Models

TS Analysis - SKJ
10

Greater flexibility in fitting TS models is achieved by including both AR and MA terms in model.
This leads to mixed autoregressive-moving average ARMA (p, q) model:

𝒛𝒕 = 𝝓𝟏 𝒛𝒕−𝟏 + ⋯ + 𝝓𝒑 𝒛𝒕−𝒑 + 𝒂𝒕 − 𝜽𝟏 𝒂𝒕−𝟏 − ⋯ − 𝜽𝒒 𝒂𝒕−𝒒


or
𝑧𝑡 − 𝜙1 𝑧𝑡−1 − ⋯ − 𝜙𝑝 𝑧𝑡−𝑝 = 𝑎𝑡 − 𝜃1 𝑎𝑡−1 − ⋯ − 𝜃𝑞 𝑎𝑡−𝑞

which employs p+q+2 unknown parameters 𝜇, 𝜙1 , … , 𝜙𝑝 , 𝜃1 , … , 𝜃𝑞 , 𝜎𝑎2 ; estimated from the data.

Shorthand notation (B) zt = (B) at

Combination of AR and MA models makes it possible to simulate many hydrologic processes by


using a small number of parameters.
ARMA(p,0) model is the same as AR(p); ARMA(0,q) model is same as MA(q).

Example: flow in a stream results due to a number of causes such as precipitation and catchment
storage.
In ARMA model, zt and at may represent time dependent discharge (output) and rainfall (input).
This mixed behavior can be modelled by ARMA models.
In practice, an adequate representation of actually occurring stationary TS can be frequently
obtained with AR, MA, or ARMA model, in which p and q are not greater than 2 and often less
than 2.

ARMA(1,1) model
Simplest member of ARMA(p, q) family is the ARMA(1, 1) model which can be written as

𝑧𝑡 = 𝜑1,1 𝑧𝑡−1 + 𝑎𝑡 − 𝜃1,1 𝑎𝑡−1

Autoregressive Integrated Moving Average (ARIMA) models


ARMA models are suitable for stationary hydrologic series.
In case of nonstationary series, periodic or seasonal fluctuation can be removed by taking
differences and ARMA model can be applied to resultant series.
Resultant model is termed as Autoregressive Integrated Moving Average (ARIMA) model.

Consider a time series that is homogeneous except in level, i.e., the various segments of the series
look identical except, the difference in level about which it changes.

Such a series can be adequately represented by a model of the form:

(B) zt = (B) at (14)

where  is the backward difference operator defined as

zt = zt - zt-1 = (1 - B) zt (15)

Thus, ARIMA (p, d, q) is an ARMA model that is fitted to the data after taking the dth difference
TS Analysis - SKJ
11

of the series:

(B)d zt = (B) at (16)

where d indicates that the series is differenced d times. The notation n = 1 - Bn indicates
differencing with lag of n. The first order differencing [eq. (15)] is helpful in removing the trend
of a series or non-stationarity in the mean. Two consecutive differencing operations are necessary
to remo.ve non-stationarity in the mean and slope. However, it may not always be possible to
remove non-stationarity by differencing alone, other transformations may also be needed.

Spectral properties of a TS: Periodograms


Harmonic Analysis: A TS can be assumed to be composed of sine and cosine waves with different
frequencies, after trend (if present) is removed.

Term Harmonic originally came from acoustics wherein musical instruments are identified by
harmonics which have frequencies that are multiples of basic frequency produced.
French mathematician Fourier showed that a continuous function { X(t), t  T} (where T is an
index set) can, in general, be equated to sum of an infinite number of harmonics with frequencies
1/T, 2/T, 3/T, ... .
Hence, this type of representation is a called a Fourier series.
If a Fourier series model is fit
𝑞

𝑧𝑡 = 𝛼0 + ∑(𝛼𝑖 cos(2𝜋𝑓𝑖 𝑡) + 𝛽𝑖 sin(2𝜋𝑓𝑖 𝑡)) + 𝑒𝑡


𝑖=1
𝑓𝑖 = 𝑖/𝑁

Using the coefficients 𝛼𝑖 and 𝛽𝑖 , periodograms (somewhat similar to correlogram) can be


developed.

TS Analysis - SKJ
12

Harmonic decomposition of a periodic signal over a time span. The L harmonics (first
three are shown) have frequencies 1/T, 2/T, 3/T, 4/T, . .. , L/T and wavelengths T, T/2,
T/3, T/4, . .. , T/ L, where T/ L  2t; t is the sampling interval.
Their ordinates of harmonics are summed algebraically to give the periodic component
(𝑝)
𝑥𝑡 of sequence xt.

Fitting of ARMA Models


Initially, one does not know order of the ARMA process for fitting to an observed time series as
well as the order of differencing that is required, if any.
Therefore, the model is built iteratively, i.e., a set of models is identified using the characteristics
of the data and its adequacy is tested.
Depending on the results, the model may be adopted or another candidate model is identified.
While fitting a time series model, at least 50 observations should be used.
If sufficient observations are not available, one proceeds by using experience to build a preliminary
model.
This model may be updated as more data becomes available.

While applying ARMA models, the main stages are:

• Model Identification which involves the use of the data and any information on how the series
was generated to identify a subclass of parsimonious models worthy to be considered.
• Parameter Estimation which involves an efficient use of the data to make inferences about
TS Analysis - SKJ
13

parameters conditional on the adequacy of the considered model.


• Diagnostic Checking involves checking the fitted model in its relation to the data with the
intent to reveal model inadequacies and to achieve model improvement.

Parsimony principle
If there are two equally good models, choose one with least number of parameters.

If ARMA(1,1) and AR(3) models give equally good results, which ??

Partial Autocorrelation Function (PACF)


Partial autocorrelation function is another way to represent time dependence structure of a series.
Partial autocorrelation is the autocorrelation remaining in the series after fitting a model of order
p-1 and removing the linear dependence.
It is useful in identification of the type and order of the model of a given time series.
Let kj be the jth coefficient in an AR process of order k; kk being the last coefficient.

We know

zt = k,1 zt-1 + k,2 zt-2 + … + k,k zt-k + at (5)

If equation (5) is multiplied by zt-1 and expectations are computed, we get

1 = k,1 + k,2 1 + … + k,k k-1

So, if equation (5) is multiplied in turn by zt-2, zt-3, and expectations are computed, we get a set of
equations called the Yule-Walker equations:

TS Analysis - SKJ
14

1 𝜌1 𝜌2 ⋯ 𝜌𝑘−1 𝜑𝑘,1 𝜌1
𝜌1 1 𝜌1 … 𝜌𝑘−2 𝜑𝑘,2 𝜌2
[ ][ ⋮ ] = [ ⋮ ]
⋮ ⋮ ⋮ ⋮ ⋮
𝜌𝑘−1 𝜌𝑘−2 𝜌𝑘−3 ⋯ 1 𝜑𝑘,𝑘 𝜌𝑘

or
Pk k = k

Solving these equations for k = 1, 2, 3, …, successively, the values of 11, 22 ... are obtained as a
function of .

Quantity kk, regarded as a function of lag k, is called partial autocorrelation function (PACF).

PACF shows the order and type of model.

Partial autocorrelations may be estimated by successively fitting AR processes of orders 1, 2, 3, …


by least squares and picking out estimates 11, 22 ... at each stage.
Or, approximate Yule-Walker estimates of successive autoregressive processes may be employed.
Estimated partial autocorrelations can then be obtained by substituting estimates rj for theoretical
autocorrelations to yield

𝑟𝑗 = 𝜑̂𝑘1 𝑟𝑗−1 + 𝜑̂𝑘2 𝑟𝑗−2 +. . . +𝜑̂𝑘(𝑘−1) 𝑟𝑗−𝑘+1 + 𝜑̂𝑘𝑘 𝑟𝑗−𝑘 , 𝑗 = 1, 2, . . . 𝑘

Autocorrelation function of series also gives useful information.


If autocorrelation function fails to die out rapidly, it suggests that series may be non-stationary and
may require differencing to obtain a stationary series.

Properties of autocorrelation and PAC functions of some time series models.

Model type Autocorrelation Partial autocorrelation function


function
AR(1), first-order Decreases exponentially 1,1  0
autoregressive i,i = 0 for i = 2,3,4,…
AR(p), pth-order Mixed type of damping i,i  0 for i  p
Autoregressive from lag 1 i,i = 0 for i  p
MA(1), first-order moving- 1  0 Decreases exponentially
average i = 0 for i = 2,3,4,…
MA(q), qth-order i = 0 for i  q Mixed type of damping from lag 1
moving-average i  0 for i  q
ARMA(1, 1), auto-regressive Decreases exponentially Decreases exponentially after lag
moving-average after lag 1 1
ARMA (p, q), auto- Mixed type of damping Mixed type of damping after lag
regressive moving-average after lag q + 1 p-q

For most monthly hydrological series, it is often helpful to first standardize series by subtracting
mean and dividing by standard deviation of corresponding month.
A first-order differencing of resultant series (if required) is often adequate to yield a stationary
series that can be modelled by ARMA class of models.

TS Analysis - SKJ
15

Common techniques to estimate of the parameters of a time series model are method of moments,
method of least squares, and method of maximum likelihood.

Model Testing
After an ARMA model has been fitted, it is necessary to apply statistical tests to check its adequacy
and suitability.
Tests for this purpose include the Porte Manteau Lack of Fit Test, the Akaike Information
Criterion (AIC), and the test of correlogram.

The Porte Manteau Lack of fit test checks whether residuals of a model are independent or not.

In this test, Q statistic is computed by


L
Q = N  rk2
k =1

where rk denotes autocorrelation of residuals at lag k and L is the number of lags considered. Q
approximately follows a chi-square distribution with (L-p-q) degrees of freedom.

Statistic AIC is calculated by


  
AIC(p, q) = N ln   2  + 2( p + q)
 

where N is the sample size, and 𝜎∈2 is the maximum likelihood estimate of the residual variance.
Model which gives the minimum AIC is selected.

Examination of residuals (difference between observed and computed values of a dependent


variable) of a model is always helpful.
Residuals of an adequate model should resemble white noise, lag-one serial correlation should be
close to zero, and they should have small variance.

Technique of overfitting, in which a more elaborate model is fitted to the data and then the results
are compared, has also been recommended.

Applications of TS Models
ARMA models are frequently used in rainfall runoff modelling.
A number of well-known hydrologic models are special cases of the ARMA model.
For example, the Muskingum model of flood routing is obtained by setting certain parameters of
this equation to zero.

ARMA model parameters do have a physical interpretation.


As precipitation event occurs, MA parameters influence rising limb of hydrograph -- these depend
on the basin physiographic characteristics and its state.
Similarly, AR parameters more heavily influence recession limb of hydrograph. These can vary
from storm to storm.

TS Analysis - SKJ

You might also like