Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
14 views

Time Series Forecasting With Python Cheat Sheet

Uploaded by

monalishadash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Time Series Forecasting With Python Cheat Sheet

Uploaded by

monalishadash
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Time Series Forecasting with Python – Cheat Sheet

Data Science with Marco

Time series analysis

ACF plot PACF plot Time series decomposition

The autocorrelation function (ACF) plot shows the The partial autocorrelation function (PACF) plot Separate the series into 3 components: trend,
autocorrelation coefficients as a function of the shows the partial autocorrelation coefficients as a seasonality, and residuals
lag. function of the lag. • Trend: long-term changes in the series
• Use it to determine the order q of a stationary • Use it to determine the order p of a stationary • Seasonality: periodical variations in the series
MA(q) process AR(p) process • Residuals: what is not explained by trend and
• A stationary MA(q) process has significant • A stationary AR(p) process has significant seasonality
coefficients up until lag q coefficients up until lag p

Output for a MA(2) process (i.e., q = 2): Output for an AR(2) process (i.e., p = 2):

Note: m is the frequency of data (i.e., how many


observations per season)
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Statistical tests

ADF test – Test for stationarity Ljung-Box test – Residuals analysis Granger causality – Multivariate forecasting

A series is stationary it its mean, variance, and Used to determine if the autocorrelation of a group Determine if one time series is useful in predicting
autocorrelation are constant over time. Test for of data is significantly different from 0. Use it on the other one.
stationarity with augmented Dickey-Fuller (ADF) the residuals to check if they are independent. Use to validate the VAR model. If Granger causality
test. • Null hypothesis: the data is independently test fails, then the VAR model is invalid.
• Null hypothesis: a unit root is present (i.e., the distributed (i.e., there is no autocorrelation) • Null hypothesis: 𝑦2,𝑡 does not Granger-causes
series is not stationary) • We want a p-value > 0.05 𝑦1,𝑡
• We want a p-value < 0.05 • Works for predictive causality
• Tests causality in one direction only (i.e., must
run the test twice)
• We want a p-value < 0.05

Note: to make a series stationary, use differencing.


• n = 1: difference between consecutive timesteps
• n = 4: difference between values 4 timesteps
apart Note: print the p-values up to h lags, where h is the
Differencing removes n data points. length of your forecast horizon

Note: see how we run the test twice to test


causality in both directions
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Forecasting – Statistical models

Moving average model – MA(q) Autoregressive model – AR(p) ARMA(p,q)

The moving average model: the current value The autoregressive model is a regression against The autoregressive moving average model (ARMA)
depends on the mean of the series, the current itself. This means that the present value depends is the combination of the autoregressive model
error term, and past error terms. on past values. AR(p), and the moving average model MA(q).

• Denoted as MA(q) where q is the order • Denoted as AR(p) where p is the order • Denoted as ARMA(p,q) where p is the order of
• Use ACF plot to find q • Use PACF to find p the autoregressive portion, and q is the order of
• Assumes stationarity. Use only on stationary • Assumes stationarity. Use only on stationary the moving average portion
data data • Cannot use ACF or PACF to find the order p, and
q. Must try different (p,q) value and select the
Equation Equation model with the lowest AIC (Akaike’s Information
𝑦𝑡 = 𝜇 + 𝜖𝑡 + 𝜃1 𝜖𝑡−1 + 𝜃2 𝜖𝑡−2 + ⋯ + 𝜃𝑞 𝜖𝑡−𝑞 𝑦𝑡 = 𝐶 + 𝜙1 𝑦𝑡−1 + 𝜙2 𝑦𝑡−2 + ⋯ + 𝜙𝑝 𝑦𝑡−𝑝 + 𝜖𝑡 Criterion)
• Assumes stationarity. Use only on stationary
data.

Equation
𝑦𝑡 = 𝐶 + 𝜙1 𝑦𝑡−1 + ⋯ 𝜙𝑝 𝑦𝑡−𝑝 + 𝜃1 𝜖𝑡−1 + ⋯ 𝜃𝑞 𝜖𝑡−𝑞 + 𝜖𝑡
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Forecasting – Statistical models

ARIMA(p,d,q) SARIMA(p,d,q)(P,D,Q)m SARIMAX

The autoregressive integrated moving average The seasonal autoregressive integrated moving SARIMAX is the most general model. It combines
(ARIMA) model is the combination of the average (SARIMA) model includes a seasonal seasonality, a moving average portion, an
autoregressive model AR(p), and the moving component on top of the ARIMA model. autoregressive portion, and exogenous variables.
average model MA(q), but in terms of the
differenced series. • Denoted as SARIMA(p,d,q)(P,D,Q)m. Here, p, d, • Can use external variables to forecast a series
and q have the same meaning as in the ARIMA
• Denoted as ARMA(p,d,q), where p is the order model.
of the autoregressive portion, d is the order of • P is the seasonal order of the autoregressive
integration, and q is the order of the moving portion
average portion • D is the seasonal order of integration
• Can use on non-stationary data • Q is the seasonal order of the moving average
portion
Equation • m is the frequency of the data (i.e., the number
𝑦′𝑡 = 𝐶 + 𝜙1 𝑦′𝑡−1 + ⋯ 𝜙𝑝 𝑦′𝑡−𝑝 + 𝜃1 𝜖𝑡−1 + ⋯ 𝜃𝑞 𝜖𝑡−𝑞 + 𝜖𝑡 of data points in one season)
Caveat: SARIMAX predicts the next timestep. If
your horizon is longer than one timestep, then you
must forecast your exogenous variables too, which
can amplify the error in your model

Note: the order of integration d is simply the


number of time a series was differenced to
become stationary.
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Forecasting – Statistical models

VARMAX BATS and TBATS Exponential smoothing

The vector autoregressive moving average with BATS and TBATS are used when the series has more Exponential smoothing uses past values to predict
exogenous variables (VARMAX) model is used for than one seasonal period. This can happen when the future, but the weights decay exponentially as
multivariate forecasting (i.e., predicting two time we have high frequency data, such as daily data. the values go further back in time.
series at the same time)
• When there is more than one seasonal period, • Simple exponential smoothing: returns flat
• Assumes Granger-causality. Must use the SARIMA cannot be used. Use BATS or TBATS. forecasts
Granger-causality test. If the test fails, the • BATS: Box-Cox transformation, ARMA errors, • Double exponential smoothing: adds a trend
VARMAX model cannot be used. Trend and Seasonal components component. Forecasts are a straight line
• TBATS: Trigonometric seasonality, Box-Cox (increasing or decreasing)
transformation, ARMA errors, Trend and • Triple exponential smoothing: adds a seasonal
Seasonal components. component
• Trend can be “additive” or “exponential”
• Seasonality can be “additive” or “multiplicative”
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Forecasting – Deep learning models

Deep neural network (DNN) Long short-term memory - LSTM Convolutional neural network - CNN

A deep neural network stacks fully connected An LSTM is great at processing sequences of data, A CNN can act as a filter for our time series, due to
layers and can model non-linear relationship in the such as text and time series. the convolution operation which reduces the
time series if the activation function is non-linear. feature space.
Its architecture allows for past information to still
• Start with a simple model with few hidden be used for later predictions • A CNN trains faster than an LSTM
layers. Experiment training for more epochs • Can be combined with an LSTM. Place the CNN
before adding layers • You can stack many LSTM layers in your model layer before the LSTM
• You can try combining an LSTM with a CNN
• An LSTM is longer to train since the data is
processed in sequence
Time Series Forecasting with Python – Cheat Sheet
Data Science with Marco

Forecasting – Deep learning models

Autoregressive deep learning model Autoregressive deep learning model (cont.)

An autoregressive deep learning model feeds its


predictions back into the model to make further
predictions. That way, we generate a sequence of
predictions, one forecast at a time.

• Can be used with any architecture, whether it’s


a DNN, LSTM or CNN
• If early predictions are bad, the errors will
magnify as more predictions are made.

You might also like