Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
SlideShare a Scribd company logo
Time series analysis and
prediction in the deep
learning era
Alberto Arrigoni, PhD
February 2019
Time series: analysis and prediction
What will the future hold?
FuturePast
Now
Time series applications + context
Time series prediction: e.g.
demand/sales forecasting...
Use prediction for anomaly
detection: e.g. manufacturing
settings...
Counterfactual prediction:
e.g. marketing campaigns...
Show ads
Counterfactual
Time series applications + context
Time series prediction: e.g.
demand/sales forecasting...
Use prediction for anomaly
detection: e.g. manufacturing
settings...
Counterfactual prediction:
e.g. marketing campaigns...
Show ads
Counterfactual
Time series prediction methods
(non-comprehensive list)
Classical autoregressive models Bayesian AR models
General machine learning
approaches
Deep learning
t+3
Number of time series (~ thousands)
[the SCALE problem]
Time series are often highly erratic,
intermittent or bursty (...and on highly
different scales)
~ 10 items
2 items
Product A Product B
...
(1)
(2)
Time series prediction and sales forecasting: issues
E.g. retail businesses
Time series belong to a hierarchy
of products/categories
E.g. online retailer selling clothes
Time series prediction and sales forecasting: issues
Now
Nike t-shirts
Clothes (total sales)
T-shirts total sales
~ 100
~ 1000(3)
For new products historical data is
missing (the cold-start problem)
(4)
Adidas t-shirts
Classical autoregressive models
Estimate model order (AIC, BIC)
Fit model parameters
(maximum likelihood)
Autoregressive component
Moving average component
Test residuals for
randomness
De-trending by differencing
Variance stabilization by log
or Box-Cox transformation
Workflow
Classical autoregressive models
THE PROS:
- Good explainability
- Solid theoretical background
- Very explicit model
- A lot of control as it is a manual process
THE CONS:
- Data is seldom stationary: trend,
seasonality, cycles need to modeled as
well
- Computationally intensive (one model for
each time series)
- No information sharing across time series
(apart from Hyndman’s hts approach) *
- Historical data are essential for
forecasting, (no cold-start)
* https://robjhyndman.com/publications/hierarchical/
Tech stack and packages
- Rob Hyndman’s online text:
https://otexts.com/fpp2/
- Infamous auto.arima
package, ets, tbats, garch,
stl...
- Python’s Pyramid
- Aggregate histograms over time scales
- Transform into Fourier space
- Add low/high pass filters as variables
General machine learning approach for ts prediction
Past Yt
t
Autoregressive component
- Can use any number of methods (linear, trees,
neural networks...)
- Turn the time series prediction problem into a
supervised learning problem
- Easily extendable to support multiple input
variables
- Covariates can be easily handled and
transformed through feature engineering
Covariates
E.g. feature engineering
THE PROS:
- Can model non-linear relationships
- Can model the “hierarchical structure” of the
time series through categorical variables
- Support for covariates (predictors) + feature
engineering
- One model is shared among multiple time
series
- Cold-start predictions are possible by
iteratively feeding the predictions back to the
feature space
THE CONS:
- Feature engineering takes time
- Long-term relationships between data points
need to be explicitly modeled
(autoregressive features)
General machine learning approach for ts prediction
Tech stack and packages
- Sklearn, PySpark for feature
engineering, data reduction
Bayesian AR models (Facebook Prophet)
Prophet is a Bayesian GAM (Generalized Additive Model)
Linear trend with
changepoints
Seasonal
component
Holiday-specific
componentt
Sales
1) Detect changepoints in the time
series
2) Fit linear trend parameters (k and
delta)
(piecewise) linear
trends
Growth rate Growth rate
adjustment
**
** An additional ‘offset’ term has been omitted from the formula
* Implemented using STAN
*
Bayesian AR models (Facebook Prophet)
E.g. P = 365 for yearly data
Need to estimate 2N parameters (an
and bn
) using MCMC!
Prophet is a Bayesian GAM (Generalized Additive Model)
Linear trend with
changepoints
Seasonal
component
Holiday-specific
componentt
Sales
THE PROS:
- Uncertainty estimation
- Bayesian changepoint detection
- User-in-the-loop paradigm (Prophet)
- Black-box variational inference is
revolutionizing Bayesian inference
THE CONS:
- Bayesian inference takes time (the “scale”
issue)
- One model for each time series
- No information sharing among series
(unless you specify a hierarchical bayesian
model with shared parameters, but still...)
- Historical data are needed for prediction!
- Performance is often on par* with
autoregressive models
Tech stack and packages
- Python/R clients for Prophet *
- R package for structural bayesian
time series models: Bsts
Bayesian AR models
* Taylor et al., Forecasting at scale* This may open endless discussions. Bottom line: depends on your data :)
Interlude: uncertainty estimation with deep learning
- Uncertainty estimation is a prerogative of Bayesian methods.
- Black box variational inference (ADVI) has sprung renewed interest towards Bayesian
neural networks, but we are not there yet in terms of performance
- A DeepMind paper from NIPS 2017 introduces a simple yet effective way to estimate
predictive uncertainty using Deep Ensembles
For a TensorFlow implementation of this paper: https://arrigonialberto86.github.io/funtime/deep_ensembles.html
“Engineering Uncertainty
Estimation in Neural Networks for
Time Series Prediction at Uber”
https://eng.uber.com/neural-network
s-uncertainty-estimation/
1) 2)
Interlude: Deep Ensembles
Train a deep learning model using a custom
final layer which parametrizes a Gaussian
distribution
Sample x from the Gaussian
distribution using fitted
parameters
Calculate loss to backpropagate the
error (using Gaussian likelihood)
(1)
(3)
(2)
Network output
What the network is learning: different
regions of the x space have different
variances
Generate a synthetic
dataset with different
variances
Interlude: Deep Ensembles
PREDICTION ON
TRAINING DATASET
SYNTHETIC TRAINING
DATASET
Use the network from previous
slide to predict on the training
set to see if it actually detects
variance reduction
Interlude: Deep Ensembles
The authors suggest to train different NNs on the
same data (the whole training set) with random
initialization
Ensemble networks (improve generalization power)
Uniformly weighted mixture model
Predictions for regions outside of
the training dataset show
increasing variance (due to
ensembling)
In addition to ‘distribution’ modeling
and ensembling the authors suggest to
use the fast gradient sign method * to
produce adversarial training example
(Not shown here)
* Goodfellow et al., 2014
Interlude: Deep Ensembles
Custom GaussianLayer
Let’s just do some extra work and define a
custom layer
For a TensorFlow implementation of this paper: https://arrigonialberto86.github.io/funtime/deep_ensembles.html
Interlude: Deep Ensembles
Custom layer returns both
mu and sigma
Build 2 weight matrices + 2
biase terms
DeepAR (Amazon)
Instead of fitting separate models for each time series we create a global model from related time
series to handle widely-varying scales through rescaling and velocity-based sampling.
Differentscales
Probabilities
~1000 time series
Past Future
Covariates
Flunkert et al., 2017
DeepAR (Amazon)
ht-1
ht
ht+1
- Use LSTM interactions in the time series
- As seen with the Deep Ensemble
architecture, we can predict parameters of
distributions at each time point (theta
vector)
- Time series need to be scaled for the
network to learn time-varying dynamics
DeepAR (Amazon)
* Likelihood/loss is customizable: Gaussian/negative binomial for count data + overdispersion
Training Prediction
*
For a commentary + code review: https://arrigonialberto86.github.io/funtime/deepar.html
DeepAR (Amazon)
The mandatory ‘AirPassengers’ prediction example (results shown on training set)
It is given that this is not the use case Amazon had in mind...
DeepAR (Amazon)
- Long-term relationships are handled by
design using LSTMs
- One model is fitted for all the time series
- The hierarchical ts structure and
inter-dependencies are captured by
using covariates (even holidays,
recurrent events etc...)
- The model can be used for cold-start
predictions (using categorical covariates
with ‘descriptive’ product information)
- Hassle-free uncertainty estimation
DeepAR and the AWS ecosystem
AWS SageMaker
Deep State Space (NIPS 2018)*
A state space model or SSM is just like an Hidden Markov Model, except the hidden states are
continuous
Observation (zt
)
update
Latent state (lt
)
update
In normal settings we would need to fit these parameters for each time series
zt-1 zt
zt+1
???
* Rangapuram et al, 2018, Deep State Space Models for Time Series Forecasting
Deep State Space (NIPS 2018)
Training
Prediction
Compute the negative
likelihood, derive the
time-varying SS
parameters using
backpropagation
Use Kalman filtering to
estimate lt
, then
recursively apply the
transition equation and the
observation model to
generate prediction
samples
- Long-term relationships are handled by
design using LSTMs
- One model is fitted for all the time
series
- The hierarchical ts structure and
inter-dependencies are captured by
ad-hoc design and components of the SS
model (even holidays, recurrent events
etc...)
- The model can be used for cold-start
predictions (using categorical covariates
with ‘descriptive’ product information)
Deep State Space (NIPS 2018)
Going forward: Deep factors with GPs *
* Maddix et al., “Deep Factors with Gaussian Processes for Forecasting”, NIPS 2018
The combination of probabilistic graphical models with deep neural networks has been an active
research area recently
Global DNN backbone and local Gaussian Process (GP). The main idea is to represent each
time series as a combination of a global time series and a corresponding local model.
gt
gt
gt
gt
RNN
zit
+ covariates Backpropagation to find RNN
parameters to produce global factors (gt
)
+ GP hyperparameters
M4 forecasting competition winner algo (Uber, 2018)
The winning idea is often the simplest!
Hybrid Exponential Smoothing-Recurrent Neural Networks (ES-RNN) method. It
mixes hand-coded parts like ES formulas with a black-box recurrent neural network
(RNN) forecasting engine.
yt-1
yt
yt+1
Deseasonalized and normalized vector of covariates + previous state
RNN results are now part of a parametric model
Classical
autoregressive
models
Bayesian models
(GAM/structural)
Classical
machine
learning
Deep learning
approaches
Scalability
Info sharing
across ts
Cold-start
predictions
Uncertainty
estimation
Unevenly spaced
time series *
Summary of performance
* DeepAR
Deep Factors
* Chen et al., Neural ordinary differential equations, 2018 / Futoma et al., 2017, Multitask GP + RNN
BACKUP SLIDES
Deep State Space (Amazon)
Level-trend model parametrization:
DeepAR (Amazon)
Step 1 Step 2 Step 3
Training procedure:
- Predict parameters (e.g. mu,
sigma)
- Compute likelihood of the
prediction (can be Gaussian as we
have seen with Deep Ensembles)
*
- Sample next point
* Likelihood/loss is customizable: Gaussian/negative
binomial for count data + overdispersion
Training
Prediction (~ Monte Carlo)

More Related Content

What's hot

Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
Saurabh Kaushik
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
Kumar P
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.
Anupama Kate
 
Presentation
PresentationPresentation
Presentation
Srinivas KNS
 
Time series forecasting
Time series forecastingTime series forecasting
Time series forecasting
Firas Kastantin
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
Hayim Makabee
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
Wes McKinney
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
Data Science Milan
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
Sri Ambati
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Simplilearn
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
Manish Jain
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
Hitesh Mohapatra
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Simplilearn
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
KONGU ENGINEERING COLLEGE
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time series
Luigi Piva CQF
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Knoldus Inc.
 
Seasonal ARIMA
Seasonal ARIMASeasonal ARIMA
Seasonal ARIMA
Joud Khattab
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
Yan Xu
 
Time-series Analysis in Minutes
Time-series Analysis in MinutesTime-series Analysis in Minutes
Time-series Analysis in Minutes
Orzota
 

What's hot (20)

Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective Explainable AI (XAI) - A Perspective
Explainable AI (XAI) - A Perspective
 
Arima model (time series)
Arima model (time series)Arima model (time series)
Arima model (time series)
 
Time Series Forecasting Project Presentation.
Time Series Forecasting Project  Presentation.Time Series Forecasting Project  Presentation.
Time Series Forecasting Project Presentation.
 
Presentation
PresentationPresentation
Presentation
 
Time series forecasting
Time series forecastingTime series forecasting
Time series forecasting
 
Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)Automated Machine Learning (Auto ML)
Automated Machine Learning (Auto ML)
 
Scipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in PythonScipy 2011 Time Series Analysis in Python
Scipy 2011 Time Series Analysis in Python
 
Time Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del PraTime Series Classification with Deep Learning | Marco Del Pra
Time Series Classification with Deep Learning | Marco Del Pra
 
Interpretable Machine Learning
Interpretable Machine LearningInterpretable Machine Learning
Interpretable Machine Learning
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
Recurrent Neural Network (RNN) | RNN LSTM Tutorial | Deep Learning Course | S...
 
2.mathematics for machine learning
2.mathematics for machine learning2.mathematics for machine learning
2.mathematics for machine learning
 
Multivariate time series
Multivariate time seriesMultivariate time series
Multivariate time series
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Seasonal ARIMA
Seasonal ARIMASeasonal ARIMA
Seasonal ARIMA
 
Introduction to Recurrent Neural Network
Introduction to Recurrent Neural NetworkIntroduction to Recurrent Neural Network
Introduction to Recurrent Neural Network
 
Time-series Analysis in Minutes
Time-series Analysis in MinutesTime-series Analysis in Minutes
Time-series Analysis in Minutes
 

Similar to Time series deep learning

NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
ssuser4b1f48
 
Novel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data StreamsNovel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data Streams
IJERA Editor
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
confluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
Dr. Mirko Kämpf
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
Benjamin Bengfort
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Josef A. Habdank
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
Joaquin Vanschoren
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
Marco Parenzan
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
mariuseriksen4
 
timeseries cheat sheet with example code for R
timeseries cheat sheet with example code for Rtimeseries cheat sheet with example code for R
timeseries cheat sheet with example code for R
derekjohnson549253
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
tsysglobalsolutions
 
House price prediction
House price predictionHouse price prediction
House price prediction
SabahBegum
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time Series
Sigmoid
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
Krishna Mohan Mishra
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
MLReview
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
MLconf
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx
0567Padma
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
Anubhav Jain
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
Spark Summit
 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Editor IJCATR
 

Similar to Time series deep learning (20)

NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
NS-CUK Seminar: S.T.Nguyen, Review on "Continuous-Time Sequential Recommendat...
 
Novel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data StreamsNovel Ensemble Tree for Fast Prediction on Data Streams
Novel Ensemble Tree for Fast Prediction on Data Streams
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Introduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-LearnIntroduction to Machine Learning with SciKit-Learn
Introduction to Machine Learning with SciKit-Learn
 
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearnPrediction as a service with ensemble model in SparkML and Python ScikitLearn
Prediction as a service with ensemble model in SparkML and Python ScikitLearn
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
Time Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and AzureTime Series Anomaly Detection with .net and Azure
Time Series Anomaly Detection with .net and Azure
 
2017 nov reflow sbtb
2017 nov reflow sbtb2017 nov reflow sbtb
2017 nov reflow sbtb
 
timeseries cheat sheet with example code for R
timeseries cheat sheet with example code for Rtimeseries cheat sheet with example code for R
timeseries cheat sheet with example code for R
 
IEEE Datamining 2016 Title and Abstract
IEEE  Datamining 2016 Title and AbstractIEEE  Datamining 2016 Title and Abstract
IEEE Datamining 2016 Title and Abstract
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
ML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time SeriesML on Big Data: Real-Time Analysis on Time Series
ML on Big Data: Real-Time Analysis on Time Series
 
Stock Market Prediction Using ANN
Stock Market Prediction Using ANNStock Market Prediction Using ANN
Stock Market Prediction Using ANN
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
Josh Patterson MLconf slides
Josh Patterson MLconf slidesJosh Patterson MLconf slides
Josh Patterson MLconf slides
 
DLT UNIT-3.docx
DLT  UNIT-3.docxDLT  UNIT-3.docx
DLT UNIT-3.docx
 
Automating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomateAutomating materials science workflows with pymatgen, FireWorks, and atomate
Automating materials science workflows with pymatgen, FireWorks, and atomate
 
Spark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef HabdankSpark Summit EU talk by Josef Habdank
Spark Summit EU talk by Josef Habdank
 
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
Time Series Forecasting Using Novel Feature Extraction Algorithm and Multilay...
 

Recently uploaded

Why You Need Real-Time Data to Compete in E-Commerce
Why You Need  Real-Time Data to Compete in  E-CommerceWhy You Need  Real-Time Data to Compete in  E-Commerce
Why You Need Real-Time Data to Compete in E-Commerce
PromptCloud
 
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
da42ki0
 
Data Storytelling Final Project for MBA 635
Data Storytelling Final Project for MBA 635Data Storytelling Final Project for MBA 635
Data Storytelling Final Project for MBA 635
HeidiLivengood
 
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
RejoJohn2
 
Unit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptxUnit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptx
Priyanka Jadhav
 
SAMPLE PRODUCT RESEARCH PR - strikingly.pptx
SAMPLE PRODUCT RESEARCH PR - strikingly.pptxSAMPLE PRODUCT RESEARCH PR - strikingly.pptx
SAMPLE PRODUCT RESEARCH PR - strikingly.pptx
wojakmodern
 
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
AltanAtabarut
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
da42ki0
 
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
femim26318
 
SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024
Becky Burwell
 
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
da42ki0
 
ChessMaster Project Presentation for Batch 1643.pptx
ChessMaster Project Presentation for Batch 1643.pptxChessMaster Project Presentation for Batch 1643.pptx
ChessMaster Project Presentation for Batch 1643.pptx
duduphc
 
Data management and excel appication.pptx
Data management and excel appication.pptxData management and excel appication.pptx
Data management and excel appication.pptx
OlabodeSamuel3
 
Flow Diagram Infographics by Slidesgo.pptx
Flow Diagram Infographics by Slidesgo.pptxFlow Diagram Infographics by Slidesgo.pptx
Flow Diagram Infographics by Slidesgo.pptx
DannyInfante1
 
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZKeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
jp3113ig
 
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
DALubis
 
chapter one 1 cloudcomputing .pptx someone
chapter one 1 cloudcomputing .pptx someonechapter one 1 cloudcomputing .pptx someone
chapter one 1 cloudcomputing .pptx someone
abeeeeeeeer588
 
Hadoop Vs Snowflake Blog PDF Submission.pptx
Hadoop Vs Snowflake Blog PDF Submission.pptxHadoop Vs Snowflake Blog PDF Submission.pptx
Hadoop Vs Snowflake Blog PDF Submission.pptx
dewsharon760
 
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
SelcukTOPAL2
 
INTRODUCTION TO BIG DATA ANALYTICS.pptx
INTRODUCTION TO  BIG DATA ANALYTICS.pptxINTRODUCTION TO  BIG DATA ANALYTICS.pptx
INTRODUCTION TO BIG DATA ANALYTICS.pptx
Preethi G
 

Recently uploaded (20)

Why You Need Real-Time Data to Compete in E-Commerce
Why You Need  Real-Time Data to Compete in  E-CommerceWhy You Need  Real-Time Data to Compete in  E-Commerce
Why You Need Real-Time Data to Compete in E-Commerce
 
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
一比一原版(sfu毕业证书)加拿大西蒙菲莎大学毕业证如何办理
 
Data Storytelling Final Project for MBA 635
Data Storytelling Final Project for MBA 635Data Storytelling Final Project for MBA 635
Data Storytelling Final Project for MBA 635
 
CT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptxCT AnGIOGRAPHY of pulmonary embolism.pptx
CT AnGIOGRAPHY of pulmonary embolism.pptx
 
Unit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptxUnit 1 Introduction to DATA SCIENCE .pptx
Unit 1 Introduction to DATA SCIENCE .pptx
 
SAMPLE PRODUCT RESEARCH PR - strikingly.pptx
SAMPLE PRODUCT RESEARCH PR - strikingly.pptxSAMPLE PRODUCT RESEARCH PR - strikingly.pptx
SAMPLE PRODUCT RESEARCH PR - strikingly.pptx
 
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptxParcel Delivery - Intel Segmentation and Last Mile Opt.pptx
Parcel Delivery - Intel Segmentation and Last Mile Opt.pptx
 
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
一比一原版(uc毕业证书)加拿大卡尔加里大学毕业证如何办理
 
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
Cal Girls Mansarovar Jaipur | 08445551418 | Rajni High Profile Girls Call in ...
 
SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024SFBA Splunk Usergroup meeting July 17, 2024
SFBA Splunk Usergroup meeting July 17, 2024
 
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
一比一原版(macewan毕业证书)加拿大麦科文大学毕业证如何办理
 
ChessMaster Project Presentation for Batch 1643.pptx
ChessMaster Project Presentation for Batch 1643.pptxChessMaster Project Presentation for Batch 1643.pptx
ChessMaster Project Presentation for Batch 1643.pptx
 
Data management and excel appication.pptx
Data management and excel appication.pptxData management and excel appication.pptx
Data management and excel appication.pptx
 
Flow Diagram Infographics by Slidesgo.pptx
Flow Diagram Infographics by Slidesgo.pptxFlow Diagram Infographics by Slidesgo.pptx
Flow Diagram Infographics by Slidesgo.pptx
 
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZKeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
KeynoteUploadJRP ABCDEFGHIJKLMNOPQRSTUVWXYZ
 
Accounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-RegulationsAccounting and Auditing Laws-Rules-and-Regulations
Accounting and Auditing Laws-Rules-and-Regulations
 
chapter one 1 cloudcomputing .pptx someone
chapter one 1 cloudcomputing .pptx someonechapter one 1 cloudcomputing .pptx someone
chapter one 1 cloudcomputing .pptx someone
 
Hadoop Vs Snowflake Blog PDF Submission.pptx
Hadoop Vs Snowflake Blog PDF Submission.pptxHadoop Vs Snowflake Blog PDF Submission.pptx
Hadoop Vs Snowflake Blog PDF Submission.pptx
 
Selcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdfSelcuk Topal Arbitrum Scientific Report.pdf
Selcuk Topal Arbitrum Scientific Report.pdf
 
INTRODUCTION TO BIG DATA ANALYTICS.pptx
INTRODUCTION TO  BIG DATA ANALYTICS.pptxINTRODUCTION TO  BIG DATA ANALYTICS.pptx
INTRODUCTION TO BIG DATA ANALYTICS.pptx
 

Time series deep learning

  • 1. Time series analysis and prediction in the deep learning era Alberto Arrigoni, PhD February 2019
  • 2. Time series: analysis and prediction What will the future hold? FuturePast Now
  • 3. Time series applications + context Time series prediction: e.g. demand/sales forecasting... Use prediction for anomaly detection: e.g. manufacturing settings... Counterfactual prediction: e.g. marketing campaigns... Show ads Counterfactual
  • 4. Time series applications + context Time series prediction: e.g. demand/sales forecasting... Use prediction for anomaly detection: e.g. manufacturing settings... Counterfactual prediction: e.g. marketing campaigns... Show ads Counterfactual
  • 5. Time series prediction methods (non-comprehensive list) Classical autoregressive models Bayesian AR models General machine learning approaches Deep learning t+3
  • 6. Number of time series (~ thousands) [the SCALE problem] Time series are often highly erratic, intermittent or bursty (...and on highly different scales) ~ 10 items 2 items Product A Product B ... (1) (2) Time series prediction and sales forecasting: issues E.g. retail businesses
  • 7. Time series belong to a hierarchy of products/categories E.g. online retailer selling clothes Time series prediction and sales forecasting: issues Now Nike t-shirts Clothes (total sales) T-shirts total sales ~ 100 ~ 1000(3) For new products historical data is missing (the cold-start problem) (4) Adidas t-shirts
  • 8. Classical autoregressive models Estimate model order (AIC, BIC) Fit model parameters (maximum likelihood) Autoregressive component Moving average component Test residuals for randomness De-trending by differencing Variance stabilization by log or Box-Cox transformation Workflow
  • 9. Classical autoregressive models THE PROS: - Good explainability - Solid theoretical background - Very explicit model - A lot of control as it is a manual process THE CONS: - Data is seldom stationary: trend, seasonality, cycles need to modeled as well - Computationally intensive (one model for each time series) - No information sharing across time series (apart from Hyndman’s hts approach) * - Historical data are essential for forecasting, (no cold-start) * https://robjhyndman.com/publications/hierarchical/ Tech stack and packages - Rob Hyndman’s online text: https://otexts.com/fpp2/ - Infamous auto.arima package, ets, tbats, garch, stl... - Python’s Pyramid
  • 10. - Aggregate histograms over time scales - Transform into Fourier space - Add low/high pass filters as variables General machine learning approach for ts prediction Past Yt t Autoregressive component - Can use any number of methods (linear, trees, neural networks...) - Turn the time series prediction problem into a supervised learning problem - Easily extendable to support multiple input variables - Covariates can be easily handled and transformed through feature engineering Covariates E.g. feature engineering
  • 11. THE PROS: - Can model non-linear relationships - Can model the “hierarchical structure” of the time series through categorical variables - Support for covariates (predictors) + feature engineering - One model is shared among multiple time series - Cold-start predictions are possible by iteratively feeding the predictions back to the feature space THE CONS: - Feature engineering takes time - Long-term relationships between data points need to be explicitly modeled (autoregressive features) General machine learning approach for ts prediction Tech stack and packages - Sklearn, PySpark for feature engineering, data reduction
  • 12. Bayesian AR models (Facebook Prophet) Prophet is a Bayesian GAM (Generalized Additive Model) Linear trend with changepoints Seasonal component Holiday-specific componentt Sales 1) Detect changepoints in the time series 2) Fit linear trend parameters (k and delta) (piecewise) linear trends Growth rate Growth rate adjustment ** ** An additional ‘offset’ term has been omitted from the formula * Implemented using STAN *
  • 13. Bayesian AR models (Facebook Prophet) E.g. P = 365 for yearly data Need to estimate 2N parameters (an and bn ) using MCMC! Prophet is a Bayesian GAM (Generalized Additive Model) Linear trend with changepoints Seasonal component Holiday-specific componentt Sales
  • 14. THE PROS: - Uncertainty estimation - Bayesian changepoint detection - User-in-the-loop paradigm (Prophet) - Black-box variational inference is revolutionizing Bayesian inference THE CONS: - Bayesian inference takes time (the “scale” issue) - One model for each time series - No information sharing among series (unless you specify a hierarchical bayesian model with shared parameters, but still...) - Historical data are needed for prediction! - Performance is often on par* with autoregressive models Tech stack and packages - Python/R clients for Prophet * - R package for structural bayesian time series models: Bsts Bayesian AR models * Taylor et al., Forecasting at scale* This may open endless discussions. Bottom line: depends on your data :)
  • 15. Interlude: uncertainty estimation with deep learning - Uncertainty estimation is a prerogative of Bayesian methods. - Black box variational inference (ADVI) has sprung renewed interest towards Bayesian neural networks, but we are not there yet in terms of performance - A DeepMind paper from NIPS 2017 introduces a simple yet effective way to estimate predictive uncertainty using Deep Ensembles For a TensorFlow implementation of this paper: https://arrigonialberto86.github.io/funtime/deep_ensembles.html “Engineering Uncertainty Estimation in Neural Networks for Time Series Prediction at Uber” https://eng.uber.com/neural-network s-uncertainty-estimation/ 1) 2)
  • 16. Interlude: Deep Ensembles Train a deep learning model using a custom final layer which parametrizes a Gaussian distribution Sample x from the Gaussian distribution using fitted parameters Calculate loss to backpropagate the error (using Gaussian likelihood) (1) (3) (2) Network output
  • 17. What the network is learning: different regions of the x space have different variances Generate a synthetic dataset with different variances Interlude: Deep Ensembles PREDICTION ON TRAINING DATASET SYNTHETIC TRAINING DATASET Use the network from previous slide to predict on the training set to see if it actually detects variance reduction
  • 18. Interlude: Deep Ensembles The authors suggest to train different NNs on the same data (the whole training set) with random initialization Ensemble networks (improve generalization power) Uniformly weighted mixture model Predictions for regions outside of the training dataset show increasing variance (due to ensembling) In addition to ‘distribution’ modeling and ensembling the authors suggest to use the fast gradient sign method * to produce adversarial training example (Not shown here) * Goodfellow et al., 2014
  • 19. Interlude: Deep Ensembles Custom GaussianLayer Let’s just do some extra work and define a custom layer For a TensorFlow implementation of this paper: https://arrigonialberto86.github.io/funtime/deep_ensembles.html
  • 20. Interlude: Deep Ensembles Custom layer returns both mu and sigma Build 2 weight matrices + 2 biase terms
  • 21. DeepAR (Amazon) Instead of fitting separate models for each time series we create a global model from related time series to handle widely-varying scales through rescaling and velocity-based sampling. Differentscales Probabilities ~1000 time series Past Future Covariates Flunkert et al., 2017
  • 22. DeepAR (Amazon) ht-1 ht ht+1 - Use LSTM interactions in the time series - As seen with the Deep Ensemble architecture, we can predict parameters of distributions at each time point (theta vector) - Time series need to be scaled for the network to learn time-varying dynamics
  • 23. DeepAR (Amazon) * Likelihood/loss is customizable: Gaussian/negative binomial for count data + overdispersion Training Prediction *
  • 24. For a commentary + code review: https://arrigonialberto86.github.io/funtime/deepar.html DeepAR (Amazon) The mandatory ‘AirPassengers’ prediction example (results shown on training set) It is given that this is not the use case Amazon had in mind...
  • 25. DeepAR (Amazon) - Long-term relationships are handled by design using LSTMs - One model is fitted for all the time series - The hierarchical ts structure and inter-dependencies are captured by using covariates (even holidays, recurrent events etc...) - The model can be used for cold-start predictions (using categorical covariates with ‘descriptive’ product information) - Hassle-free uncertainty estimation DeepAR and the AWS ecosystem AWS SageMaker
  • 26. Deep State Space (NIPS 2018)* A state space model or SSM is just like an Hidden Markov Model, except the hidden states are continuous Observation (zt ) update Latent state (lt ) update In normal settings we would need to fit these parameters for each time series zt-1 zt zt+1 ??? * Rangapuram et al, 2018, Deep State Space Models for Time Series Forecasting
  • 27. Deep State Space (NIPS 2018) Training Prediction Compute the negative likelihood, derive the time-varying SS parameters using backpropagation Use Kalman filtering to estimate lt , then recursively apply the transition equation and the observation model to generate prediction samples
  • 28. - Long-term relationships are handled by design using LSTMs - One model is fitted for all the time series - The hierarchical ts structure and inter-dependencies are captured by ad-hoc design and components of the SS model (even holidays, recurrent events etc...) - The model can be used for cold-start predictions (using categorical covariates with ‘descriptive’ product information) Deep State Space (NIPS 2018)
  • 29. Going forward: Deep factors with GPs * * Maddix et al., “Deep Factors with Gaussian Processes for Forecasting”, NIPS 2018 The combination of probabilistic graphical models with deep neural networks has been an active research area recently Global DNN backbone and local Gaussian Process (GP). The main idea is to represent each time series as a combination of a global time series and a corresponding local model. gt gt gt gt RNN zit + covariates Backpropagation to find RNN parameters to produce global factors (gt ) + GP hyperparameters
  • 30. M4 forecasting competition winner algo (Uber, 2018) The winning idea is often the simplest! Hybrid Exponential Smoothing-Recurrent Neural Networks (ES-RNN) method. It mixes hand-coded parts like ES formulas with a black-box recurrent neural network (RNN) forecasting engine. yt-1 yt yt+1 Deseasonalized and normalized vector of covariates + previous state RNN results are now part of a parametric model
  • 31. Classical autoregressive models Bayesian models (GAM/structural) Classical machine learning Deep learning approaches Scalability Info sharing across ts Cold-start predictions Uncertainty estimation Unevenly spaced time series * Summary of performance * DeepAR Deep Factors * Chen et al., Neural ordinary differential equations, 2018 / Futoma et al., 2017, Multitask GP + RNN
  • 33. Deep State Space (Amazon) Level-trend model parametrization:
  • 34. DeepAR (Amazon) Step 1 Step 2 Step 3 Training procedure: - Predict parameters (e.g. mu, sigma) - Compute likelihood of the prediction (can be Gaussian as we have seen with Deep Ensembles) * - Sample next point * Likelihood/loss is customizable: Gaussian/negative binomial for count data + overdispersion Training Prediction (~ Monte Carlo)