Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

TIME SERIES PREDICTION WITH DEEP LEARNING MODELS

Zafer Üşi
5 min readAug 26, 2023

--

Abstract

In this article, I will talk about how time series prediction can be performed with the support of deep learning models when popular models built for this purpose outperform each other and why they should be preferred.

CHILDREN OF AI: MACHINE LEARNING AND DEEP LEARNING

First, let’s talk about machine learning and deep learning and how they have become more popular recently for beginners.

Machine learning is the process of producing output in line with the determined task of functions weighted with parameters that the computer creates by examining the input and output relationship from various data types such as text, image, and audio. The main uses for machine learning are regression, segmentation, and prediction.

On the other hand, deep learning differs from machine learning with some features and forms a subset of it. Thanks to its multi-layered structure, the main differences between deep learning from machine learning are able to better understand the hierarchical structure in unstructured data types and perform more successfully in the training process. For example, when extracting objects from image data, it first starts to capture 1-dimensional patterns and then 2-dimensional patterns. This is a successful method to prevent the loss of training success.

While deep learning is used for operations such as object extraction and object segmentation in image sets from unstructured data structures, it is widely used in natural language processing tasks and time series classification and prediction tasks in text-based data.

Neural Networks Structure
Object Detection taken by pixebay

THEN WHAT IS TİME SERIES FORECASTING?

Time series prediction algorithm is defined as a mathematical method that can be physically designed in a computer environment, which serves the purpose of estimating one or more future variables by referencing past timestamp data.

Past timestamps make sense as indexed real time but its mean is not limited just by it. In other words, each row in each dataset with the same attributes can create a time series. To give an example, the energy conservation performance estimation of the new flats, which has not been determined yet, using the energy conservation performance data of the houses in the nearby addresses is one of my current study subjects. You can also check out another of my stock valuation works on my GitHub account.

WHAT IS THE DIFFERENCE BETWEEN TRADITIONAL NEURAL NETWORKS AND RECURRENT NEURAL NETWORKS?

Traditional neural networks have a simpler network structure compared to recurrent neural networks. While traditional neural networks are less successful at examining dependencies and relationships in past tense steps, RNN can outperform better results by updating weight matrices to evaluate the importance of historical information in each unit in the neural network by back propagation method.

I want to exemplify this with a table.

Table list of person’s and orders.
Table shows the person’s order.

Linear regression models based on traditional neural networks can interpret these table as always Mike order Pizza, Jenifer orders Kebap, and Jennifer orders burger but it should be taken into account that John also sometimes orders pizza. So linear regression models will achieve low accuracy on the correlated series. This time RNN models take over the control and compute these input and output correlations, so the outputs will have better accuracy on the predictions. For more details on how RNN does this, you can check the articles on references.

WHAT ARE POPULAR RECURRENT NEURAL NETWORK MODELS IN TIME SERIES PREDICTION?

Deep learning models used for time series analysis have been developed according to changing needs over time. Popular architectures include models such as GRU, ARIMA, LSTM, SimpleRNN, and Transformers. While the SimpleRNN model is more successful in scenarios where past dependencies have less impact on future predictions, it produces output with less accuracy when examining long dependencies.

The reason for this is that when the nonlinear parameters of the activation functions contained in the simpleRNN units are updated with the gradient-based backpropagation method, the gradient values approach 0 and cannot be weighted.

It has entered the literature as the problem of vanishing or exploding gradients. To handle it, another RNN model was created named LSTM.

What is LSTM? How does its architecture work?

LSTM is long short-term memory by explanation. Its architecture includes gate units and cell state. Gates are the forget gate, input gate, and output gate.

Forget And Input Gates

These gate units basically decide which information from the past time steps will be included in the function while evaluating within the neuron. The processed information from the previous neuron and the input to the current neuron are first forwarded to the forget and input gate. The context for the current unit with functions within the gates allows the passage of higher information while forgetting the lower ones. The outputs of both these units are updated in cell state. In the last step output gate unit processes the cell state and decides what the outputs will be.

Transformers Models

Unlike other deep learning mechanisms, the Transformers model uses attention mechanisms. The encoder-decoder layers included in the Transformers model use a method called the attention mechanism. This mechanism focuses on the information that seems necessary for future predictions giving more weight and rendering the others trivial. Its difference from other deep learning algorithms such as LSTM is that it provides faster model training with multiple encoder-decoder architectures working in parallel. This is a technology used especially in chatbots. Actually, we all know one of the most popular apps using this architecture…

--

--