Hierarchical Model Tutorial#

This tutorial illustrates how to use GluonTS’ deep-learning based hierarchical model DeepVarHierarchical. We first explain the data preparation for hierarchical/grouped time series, and then show the model training, prediction and evaluation using common use-cases.

Introduction#

The important aspect in using hierarchical models is the proper preparation of hierarchical time series data. The minimal requirements in building the hierarchical or grouped time series are the time series at the bottom-level of the hierarchy and the hierarchical/grouped aggregation matrix. GluonTS provides a simple way to construct hierarchical time series by reading the bottom-level time series and the aggregation matrix from csv files. Here we first describe the preparation of hierarchical time series before discussing the training of the hierarchical model.

Preparation of Hierarchical Time Series#

The bottom level time series are assumed to be available in a csv file as columns. The csv file should also contain the index column listing the time stamps for each row. Note that the aggregated time series are automatically constructed and should not be provided. Here is an example csv of bottom level time series. Similarly, the aggregation matrix can also be read from a csv file; here is an example. We use the standard format for the summation matrix; e.g., see Eq. 10.3 of the textbook Forecasting: Principles and Practice. Note that grouped time series can also be represented by an aggregation matrix in the same way as the hierarchical time series and hence the material presented here is also applicable to grouped time series.

[1]:

import pandas as pd
from gluonts.dataset.hierarchical import HierarchicalTimeSeries

# Load (only!) the time series at the bottom level of the hierarchy.
ts_at_bottom_level_csv = (
    "https://gist.githubusercontent.com/rshyamsundar/39e57075743537c4100a716a7b7ec047/"
    "raw/f02f5aeadad73e3f3e9cf54478606d3507826492/example_bottom_ts.csv"
)

# Make sure the dataframe has `PeriodIndex` by explicitly casting it to `PeriodIndex`.
ts_at_bottom_level = pd.read_csv(
    ts_at_bottom_level_csv,
    index_col=0,
    parse_dates=True,
).to_period()

# Load the aggregation matrix `S`.
S_csv = (
    "https://gist.githubusercontent.com/rshyamsundar/17084fd1f28021867bcf6f2d69d9b73a/raw/"
    "32780ca43f57a78f2d521a75e73b136b17f34a02/example_agg_mat.csv"
)
S = pd.read_csv(S_csv).values

hts = HierarchicalTimeSeries(
    ts_at_bottom_level=ts_at_bottom_level,
    S=S,
)

One can access all the time series of the hierarchy using ts_at_all_levels property.

[2]:

hts.ts_at_all_levels.head()

[2]:

	0	1	2	3	4	5	6
2020-03-22 00:00	0.686671	0.156873	0.529798	0.056962	0.099911	0.039827	0.489971
2020-03-22 01:00	2.189128	0.669261	1.519866	0.246535	0.422727	0.763164	0.756702
2020-03-22 02:00	1.152853	0.582213	0.570640	0.314393	0.267820	0.169645	0.400996
2020-03-22 03:00	1.198889	0.653139	0.545750	0.609158	0.043981	0.235009	0.310741
2020-03-22 04:00	2.069197	0.678490	1.390707	0.380788	0.297702	0.898429	0.492278

Model Training and Forecasting#

We now show the simplest use-case where we want to train the model on the whole dataset available and produce predictions for the future time steps. Note that this is how the model would be used in practice, once the best model has already been selected based on some user-specific evaluation criteria; see the next section for model evaluation.

We assume that the hierarchical time series hts of type HierarchicalTimeSeries has already been constructed as described above. The first step is to convert this hierarchical time series to a Dataset on which mini-batch training can be run. Here we convert it into gluonts.dataset.pandas.PandasDataset.

[3]:

dataset = hts.to_dataset()

Once the dataset is created, it is straightforward to use the hierarchical model. Here, for a quick illustration, we fix the prediction length and choose a smaller number of epochs. We train on the whole dataset and give the same as the input to the trained model (called predictor) to generate predictions (or forecasts) for future/unseen time steps. The final output forecasts is an instance of gluonts.model.forecast.SampleForecast containing sample-based forecasts for all the time series in the hierarchy.

[4]:

from gluonts.mx.model.deepvar_hierarchical import DeepVARHierarchicalEstimator
from gluonts.mx.trainer import Trainer

prediction_length = 24
estimator = DeepVARHierarchicalEstimator(
    freq=hts.freq,
    prediction_length=prediction_length,
    trainer=Trainer(epochs=2),
    S=S,
)
predictor = estimator.train(dataset)

forecast_it = predictor.predict(dataset)

# There is only one element in `forecast_it` containing forecasts for all the time series in the hierarchy.
forecasts = next(forecast_it)

100%|██████████| 50/50 [00:32<00:00,  1.53it/s, epoch=1/2, avg_epoch_loss=178]
100%|██████████| 50/50 [00:32<00:00,  1.53it/s, epoch=2/2, avg_epoch_loss=146]

Using external dynamic features#

By default, the hierarchical model DeepVarHierarchical internally creates several time-based/dynamic features for model training. These are seasonal features automatically deduced from the frequency of the target time series. One could also provide external dynamic features to the model if available. Here we show how this is done; we restart from the point where the hierarchical time series hts has already been created.

We first load the available external features from a csv file.

[5]:

dynamic_features_csv = (
    "https://gist.githubusercontent.com/rshyamsundar/d8e63bad43397c95a4f5daaa17e122f8/"
    "raw/a50657cf89f86d48cee41122f02cf5b1fcafdd2f/example_dynamic_features.csv"
)

dynamic_features_df = pd.read_csv(
    dynamic_features_csv,
    index_col=0,
    parse_dates=True,
).to_period()

The dynamic features are assumed to be available both for the “training range” (time points where the target is available) as well as for the “prediction range” (future time points where the forecast is required). Thus, dynamic features are longer than the target time series by prediction_length time steps.

For training the model, we need dynamic features only for the training range.

[6]:

dynamic_features_df_train = dynamic_features_df.iloc[:-prediction_length, :]

We create the training dataset by passing the external features dynamic_features_df_train and train the model on it.

[7]:

dataset_train = hts.to_dataset(feat_dynamic_real=dynamic_features_df_train)
estimator = DeepVARHierarchicalEstimator(
    freq=hts.freq,
    prediction_length=prediction_length,
    trainer=Trainer(epochs=2),
    S=S,
)
predictor = estimator.train(dataset_train)

100%|██████████| 50/50 [00:33<00:00,  1.48it/s, epoch=1/2, avg_epoch_loss=192]
100%|██████████| 50/50 [00:33<00:00,  1.49it/s, epoch=2/2, avg_epoch_loss=147]

To generate forecasts for future/unseen time steps, we need to pass both the past target (i.e., target in the training range) as well as full features dynamic_features_df (including those in the prediction range) to the trained model. Hence we need to create a new dataset that is separate from dataset_train, unlike in the earlier case.

[8]:

predictor_input = hts.to_dataset(feat_dynamic_real=dynamic_features_df)
forecast_it = predictor.predict(predictor_input)

# There is only one element in `forecast_it` containing forecasts for all the time series in the hierarchy.
forecasts = next(forecast_it)

Model Evaluation via Backtesting#

We now explain how the hierarchical model can be evaluated via backtesting. For ease of presentation, we describe the model evaluation for the case where external dynamic features are available. However, it is straightforward to modify the code, in case external features are not available; simply invoke the function to_dataset() below without any arguments.

We assume that time series at the bottom level ts_at_bottom_level and the aggregation matrix S have already been created as described above. We create the train-test split along the time axis by withholding the last prediction_length time points for evaluation and the remaining for training.

[9]:

prediction_length = 24
hts_train = HierarchicalTimeSeries(
    ts_at_bottom_level=ts_at_bottom_level.iloc[:-prediction_length, :],
    S=S,
)
hts_test_label = HierarchicalTimeSeries(
    ts_at_bottom_level=ts_at_bottom_level.iloc[-prediction_length:, :],
    S=S,
)

We load the external features as well and slice the features corresponding to the training range.

[10]:

dynamic_features_csv = (
    "https://gist.githubusercontent.com/rshyamsundar/d8e63bad43397c95a4f5daaa17e122f8/"
    "raw/a50657cf89f86d48cee41122f02cf5b1fcafdd2f/example_dynamic_features.csv"
)

dynamic_features_df = pd.read_csv(
    dynamic_features_csv,
    index_col=0,
    parse_dates=True,
).to_period()

dynamic_features_df_train = dynamic_features_df.iloc[:-prediction_length, :]

We convert hts_train into PandasDataset by passing the external features dynamic_features_df_train and train the hierarchical model on it.

[11]:

dataset_train = hts_train.to_dataset(feat_dynamic_real=dynamic_features_df_train)

estimator = DeepVARHierarchicalEstimator(
    freq=hts.freq,
    prediction_length=prediction_length,
    trainer=Trainer(epochs=2),
    S=S,
)

predictor = estimator.train(dataset_train)

100%|██████████| 50/50 [00:33<00:00,  1.48it/s, epoch=1/2, avg_epoch_loss=176]
100%|██████████| 50/50 [00:33<00:00,  1.49it/s, epoch=2/2, avg_epoch_loss=144]

To generate forecasts for time points corresponding to the withheld observations, we need to pass full features as well as the target in the training range to the trained model. So we create the input dataset for the predictor accordingly and generate forecasts.

[12]:

predictor_input = hts_train.to_dataset(feat_dynamic_real=dynamic_features_df)
forecast_it = predictor.predict(predictor_input)

Once the forecasts are obtained, we can evaluate them against the withheld observations. GluonTS evaluator takes as input an iterator over the true (withheld) observations and the corresponding forecasts to do the evaluation. Our forecasts are already in the correct format and our withheld observations are in hts_test_label.

[13]:

from gluonts.evaluation import MultivariateEvaluator

evaluator = MultivariateEvaluator()
agg_metrics, item_metrics = evaluator(
    ts_iterator=[hts_test_label.ts_at_all_levels],
    fcst_iterator=forecast_it,
)

print(
    f"Mean (weighted) quantile loss over all time series: "
    f"{agg_metrics['mean_wQuantileLoss']}"
)

Running evaluation: 1it [00:00, 145.95it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 154.48it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 156.65it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 157.82it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 163.84it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 140.81it/s]
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
Running evaluation: 1it [00:00, 164.84it/s]

Mean (weighted) quantile loss over all time series: 0.27951188855488635


/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)
/opt/hostedtoolcache/Python/3.8.18/x64/lib/python3.8/site-packages/pandas/core/dtypes/astype.py:138: UserWarning: Warning: converting a masked element to nan.
  return arr.astype(dtype, copy=True)