gluonts.mx.model.deepvar_hierarchical package#

class gluonts.mx.model.deepvar_hierarchical.DeepVARHierarchicalEstimator(freq: str, prediction_length: int, S: numpy.ndarray, D: Optional[numpy.ndarray] = None, num_samples_for_loss: int = 200, likelihood_weight: float = 0.0, CRPS_weight: float = 1.0, sample_LH: bool = False, coherent_train_samples: bool = True, coherent_pred_samples: bool = True, warmstart_epoch_frac: float = 0.0, seq_axis: Optional[List[int]] = None, log_coherency_error: bool = True, trainer: gluonts.mx.trainer._base.Trainer = gluonts.mx.trainer._base.Trainer(add_default_callbacks=True, callbacks=None, clip_gradient=10.0, ctx=None, epochs=100, hybridize=True, init='xavier', learning_rate=0.001, num_batches_per_epoch=50, weight_decay=1e-08), context_length: Optional[int] = None, num_layers: int = 2, num_cells: int = 40, cell_type: str = 'lstm', num_parallel_samples: int = 100, dropout_rate: float = 0.1, use_feat_dynamic_real: bool = False, cardinality: List[int] = [1], embedding_dimension: int = 5, scaling: bool = True, pick_incomplete: bool = False, lags_seq: Optional[List[int]] = None, time_features: Optional[List[Callable[[pandas.core.indexes.period.PeriodIndex], numpy.ndarray]]] = None, batch_size: int = 32, **kwargs)[source]#

Bases: gluonts.mx.model.deepvar._estimator.DeepVAREstimator

Constructs a DeepVARHierarchical estimator, which is a hierachical extension of DeepVAR.

The model has been described in the ICML 2021 paper: http://proceedings.mlr.press/v139/rangapuram21a.html

Parameters
  • freq – Frequency of the data to train on and predict

  • prediction_length (int) – Length of the prediction horizon

  • S – Summation or aggregation matrix.

  • D – Positive definite matrix (typically a diagonal matrix). Optional. If provided then the distance between the reconciled and unreconciled forecasts is calculated based on the norm induced by D. Useful for weighing the distances differently for each level of the hierarchy. By default Euclidean distance is used.

  • num_samples_for_loss – Number of samples to draw from the predicted distribution to compute the training loss.

  • likelihood_weight – Weight for the negative log-likelihood loss. Default: 0.0. If not zero, then negative log-likelihood (times likelihood_weight) is added to the CRPS loss (times CRPS_weight).

  • CRPS_weight – Weight for the CRPS loss component. Default: 1.0. If zero, then loss is only negative log-likelihood (times likelihood_weight). If non-zero, then CRPS loss (times ‘CRPS_weight’) is added to the negative log-likelihood loss (times likelihood_weight).

  • sample_LH – Boolean flag to specify if likelihood should be computed using the distribution based on (coherent) samples. Default: False (in this case likelihood is computed using the parametric distribution predicted by the network).

  • coherent_train_samples – Flag to indicate whether coherence should be enforced during training. Default: True.

  • coherent_pred_samples – Flag to indicate whether coherence should be enforced during prediction. Default: True.

  • warmstart_epoch_frac – Specifies the epoch (as a fraction of total number of epochs) from when to start enforcing coherence during training.

  • seq_axis – Specifies the list of axes that should be processed sequentially (only during training). The reference axes are: (num_samples_for_loss, batch, seq_length, target_dim). This is useful if batch processing is not possible because of insufficient memory (e.g. if both num_samples_for_loss and target_dim are very large). In such cases, use seq_axis = [1]. By default, all axes are processeed in parallel.

  • log_coherency_error – Flag to indicate whether to compute and show the cohererncy error on the samples generated during prediction.

  • trainer – Trainer object to be used (default: Trainer())

  • context_length – Number of steps to unroll the RNN for before computing predictions (default: None, in which case context_length = prediction_length)

  • num_layers – Number of RNN layers (default: 2)

  • num_cells – Number of RNN cells for each layer (default: 40)

  • cell_type – Type of recurrent cells to use (available: ‘lstm’ or ‘gru’; default: ‘lstm’)

  • num_parallel_samples – Number of evaluation samples per time series to increase parallelism during inference. This is a model optimization that does not affect the accuracy (default: 100)

  • dropout_rate – Dropout regularization parameter (default: 0.1)

  • use_feat_dynamic_real – Whether to use the feat_dynamic_real field from the data (default: False)

  • cardinality – Number of values of each categorical feature (default: [1])

  • embedding_dimension – Dimension of the embeddings for categorical features (default: 5])

  • scaling – Whether to automatically scale the target values (default: true)

  • pick_incomplete – Whether training examples can be sampled with only a part of past_length time-units

  • lags_seq – Indices of the lagged target values to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)

  • time_features – Time features to use as inputs of the RNN (default: None, in which case these are automatically determined based on freq)

  • batch_size – The size of the batches to be used training and prediction.

create_predictor(transformation: gluonts.transform._base.Transformation, trained_network: mxnet.gluon.block.HybridBlock) gluonts.model.predictor.Predictor[source]#

Create and return a predictor object.

Parameters
  • transformation – Transformation to be applied to data before it goes into the model.

  • module – A trained HybridBlock object.

Returns

A predictor wrapping a HybridBlock used for inference.

Return type

Predictor

create_training_network() gluonts.mx.model.deepvar_hierarchical._network.DeepVARHierarchicalTrainingNetwork[source]#

Create and return the network used for training (i.e., computing the loss).

Returns

The network that computes the loss given input data.

Return type

HybridBlock

lead_time: int#
output_transform: Optional[Callable]#
prediction_length: int#
gluonts.mx.model.deepvar_hierarchical.coherency_error(S: numpy.ndarray, samples: numpy.ndarray) float[source]#

Computes the maximum relative coherency error.

\[\max_i | (S @ y_b)_i - y_i | / y_i\]

where \(y\) refers to the samples and \(y_b\) refers to the samples at the bottom level.

Parameters
  • S – The summation matrix S. Shape: (total_num_time_series, num_bottom_time_series)

  • samples – Samples. Shape: (*batch_shape, target_dim).

Returns

Coherency error

Return type

Float

gluonts.mx.model.deepvar_hierarchical.projection_mat(S: numpy.ndarray, D: Optional[numpy.ndarray] = None) numpy.ndarray[source]#

Computes the projection matrix :math: P for projecting base forecasts :math: bar{y} on to the space of coherent forecasts: :math: P bar{y}.

More precisely,

\[\begin{split}P = S (S^T S)^{-1} S^T, if D is None,\\ P = S (S^T D S)^{-1} S^T D, otherwise.\end{split}\]
Parameters
  • S – The summation or the aggregation matrix. Shape: (total_num_time_series, num_bottom_time_series)

  • D – Symmetric positive definite matrix (typically a diagonal matrix). Shape: (total_num_time_series, total_num_time_series) Optional. If provided then the distance between the reconciled and unreconciled forecasts is calculated based on the norm induced by D. Useful for weighing the distances differently for each level of the hierarchy. By default Euclidean distance is used.

Returns

Projection matrix, shape (total_num_time_series, total_num_time_series)

Return type

Numpy ND array

gluonts.mx.model.deepvar_hierarchical.reconcile_samples(reconciliation_mat: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], samples: Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol], seq_axis: Optional[List] = None) Union[mxnet.ndarray.ndarray.NDArray, mxnet.symbol.symbol.Symbol][source]#

Computes coherent samples by multiplying unconstrained samples with reconciliation_mat.

Parameters
  • reconciliation_mat – Shape: (target_dim, target_dim)

  • samples – Unconstrained samples Shape: (*batch_shape, target_dim) During training: (num_samples, batch_size, seq_len, target_dim) During prediction: (num_parallel_samples x batch_size, seq_len, target_dim)

  • seq_axis – Specifies the list of axes that should be reconciled sequentially. By default, all axes are processeed in parallel.

Returns

Coherent samples

Return type

Tensor, shape same as that of samples