gluonts.mx.model.tpp package#

class gluonts.mx.model.tpp.DeepTPPEstimator(prediction_interval_length: float, context_interval_length: float, num_marks: int, time_distr_output: gluonts.mx.model.tpp.distribution.base.TPPDistributionOutput = gluonts.mx.model.tpp.distribution.weibull.WeibullOutput(), embedding_dim: int = 5, trainer: gluonts.mx.trainer._base.Trainer = gluonts.mx.trainer._base.Trainer(add_default_callbacks=True, callbacks=None, clip_gradient=10.0, ctx=None, epochs=100, hybridize=False, init='xavier', learning_rate=0.001, num_batches_per_epoch=50, weight_decay=1e-08), num_hidden_dimensions: int = 10, num_parallel_samples: int = 100, num_training_instances: int = 100, freq: str = 'H', batch_size: int = 32)[source]#

Bases: gluonts.mx.model.estimator.GluonEstimator

DeepTPP is a multivariate point process model based on an RNN.

After each event \((\tau_i, m_i)\), we feed the inter-arrival time \(\tau_i\) and the mark \(m_i\) into the RNN. The state \(h_i\) of the RNN represents the history embedding. We use \(h_i\) to parametrize the distribution over the next inter-arrival time \(p(\tau_{i+1} | h_i)\) and the distribution over the next mark \(p(m_{i+1} | h_i)\). The distribution over the marks is always categorical, but different choices are possible for the distribution over inter-arrival times - see gluonts.model.tpp.distribution.

The model is a generalization of the approaches described in [DDT+16], [TWJ19] and [SBG20].

References

Parameters
  • prediction_interval_length – The length of the interval (in continuous time) that the estimator will predict at prediction time.

  • context_interval_length – The length of intervals (in continuous time) that the estimator will be trained with.

  • num_marks – The number of marks (distinct processes), i.e., the cardinality of the mark set.

  • time_distr_output – TPPDistributionOutput for the distribution over the inter-arrival times. See gluonts.model.tpp.distribution for possible choices.

  • embedding_dim – The dimension of vector embeddings for marks (used as input to the GRU).

  • trainergluonts.mx.trainer.Trainer object which will be used to train the estimator. Note that Trainer(hybridize=False) must be set as DeepTPPEstimator currently does not support hybridization.

  • num_hidden_dimensions – Number of hidden units in the GRU network.

  • num_parallel_samples – The number of samples returned by the Predictor learned.

  • num_training_instances – The number of training instances to be sampled from each entry in the data set provided during training.

  • freq – Similar to the freq of discrete-time models, specifies the time unit by which inter-arrival times are given.

  • batch_size – The size of the batches to be used training and prediction.

create_predictor(transformation: gluonts.transform._base.Transformation, trained_network: gluonts.mx.model.tpp.deeptpp._network.DeepTPPTrainingNetwork) gluonts.model.predictor.Predictor[source]#

Create and return a predictor object.

Parameters
  • transformation – Transformation to be applied to data before it goes into the model.

  • module – A trained HybridBlock object.

Returns

A predictor wrapping a HybridBlock used for inference.

Return type

Predictor

create_training_data_loader(data: gluonts.dataset.Dataset, **kwargs) Iterable[Dict[str, Any]][source]#

Create a data loader for training purposes.

Parameters

data – Dataset from which to create the data loader.

Returns

The data loader, i.e. and iterable over batches of data.

Return type

DataLoader

create_training_network() mxnet.gluon.block.HybridBlock[source]#

Create and return the network used for training (i.e., computing the loss).

Returns

The network that computes the loss given input data.

Return type

HybridBlock

create_transformation() gluonts.transform._base.Transformation[source]#

Create and return the transformation needed for training and inference.

Returns

The transformation that will be applied entry-wise to datasets, at training and inference time.

Return type

Transformation

create_validation_data_loader(data: gluonts.dataset.Dataset, **kwargs) Iterable[Dict[str, Any]][source]#

Create a data loader for validation purposes.

Parameters

data – Dataset from which to create the data loader.

Returns

The data loader, i.e. and iterable over batches of data.

Return type

DataLoader

lead_time: int#
prediction_length: int#
class gluonts.mx.model.tpp.PointProcessGluonPredictor(input_names: typing.List[str], prediction_net: mxnet.gluon.block.Block, batch_size: int, prediction_interval_length: float, freq: str, ctx: mxnet.context.Context, input_transform: gluonts.transform._base.Transformation, dtype: typing.Type = <class 'numpy.float32'>, forecast_generator: gluonts.model.forecast_generator.ForecastGenerator = <gluonts.mx.model.tpp.predictor.PointProcessForecastGenerator object>, **kwargs)[source]#

Bases: gluonts.mx.model.predictor.GluonPredictor

Predictor object for marked temporal point process models.

TPP predictions differ from standard discrete-time models in several regards. First, at least for now, only sample forecasts implementing PointProcessSampleForecast are available. Similar to TPP Estimator objects, the Predictor works with prediction_interval_length as opposed to prediction_length.

The predictor also accounts for the fact that the prediction network outputs a 2-tuple of Tensors, for the samples themselves and their valid_length.

Parameters

prediction_interval_length – The length of the prediction interval

as_symbol_block_predictor(batch: Optional[Dict[str, Any]] = None, dataset: Optional[gluonts.dataset.Dataset] = None) gluonts.mx.model.predictor.SymbolBlockPredictor[source]#

Returns a variant of the current GluonPredictor backed by a Gluon SymbolBlock. If the current predictor is already a SymbolBlockPredictor, it just returns itself.

One of batch or datset must be set.

Parameters
  • batch – A batch of data to use for the required forward pass after the hybridize() call of the underlying network.

  • dataset – Dataset from which a batch is extracted if batch is not set.

Returns

A predictor derived from the current one backed by a SymbolBlock.

Return type

SymbolBlockPredictor

hybridize(batch: Dict[str, Any]) None[source]#

Hybridizes the underlying prediction network.

Parameters

batch – A batch of data to use for the required forward pass after the hybridize() call.

predict(dataset: gluonts.dataset.Dataset, num_samples: Optional[int] = None, num_workers: Optional[int] = None, num_prefetch: Optional[int] = None, **kwargs) Iterator[gluonts.model.forecast.Forecast][source]#

Compute forecasts for the time series in the provided dataset. This method is not implemented in this abstract class; please use one of the subclasses. :param dataset: The dataset containing the time series to predict.

Returns

Iterator over the forecasts, in the same order as the dataset iterable was provided.

Return type

Iterator[Forecast]

serialize_prediction_net(path: pathlib.Path) None[source]#
class gluonts.mx.model.tpp.PointProcessSampleForecast(samples: Union[mxnet.ndarray.ndarray.NDArray, numpy.ndarray], valid_length: Union[mxnet.ndarray.ndarray.NDArray, numpy.ndarray], start_date: pandas._libs.tslibs.timestamps.Timestamp, freq: str, prediction_interval_length: float, item_id: Optional[str] = None, info: Optional[Dict] = None)[source]#

Bases: gluonts.model.forecast.Forecast

Sample forecast object used for temporal point process inference. Differs from standard forecast objects as it does not implement fixed length samples. Each sample has a variable length, that is kept in a separate valid_length attribute.

Importantly, PointProcessSampleForecast does not implement some methods (such as quantile or plot) that are available in discrete time forecasts.

Parameters
  • samples – A multidimensional array of samples, of shape (number_of_samples, max_pred_length, target_dim). The target_dim is equal to 2, where the first dimension contains the inter-arrival times and the second - categorical marks.

  • valid_length – An array of integers denoting the valid lengths of each sample in samples. That is, valid_length[0] == 2 implies that only the first two entries of samples[0, ...] are valid “points”.

  • start_date (pandas._libs.tslibs.period.Period) – Starting Timestamp of the sample

  • freq – The time unit of interarrival times

  • prediction_interval_length (float) – The length of the prediction interval for which samples were drawn.

  • item_id (Optional[str]) – Item ID, if available.

  • info (Optional[Dict]) – Optional dictionary of additional information.

dim() int[source]#

Return the dimensionality of the forecast object.

property freq#
property index: pandas.core.indexes.period.PeriodIndex#
mean = None#
plot(**kwargs)[source]#

Plot median forecast and prediction intervals using matplotlib.

By default the 0.5 and 0.9 prediction intervals are plotted. Other intervals can be choosen by setting intervals.

This plots to the current axes object (via plt.gca()), or to ax if provided. Similarly, the color is using matplotlibs internal color cycle, if no explicit color is set.

One can set name to use it as the label for the median forecast. Intervals are not labeled, unless show_label is set to True.

prediction_interval_length: float#
prediction_length: int = None#
quantile(q: Union[float, str]) numpy.ndarray[source]#

Compute a quantile from the predicted distribution.

Parameters

q – Quantile to compute.

Returns

Value of the quantile across the prediction range.

Return type

numpy.ndarray