gluonts.torch.model.i_transformer package#

class gluonts.torch.model.i_transformer.ITransformerEstimator(prediction_length: int, context_length: Optional[int] = None, d_model: int = 32, nhead: int = 4, dim_feedforward: int = 128, dropout: float = 0.1, activation: str = 'relu', norm_first: bool = False, num_encoder_layers: int = 2, lr: float = 0.001, weight_decay: float = 1e-08, scaling: Optional[str] = 'mean', distr_output: gluonts.torch.distributions.output.Output = gluonts.torch.distributions.studentT.StudentTOutput(beta=0.0), num_parallel_samples: int = 100, batch_size: int = 32, num_batches_per_epoch: int = 50, trainer_kwargs: Optional[Dict[str, Any]] = None, train_sampler: Optional[gluonts.transform.sampler.InstanceSampler] = None, validation_sampler: Optional[gluonts.transform.sampler.InstanceSampler] = None, nonnegative_pred_samples: bool = False)[source]#

Bases: gluonts.torch.model.estimator.PyTorchLightningEstimator

An estimator training the iTransformer model for multivariate forecasting as described in https://arxiv.org/abs/2310.06625 extended to be probabilistic.

This class uses the model defined in ITransformerModel, and wraps it into a ITransformerLightningModule for training purposes: training is performed using PyTorch Lightning’s pl.Trainer class.

Parameters
  • prediction_length (int) – Length of the prediction horizon.

  • context_length – Number of time steps prior to prediction time that the model takes as inputs (default: 10 * prediction_length).

  • d_model – Size of latent in the Transformer encoder.

  • nhead – Number of attention heads in the Transformer encoder which must divide d_model.

  • dim_feedforward – Size of hidden layers in the Transformer encoder.

  • dropout – Dropout probability in the Transformer encoder.

  • activation – Activation function in the Transformer encoder.

  • norm_first – Whether to apply normalization before or after the attention.

  • num_encoder_layers – Number of layers in the Transformer encoder.

  • lr – Learning rate (default: 1e-3).

  • weight_decay – Weight decay regularization parameter (default: 1e-8).

  • scaling – Scaling parameter can be “mean”, “std” or None.

  • distr_output – Distribution to use to evaluate observations and sample predictions (default: StudentTOutput()).

  • num_parallel_samples – Number of samples per time series to that the resulting predictor should produce (default: 100).

  • batch_size – The size of the batches to be used for training (default: 32).

  • num_batches_per_epoch

    Number of batches to be processed in each training epoch

    (default: 50).

  • trainer_kwargs – Additional arguments to provide to pl.Trainer for construction.

  • train_sampler – Controls the sampling of windows during training.

  • validation_sampler – Controls the sampling of windows during validation.

  • nonnegative_pred_samples – Should final prediction samples be non-negative? If yes, an activation function is applied to ensure non-negative. Observe that this is applied only to the final samples and this is not applied during training.

create_lightning_module() lightning.pytorch.core.module.LightningModule[source]#

Create and return the network used for training (i.e., computing the loss).

Returns

The network that computes the loss given input data.

Return type

pl.LightningModule

create_predictor(transformation: gluonts.transform._base.Transformation, module) gluonts.torch.model.predictor.PyTorchPredictor[source]#

Create and return a predictor object.

Parameters
  • transformation – Transformation to be applied to data before it goes into the model.

  • module – A trained pl.LightningModule object.

Returns

A predictor wrapping a nn.Module used for inference.

Return type

Predictor

create_training_data_loader(data: gluonts.dataset.Dataset, module: gluonts.torch.model.i_transformer.lightning_module.ITransformerLightningModule, shuffle_buffer_length: Optional[int] = None, **kwargs) Iterable[source]#

Create a data loader for training purposes.

Parameters
  • data – Dataset from which to create the data loader.

  • module – The pl.LightningModule object that will receive the batches from the data loader.

Returns

The data loader, i.e. and iterable over batches of data.

Return type

Iterable

create_transformation() gluonts.transform._base.Transformation[source]#

Create and return the transformation needed for training and inference.

Returns

The transformation that will be applied entry-wise to datasets, at training and inference time.

Return type

Transformation

create_validation_data_loader(data: gluonts.dataset.Dataset, module: gluonts.torch.model.i_transformer.lightning_module.ITransformerLightningModule, **kwargs) Iterable[source]#

Create a data loader for validation purposes.

Parameters
  • data – Dataset from which to create the data loader.

  • module – The pl.LightningModule object that will receive the batches from the data loader.

Returns

The data loader, i.e. and iterable over batches of data.

Return type

Iterable

lead_time: int#
prediction_length: int#
class gluonts.torch.model.i_transformer.ITransformerLightningModule(model_kwargs: dict, num_parallel_samples: int = 100, lr: float = 0.001, weight_decay: float = 1e-08)[source]#

Bases: lightning.pytorch.core.module.LightningModule

A pl.LightningModule class that can be used to train a ITransformerModel with PyTorch Lightning.

This is a thin layer around a (wrapped) ITransformerModel object, that exposes the methods to evaluate training and validation loss.

Parameters
  • model_kwargs – Keyword arguments to construct the ITransformerModel to be trained.

  • num_parallel_samples – Number of evaluation samples per time series to sample during inference.

  • lr – Learning rate.

  • weight_decay – Weight decay regularization parameter.

configure_optimizers()[source]#

Returns the optimizer to use.

forward(*args, **kwargs)[source]#

Same as torch.nn.Module.forward().

Parameters
  • *args – Whatever you decide to pass into the forward method.

  • **kwargs – Keyword arguments are also possible.

Returns

Your model’s output

training_step(batch, batch_idx: int)[source]#

Execute training step.

validation_step(batch, batch_idx: int)[source]#

Execute validation step.

class gluonts.torch.model.i_transformer.ITransformerModel(prediction_length: int, context_length: int, d_model: int, nhead: int, dim_feedforward: int, dropout: float, activation: str, norm_first: bool, num_encoder_layers: int, scaling: Optional[str], distr_output=gluonts.torch.distributions.studentT.StudentTOutput(beta=0.0), nonnegative_pred_samples: bool = False)[source]#

Bases: torch.nn.modules.module.Module

Module implementing the iTransformer model for multivariate forecasting as described in https://arxiv.org/abs/2310.06625 extended to be probabilistic.

Parameters
  • prediction_length – Number of time points to predict.

  • context_length – Number of time steps prior to prediction time that the model.

  • d_model – Transformer latent dimension.

  • nhead – Number of attention heads which must be divisible with d_model.

  • dim_feedforward – Dimension of the transformer’s feedforward network model.

  • dropout – Dropout rate for the transformer.

  • activation – Activation function for the transformer.

  • norm_first – Whether to normalize the input before the transformer.

  • num_encoder_layers – Number of transformer encoder layers.

  • scaling – Whether to scale the input using mean or std or None.

  • distr_output – Distribution to use to evaluate observations and sample predictions. Default: StudentTOutput().

  • nonnegative_pred_samples – Should final prediction samples be non-negative? If yes, an activation function is applied to ensure non-negative. Observe that this is applied only to the final samples and this is not applied during training.

describe_inputs(batch_size=1) gluonts.model.inputs.InputSpec[source]#
forward(past_target: torch.Tensor, past_observed_values: torch.Tensor) Tuple[Tuple[torch.Tensor, ...], torch.Tensor, torch.Tensor][source]#

Define the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(past_target: torch.Tensor, past_observed_values: torch.Tensor, future_target: torch.Tensor, future_observed_values: torch.Tensor) torch.Tensor[source]#
training: bool#