gluonts.dataset.repository.datasets module#

gluonts.dataset.repository.datasets.get_dataset(dataset_name: str, path: pathlib.Path = PosixPath('/home/runner/.gluonts/datasets'), regenerate: bool = False, dataset_writer: gluonts.dataset.DatasetWriter = JsonLinesWriter(use_gzip=True, suffix='.json', compresslevel=4), prediction_length: Optional[int] = None) gluonts.dataset.common.TrainDatasets[source]#

Get a repository dataset.

The datasets that can be obtained through this function have been used with different processing over time by several papers (e.g., [SFG17], [LCY+18], and [YRD15]) or are obtained through the Monash Time Series Forecasting Repository.

Parameters
  • dataset_name – Name of the dataset, for instance “m4_hourly”.

  • regenerate – Whether to regenerate the dataset even if a local file is present. If this flag is False and the file is present, the dataset will not be downloaded again.

  • path – Where the dataset should be saved.

  • prediction_length – The prediction length to be used for the dataset. If None, the default prediction length will be used. If the dataset is already materialized, setting this option to a different value does not have an effect. Make sure to set regenerate=True in this case. Note that some datasets from the Monash Time Series Forecasting Repository do not actually have a default prediction length – the default then depends on the frequency of the data: - Minutely data –> prediction length of 60 (one hour) - Hourly data –> prediction length of 48 (two days) - Daily data –> prediction length of 30 (one month) - Weekly data –> prediction length of 8 (two months) - Monthly data –> prediction length of 12 (one year) - Yearly data –> prediction length of 4 (four years)

Return type

Dataset obtained by either downloading or reloading from local file.

gluonts.dataset.repository.datasets.get_download_path() pathlib.Path[source]#
Returns

default path to download datasets or models of gluon-ts. The path is $HOME/.gluonts/

Return type

Path

gluonts.dataset.repository.datasets.materialize_dataset(dataset_name: str, path: pathlib.Path = PosixPath('/home/runner/.gluonts/datasets'), regenerate: bool = False, dataset_writer: gluonts.dataset.DatasetWriter = JsonLinesWriter(use_gzip=True, suffix='.json', compresslevel=4), prediction_length: Optional[int] = None) pathlib.Path[source]#

Ensures that the dataset is materialized under the path / dataset_name path.

Parameters
  • dataset_name – Name of the dataset, for instance “m4_hourly”.

  • regenerate – Whether to regenerate the dataset even if a local file is present. If this flag is False and the file is present, the dataset will not be downloaded again.

  • path – Where the dataset should be saved.

  • prediction_length – The prediction length to be used for the dataset. If None, the default prediction length will be used. The prediction length might not be available for all datasets.

Return type

The path where the dataset is materialized