gluonts.dataset.pandas module#
- class gluonts.dataset.pandas.PandasDataset(dataframes: InitVar[Union[pd.DataFrame, pd.Series, Iterable[pd.DataFrame], Iterable[pd.Series], Iterable[tuple[Any, pd.DataFrame]], Iterable[tuple[Any, pd.Series]], dict[str, pd.DataFrame], dict[str, pd.Series]]], target: Union[str, list[str]] = 'target', feat_dynamic_real: Optional[list[str]] = None, past_feat_dynamic_real: Optional[list[str]] = None, timestamp: Optional[str] = None, freq: Optional[str] = None, static_features: InitVar[Optional[pd.DataFrame]] = None, future_length: int = 0, unchecked: bool = False, assume_sorted: bool = False, dtype: Type = <class 'numpy.float32'>)[source]#
Bases:
object
A dataset type based on
pandas.DataFrame
.This class is constructed with a collection of
pandas.DataFrame
objects where eachDataFrame
is representing one time series. Bothtarget
andtimestamp
columns are essential. Dynamic features of a series can be specified with together with the series’DataFrame
, while static features can be specified in a separateDataFrame
object via thestatic_features
argument.- Parameters
dataframes (InitVar[Union[pd.DataFrame, pd.Series, Iterable[pd.DataFrame], Iterable[pd.Series], Iterable[tuple[Any, pd.DataFrame]], Iterable[tuple[Any, pd.Series]], dict[str, pd.DataFrame], dict[str, pd.Series]]]) – Single
pd.DataFrame
/pd.Series
or a collection as list or dict containing at leasttimestamp
andtarget
values. If a dict is provided, the key will be the associateditem_id
.target (Union[str, list[str]]) – Name of the column that contains the
target
time series. For multivariate targets, a list of column names should be provided.timestamp (Optional[str]) – Name of the column that contains the timestamp information.
freq (Optional[str]) – Frequency of observations in the time series. Must be a valid pandas frequency.
feat_dynamic_real (Optional[list[str]]) – List of column names that contain dynamic real features.
past_feat_dynamic_real (Optional[list[str]]) – List of column names that contain dynamic real features only available in the past.
static_features (InitVar[Optional[pd.DataFrame]]) –
pd.DataFrame
containing static features for the series. The index should contain the key of the series in thedataframes
argument.future_length (int) – For target and past dynamic features last
future_length
elements are removed when iterating over the data set.unchecked (bool) – Whether consistency checks on indexes should be skipped. (Default:
False
)assume_sorted (bool) – Whether to assume that indexes are sorted by time, and skip sorting. (Default:
False
)
- assume_sorted: bool = False#
- dataframes: InitVar[Union[pd.DataFrame, pd.Series, Iterable[pd.DataFrame], Iterable[pd.Series], Iterable[tuple[Any, pd.DataFrame]], Iterable[tuple[Any, pd.Series]], dict[str, pd.DataFrame], dict[str, pd.Series]]]#
- dtype#
alias of
numpy.float32
- feat_dynamic_real: Optional[list[str]] = None#
- freq: Optional[str] = None#
- classmethod from_long_dataframe(dataframe: pd.DataFrame, item_id: str, timestamp: Optional[str] = None, static_feature_columns: Optional[list[str]] = None, static_features: pd.DataFrame = Empty DataFrame Columns: [] Index: [], **kwargs) PandasDataset [source]#
Construct
PandasDataset
out of a long data frame.A long dataframe contains time series data (both the target series and covariates) about multiple items at once. An
item_id
column is used to distinguish the items andgroup_by
accordingly.Static features can be included in the long data frame as well (with constant value), or be given as a separate data frame indexed by the
item_id
values.Note: on large datasets, this constructor can take some time to complete since it does some indexing and groupby operations on the data, and caches the result.
- Parameters
dataframe – pandas.DataFrame containing at least
timestamp
,target
anditem_id
columns.item_id – Name of the column that, when grouped by, gives the different time series.
static_feature_columns – Columns in
dataframe
containing static features.static_features – Dedicated
DataFrame
for static features. If bothstatic_features
andstatic_feature_columns
are specified, then the two sets of features are appended together.**kwargs – Additional arguments. Same as of PandasDataset class.
- Returns
Dataset containing series data from the given long dataframe.
- Return type
- future_length: int = 0#
- property num_feat_dynamic_real: int#
- property num_feat_static_cat: int#
- property num_feat_static_real: int#
- property num_past_feat_dynamic_real: int#
- past_feat_dynamic_real: Optional[list[str]] = None#
- property static_cardinalities#
- static_features: InitVar[Optional[pd.DataFrame]] = None#
- target: Union[str, list[str]] = 'target'#
- timestamp: Optional[str] = None#
- unchecked: bool = False#
- gluonts.dataset.pandas.is_uniform(index: pandas.core.indexes.period.PeriodIndex) bool [source]#
Check if
index
contains monotonically increasing periods, evenly spaced with frequencyindex.freq
.>>> ts = ["2021-01-01 00:00", "2021-01-01 02:00", "2021-01-01 04:00"] >>> is_uniform(pd.DatetimeIndex(ts).to_period("2H")) True >>> ts = ["2021-01-01 00:00", "2021-01-01 04:00"] >>> is_uniform(pd.DatetimeIndex(ts).to_period("2H")) False