As someone who's worked in time series forecasting for a while, I haven't yet fo...

waltherg · 2023-10-14T06:55:44

Great write-up, thank you. Do you have rough measures for what constitutes high/mid/low- dimensional data? And how do you use XGBoost et al for multi-step forecasting, I.e. in scenarios where you want to predict multiple time steps in the future?

isoprophlex · 2023-10-14T07:53:12

Because they're so cheap to train, you can just use n models if you want to predict n steps ahead.

In sklearn, if you have a single-output regressor, use this for ergonomics: https://scikit-learn.org/stable/modules/generated/sklearn.mu...

The added benefit is that you optimize each regressor towards its own target timestep t+1 ... t+n. A single loss on the aggregate of all timesteps is often problematic

aldanor · 2023-10-14T17:02:07

There's been recent advances in joint fitting of multi-output regression forests ("vector leaf"): https://xgboost.readthedocs.io/en/stable/tutorials/multioutp...

In theory, this might suit the multi-step forecast use case.

jprafael · 2023-10-14T20:42:02

I've found that it works well to add the prediction horizon as a numerical feature (e.g. # of days), and them replicate each row for many such horizons, while ensuring that all such rows go to the same training fold.

yeahwhatever10 · 2023-10-14T02:13:49

Thanks for this write up. Your comment clears up a lot of the confusion I've had around these time series transformers.

How does lagged features for an MLP compare to longer sequence lengths for attention in Transformers? Are you able to lag 128 time steps in a feed forward network and get good results?

asavinov · 2023-10-14T11:30:05

I agree that the conventional (numeric) forecasting can hardly benefit from the newest approaches like transformers and LLMs. I made such a conclusion while working on the intelligent trading bot [0] by experimenting with many ML algorithms. Yet, there exist some cases where transformers might provide significant advantages. They could be useful where the (numeric) forecasting is augmented with discrete event analysis and where sequences of events are important. Another use case is where certain patterns are important like those detected in technical analysis. Yet, for these cases much more data is needed.

[0] https://github.com/asavinov/intelligent-trading-bot Intelligent Trading Bot: Automatically generating signals and trading based on machine learning and feature engineering

teeray · 2023-10-14T03:52:54

> I haven't yet found a use case for these "time series" focused deep learning models.

I guarantee you there will be chartists hawking GPT-powered market forecasts.

kylebenzle · 2023-10-14T06:24:33

That is terrifying but inevitable. Lol, the back end will just be chatgpt API asking, which stock should I buy next?

tchaffee · 2023-10-14T11:01:20

What's terrifying about it?

pfalke · 2023-10-14T08:55:05

Foundational models can work where so far „needs human intuition“ was the state of things. I can picture a time series model with large enough Training corpus being able to deal quite well with typical quirks of seasonalities, shocks, outliers, etc.

I fully agree regarding how things have been so far, but I’m excited to see practitioners try out models such as the one presented here — it might just work.

dr_dshiv · 2023-10-14T10:47:30

Reminds me a bit how in psychology you have ANOVA, MANOVA, ANCOVA, MANCOVA etc etc but really in the end we are just running regressions—variables are just variables.

westurner · 2023-10-14T20:48:14

Re: Causal inference [with observational data]: "The world needs computational social science" (2023) https://news.ycombinator.com/item?id=37746921

recursive4 · 2023-10-14T04:49:44

So fraud is consistent with respect to time?

jldugger · 2023-10-14T05:34:20

My read on this was that you can just dump the lagged values as inputs and let the network figure it out just as well as the other, time series specific models do, not that time doesn't matter.

cbeach · 2023-10-14T06:29:41

I assume the time series modelling is used to predict normal non-fraud behaviour. And then simpler algorithms are able to highlight deviations from the norm?

StephenAshmore · 2023-10-15T02:58:21

I agree! I worked on forecasting sales data for years, and we had the same results.