Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Li, Shiyang; Jin, Xiaoyong; Xuan, Yao; Zhou, Xiyou; Chen, Wenhu; Wang, Yu-Xiang; Yan, Xifeng

Computer Science > Machine Learning

arXiv:1907.00235 (cs)

[Submitted on 29 Jun 2019 (v1), last revised 3 Jan 2020 (this version, v3)]

Title:Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Authors:Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, Xifeng Yan

View PDF

Abstract:Time series forecasting is an important problem across many domains, including predictions of solar plant energy output, electricity consumption, and traffic jam situation. In this paper, we propose to tackle such forecasting problem with Transformer [1]. Although impressed by its performance in our preliminary study, we found its two major weaknesses: (1) locality-agnostics: the point-wise dot-product self-attention in canonical Transformer architecture is insensitive to local context, which can make the model prone to anomalies in time series; (2) memory bottleneck: space complexity of canonical Transformer grows quadratically with sequence length $L$, making directly modeling long time series infeasible. In order to solve these two issues, we first propose convolutional self-attention by producing queries and keys with causal convolution so that local context can be better incorporated into attention mechanism. Then, we propose LogSparse Transformer with only $O(L(\log L)^{2})$ memory cost, improving forecasting accuracy for time series with fine granularity and strong long-term dependencies under constrained memory budget. Our experiments on both synthetic data and real-world datasets show that it compares favorably to the state-of-the-art.

Comments:	To appear in the proceeding of NeurIPS 2019
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1907.00235 [cs.LG]
	(or arXiv:1907.00235v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1907.00235

Submission history

From: Xiyou Zhou [view email]
[v1] Sat, 29 Jun 2019 16:36:04 UTC (399 KB)
[v2] Tue, 29 Oct 2019 07:51:31 UTC (1,545 KB)
[v3] Fri, 3 Jan 2020 05:10:50 UTC (1,545 KB)

Computer Science > Machine Learning

Title:Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators