Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3447548.3467401acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

A Transformer-based Framework for Multivariate Time Series Representation Learning

Published: 14 August 2021 Publication History

Abstract

We present a novel framework for multivariate time series representation learning based on the transformer encoder architecture. The framework includes an unsupervised pre-training scheme, which can offer substantial performance benefits over fully supervised learning on downstream tasks, both with but even without leveraging additional unlabeled data, i.e., by reusing the existing data samples. Evaluating our framework on several public multivariate time series datasets from various domains and with diverse characteristics, we demonstrate that it performs significantly better than the best currently available methods for regression and classification, even for datasets which consist of only a few hundred training samples. Given the pronounced interest in unsupervised learning for nearly all domains in the sciences and in industry, these findings represent an important landmark, presenting the first unsupervised method shown to push the limits of state-of-the-art performance for multivariate time series regression and classification.

Supplementary Material

MP4 File (a_transformerbased_framework_for_multivariate-george_zerveas-srideepika_jayaraman-38957975-xf1A.mp4)
KDD officially edited presentation video

References

[1]
Anthony Bagnall, Hoang Anh Dau, Jason Lines, Michael Flynn, James Large, Aaron Bostrom, Paul Southam, and Eamonn Keogh. 2018. The UEA multivariate time series classification archive, 2018. arXiv:1811.00075 [cs, stat] (Oct. 2018).
[2]
A. Bagnall, J. Lines, A. Bostrom, J. Large, and E. Keogh. 2017. The Great Time Series Classification Bake Off: a Review and Experimental Evaluation of Recent Algorithmic Advances. Data Mining and Knowledge Discovery, Vol. 31 (2017), 606--660. Issue 3.
[3]
Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The Long-Document Transformer. arXiv:2004.05150 [cs] (April 2020).
[4]
Filippo Maria Bianchi, Lorenzo Livi, Karl Øyvind Mikalsen, Michael Kampffmeyer, and Robert Jenssen. 2019. Learning representations of multivariate time series with missing data. Pattern Recognition, Vol. 96 (Dec. 2019), 106973. https://doi.org/10.1016/j.patcog.2019.106973
[5]
T. Brown, B. Mann, et al. 2020. Language Models are Few-Shot Learners. arxiv: 2005.14165 [cs.CL]
[6]
Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V. Le, and Ruslan Salakhutdinov. 2019. Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context. arXiv:1901.02860 [cs, stat] (June 2019).
[7]
Edward De Brouwer, Jaak Simm, Adam Arany, and Yves Moreau. 2019. GRU-ODE-Bayes: Continuous Modeling of Sporadically-Observed Time Series. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. dtextquotesingle Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 7379--7390.
[8]
Angus Dempster, Franccois Petitjean, and Geoffrey I. Webb. 2020 a. ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge Discovery (2020). https://doi.org/10.1007/s10618-020-00701-z
[9]
Angus Dempster, Daniel F. Schmidt, and Geoffrey I. Webb. 2020 b. MINIROCKET: A Very Fast (Almost) Deterministic Transform for Time Series Classification. arXiv:2012.08791 [cs, stat] (Dec. 2020).
[10]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arxiv: 1810.04805
[11]
Hassan Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2019 a. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery, Vol. 33, 4 (July 2019), 917--963. https://doi.org/10.1007/s10618-019-00619-1
[12]
H. Fawaz, B. Lucas, et al. 2019 b. InceptionTime: Finding AlexNet for Time Series Classification. ArXiv (2019). https://doi.org/10.1007/s10618-020-00710-y
[13]
Vincent Fortuin, M. Hüser, Francesco Locatello, Heiko Strathmann, and G. Rätsch. 2019. SOM-VAE: Interpretable Discrete Representation Learning on Time Series. ICLR (2019).
[14]
Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019. Unsupervised Scalable Representation Learning for Multivariate Time Series. In Advances in Neural Information Processing Systems 32, H. Wallach, H. Larochelle, A. Beygelzimer, F. Alché-Buc, E. Fox, and R. Garnett (Eds.). Curran Associates, Inc., 4650--4661.
[15]
Cheng-Zhi Anna Huang, Ashish Vaswani, et al. 2018. Music transformer: Generating music with long-term structure. In International Conference on Learning Representations.
[16]
A. Jansen, M. Plakal, Ratheet Pandya, D. Ellis, Shawn Hershey, Jiayang Liu, R. C. Moore, and R. A. Saurous. 2018. Unsupervised Learning of Semantic Audio Representations. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2018). https://doi.org/10.1109/ICASSP.2018.8461684
[17]
A. Kopf, Vincent Fortuin, Vignesh Ram Somnath, and M. Claassen. 2019. Mixture-of-Experts Variational Autoencoder for clustering and generating from similarity-based representations. ICLR 2019 (2019).
[18]
Qi Lei, Jinfeng Yi, R. Vaculín, Lingfei Wu, and I. Dhillon. 2017. Similarity Preserving Representation Learning for Time Series Analysis. ArXiv (2017).
[19]
Shiyang Li, Xiaoyong Jin, Yao Xuan, Xiyou Zhou, Wenhu Chen, Yu-Xiang Wang, and Xifeng Yan. 2019. Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. In Advances in Neural Information Processing Systems. 5243--5253.
[20]
Bryan Lim, Sercan O. Arik, Nicolas Loeff, and Tomas Pfister. 2020. Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting. arxiv: 1912.09363 [stat.ML]
[21]
J. Lines, Sarah Taylor, and Anthony J. Bagnall. 2018. Time Series Classification with HIVE-COTE. ACM Trans. Knowl. Discov. Data (2018). https://doi.org/10.1145/3182382
[22]
Liyuan Liu, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao, and Jiawei Han. 2020. On the Variance of the Adaptive Learning Rate and Beyond. arXiv:1908.03265 [cs, stat] (April 2020).
[23]
Benjamin Lucas, Ahmed Shifaz, et al. 2019. Proximity Forest: An effective and scalable distance-based classifier for time series. Data Mining and Knowledge Discovery, Vol. 33, 3 (May 2019), 607--635. https://doi.org/10.1007/s10618-019-00617-3
[24]
Xinrui Lyu, Matthias Hueser, Stephanie L. Hyland, George Zerveas, and Gunnar Raetsch. 2018. Improving Clinical Predictions through Unsupervised Time Series Representation Learning. In Proceedings of the NeurIPS 2018 Workshop on Machine Learning for Health. arxiv: 1812.00490
[25]
J. Ma, Zheng Shou, Alireza Zareian, Hassan Mansour, A. Vetro, and S. Chang. 2019. CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation. arxiv: 1905.09904 [cs.CS]
[26]
P. Malhotra, T. Vishnu, L. Vig, Puneet Agarwal, and G. Shroff. 2017. TimeNet: Pre-trained deep recurrent neural network for time series classification. ESANN (2017).
[27]
Colin Raffel, Noam Shazeer, et al. 2019. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. ArXiv, Vol. abs/1910.10683 (2019).
[28]
Sheng Shen, Zhewei Yao, Amir Gholami, Michael W. Mahoney, and Kurt Keutzer. 2020. PowerNorm: Rethinking Batch Normalization in Transformers. arXiv:2003.07845 [cs] (June 2020).
[29]
Ahmed Shifaz, Charlotte Pelletier, F. Petitjean, and Geoffrey I. Webb. 2020. TS-CHIEF: a scalable and accurate forest algorithm for time series classification. Data Mining and Knowledge Discovery (2020). https://doi.org/10.1007/s10618-020-00679-8
[30]
C. Tan, C. Bergmeir, François Petitjean, and Geoffrey I. Webb. 2020 a. Monash University, UEA, UCR Time Series Regression Archive. ArXiv (2020).
[31]
Chang Wei Tan, Christoph Bergmeir, Francois Petitjean, and Geoffrey I Webb. 2020 b. Time Series Regression. arXiv preprint arXiv:2006.12672 (2020).
[32]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008.
[33]
Neo Wu, Bradley Green, Xue Ben, and Shawn O'Banion. 2020. Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case. arxiv: 2001.08317 [cs.LG]
[34]
Chuxu Zhang, Dongjin Song, Yuncong Chen, Xinyang Feng, C. Lumezanu, Wei Cheng, Jingchao Ni, B. Zong, H. Chen, and Nitesh V. Chawla. 2019. A Deep Neural Network for Unsupervised Anomaly Detection and Diagnosis in Multivariate Time Series Data. In AAAI. https://doi.org/10.1609/aaai.v33i01.33011409

Cited By

View all
  • (2025)Identification of multimodal brain imaging biomarkers in first-episode drugs-naive major depressive disorder through a multi-site large-scale MRI consortium dataJournal of Affective Disorders10.1016/j.jad.2024.10.006369(364-372)Online publication date: Jan-2025
  • (2025)A pre-trained multi-step prediction informer for ship motion prediction with a mechanism-data dual-driven frameworkEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109523139(109523)Online publication date: Jan-2025
  • (2025)A deep learning-based hand motion classification for hand dysfunction assessment in cervical spondylotic myelopathyBiomedical Signal Processing and Control10.1016/j.bspc.2024.10688499(106884)Online publication date: Jan-2025
  • Show More Cited By

Index Terms

  1. A Transformer-based Framework for Multivariate Time Series Representation Learning

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining
        August 2021
        4259 pages
        ISBN:9781450383325
        DOI:10.1145/3447548
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 14 August 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. classification
        2. deep learning
        3. framework
        4. imputation
        5. multivariate time series
        6. regression
        7. self-supervised learning
        8. transformer
        9. unsupervised learning

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        KDD '21
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)9,181
        • Downloads (Last 6 weeks)1,043
        Reflects downloads up to 09 Nov 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Identification of multimodal brain imaging biomarkers in first-episode drugs-naive major depressive disorder through a multi-site large-scale MRI consortium dataJournal of Affective Disorders10.1016/j.jad.2024.10.006369(364-372)Online publication date: Jan-2025
        • (2025)A pre-trained multi-step prediction informer for ship motion prediction with a mechanism-data dual-driven frameworkEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109523139(109523)Online publication date: Jan-2025
        • (2025)A deep learning-based hand motion classification for hand dysfunction assessment in cervical spondylotic myelopathyBiomedical Signal Processing and Control10.1016/j.bspc.2024.10688499(106884)Online publication date: Jan-2025
        • (2024)Carbon emissions trading price forecasts by multi-perspective fusionEconomic Analysis Letters10.58567/eal030200023:2(37-48)Online publication date: 15-Jun-2024
        • (2024)Anomaly Detection in Time Series: Current Focus and Future ChallengesAnomaly Detection - Recent Advances, AI and ML Perspectives and Applications10.5772/intechopen.111886Online publication date: 17-Jan-2024
        • (2024)Drive GPT – An AI Based Generative Driver ModelSAE Technical Paper Series10.4271/2024-26-0025Online publication date: 23-Jan-2024
        • (2024)Extraction of Features for Time Series Classification Using Noise InjectionSensors10.3390/s2419640224:19(6402)Online publication date: 2-Oct-2024
        • (2024)Machine Learning in Short-Reach Optical Systems: A Comprehensive SurveyPhotonics10.3390/photonics1107061311:7(613)Online publication date: 28-Jun-2024
        • (2024)Back to Basics: The Power of the Multilayer Perceptron in Financial Time Series ForecastingMathematics10.3390/math1212192012:12(1920)Online publication date: 20-Jun-2024
        • (2024)Predictive Analytics of Air Temperature in Alaskan Permafrost Terrain Leveraging Two-Level Signal Decomposition and Deep LearningForecasting10.3390/forecast60100046:1(55-80)Online publication date: 9-Jan-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media