research-article

Free access

Warpformer: A Multi-scale Modeling Approach for Irregular Clinical Time Series

Authors:

Jia LiAuthors Info & Claims

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 3273 - 3285

https://doi.org/10.1145/3580305.3599543

Published: 04 August 2023 Publication History

Abstract

Irregularly sampled multivariate time series are ubiquitous in various fields, particularly in healthcare, and exhibit two key characteristics: intra-series irregularity and inter-series discrepancy. Intra-series irregularity refers to the fact that time-series signals are often recorded at irregular intervals, while inter-series discrepancy refers to the significant variability in sampling rates among diverse series. However, recent advances in irregular time series have primarily focused on addressing intra-series irregularity, overlooking the issue of inter-series discrepancy. To bridge this gap, we present Warpformer, a novel approach that fully considers these two characteristics. In a nutshell, Warpformer has several crucial designs, including a specific input representation that explicitly characterizes both intra-series irregularity and inter-series discrepancy, a warping module that adaptively unifies irregular time series in a given scale, and a customized attention module for representation learning. Additionally, we stack multiple warping and attention modules to learn at different scales, producing multi-scale representations that balance coarse-grained and fine-grained signals for downstream tasks. We conduct extensive experiments on widely used datasets and a new large-scale benchmark built from clinical databases. The results demonstrate the superiority of Warpformer over existing state-of-the-art approaches.

Supplementary Material

MP4 File (rtfp1332-2min-promo.mp4)

2-minute promotional video

Download
28.36 MB

References

[1]

Jimmy Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. ArXiv, Vol. abs/1607.06450 (2016).

[2]

Donald J Berndt and James Clifford. 1994. Using dynamic time warping to find patterns in time series. In KDD workshop.

[3]

Milutin Brankovic, Kevin Buchin, Koen Klaren, André Nusser, Aleksandr Popov, and Sampson Wong. 2020. (k, l)-Medians Clustering of Trajectories Using Continuous Dynamic Time Warping. In SIGSPATIAL.

[4]

Tom B. Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel M. Ziegler, Jeffrey Wu, Clemens Winter, Christopher Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In NeurIPS 2020.

[5]

Zhengping Che, Sanjay Purushotham, Kyunghyun Cho, David Sontag, and Yan Liu. 2018. Recurrent neural networks for multivariate time series with missing values. Scientific reports, Vol. 8, 1 (2018), 1--12.

[6]

Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David Duvenaud. 2018. Neural Ordinary Differential Equations. In NeurIPS. 6572--6583.

[7]

Marco Cuturi and Mathieu Blondel. 2017. Soft-dtw: a differentiable loss function for time-series. In International conference on machine learning. PMLR, 894--903.

[8]

Zihang Dai, Zhilin Yang, Yiming Yang, Jaime Carbonell, Quoc V Le, and Ruslan Salakhutdinov. 2019. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv preprint arXiv:1901.02860 (2019).

[9]

Linhao Dong, Shuang Xu, and Bo Xu. 2018. Speech-transformer: a no-recurrence sequence-to-sequence model for speech recognition. In ICASSP. IEEE, 5884--5888.

[10]

Marzyeh Ghassemi, Mike Wu, Michael C. Hughes, Peter Szolovits, and Finale Doshi-Velez. 2017. Predicting intervention onset in the ICU with switching state space models. AMIA Summits on Translational Science Proceedings, Vol. 2017 (2017), 82--91.

[11]

Hrayr Harutyunyan, Hrant Khachatrian, David C Kale, Greg Ver Steeg, and Aram Galstyan. 2019. Multitask learning and benchmarking with clinical time series data. Scientific data, Vol. 6, 1 (2019), 1--18.

[12]

JoonNyung Heo, Jihoon G Yoon, Hyungjong Park, Young Dae Kim, Hyo Suk Nam, and Ji Hoe Heo. 2019. Machine learning-based model for prediction of outcomes in acute stroke. Stroke, Vol. 50, 5 (2019), 1263--1265.

[13]

Max Horn, Michael Moor, Christian Bock, Bastian Rieck, and Karsten M. Borgwardt. 2020. Set Functions for Time Series. In ICML, Vol. 119. 4353--4363.

[14]

Peter B Jensen, Lars J Jensen, and Søren Brunak. 2012. Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics, Vol. 13, 6 (2012), 395--405.

[15]

Alistair EW Johnson, Tom J Pollard, Lu Shen, H Lehman Li-Wei, Mengling Feng, Mohammad Ghassemi, Benjamin Moody, Peter Szolovits, Leo Anthony Celi, and Roger G Mark. 2016. MIMIC-III, a freely accessible critical care database. Scientific data, Vol. 3, 1 (2016), 1--9.

[16]

Dongha Lee, Seonghyeon Lee, and Hwanjo Yu. 2021. Learnable dynamic temporal pooling for time series classification. In AAAI.

[17]

Jia Li, Zhichao Han, Hong Cheng, Jiao Su, Pengyun Wang, Jianfeng Zhang, and Lujia Pan. 2019. Predicting Path Failure In Time-Evolving Graphs. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.

Digital Library

[18]

Jia Li, Yu Rong, Helen Meng, Zhihui Lu, Timothy Kwok, and Hong Cheng. 2018b. TATC: predicting Alzheimer's disease with actigraphy data. In SIGKDD. 509--518.

[19]

Naihan Li, Shujie Liu, Yanqing Liu, Sheng Zhao, and Ming Liu. 2018a. Neural Speech Synthesis with Transformer Network. In AAAI.

[20]

Shizhan Liu, Hang Yu, Cong Liao, Jianguo Li, Weiyao Lin, Alex X. Liu, and Schahram Dustdar. 2022. Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting. In ICLR.

[21]

Yang Liu, Yu Rong, Zhuoning Guo, Nuo Chen, Tingyang Xu, Fugee Tsung, and Jia Li. 2023. Human Mobility Modeling During the COVID-19 Pandemic via Deep Graph Diffusion Infomax. In AAAI.

[22]

Suhas Lohit, Qiao Wang, and Pavan Turaga. 2019. Temporal transformer networks: Joint learning of invariant and discriminative time warping. In CVPR. 12426--12435.

[23]

Junyu Luo, Muchao Ye, Cao Xiao, and Fenglong Ma. 2020. Hitanet: Hierarchical time-aware attention networks for risk prediction on electronic health records. In SIGKDD. 647--656.

[24]

Liantao Ma, Junyi Gao, Yasha Wang, Chaohe Zhang, Jiangtao Wang, Wenjie Ruan, Wen Tang, Xin Gao, and Xinyu Ma. 2020. AdaCare: Explainable Clinical Health Status Representation Learning via Scale-Adaptive Feature Extraction and Recalibration. In AAAI. 825--832.

[25]

Matthew B. A. McDermott, Bret Nestor, Evan Kim, Wancong Zhang, Anna Goldenberg, Peter Szolovits, and Marzyeh Ghassemi. 2021. A comprehensive EHR timeseries pre-training benchmark. In CHIL. 257--278.

[26]

Michael C. Mozer, Denis Kazakov, and Robert V. Lindsey. 2017. Discrete Event, Continuous Time RNNs. CoRR, Vol. abs/1710.04110 (2017). [arXiv]1710.04110 http://arxiv.org/abs/1710.04110

[27]

Sankavi Muralitharan, Walter Nelson, Shuang Di, Michael McGillion, PJ Devereaux, Neil Grant Barr, and Jeremy Petch. 2021. Machine Learning-Based Early Warning Systems for Clinical Deterioration: Systematic Scoping Review. Journal of medical Internet research, Vol. 23, 2 (2021), e25187.

[28]

Daniel Neil, Michael Pfeiffer, and Shih-Chii Liu. 2016. Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences. In NueIPS. 3882--3890.

[29]

Milad Zafar Nezhad, Najibesadat Sadati, Kai Yang, and Dongxiao Zhu. 2019. A deep active survival analysis approach for precision treatment recommendations: Application of prostate cancer. Expert Systems with Applications, Vol. 115 (2019), 16--26.

[30]

Yuqi Nie, Nam H Nguyen, Phanwadee Sinthong, and Jayant Kalagnanam. 2023. A Time Series is Worth 64 Words: Long-term Forecasting with Transformers. In ICLR.

[31]

Zhaozhi Qian, Yao Zhang, Ioana Bica, Angela Wood, and Mihaela van der Schaar. 2021. SyncTwin: Treatment Effect Estimation with Longitudinal Outcomes. NeurIPS, Vol. 34 (2021).

[32]

Yulia Rubanova, Tian Qi Chen, and David Duvenaud. 2019. Latent Ordinary Differential Equations for Irregularly-Sampled Time Series. In NeurIPS. 5321--5331.

[33]

Hiroaki Sakoe and Seibi Chiba. 1978. Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 26 (1978), 159--165.

[34]

Erik Scharwächter, Jonathan Lennartz, and Emmanuel Müller. 2021. Differentiable Segmentation of Sequences. In ICLR.

[35]

Mohammad Amin Shabani, Amir H. Abdi, Lili Meng, and Tristan Sylvain. 2023. Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting. In ICLR.

[36]

Peter Shaw, Jakob Uszkoreit, and Ashish Vaswani. 2018. Self-Attention with Relative Position Representations. In ACL.

[37]

Satya Narayan Shukla and Benjamin Marlin. 2021. Multi-Time Attention Networks for Irregularly Sampled Time Series. In ICLR.

[38]

Satya Narayan Shukla and Benjamin M. Marlin. 2019. Interpolation-Prediction Networks for Irregularly Sampled Time Series. In ICLR.

[39]

Ikaro Silva, George Moody, Daniel J Scott, Leo A Celi, and Roger G Mark. 2012. Predicting in-hospital mortality of icu patients: The physionet/computing in cardiology challenge 2012. In 2012 Computing in Cardiology. IEEE, 245--248.

[40]

David So, Quoc Le, and Chen Liang. 2019. The evolved transformer. In ICML. PMLR, 5877--5886.

[41]

Harini Suresh, Nathan Hunt, Alistair Johnson, Leo Anthony Celi, Peter Szolovits, and Marzyeh Ghassemi. 2017. Clinical intervention prediction and understanding with deep neural networks. In Machine Learning for Healthcare Conference. 322--337.

[42]

Panagiotis Symeonidis, Christos Andras, and Markus Zanker. 2021. Treatment Recommendations for COVID-19 Patients along with Robust Explanations. In CBMS. 207--212.

[43]

SINDHU TIPIRNENI and CHANDAN K REDDY. 2022. Self-Supervised Transformer for Sparse and Irregularly Sampled Multivariate Clinical Time-Series. ACM Trans. Knowl. Discov. Data, Vol. 1, 1 (2022).

[44]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NeurIPS. 5998--6008.

[45]

Benyou Wang, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, and Jakob Grue Simonsen. 2021. On Position Embeddings in BERT. In ICLR.

[46]

Qiang Wang, Bei Li, Tong Xiao, Jingbo Zhu, Changliang Li, Derek F. Wong, and Lidia S. Chao. 2019. Learning Deep Transformer Models for Machine Translation. In ACL.

[47]

Shirly Wang, Matthew B. A. McDermott, Geeticka Chauhan, Marzyeh Ghassemi, Michael C. Hughes, and Tristan Naumann. 2020. MIMIC-Extract: a data extraction, preprocessing, and representation pipeline for MIMIC-III. In CHIL. 222--235.

[48]

Ron Shapira Weber, Matan Eyal, Nicki Skafte Detlefsen, Oren Shriki, and Oren Freifeld. 2019. Diffeomorphic Temporal Alignment Nets. In Neural Information Processing Systems.

[49]

Haixu Wu, Jiehui Xu, Jianmin Wang, and Mingsheng Long. 2021. Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting. In NeurIPS.

[50]

Mike Wu, Marzyeh Ghassemi, Mengling Feng, Leo A Celi, Peter Szolovits, and Finale Doshi-Velez. 2017. Understanding vasopressor intervention and weaning: risk prediction in a public heterogeneous clinical time series database. Journal of the American Medical Informatics Association, Vol. 24, 3 (2017), 488--495.

[51]

Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, and Jimeng Sun. 2018. Raim: Recurrent attentive and intensive model of multimodal patient monitoring data. In SIGKDD. 2565--2573.

Digital Library

[52]

Zhen Xu, David R. So, and Andrew M. Dai. 2021. MUFASA: Multimodal Fusion Architecture Search for Electronic Health Records. In AAAI. 10532--10540.

[53]

Muchao Ye, Junyu Luo, Cao Xiao, and Fenglong Ma. 2020. Lsan: Modeling long-term dependencies and short-term correlations with hierarchical attention for risk prediction. In CIKM. 1753--1762.

[54]

Kun-Hsing Yu, Andrew L Beam, and Isaac S Kohane. 2018. Artificial intelligence in healthcare. Nature biomedical engineering, Vol. 2, 10 (2018), 719--731.

[55]

Tianping Zhang, Yizhuo Zhang, Wei Cao, Jiang Bian, Xiaohan Yi, Shun Zheng, and Jian Li. 2022b. Less is more: Fast multivariate time series forecasting with light sampling-oriented mlp structures. arXiv preprint arXiv:2207.01186 (2022).

[56]

Xianli Zhang, Buyue Qian, Yang Li, Yang Liu, Xi Chen, Chong Guan, and Chen Li. 2021. Learning Robust Patient Representations from Multi-modal Electronic Health Records: A Supervised Deep Learning Approach. In SDM. 585--593.

[57]

Xiang Zhang, Marko Zeman, Theodoros Tsiligkaridis, and Marinka Zitnik. 2022a. Graph-Guided Network For Irregularly Sampled Multivariate Time Series. In ICLR.

[58]

Zheng Zhang, Ping Tang, and Thomas Corpetti. 2020. Time adaptive optimal transport: A framework of time series similarity measure. IEEE Access, Vol. 8 (2020), 149764--149774.

[59]

Le Zheng, Oliver Wang, Shiying Hao, Chengyin Ye, Modi Liu, Minjie Xia, Alex N Sabo, Liliana Markovic, Frank Stearns, Laura Kanov, et al. 2020. Development of an early-warning system for high-risk patients for suicide attempt using deep learning and electronic health records. Translational psychiatry, Vol. 10, 1 (2020), 1--10.

[60]

Haoyi Zhou, Shanghang Zhang, Jieqi Peng, Shuai Zhang, Jianxin Li, Hui Xiong, and Wancai Zhang. 2021. Informer: Beyond efficient transformer for long sequence time-series forecasting. In AAAI.

[61]

Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, and Rong Jin. 2022. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In ICML. 27268--27286.

Cited By

Zhang WZhang LHan JLiu HFu YZhou JMei YXiong HBaeza-Yates RBonchi F(2024)Irregular Traffic Time Series Forecasting Based on Asynchronous Spatio-Temporal Graph Convolutional NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671665(4302-4313)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671665
Huang JYang BYin KXu J(2024)DNA-T: Deformable Neighborhood Attention Transformer for Irregular Medical Time SeriesIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.339544628:7(4224-4237)Online publication date: Jul-2024
https://doi.org/10.1109/JBHI.2024.3395446
Zhang WZhang JLi JTsung FFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)A Co-training Approach for Noisy Time Series LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614759(3308-3318)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614759

Index Terms

Warpformer: A Multi-scale Modeling Approach for Irregular Clinical Time Series

Recommendations

ElasticFlow: A complexity-effective approach for pipelining irregular loop nests
2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
Modern high-level synthesis (HLS) tools commonly employ pipelining to achieve efficient loop acceleration by overlapping the execution of successive loop iterations. However, existing HLS techniques provide inadequate support for pipelining irregular loop ...
Distributed memory code generation for mixed Irregular/Regular computations
PPoPP 2015: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Many applications feature a mix of irregular and regular computational structures. For example, codes using adaptive mesh refinement (AMR) typically use a collection of regular blocks, where the number of blocks and the relationship between blocks is ...
FlexVec: auto-vectorization for irregular loops
PLDI '16

Traditional vectorization techniques build a dependence graph with distance and direction information to determine whether a loop is vectorizable. Since vectorization reorders the execution of instructions across iterations, in general instructions ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2023

5996 pages

ISBN:9798400701030

DOI:10.1145/3580305

General Chairs:
Ambuj Singh
UC Santa Barbara, USA
,
Yizhou Sun
UC Los Angeles, USA
,
Program Chairs:
Leman Akoglu
Carnegie Mellon University, USA
,
Dimitrios Gunopulos
University of Athens, Greece
,
Xifeng Yan
UC Santa Barbara, USA
,
Ravi Kumar
Google, USA
,
Fatma Ozcan
Google, USA
,
Jieping Ye
Alibaba DAMO Academy

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 August 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '23

Sponsor:

KDD '23: The 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 6 - 10, 2023

CA, Long Beach, USA

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
763
Total Downloads

Downloads (Last 12 months)642
Downloads (Last 6 weeks)99

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang WZhang LHan JLiu HFu YZhou JMei YXiong HBaeza-Yates RBonchi F(2024)Irregular Traffic Time Series Forecasting Based on Asynchronous Spatio-Temporal Graph Convolutional NetworksProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671665(4302-4313)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671665
Huang JYang BYin KXu J(2024)DNA-T: Deformable Neighborhood Attention Transformer for Irregular Medical Time SeriesIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2024.339544628:7(4224-4237)Online publication date: Jul-2024
https://doi.org/10.1109/JBHI.2024.3395446
Zhang WZhang JLi JTsung FFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)A Co-training Approach for Noisy Time Series LearningProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3614759(3308-3318)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3614759

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents