research-article

Open access

Semi-Supervised Learning for Time Series Collected at a Low Sampling Rate

Authors:

Young Seop Lee,

Jae-Gil LeeAuthors Info & Claims

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Pages 59 - 70

https://doi.org/10.1145/3637528.3672033

Published: 24 August 2024 Publication History

Abstract

Although time-series classification has many applications in healthcare and manufacturing, the high cost of data collection and labeling hinders its widespread use. To reduce data collection and labeling costs while maintaining high classification accuracy, we propose a novel problem setting, called semi-supervised learning with low-sampling-rate time series, in which the majority of time series are collected at a low sampling rate and are unlabeled whereas the minority of time series are collected at a high sampling rate and are labeled. For this novel problem scenario, we develop the SemiTSR framework equipped with the super-resolution module and the semi-supervised learning module. Here, low-sampling-rate time series are upsampled precisely, taking periodicity and trend at each timestamp into account, and both labeled and unlabeled high-sampling-rate time series are utilized for training. In particular, consistency regularization between artificially downsampled time series derived from an original high-sampling-rate time series is effective at overcoming limited sampling rates. We demonstrate that SemiTSR significantly outperforms conventional semi-supervised learning techniques by assuring high classification accuracy with low-sampling-rate time series.

Supplemental Material

MP4 File - Promotion Video of SemiTSR

Short promotional video for the research track paper "Semi-Supervised Learning for Time Series Collected at a Low Sampling Rate [KDD'24]"

Download
24.18 MB

References

[1]

Richard Andersson et al. 2010. Sampling frequency and eye-tracking measures: how speed affects durations, latencies, and more. Journal of Eye Movement Research, Vol. 3, 3 (2010).

[2]

Davide Anguita et al. 2013. A public domain dataset for human activity recognition using smartphones. In ESANN. 437--442.

[3]

Shaojie Bai et al. 2018. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018).

[4]

Oresti Banos et al. 2014. mHealthDroid: A novel framework for agile development of mobile health applications. In IWAAL. 91--98.

[5]

Mathias Baumert et al. 2016. Effects of ECG sampling rate on QT interval variability measurement. Biomedical Signal Processing and Control, Vol. 25 (2016), 159--164.

[6]

Yoshua Bengio et al. 2011. Deep learners benefit more from out-of-distribution examples. In AISTATS. 164--172.

[7]

David Berthelot et al. 2019. Mixmatch: A holistic approach to semi-supervised learning. In NeurIPS. 5049--5059.

[8]

David Berthelot et al. 2019. Remixmatch: Semi-supervised learning with distribution alignment and augmentation anchoring. arXiv preprint arXiv:1911.09785 (2019).

[9]

Joel S Burma et al. 2021. Insufficient sampling frequencies skew heart rate variability estimates: Implications for extracting heart rate metrics from neuroimaging and physiological data. Journal of Biomedical Informatics, Vol. 123 (2021), 103934.

Digital Library

[10]

Ricardo Chavarriaga et al. 2013. The Opportunity challenge: A benchmark database for on-body sensor-based activity recognition. Pattern Recognition Letters, Vol. 34, 15 (2013), 2033--2042.

Digital Library

[11]

Yanping Chen et al. 2014. Flying insect classification with inexpensive sensors. Journal of Insect Behavior, Vol. 27 (2014), 657--677.

[12]

Yinbo Chen et al. 2021. Learning continuous image representation with local implicit image function. In CVPR. 8628--8638.

[13]

Kyunghyun Cho et al. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).

[14]

Ekin D Cubuk et al. 2020. Randaugment: Practical automated data augmentation with a reduced search space. In CVPR Workshop. 702--703.

[15]

Lucas de Carvalho Pagliosa and Rodrigo Fernandes de Mello. 2018. Semi-supervised time series classification on positive and unlabeled problems using cross-recurrence quantification analysis. Pattern Recognition, Vol. 80 (2018), 53--63.

[16]

Terrance DeVries and Graham W Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017).

[17]

William R Dieter et al. 2005. Power reduction by varying sampling rate. In ISLPED. 227--232.

[18]

Namik Kemal Eryol et al. 2003. Effects of calcium treatment on QT interval and QT dispersion in hypocalcemia. American Journal of Cardiology, Vol. 91, 6 (2003), 750--2.

[19]

Haoyi Fan et al. 2021. Semi-supervised time series classification by temporal relation prediction. In ICASSP. 3545--3549.

[20]

Wei Fan et al. 2022. DEPTS: Deep Expansion Learning for Periodic Time Series Forecasting. In ICLR.

[21]

Scott Fazackerley et al. 2021. Efficient Flash Indexing for Time Series Data on Memory-constrained Embedded Sensor Devices. In SENSORNETS. 92--99.

[22]

Erik Fung et al. 2015. Electrocardiographic patch devices and contemporary wireless cardiac monitoring. Frontiers in Physiology, Vol. 6 (2015).

[23]

Ary L Goldberger et al. 2000. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation, Vol. 101, 23 (2000), e215--e220.

[24]

Lukas Gorzelniak et al. 2013. Does the low power mode of the actigraph GT3X accelerometer influence the device output in sleep research in healthy subjects?. In MedInfo. 1172.

[25]

Hassan Ismail Fawaz et al. 2019. Deep learning for time series classification: a review. Data Mining and Knowledge Discovery, Vol. 33 (2019), 917--963.

Digital Library

[26]

Brian Kenji Iwana and Seiichi Uchida. 2021. An empirical survey of data augmentation for time series classification with neural networks. Plos One, Vol. 16, 7 (2021), e0254841.

[27]

Shayan Jawed et al. 2020. Self-supervised learning for semi-supervised time series classification. In PAKDD. 499--511.

[28]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[29]

T Kurp et al. 2012. Adaptive sensing for energy-efficient manufacturing system and process monitoring. CIRP Journal of Manufacturing Science and Technology, Vol. 5, 4 (2012), 328--336.

[30]

Guokun Lai et al. 2018. Modeling long-and short-term temporal patterns with deep neural networks. In SIGIR. 95--104.

[31]

Samuli Laine and Timo Aila. 2016. Temporal ensembling for semi-supervised learning. arXiv preprint arXiv:1610.02242 (2016).

[32]

Dong-Hyun Lee et al. 2013. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. In ICML Workshop.

[33]

Jaewon Lee and Kyong Hwan Jin. 2022. Local texture estimator for implicit representation function. In CVPR. 1929--1938.

[34]

Jingyun Liang et al. 2021. Swinir: Image restoration using swin transformer. In ICCV. 1833--1844.

[35]

Jason Lines and Anthony Bagnall. 2014. Ensembles of elastic distance measures for time series classification. In SDM. 524--532.

[36]

Shizhan Liu et al. 2021. Pyraformer: Low-complexity pyramidal attention for long-range time series modeling and forecasting. In ICLR.

[37]

Ilya Loshchilov and Frank Hutter. 2016. SGDR: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983 (2016).

[38]

Ping Lou et al. 2020. A data-driven adaptive sampling method based on edge computing. Sensors, Vol. 20, 8 (2020), 2174.

[39]

Lars Mescheder et al. 2019. Occupancy networks: Learning 3d reconstruction in function space. In CVPR. 4460--4470.

[40]

Xiaoye Miao et al. 2021. Generative semi-supervised learning for multivariate time series imputation. In AAAI. 8983--8991.

[41]

Hangwei Qian et al. 2021. Latent independent excitation for generalizable sensor-based cross-person activity recognition. In AAAI. 11921--11929.

[42]

Jinwen Qiu et al. 2018. Multivariate Bayesian Structural Time Series Model. Journal of Machine Learning Research, Vol. 19, 1 (2018), 1--33.

[43]

Matthew A Reyna et al. 2021. Will two do? Varying dimensions in electrocardiography: the PhysioNet/Computing in Cardiology Challenge 2021. In 2021 Computing in Cardiology (CinC), Vol. 48. 1--4.

[44]

Matthew A Reyna et al. 2022. Issues in the automated classification of multilead ECGs using heterogeneous labels and populations. Physiological Measurement, Vol. 43, 8 (2022), 084001.

[45]

Peter R Rijnbeek et al. 2001. Minimum bandwidth requirements for recording of pediatric electrocardiograms. Circulation, Vol. 104, 25 (2001), 3087--3090.

[46]

Burr Settles. 2009. Active learning literature survey. Technical Report 1648. University of Wisconsin-Madison.

[47]

Wenzhe Shi et al. 2016. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In CVPR. 1874--1883.

[48]

Yooju Shin et al. 2023. Context consistency regularization for label sparsity in time series. In ICML. 31579--31595.

[49]

Satya Narayan Shukla and Benjamin M Marlin. 2021. Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series. arXiv preprint arXiv:2107.11350 (2021).

[50]

Satya Narayan Shukla and Benjamin M Marlin. 2021. Multi-time attention networks for irregularly sampled time series. arXiv preprint arXiv:2101.10318 (2021).

[51]

Vincent Sitzmann et al. 2020. Implicit neural representations with periodic activation functions. In NeurIPS. 7462--7473.

[52]

Kihyuk Sohn et al. 2020. Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In NeurIPS. 596--608.

[53]

Ilya Sutskever et al. 2013. On the importance of initialization and momentum in deep learning. In ICML. 1139--1147.

[54]

Ashish Vaswani et al. 2017. Attention is all you need. In NeurIPS. 6000--6010.

[55]

Haishuai Wang et al. 2019. Time series feature learning with labeled and unlabeled data. Pattern Recognition, Vol. 89 (2019), 55--66.

[56]

Yidong Wang et al. 2022. Freematch: Self-adaptive thresholding for semi-supervised learning. arXiv preprint arXiv:2205.07246 (2022).

[57]

Gary M Weiss et al. 2019. Smartphone and Smartwatch-Based Biometrics Using Activities of Daily Living. IEEE Access, Vol. 7 (2019), 133190--133202.

[58]

Qingsong Wen et al. 2022. Transformers in time series: A survey. arXiv preprint arXiv:2202.07125 (2022).

[59]

Gerald Woo et al. 2022. CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting. In ICLR.

[60]

Qiao Xiao et al. 2023. Deep Learning-Based ECG Arrhythmia Classification: A Systematic Review. Applied Sciences, Vol. 13, 8 (2023), 4964.

Index Terms

Semi-Supervised Learning for Time Series Collected at a Low Sampling Rate

Index terms have been assigned to the content through auto-classification.

Recommendations

Inductive Semi-supervised Multi-Label Learning with Co-Training
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

In multi-label learning, each training example is associated with multiple class labels and the task is to learn a mapping from the feature space to the power set of label space. It is generally demanding and time-consuming to obtain labels for training ...
Multiview Semi-Supervised Learning with Consensus

Obtaining high-quality and up-to-date labeled data can be difficult in many real-world machine learning applications. Semi-supervised learning aims to improve the performance of a classifier trained with limited number of labeled data by utilizing the ...
Semi-supervised genetic programming for classification
GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation

Learning from unlabeled data provides innumerable advantages to a wide range of applications where there is a huge amount of unlabeled data freely available. Semi-supervised learning, which builds models from a small set of labeled examples and a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 2024

6901 pages

ISBN:9798400704901

DOI:10.1145/3637528

General Chairs:
Ricardo Baeza-Yates
Northeastern University, USA
,
Francesco Bonchi
CENTAI / Eurecat, Italy

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Mobile eXperience Business, Samsung Electronics Co., Ltd.
National Research Foundation of Korea

Conference

KDD '24

Sponsor:

KDD '24: The 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 25 - 29, 2024

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
771
Total Downloads

Downloads (Last 12 months)771
Downloads (Last 6 weeks)177

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten