Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3594738.3611366acmconferencesArticle/Chapter ViewAbstractPublication PagesubicompConference Proceedingsconference-collections
short-paper
Open access

How Much Unlabeled Data is Really Needed for Effective Self-Supervised Human Activity Recognition?

Published: 08 October 2023 Publication History

Abstract

The prospect of learning effective representations from unlabeled data alone has led to a boost in developing self-supervised learning (SSL) methods for sensor-based Human Activity Recognition (HAR). Typically, (large-scale) unlabeled data are used for pre-training, with the learned weights being used as feature extractors for recognizing activities. While prior works have focused on the impact of increased data scale on performance, instead, we aim to discover the pre-training data efficiency of self-supervised methods. We empirically determine the minimal quantities of unlabeled data required for obtaining comparable performance to using all available data. We investigate three established SSL methods for HAR on three target datasets. Out of these three methods, we discover that Contrastive Predictive Coding (CPC) is the most efficient in terms of pre-training data requirements: just 15 minutes of sensor data across participants is sufficient to obtain competitive activity recognition performance. Further, around 5 minutes of source data is enough when there are sufficient amounts of target application data available. These findings can serve as starting point for more efficient data collection practices.

References

[1]
Bandar Almaslukh, Jalal AlMuhtadi, and Abdelmonim Artoli. 2017. An effective deep autoencoder approach for online smartphone-based human activity recognition. Int. J. Comput. Sci. Netw. Secur 17, 4 (2017), 160–165.
[2]
Yuki M Asano, Christian Rupprecht, and Andrea Vedaldi. 2019. A critical analysis of self-supervision, or what we can learn from a single image. arXiv preprint arXiv:1904.13132 (2019).
[3]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16. Springer, 213–229.
[4]
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. 2020. A simple framework for contrastive learning of visual representations. In International conference on machine learning. PMLR, 1597–1607.
[5]
Xinlei Chen and Kaiming He. 2021. Exploring simple siamese representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15750–15758.
[6]
Elijah Cole, Xuan Yang, Kimberly Wilber, Oisin Mac Aodha, and Serge Belongie. 2021. When does contrastive visual representation learning work?arXiv preprint arXiv:2105.05837 (2021).
[7]
Alexis Conneau, German Kruszewski, Guillaume Lample, Loïc Barrault, and Marco Baroni. 2018. What you can cram into a single vector: Probing sentence embeddings for linguistic properties. arXiv preprint arXiv:1805.01070 (2018).
[8]
Shohreh Deldari, Hao Xue, Aaqib Saeed, Daniel V Smith, and Flora D Salim. 2022. COCOA: Cross Modality Contrastive Learning for Sensor Data. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1–28.
[9]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[10]
Max Eichler, Gözde Gül Şahin, and Iryna Gurevych. 2019. LINSPECTOR WEB: A multilingual probing suite for word representations. arXiv preprint arXiv:1907.11438 (2019).
[11]
Alaaeldin El-Nouby, Gautier Izacard, Hugo Touvron, Ivan Laptev, Hervé Jegou, and Edouard Grave. 2021. Are large-scale datasets necessary for self-supervised pre-training?arXiv preprint arXiv:2112.10740 (2021).
[12]
Priya Goyal, Dhruv Mahajan, Abhinav Gupta, and Ishan Misra. 2019. Scaling and benchmarking self-supervised visual representation learning. In Proceedings of the ieee/cvf International Conference on computer vision. 6391–6400.
[13]
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, 2020. Bootstrap your own latent-a new approach to self-supervised learning. Advances in Neural Information Processing Systems 33 (2020), 21271–21284.
[14]
Nils Y Hammerla and Thomas Plötz. 2015. Let’s (not) stick together: pairwise similarity biases cross-validation in activity recognition. In Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing. 1041–1051.
[15]
Harish Haresamudram, David V Anderson, and Thomas Plötz. 2019. On the role of features in human activity recognition. In Proceedings of the 23rd International Symposium on Wearable Computers. 78–88.
[16]
Harish Haresamudram, Apoorva Beedu, Varun Agrawal, Patrick L Grady, Irfan Essa, Judy Hoffman, and Thomas Plötz. 2020. Masked reconstruction based self-supervision for human activity recognition. In Proceedings of the 2020 International Symposium on Wearable Computers. 45–49.
[17]
Harish Haresamudram, Irfan Essa, and Thomas Ploetz. 2022. Investigating Enhancements to Contrastive Predictive Coding for Human Activity Recognition. arXiv preprint arXiv:2211.06173 (2022).
[18]
Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2021. Contrastive predictive coding for human activity recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 2 (2021), 1–26.
[19]
Harish Haresamudram, Irfan Essa, and Thomas Plötz. 2022. Assessing the state of self-supervised human activity recognition using wearables. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 3 (2022), 1–47.
[20]
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. 2020. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9729–9738.
[21]
Yash Jain, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur. 2022. ColloSSL: Collaborative Self-Supervised Learning for Human Activity Recognition. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 1 (2022), 1–28.
[22]
Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer. 2019. Revisiting self-supervised visual representation learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1920–1929.
[23]
Hong Liu, Jeff Z HaoChen, Adrien Gaidon, and Tengyu Ma. 2021. Self-supervised Learning is More Robust to Dataset Imbalance. arXiv preprint arXiv:2110.05025 (2021).
[24]
Vinod Nair and Geoffrey E Hinton. 2010. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10). 807–814.
[25]
Aaron van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
[26]
Thomas PlÖtz. 2021. Applying machine learning for sensor data analysis in interactive systems: Common pitfalls of pragmatic use and ways to avoid them. ACM Computing Surveys (CSUR) 54, 6 (2021), 1–25.
[27]
Alec Radford, Karthik Narasimhan, Tim Salimans, and Ilya Sutskever. 2018. Improving language understanding by generative pre-training. (2018).
[28]
Attila Reiss and Didier Stricker. 2012. Introducing a new benchmarked dataset for activity monitoring. In 2012 16th international symposium on wearable computers. IEEE, 108–109.
[29]
Aaqib Saeed, Tanir Ozcelebi, and Johan Lukkien. 2019. Multi-task self-supervised learning for human activity detection. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 3, 2 (2019), 1–30.
[30]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research 15, 1 (2014), 1929–1958.
[31]
Allan Stisen, Henrik Blunck, Sourav Bhattacharya, Thor Siiger Prentow, Mikkel Baun Kjærgaard, Anind Dey, Tobias Sonne, and Mads Møller Jensen. 2015. Smart devices are different: Assessing and mitigatingmobile sensing heterogeneities for activity recognition. In Proceedings of the 13th ACM conference on embedded networked sensor systems. 127–140.
[32]
Timo Sztyler and Heiner Stuckenschmidt. 2016. On-body localization of wearable devices: An investigation of position-aware activity recognition. In 2016 IEEE International Conference on Pervasive Computing and Communications (PerCom). IEEE, 1–9.
[33]
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, Soren Brage, Nick Wareham, and Cecilia Mascolo. 2021. SelfHAR: Improving Human Activity Recognition through Self-training with Unlabeled Data. arXiv preprint arXiv:2102.06073 (2021).
[34]
Chi Ian Tang, Ignacio Perez-Pozuelo, Dimitris Spathis, and Cecilia Mascolo. 2020. Exploring Contrastive Learning in Human Activity Recognition for Healthcare. arXiv preprint arXiv:2011.11542 (2020).
[35]
Terry T Um, Franz MJ Pfister, Daniel Pichler, Satoshi Endo, Muriel Lang, Sandra Hirche, Urban Fietzek, and Dana Kulić. 2017. Data augmentation of wearable sensor data for parkinson’s disease monitoring using convolutional neural networks. In Proceedings of the 19th ACM international conference on multimodal interaction. 216–220.
[36]
Aaron Van den Oord, Yazhe Li, and Oriol Vinyals. 2018. Representation learning with contrastive predictive coding. arXiv e-prints (2018), arXiv–1807.
[37]
Ivan Vulić, Edoardo Maria Ponti, Robert Litschko, Goran Glavaš, and Anna Korhonen. 2020. Probing pretrained language models for lexical semantics. arXiv preprint arXiv:2010.05731 (2020).
[38]
Yuzhe Yang and Zhi Xu. 2020. Rethinking the value of labels for improving class-imbalanced learning. Advances in Neural Information Processing Systems 33 (2020), 19290–19301.
[39]
Jure Zbontar, Li Jing, Ishan Misra, Yann LeCun, and Stéphane Deny. 2021. Barlow twins: Self-supervised learning via redundancy reduction. In International Conference on Machine Learning. PMLR, 12310–12320.

Cited By

View all
  • (2025)Impact of Pre-training Datasets on Human Activity Recognition with Contrastive Predictive CodingIntelligent Systems10.1007/978-3-031-79035-5_21(306-320)Online publication date: 30-Jan-2025
  • (2024)Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity RecognitionSensors10.3390/s2404123824:4(1238)Online publication date: 15-Feb-2024
  • (2024)Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing DataACM Transactions on Intelligent Systems and Technology10.1145/3696461Online publication date: 20-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISWC '23: Proceedings of the 2023 ACM International Symposium on Wearable Computers
October 2023
145 pages
ISBN:9798400701993
DOI:10.1145/3594738
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2023

Check for updates

Author Tags

  1. human activity recognition
  2. representation learning
  3. self-supervision

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Funding Sources

Conference

UbiComp/ISWC '23

Acceptance Rates

Overall Acceptance Rate 38 of 196 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)672
  • Downloads (Last 6 weeks)90
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Impact of Pre-training Datasets on Human Activity Recognition with Contrastive Predictive CodingIntelligent Systems10.1007/978-3-031-79035-5_21(306-320)Online publication date: 30-Jan-2025
  • (2024)Towards Learning Discrete Representations via Self-Supervision for Wearables-Based Human Activity RecognitionSensors10.3390/s2404123824:4(1238)Online publication date: 15-Feb-2024
  • (2024)Evaluating Large Language Models as Virtual Annotators for Time-series Physical Sensing DataACM Transactions on Intelligent Systems and Technology10.1145/3696461Online publication date: 20-Sep-2024
  • (2024)Progress and Thinking on Self-Supervised Learning Methods in Computer Vision: A ReviewIEEE Sensors Journal10.1109/JSEN.2024.344388524:19(29524-29544)Online publication date: 1-Oct-2024
  • (2024)A Washing Machine is All You Need? On the Feasibility of Machine Data for Self-Supervised Human Activity Recognition2024 International Conference on Activity and Behavior Computing (ABC)10.1109/ABC61795.2024.10651688(1-10)Online publication date: 29-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media