Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3382507.3418826acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

LASO: Exploiting Locomotive and Acoustic Signatures over the Edge to Annotate IMU Data for Human Activity Recognition

Published: 22 October 2020 Publication History

Abstract

Annotated IMU sensor data from smart devices and wearables are essential for developing supervised models for fine-grained human activity recognition, albeit generating sufficient annotated data for diverse human activities under different environments is challenging. Existing approaches primarily use human-in-the-loop based techniques, including active learning; however, they are tedious, costly, and time-consuming. Leveraging the availability of acoustic data from embedded microphones over the data collection devices, in this paper, we propose LASO, a multimodal approach for automated data annotation from acoustic and locomotive information. LASO works over the edge device itself, ensuring that only the annotated IMU data is collected, discarding the acoustic data from the device itself, hence preserving the audio-privacy of the user. In the absence of any pre-existing labeling information, such an auto-annotation is challenging as the IMU data needs to be sessionized for different time-scaled activities in a completely unsupervised manner. We use a change-point detection technique while synchronizing the locomotive information from the IMU data with the acoustic data, and then use pre-trained audio-based activity recognition models for labeling the IMU data while handling the acoustic noises. LASO efficiently annotates IMU data, without any explicit human intervention, with a mean accuracy of $0.93$ ($\pm 0.04$) and $0.78$ ($\pm 0.05$) for two different real-life datasets from workshop and kitchen environments, respectively.

Supplementary Material

MP4 File (3382507.3418826.mp4)
All of us are crazy about machine learning and deep learning these days. However, you need a huge amount of labeled data to train your ML or DL model. Data might be available, but how do you label it? Manual annotation is cumbersome, especially, when dealing with a huge volume of data generated by sensors for human activity detection. In our work, we have developed an automated approach for labeling sensor data using auxiliary modalities. More specifically, by capturing the acoustic signatures, fed to existing pre-trained models to infer human activities. Although this sounds simple, it is challenging in practice. Here comes our novelty. We have used an unsupervised change-point detection technique to achieve precise detection of activities followed by a feedback mechanism to avoid confounding of activities to avoid the generation of multiple labels by the audio. Besides, a PoC implementation of the system shows that the designed framework is able to run on edge-devices as well, thus preserving privacy.

References

[1]
Alaa E Abdel Hakim and Wael Deabes. 2019. Can People Really Do Nothing? Handling Annotation Gaps in ADL Sensor Data. Algorithms 12 (2019), 217.
[2]
Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, and Sudheendra Vijayanarasimhan. 2016. Youtube-8m: A large-scale video classification benchmark. arXiv preprint arXiv:1609.08675 (2016).
[3]
Mohammad Arif Ul Alam, Nirmalya Roy, Aryya Gangopadhyay, and Elizabeth Galik. 2017. A smart segmentation technique towards improved infrequent non-speech gestural activity recognition model. Pervasive and Mobile Computing 34 (2017), 25--45.
[4]
Marjan Alirezaie and Amy Loutfi. 2013. Automatic Annotation of Sensor Data Streams using Abductive Reasoning. In KEOD. 345--354.
[5]
Samaneh Aminikhanghahi, Tinghui Wang, and Diane J Cook. 2018. Real-time change point detection with application to smart home time series data. IEEE Transactions on Knowledge and Data Engineering (2018), 1010--1023.
[6]
Yusuf Aytar, Carl Vondrick, and Antonio Torralba. 2016. Soundnet: Learning sound representations from unlabeled video. In Advances in neural information processing systems. 892--900.
[7]
Vincent Becker, Linus Fessler, and Gábor Sörös. 2019. GestEar: Combining Audio and Motion Sensing for Gesture Recognition on Smartwatches. In Proceedings of the 23rd International Symposium on Wearable Computers. ACM, 10--19.
[8]
Maik Benndorf, Frederic Ringsleben, Thomas Haenselmann, and Bharat Yadav. 2017. Automated Annotation of Sensor data for Activity Recognition using Deep Learning. INFORMATIK 2017 (2017).
[9]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7291--7299.
[10]
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. 2002. SMOTE: Synthetic Minority Over-sampling Technique. J. Artif. Int. Res. (2002), 321--357.
[11]
David Cohn, Les Atlas, and Richard Ladner. 1994. Improving generalization with active learning. Machine learning 15 (1994), 201--221.
[12]
Federico Cruciani, Ian Cleland, Chris Nugent, Paul McCullagh, Kåre Synnes, and Josef Hallberg. 2018. Automatic annotation for human activity recognition in free living using a smartphone. Sensors 18 (2018).
[13]
Mihaly Csikszentmihalyi and Reed Larson. 2014. Validity and reliability of the experience-sampling method. In Flow and the foundations of positive psychology. Springer, 35--54.
[14]
Alain De Cheveigné and Hideki Kawahara. 2002. YIN, a fundamental frequency estimator for speech and music. The Journal of the Acoustical Society of America 111 (2002), 1917--1930.
[15]
Alexander Diete, Timo Sztyler, and Heiner Stuckenschmidt. 2018. Exploring semi-supervised methods for labeling support in multimodal datasets. Sensors 18 (2018), 2639.
[16]
Anthony Fleury, Michel Vacher, and Norbert Noury. 2010. SVM-based Multimodal Classification of Activities of Daily Living in Health Smart Homes: Sensors, Algorithms, and First Experimental Results. Trans. Info. Tech. Biomed. 14, 2 (2010), 274--283.
[17]
P. Haubrick and J. Ye. 2019. Robust Audio Sensing with MultiSound Classification. In 2019 IEEE International Conference on Pervasive Computing and Communications (PerCom. 1--7.
[18]
H. M. Sajjad Hossain, M. D. Abdullah Al Haiz Khan, and Nirmalya Roy. 2018. DeActive: Scaling Activity Recognition with Active Deep Learning. IMWUT 2 (2018), 66:1--66:23.
[19]
HM Sajjad Hossain, Md Abdullah Al Hafiz Khan, and Nirmalya Roy. 2017. Active learning enabled activity recognition. Pervasive and Mobile Computing 38 (2017), 312--330.
[20]
Yoshinobu Kawahara, Takehisa Yairi, and Kazuo Machida. 2007. Change-point detection in time-series data based on subspace identification. In Seventh IEEE International Conference on Data Mining (ICDM 2007). IEEE, 559--564.
[21]
Matthew Keally, Gang Zhou, Guoliang Xing, Jianxin Wu, and Andrew Pyles. 2011. PBN: Towards Practical Activity Recognition Using Smartphone-Based Body Sensor Networks. In Proceedings of the 9th ACM Conference on Embedded Networked Sensor Systems. ACM, 246--259.
[22]
Nicholas D Lane, Petko Georgiev, and Lorena Qendro. 2015. DeepEar: robust smartphone audio sensing in unconstrained acoustic environments using deep learning. In Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing. ACM, 283--294.
[23]
Gierad Laput, Karan Ahuja, Mayank Goel, and Chris Harrison. 2018. Ubicoustics: Plug-and-play acoustic activity recognition. In The 31st Annual ACM Symposium on User Interface Software and Technology. ACM, 213--224.
[24]
Gierad Laput and Chris Harrison. 2019. Sensing Fine-Grained Hand Activity with Smartwatches. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. ACM, 338:1--338:13.
[25]
Oscar D Lara and Miguel A Labrador. 2012. A survey on human activity recognition using wearable sensors. IEEE communications surveys & tutorials 15 (2012), 1192--1209.
[26]
Dawei Liang and Edison Thomaz. 2019. Audio-Based Activities of Daily Living (ADL) Recognition with Large-Scale Acoustic Embeddings from Online Videos. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 3 (2019), 17:1--17:18.
[27]
Dominic Mazzoni. 1999--2019. Audacity software is copyright 1999--2019 Audacity Team. The name Audacity is a registered trademark of Dominic Mazzoni. https://www.audacityteam. org/.
[28]
Mary L McHugh. 2012. Interrater reliability: the kappa statistic. Biochemia medica: Biochemia medica 22 (2012), 276--282.
[29]
Aditi Muralidharan, Zoltan Gyongyi, and Ed Chi. 2012. Social annotations in web search. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1085--1094.
[30]
Viswam Nathan, Sudip Paul, Temiloluwa Prioleau, Li Niu, Bobak J Mortazavi, Stephen A Cambone, Ashok Veeraraghavan, Ashutosh Sabharwal, and Roozbeh Jafari. 2018. A survey on smart homes for aging in place: Toward solutions to the specific needs of the elderly. IEEE Signal Processing Magazine 35 (2018), 111--119.
[31]
Karl Pearson. 1900. X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science (1900), 157--175.
[32]
Vitor Fortes Rey, Peter Hevesi, Onorina Kovalenko, and Paul Lukowicz. 2019. Let There Be IMU Data: Generating Training Data for Wearable, Motion Sensor Based Activity Recognition from Monocular RGB Videos. In Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers. 699--708.
[33]
J. Salamon, C. Jacoby, and J. P. Bello. 2014. A Dataset and Taxonomy for Urban Sound Research. In ACM MM. Orlando, FL, USA, 1041--1044.
[34]
Asim Smailagic, Daniel P. Siewiorek, Uwe Maurer, Anthony Rowe, and Karen P. Tang. 2005. eWatch: Context Sensitive System Design Case Study. In Proceedings of the IEEE Computer Society Annual Symposium on VLSI: New Frontiers in VLSI Design. IEEE Computer Society, 98--103.
[35]
Ekaterina H Spriggs, Fernando De La Torre, and Martial Hebert. 2009. Temporal segmentation and activity classification from first-person sensing. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. IEEE, 17--24.
[36]
Masashi Sugiyama, Taiji Suzuki, and Takafumi Kanamori. 2012. Density ratio estimation in machine learning. Cambridge University Press.
[37]
Catherine Tong, Shyam A Tailor, and Nicholas D Lane. 2020. Are Accelerometers for Activity Recognition a Dead-end?. In Proceedings of the 21st International Workshop on Mobile Computing Systems and Applications. 39--44.
[38]
Jindong Wang, Yiqiang Chen, Shuji Hao, Xiaohui Peng, and Lisha Hu. 2019. Deep learning for sensor-based activity recognition: A survey. Pattern Recognition Letters 119 (2019), 3--11.
[39]
Bernard L Welch. 1947. The generalization ofstudent?s? problem when several different population variances are involved. Biometrika (1947), 28--35.
[40]
Chenren Xu, Sugang Li, Gang Liu, Yanyong Zhang, Emiliano Miluzzo, Yih-Farn Chen, Jun Li, and Bernhard Firner. 2013. Crowd++: Unsupervised Speaker Count with Smartphones. In Proceedings of the 2013 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 43--52.
[41]
Makoto Yamada, Taiji Suzuki, Takafumi Kanamori, Hirotaka Hachiya, and Masashi Sugiyama. 2013. Relative density-ratio estimation for robust distribution comparison. Neural computation (2013), 1324--1370.
[42]
Koji Yatani and Khai N. Truong. 2012. BodyScope: A Wearable Acoustic Sensor for Activity Recognition. In Proceedings of the 2012 ACM Conference on Ubiquitous Computing. ACM, 341-- 350.
[43]
Mattia Zeni, Wanyi Zhang, Enrico Bignotti, Andrea Passerini, and Fausto Giunchiglia. 2019. Fixing Mislabeling by Human Annotators Leveraging Conflict Resolution and Prior Knowledge. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (2019), 32:1--32:23.
[44]
Cheng Zhang, AbdelKareem Bedri, Gabriel Reyes, Bailey Bercik, Omer T. Inan, Thad E. Starner, and Gregory D. Abowd. 2016. TapSkin: Recognizing On-Skin Input for Smartwatches. In Proceedings of the 2016 ACM International Conference on Interactive Surfaces and Spaces. ACM, 13--22.
[45]
Liyue Zhao, Gita Sukthankar, and Rahul Sukthankar. 2011. Incremental relabeling for active learning with noisy crowdsourced annotations. In 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third International Conference on Social Computing. IEEE, 728--733.

Cited By

View all
  • (2024)Collecting Self-reported Physical Activity and Posture Data Using Audio-based Ecological Momentary AssessmentProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785848:3(1-35)Online publication date: 9-Sep-2024
  • (2024)AcouDL: Context-Aware Daily Activity Recognition from Natural Acoustic Signals2024 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP61445.2024.00077(332-337)Online publication date: 29-Jun-2024
  • (2024)Self-SLAM: A Self-supervised Learning Based Annotation Method to Reduce Labeling OverheadMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_8(123-140)Online publication date: 22-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction
October 2020
920 pages
ISBN:9781450375818
DOI:10.1145/3382507
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. human-in-the-loop
  2. labeling human activity
  3. smart-environments

Qualifiers

  • Research-article

Conference

ICMI '20
Sponsor:
ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
October 25 - 29, 2020
Virtual Event, Netherlands

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)42
  • Downloads (Last 6 weeks)2
Reflects downloads up to 20 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Collecting Self-reported Physical Activity and Posture Data Using Audio-based Ecological Momentary AssessmentProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36785848:3(1-35)Online publication date: 9-Sep-2024
  • (2024)AcouDL: Context-Aware Daily Activity Recognition from Natural Acoustic Signals2024 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP61445.2024.00077(332-337)Online publication date: 29-Jun-2024
  • (2024)Self-SLAM: A Self-supervised Learning Based Annotation Method to Reduce Labeling OverheadMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-031-70378-2_8(123-140)Online publication date: 22-Aug-2024
  • (2024)SelfAct: Personalized Activity Recognition Based on Self-Supervised and Active LearningMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-031-63989-0_19(375-391)Online publication date: 19-Jul-2024
  • (2023)Robust Finger Interactions with COTS Smartwatches via Unsupervised Siamese AdaptationProceedings of the 36th Annual ACM Symposium on User Interface Software and Technology10.1145/3586183.3606794(1-14)Online publication date: 29-Oct-2023
  • (2023)Acconotate: Exploiting Acoustic Changes for Automatic Annotation of Inertial Data at the Source2023 19th International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)10.1109/DCOSS-IoT58021.2023.00013(25-33)Online publication date: Jun-2023
  • (2022)User Involvement in Training Smart Home AgentsProceedings of the 10th International Conference on Human-Agent Interaction10.1145/3527188.3561914(76-85)Online publication date: 5-Dec-2022
  • (2022)Demo: Automated Micro-Activity Annotations for Human Activity Recognition with Inertial Sensing2022 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP55677.2022.00039(162-164)Online publication date: Jun-2022
  • (2022)AmicroN: Framework for Generating Micro-Activity Annotations for Human Activity Recognition2022 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP55677.2022.00019(26-31)Online publication date: Jun-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media