research-article

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors

Authors:

Kaishun WuAuthors Info & Claims

MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

Pages 287 - 299

https://doi.org/10.1145/3307334.3326109

Published: 12 June 2019 Publication History

Abstract

This paper presents ArmTroi, a wearable system for understanding and analyzing the detailed arm motions of people primarily by using the motion sensors from wrist-worn wearable devices. ArmTroi can achieve real-time 3D arm skeleton tracking and reliable gesture inference tolerant to missing wearable sensors for enabling numerous useful application designs. We have coped with two major challenges through ArmTroi. First, the skeleton of each arm is determined from the locations of the elbow and wrist, whereas a wearable device only senses a single point from the wrist. We find that the potential solution space is huge. This underconstrained nature fundamentally challenges the achievement of accurate and real-time arm skeleton tracking. Second, wearable sensors may not reliably provide sensory data. For example, devices are not worn by the user, yet the learning tools for gesture inference, such as deep learning, typically have static network structures, which require nontrivial network adaptation to match the input's varying availability and ensure reliable gesture inference. We propose effective techniques to address above challenges, and all computations can be conducted on the user's smartphone. ArmTroi is thus a fully lightweight and portable system. We develop a prototype and extensive evaluation shows the efficacy of the ArmTroi design.

References

[1]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. of ICLR .

[2]

Yoshua Bengio. 2013. Deep learning of representations: Looking forward. In Proc. of Springer SLSP .

Digital Library

[3]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In Proc. of IEEE CVPR .

[4]

Andrea Giovanni Cutti, Andrea Giovanardi, Laura Rocchi, Angelo Davalli, and Rinaldo Sacchetti. 2008. Ambulatory measurement of shoulder and elbow kinematics through inertial and magnetic sensors. Springer Medical & biological engineering & computing (2008).

[5]

Neeraj Deshmukh, Aravind Ganapathiraju, and Joseph Picone. 1999. Hierarchical search for large-vocabulary conversational speech recognition: working toward a solution to the decoding problem. IEEE Signal Processing Magazine (1999).

[6]

Han Ding, Longfei Shangguan, Zheng Yang, Jinsong Han, Zimu Zhou, Panlong Yang, Wei Xi, and Jizhong Zhao. 2015. Femo: A platform for free-weight exercise monitoring with rfids. In Proc. of ACM SenSys .

Digital Library

[7]

Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. of IEEE CVPR .

[8]

Mahmoud El-Gohary and James McNames. 2012. Shoulder and elbow joint angle tracking with inertial sensors. IEEE Transactions on Biomedical Engineering (2012).

[9]

Biyi Fang, Nicholas D Lane, Mi Zhang, Aidan Boran, and Fahim Kawsar. 2016. BodyScan: Enabling radio-based sensing on wearable devices for contactless activity and vital sign monitoring. In Proc. of ACM MobiSys .

Digital Library

[10]

Petko Georgiev, Nicholas D Lane, Kiran K Rachuri, and Cecilia Mascolo. 2016. LEO: Scheduling sensor inference algorithms across heterogeneous mobile processors and network resources. In Proc. of ACM MobiCom .

Digital Library

[11]

John J Guiry, Pepijn Van de Ven, and John Nelson. 2014. Multi-sensor fusion for enhanced contextual awareness of everyday activities with ubiquitous devices. Multidisciplinary Digital Publishing Institute Journal on Sensors (2014).

[12]

Xiaonan Guo, Jian Liu, and Yingying Chen. 2017. FitCoach: Virtual fitness coach empowered by wearable mobile devices. In Proc. of IEEE INFOCOM .

[13]

Kiryong Ha, Zhuo Chen, Wenlu Hu, Wolfgang Richter, Padmanabhan Pillai, and Mahadev Satyanarayanan. 2014. Towards wearable cognitive assistance. In Proc. of ACM MobiSys .

Digital Library

[14]

Nils Yannick Hammerla, James Fisher, Peter Andras, Lynn Rochester, Richard Walker, and Thomas Plötz. 2015. PD Disease State Assessment in Naturalistic Environments Using Deep Learning. In Proc. of AAAI .

Digital Library

[15]

Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proc. of ACM MobiSys .

Digital Library

[16]

Samuli Hemminki, Petteri Nurmi, and Sasu Tarkoma. 2013. Accelerometer-based transportation mode detection on smartphones. In Proc. of ACM SenSys .

Digital Library

[17]

Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber, et almbox. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.

[18]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).

[19]

Loc N Huynh, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications. In Proc. of ACM MobiSys .

Digital Library

[20]

Doo Young Kwon and Markus Gross. 2007. A framework for 3D spatial gesture design and modeling using a wearable input device. In Proc. of ACM ISWC .

Digital Library

[21]

Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proc. of ACM/IEEE IPSN .

[22]

Oscar D Lara and Miguel A Labrador. 2013. A survey on human activity recognition using wearable sensors. IEEE Communications Surveys and Tutorials (2013).

[23]

Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling missing data in clinical time series with rnns. Machine Learning for Healthcare (2016).

[24]

Cihang Liu, Lan Zhang, Zongqian Liu, Kebin Liu, Xiangyang Li, and Yunhao Liu. 2016. Lasagna: towards deep hierarchical understanding and searching over mobile sensing data. In Proc. of ACM MobiCom .

Digital Library

[25]

Sicong Liu, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. 2018. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework. In Proc. of ACM MobiSys .

Digital Library

[26]

Roanna Lun and Wenbing Zhao. 2015. A survey of applications and human motion recognition with microsoft kinect. World Scientific on International Journal of Pattern Recognition and Artificial Intelligence (2015).

[27]

Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proc. of EMNLP .

[28]

Sri Harish Mallidi and Hynek Hermansky. 2016. Novel neural network based fusion for multistream ASR. In Proc. of IEEE ICASSP .

[29]

Akhil Mathur, Nicholas D Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware. In Proc. of ACM MobiSys .

Digital Library

[30]

Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech .

[31]

Ramanan Navaratnam, Arasanathan Thayananthan, Philip HS Torr, and Roberto Cipolla. 2005. Hierarchical Part-Based Human Body Pose Estimation. In Proc. of BMVC .

[32]

Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proc. of ICML .

Digital Library

[33]

Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, and Shwetak Patel. 2013. Whole-home gesture recognition using wireless signals. In Proc. of ACM MobiCom .

Digital Library

[34]

Muhannad Quwaider and Subir Biswas. 2008. Body posture identification using hidden Markov model with a wearable sensor network. In Proc. of ICST BodyNets .

Digital Library

[35]

Nancy Berryman Reese and William D Bandy. 2016. Joint Range of Motion and Muscle Length Testing-E-Book .Elsevier Health Sciences.

[36]

Qaiser Riaz, Guanhong Tao, Björn Krüger, and Andreas Weber. 2015. Motion reconstruction using very few accelerometers and ground contacts. Elsevier Graphical Models (2015).

Digital Library

[37]

Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proc. of EMNLP .

[38]

Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (1997).

Digital Library

[39]

Chew Zhen Shan, Eileen Su Lee Ming, Hisyam Abdul Rahman, and Yeong Che Fai. 2015. Investigation of upper limb movement during badminton smash. In Proc. of IEEE ASCC .

[40]

Sheng Shen, Mahanth Gowda, and Romit Roy Choudhury. 2018. Closing the Gaps in Inertial Motion Tracking. In Proc. of ACM MobiCom .

Digital Library

[41]

Sheng Shen, He Wang, and Romit Roy Choudhury. 2016. I am a Smartwatch and I can Track my User's Arm. In Proc. of ACM MobiSys .

Digital Library

[42]

Muhammad Shoaib, Stephan Bosch, Hans Scholten, Paul JM Havinga, and Ozlem Durmaz Incel. 2015. Towards detection of bad habits by fusing smartphone and smartwatch sensors. In Proc. of IEEE PerCom Workshops .

[43]

Leonid Sigal, Alexandru O Balan, and Michael J Black. 2010. Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Springer Journal on International journal of computer vision (2010).

Digital Library

[44]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research (2014).

Digital Library

[45]

Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proc. of NIPS .

Digital Library

[46]

Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernd Eberhardt. 2011. Motion reconstruction using sparse accelerometer data. ACM Transactions on Graphics (2011).

Digital Library

[47]

Edison Thomaz, Irfan Essa, and Gregory D Abowd. 2015. A practical approach for recognizing eating moments with wrist-mounted inertial sensing. In Proc. of ACM Ubicomp .

Digital Library

[48]

Yonatan Vaizman, Katherine Ellis, and Gert Lanckriet. 2017. Recognizing detailed human context in the wild from smartphones and smartwatches. IEEE Pervasive Computing (2017).

[49]

Yonatan Vaizman, Nadir Weibel, and Gert Lanckriet. 2018. Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification. Proc. of the ACM on IMWUT (2018).

Digital Library

[50]

Praneeth Vepakomma, Debraj De, Sajal K Das, and Shekhar Bhansali. 2015. A-Wristocracy: Deep learning on wrist-worn sensing for recognition of user complex activities. In Proc. of IEEE BSN .

[51]

Tran Huy Vu, Archan Misra, Quentin Roy, Kenny Choo Tsu Wei, and Youngki Lee. 2018. Smartwatch-based Early Gesture Detection 8 Trajectory Tracking for Interactive Gesture-Driven Applications. Proc. of ACM IMWUT (2018).

Digital Library

[52]

Sijie Xiong, Sujie Zhu, Yisheng Ji, Binyao Jiang, Xiaohua Tian, Xuesheng Zheng, and Xinbing Wang. 2017. iBlink: Smart Glasses for Facial Paralysis Patients. In Proc. of ACM MobiSys .

Digital Library

[53]

Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proc. of ICML .

Digital Library

[54]

Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework. In Proc. of ACM SenSys .

Digital Library

[55]

Zhengyou Zhang. 2012. Microsoft kinect sensor and its effect. IEEE multimedia (2012).

Digital Library

[56]

Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, and Antonio Torralba. 2018. RF-based 3D skeletons. In Proc. of ACM SIGCOMM .

Digital Library

[57]

Pengfei Zhou, Yuanqing Zheng, and Mo Li. 2012. How long to wait": predicting bus arrival time with mobile phone based participatory sensing. In Proc. of ACM MobiSys .

Digital Library

Cited By

Siam SAhn HLiu LAlam SShen HCao ZShroff NKrishnamachari BSrivastava MZhang M(2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/369063921:1(1-75)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1145/3690639
Weng YWu GZheng TYang YLuo JShu YLiu JTan RHe YChen J(2024)Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity RecognitionProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699349(436-449)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3666025.3699349
Xia KLi WShao YLu SCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Vi2ACT:Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681664(1848-1856)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681664
Show More Cited By

Index Terms

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors
1. Human-centered computing
  1. Ubiquitous and mobile computing

Recommendations

When Wearable Sensing Meets Arm Tracking (poster)
MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

In this poster, we present our recent work, a wearable system for achieving real-time 3D arm skeleton. We have coped with the major challenge that the skeleton of each arm is determined from the locations of the elbow and wrist, whereas a wearable ...
Real-time vision-based hand tracking and gesture recognition
Design of an accurate end-of-arm force display system based on wearable arm gesture sensors and EMG sensors
Highlights
- A force display system based on information fusion for impaired arm is proposed.
Abstract
Most upper limb rehabilitation patients are still hard to feel the accuracy force they have imposed in the end of arm after a systematic upper limb rehabilitation. In order to provide an accurate end-of-arm force for those disabled ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services

June 2019

736 pages

ISBN:9781450366618

DOI:10.1145/3307334

General Chairs:
Junehwa Song
KAIST, South Korea
,
Minkyong Kim
Samsung Electronics
,
Program Chairs:
Nicholas D. Lane
University of Oxford & Samsung AI
,
Rajesh K. Balan
Singapore Management University

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOBILE: ACM Special Interest Group on Mobility of Systems, Users, Data and Computing

In-Cooperation

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

ECS grant
NSFC Grant
Guangdong NSF
Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China
NSF Grant of Shenzhen University
GRF grant

Conference

MobiSys '19

Sponsor:

SIGMOBILE

MobiSys '19: The 17th Annual International Conference on Mobile Systems, Applications, and Services

June 17 - 21, 2019

Seoul, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 274 of 1,679 submissions, 16%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

49
Total Citations
View Citations
1,137
Total Downloads

Downloads (Last 12 months)68
Downloads (Last 6 weeks)6

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Siam SAhn HLiu LAlam SShen HCao ZShroff NKrishnamachari BSrivastava MZhang M(2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/369063921:1(1-75)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.1145/3690639
Weng YWu GZheng TYang YLuo JShu YLiu JTan RHe YChen J(2024)Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity RecognitionProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699349(436-449)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3666025.3699349
Xia KLi WShao YLu SCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Vi2ACT:Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681664(1848-1856)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3681664
Ghosh SBulut EGanesan DLane NShi W(2024)Wi-Limb: Recognizing Moving Body Limbs Using a Single WiFi LinkProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698117(2275-2281)Online publication date: 4-Dec-2024
https://dl.acm.org/doi/10.1145/3636534.3698117
Miao SChen LHu R(2024)Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314157:4(1-25)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3631415
Mekruksavanich SPhaphan WJitpattanakul A(2024)Detecting Face-Touching Gestures with Smartwatches and Deep Learning Networks2024 47th International Conference on Telecommunications and Signal Processing (TSP)10.1109/TSP63128.2024.10605968(249-252)Online publication date: 10-Jul-2024
https://doi.org/10.1109/TSP63128.2024.10605968
Cao JLiu YHan LLi Z(2024)Finger Tracking Using Wrist-Worn EMG SensorsIEEE Transactions on Mobile Computing10.1109/TMC.2024.343901823:12(14099-14110)Online publication date: Dec-2024
https://doi.org/10.1109/TMC.2024.3439018
Li JXu JLiu YXu WLi Z(2024)Enhancing the Applicability of Sign Language TranslationIEEE Transactions on Mobile Computing10.1109/TMC.2024.335011123:9(8634-8648)Online publication date: Sep-2024
https://doi.org/10.1109/TMC.2024.3350111
Zhang YSun WLi M(2024)WiRITE: General and Practical Wi-Fi Based Hand-Writing RecognitionIEEE Transactions on Mobile Computing10.1109/TMC.2023.326598823:4(2943-2957)Online publication date: Apr-2024
https://doi.org/10.1109/TMC.2023.3265988
Liu MYang SRathee ADu W(2024)Orientation Estimation Piloted by Deep Reinforcement Learning2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)10.1109/IoTDI61053.2024.00016(134-145)Online publication date: 13-May-2024
https://doi.org/10.1109/IoTDI61053.2024.00016
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten