Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3307334.3326109acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors

Published: 12 June 2019 Publication History

Abstract

This paper presents ArmTroi, a wearable system for understanding and analyzing the detailed arm motions of people primarily by using the motion sensors from wrist-worn wearable devices. ArmTroi can achieve real-time 3D arm skeleton tracking and reliable gesture inference tolerant to missing wearable sensors for enabling numerous useful application designs. We have coped with two major challenges through ArmTroi. First, the skeleton of each arm is determined from the locations of the elbow and wrist, whereas a wearable device only senses a single point from the wrist. We find that the potential solution space is huge. This underconstrained nature fundamentally challenges the achievement of accurate and real-time arm skeleton tracking. Second, wearable sensors may not reliably provide sensory data. For example, devices are not worn by the user, yet the learning tools for gesture inference, such as deep learning, typically have static network structures, which require nontrivial network adaptation to match the input's varying availability and ensure reliable gesture inference. We propose effective techniques to address above challenges, and all computations can be conducted on the user's smartphone. ArmTroi is thus a fully lightweight and portable system. We develop a prototype and extensive evaluation shows the efficacy of the ArmTroi design.

References

[1]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural machine translation by jointly learning to align and translate. In Proc. of ICLR .
[2]
Yoshua Bengio. 2013. Deep learning of representations: Looking forward. In Proc. of Springer SLSP .
[3]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In Proc. of IEEE CVPR .
[4]
Andrea Giovanni Cutti, Andrea Giovanardi, Laura Rocchi, Angelo Davalli, and Rinaldo Sacchetti. 2008. Ambulatory measurement of shoulder and elbow kinematics through inertial and magnetic sensors. Springer Medical & biological engineering & computing (2008).
[5]
Neeraj Deshmukh, Aravind Ganapathiraju, and Joseph Picone. 1999. Hierarchical search for large-vocabulary conversational speech recognition: working toward a solution to the decoding problem. IEEE Signal Processing Magazine (1999).
[6]
Han Ding, Longfei Shangguan, Zheng Yang, Jinsong Han, Zimu Zhou, Panlong Yang, Wei Xi, and Jizhong Zhao. 2015. Femo: A platform for free-weight exercise monitoring with rfids. In Proc. of ACM SenSys .
[7]
Yong Du, Wei Wang, and Liang Wang. 2015. Hierarchical recurrent neural network for skeleton based action recognition. In Proc. of IEEE CVPR .
[8]
Mahmoud El-Gohary and James McNames. 2012. Shoulder and elbow joint angle tracking with inertial sensors. IEEE Transactions on Biomedical Engineering (2012).
[9]
Biyi Fang, Nicholas D Lane, Mi Zhang, Aidan Boran, and Fahim Kawsar. 2016. BodyScan: Enabling radio-based sensing on wearable devices for contactless activity and vital sign monitoring. In Proc. of ACM MobiSys .
[10]
Petko Georgiev, Nicholas D Lane, Kiran K Rachuri, and Cecilia Mascolo. 2016. LEO: Scheduling sensor inference algorithms across heterogeneous mobile processors and network resources. In Proc. of ACM MobiCom .
[11]
John J Guiry, Pepijn Van de Ven, and John Nelson. 2014. Multi-sensor fusion for enhanced contextual awareness of everyday activities with ubiquitous devices. Multidisciplinary Digital Publishing Institute Journal on Sensors (2014).
[12]
Xiaonan Guo, Jian Liu, and Yingying Chen. 2017. FitCoach: Virtual fitness coach empowered by wearable mobile devices. In Proc. of IEEE INFOCOM .
[13]
Kiryong Ha, Zhuo Chen, Wenlu Hu, Wolfgang Richter, Padmanabhan Pillai, and Mahadev Satyanarayanan. 2014. Towards wearable cognitive assistance. In Proc. of ACM MobiSys .
[14]
Nils Yannick Hammerla, James Fisher, Peter Andras, Lynn Rochester, Richard Walker, and Thomas Plötz. 2015. PD Disease State Assessment in Naturalistic Environments Using Deep Learning. In Proc. of AAAI .
[15]
Seungyeop Han, Haichen Shen, Matthai Philipose, Sharad Agarwal, Alec Wolman, and Arvind Krishnamurthy. 2016. Mcdnn: An approximation-based execution framework for deep stream processing under resource constraints. In Proc. of ACM MobiSys .
[16]
Samuli Hemminki, Petteri Nurmi, and Sasu Tarkoma. 2013. Accelerometer-based transportation mode detection on smartphones. In Proc. of ACM SenSys .
[17]
Sepp Hochreiter, Yoshua Bengio, Paolo Frasconi, Jürgen Schmidhuber, et almbox. 2001. Gradient flow in recurrent nets: the difficulty of learning long-term dependencies.
[18]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation (1997).
[19]
Loc N Huynh, Youngki Lee, and Rajesh Krishna Balan. 2017. DeepMon: Mobile GPU-based Deep Learning Framework for Continuous Vision Applications. In Proc. of ACM MobiSys .
[20]
Doo Young Kwon and Markus Gross. 2007. A framework for 3D spatial gesture design and modeling using a wearable input device. In Proc. of ACM ISWC .
[21]
Nicholas D Lane, Sourav Bhattacharya, Petko Georgiev, Claudio Forlivesi, Lei Jiao, Lorena Qendro, and Fahim Kawsar. 2016. Deepx: A software accelerator for low-power deep learning inference on mobile devices. In Proc. of ACM/IEEE IPSN .
[22]
Oscar D Lara and Miguel A Labrador. 2013. A survey on human activity recognition using wearable sensors. IEEE Communications Surveys and Tutorials (2013).
[23]
Zachary C Lipton, David C Kale, and Randall Wetzel. 2016. Modeling missing data in clinical time series with rnns. Machine Learning for Healthcare (2016).
[24]
Cihang Liu, Lan Zhang, Zongqian Liu, Kebin Liu, Xiangyang Li, and Yunhao Liu. 2016. Lasagna: towards deep hierarchical understanding and searching over mobile sensing data. In Proc. of ACM MobiCom .
[25]
Sicong Liu, Yingyan Lin, Zimu Zhou, Kaiming Nan, Hui Liu, and Junzhao Du. 2018. On-Demand Deep Model Compression for Mobile Devices: A Usage-Driven Model Selection Framework. In Proc. of ACM MobiSys .
[26]
Roanna Lun and Wenbing Zhao. 2015. A survey of applications and human motion recognition with microsoft kinect. World Scientific on International Journal of Pattern Recognition and Artificial Intelligence (2015).
[27]
Minh-Thang Luong, Hieu Pham, and Christopher D Manning. 2015. Effective approaches to attention-based neural machine translation. In Proc. of EMNLP .
[28]
Sri Harish Mallidi and Hynek Hermansky. 2016. Novel neural network based fusion for multistream ASR. In Proc. of IEEE ICASSP .
[29]
Akhil Mathur, Nicholas D Lane, Sourav Bhattacharya, Aidan Boran, Claudio Forlivesi, and Fahim Kawsar. 2017. DeepEye: Resource Efficient Local Execution of Multiple Deep Vision Models using Wearable Commodity Hardware. In Proc. of ACM MobiSys .
[30]
Tomas Mikolov, Martin Karafiát, Lukas Burget, Jan Cernockỳ, and Sanjeev Khudanpur. 2010. Recurrent neural network based language model. In Interspeech .
[31]
Ramanan Navaratnam, Arasanathan Thayananthan, Philip HS Torr, and Roberto Cipolla. 2005. Hierarchical Part-Based Human Body Pose Estimation. In Proc. of BMVC .
[32]
Jiquan Ngiam, Aditya Khosla, Mingyu Kim, Juhan Nam, Honglak Lee, and Andrew Y Ng. 2011. Multimodal deep learning. In Proc. of ICML .
[33]
Qifan Pu, Sidhant Gupta, Shyamnath Gollakota, and Shwetak Patel. 2013. Whole-home gesture recognition using wireless signals. In Proc. of ACM MobiCom .
[34]
Muhannad Quwaider and Subir Biswas. 2008. Body posture identification using hidden Markov model with a wearable sensor network. In Proc. of ICST BodyNets .
[35]
Nancy Berryman Reese and William D Bandy. 2016. Joint Range of Motion and Muscle Length Testing-E-Book .Elsevier Health Sciences.
[36]
Qaiser Riaz, Guanhong Tao, Björn Krüger, and Andreas Weber. 2015. Motion reconstruction using very few accelerometers and ground contacts. Elsevier Graphical Models (2015).
[37]
Alexander M Rush, Sumit Chopra, and Jason Weston. 2015. A neural attention model for abstractive sentence summarization. In Proc. of EMNLP .
[38]
Mike Schuster and Kuldip K Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing (1997).
[39]
Chew Zhen Shan, Eileen Su Lee Ming, Hisyam Abdul Rahman, and Yeong Che Fai. 2015. Investigation of upper limb movement during badminton smash. In Proc. of IEEE ASCC .
[40]
Sheng Shen, Mahanth Gowda, and Romit Roy Choudhury. 2018. Closing the Gaps in Inertial Motion Tracking. In Proc. of ACM MobiCom .
[41]
Sheng Shen, He Wang, and Romit Roy Choudhury. 2016. I am a Smartwatch and I can Track my User's Arm. In Proc. of ACM MobiSys .
[42]
Muhammad Shoaib, Stephan Bosch, Hans Scholten, Paul JM Havinga, and Ozlem Durmaz Incel. 2015. Towards detection of bad habits by fusing smartphone and smartwatch sensors. In Proc. of IEEE PerCom Workshops .
[43]
Leonid Sigal, Alexandru O Balan, and Michael J Black. 2010. Humaneva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Springer Journal on International journal of computer vision (2010).
[44]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research (2014).
[45]
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Proc. of NIPS .
[46]
Jochen Tautges, Arno Zinke, Björn Krüger, Jan Baumann, Andreas Weber, Thomas Helten, Meinard Müller, Hans-Peter Seidel, and Bernd Eberhardt. 2011. Motion reconstruction using sparse accelerometer data. ACM Transactions on Graphics (2011).
[47]
Edison Thomaz, Irfan Essa, and Gregory D Abowd. 2015. A practical approach for recognizing eating moments with wrist-mounted inertial sensing. In Proc. of ACM Ubicomp .
[48]
Yonatan Vaizman, Katherine Ellis, and Gert Lanckriet. 2017. Recognizing detailed human context in the wild from smartphones and smartwatches. IEEE Pervasive Computing (2017).
[49]
Yonatan Vaizman, Nadir Weibel, and Gert Lanckriet. 2018. Context Recognition In-the-Wild: Unified Model for Multi-Modal Sensors and Multi-Label Classification. Proc. of the ACM on IMWUT (2018).
[50]
Praneeth Vepakomma, Debraj De, Sajal K Das, and Shekhar Bhansali. 2015. A-Wristocracy: Deep learning on wrist-worn sensing for recognition of user complex activities. In Proc. of IEEE BSN .
[51]
Tran Huy Vu, Archan Misra, Quentin Roy, Kenny Choo Tsu Wei, and Youngki Lee. 2018. Smartwatch-based Early Gesture Detection 8 Trajectory Tracking for Interactive Gesture-Driven Applications. Proc. of ACM IMWUT (2018).
[52]
Sijie Xiong, Sujie Zhu, Yisheng Ji, Binyao Jiang, Xiaohua Tian, Xuesheng Zheng, and Xinbing Wang. 2017. iBlink: Smart Glasses for Facial Paralysis Patients. In Proc. of ACM MobiSys .
[53]
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhudinov, Rich Zemel, and Yoshua Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. In Proc. of ICML .
[54]
Shuochao Yao, Yiran Zhao, Aston Zhang, Lu Su, and Tarek Abdelzaher. 2017. DeepIoT: Compressing Deep Neural Network Structures for Sensing Systems with a Compressor-Critic Framework. In Proc. of ACM SenSys .
[55]
Zhengyou Zhang. 2012. Microsoft kinect sensor and its effect. IEEE multimedia (2012).
[56]
Mingmin Zhao, Yonglong Tian, Hang Zhao, Mohammad Abu Alsheikh, Tianhong Li, Rumen Hristov, Zachary Kabelac, Dina Katabi, and Antonio Torralba. 2018. RF-based 3D skeletons. In Proc. of ACM SIGCOMM .
[57]
Pengfei Zhou, Yuanqing Zheng, and Mo Li. 2012. How long to wait": predicting bus arrival time with mobile phone based participatory sensing. In Proc. of ACM MobiSys .

Cited By

View all
  • (2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/369063921:1(1-75)Online publication date: 30-Aug-2024
  • (2024)Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity RecognitionProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699349(436-449)Online publication date: 4-Nov-2024
  • (2024)Vi2ACT:Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681664(1848-1856)Online publication date: 28-Oct-2024
  • Show More Cited By

Index Terms

  1. Real-time Arm Skeleton Tracking and Gesture Inference Tolerant to Missing Wearable Sensors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MobiSys '19: Proceedings of the 17th Annual International Conference on Mobile Systems, Applications, and Services
    June 2019
    736 pages
    ISBN:9781450366618
    DOI:10.1145/3307334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 June 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. arm tracking
    2. deep learning
    3. gesture inference
    4. mobile sensing

    Qualifiers

    • Research-article

    Funding Sources

    • ECS grant
    • NSFC Grant
    • Guangdong NSF
    • Fok Ying-Tong Education Foundation for Young Teachers in the Higher Education Institutions of China
    • NSF Grant of Shenzhen University
    • GRF grant

    Conference

    MobiSys '19
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 274 of 1,679 submissions, 16%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)68
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 25 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Artificial Intelligence of Things: A SurveyACM Transactions on Sensor Networks10.1145/369063921:1(1-75)Online publication date: 30-Aug-2024
    • (2024)Large Model for Small Data: Foundation Model for Cross-Modal RF Human Activity RecognitionProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699349(436-449)Online publication date: 4-Nov-2024
    • (2024)Vi2ACT:Video-enhanced Cross-modal Co-learning with Representation Conditional Discriminator for Few-shot Human Activity RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681664(1848-1856)Online publication date: 28-Oct-2024
    • (2024)Wi-Limb: Recognizing Moving Body Limbs Using a Single WiFi LinkProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3698117(2275-2281)Online publication date: 4-Dec-2024
    • (2024)Spatial-Temporal Masked Autoencoder for Multi-Device Wearable Human Activity RecognitionProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36314157:4(1-25)Online publication date: 12-Jan-2024
    • (2024)Detecting Face-Touching Gestures with Smartwatches and Deep Learning Networks2024 47th International Conference on Telecommunications and Signal Processing (TSP)10.1109/TSP63128.2024.10605968(249-252)Online publication date: 10-Jul-2024
    • (2024)Finger Tracking Using Wrist-Worn EMG SensorsIEEE Transactions on Mobile Computing10.1109/TMC.2024.343901823:12(14099-14110)Online publication date: Dec-2024
    • (2024)Enhancing the Applicability of Sign Language TranslationIEEE Transactions on Mobile Computing10.1109/TMC.2024.335011123:9(8634-8648)Online publication date: Sep-2024
    • (2024)WiRITE: General and Practical Wi-Fi Based Hand-Writing RecognitionIEEE Transactions on Mobile Computing10.1109/TMC.2023.326598823:4(2943-2957)Online publication date: Apr-2024
    • (2024)Orientation Estimation Piloted by Deep Reinforcement Learning2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)10.1109/IoTDI61053.2024.00016(134-145)Online publication date: 13-May-2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media