research-article

Open access

MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices

Authors:

Henry Hoffmann,

Karan AhujaAuthors Info & Claims

UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

Article No.: 70, Pages 1 - 11

https://doi.org/10.1145/3654777.3676461

Published: 11 October 2024 Publication History

All formats PDF

Abstract

There has been a continued trend towards minimizing instrumentation for full-body motion capture, going from specialized rooms and equipment, to arrays of worn sensors and recently sparse inertial pose capture methods. However, as these techniques migrate towards lower-fidelity IMUs on ubiquitous commodity devices, like phones, watches, and earbuds, challenges arise including compromised online performance, temporal consistency, and loss of global translation due to sensor noise and drift. Addressing these challenges, we introduce MobilePoser, a real-time system for full-body pose and global translation estimation using any available subset of IMUs already present in these consumer devices. MobilePoser employs a multi-stage deep neural network for kinematic pose estimation followed by a physics-based motion optimizer, achieving state-of-the-art accuracy while remaining lightweight. We conclude with a series of demonstrative applications to illustrate the unique potential of MobilePoser across a variety of fields, such as health and wellness, gaming, and indoor navigation to name a few.

References

[1]

[n. d.]. PlayStation VR. https://www.playstation.com/en-us/explore/playstation-vr/.

[2]

2023. HTC Vive. https://www.vive.com.

[3]

Karan Ahuja. 2024. Practical and Rich User Digitization. arxiv:2403.00153 [cs.HC] https://arxiv.org/abs/2403.00153

[4]

Karan Ahuja, Sven Mayer, Mayank Goel, and Chris Harrison. 2021. Pose-on-the-go: Approximating user pose with smartphone sensor fusion and inverse kinematics. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[5]

Karan Ahuja, Vivian Shen, Cathy Mengying Fang, Nathan Riopelle, Andy Kong, and Chris Harrison. 2022. Controllerpose: inside-out body capture with VR controller cameras. In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[6]

Riku Arakawa, Karan Ahuja, Kristie Mak, Gwendolyn Thompson, Sam Shaaban, Oliver Lindhiem, and Mayank Goel. 2023. LemurDx: Using Unconstrained Passive Sensing for an Objective Measurement of Hyperactivity in Children with no Parent Input. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 2 (2023), 1–23.

Digital Library

[7]

Riku Arakawa, Bing Zhou, Gurunandan Krishnan, Mayank Goel, and Shree K Nayar. 2023. MI-Poser: Human Body Pose Tracking Using Magnetic and Inertial Sensor Fusion with Metal Interference Mitigation. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 7, 3 (2023), 1–24.

Digital Library

[8]

Rayan Armani, Changlin Qian, Jiaxi Jiang, and Christian Holz. 2024. Ultra Inertial Poser: Scalable Motion Capture and Tracking from Sparse Inertial Sensors and Ultra-Wideband Ranging. In ACM SIGGRAPH 2024 Conference Papers. 1–11.

Digital Library

[9]

Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V 14. Springer, 561–578.

[10]

Nathan Devrio and Chris Harrison. 2022. DiscoBand: Multiview Depth-Sensing Smartwatch Strap for Hand, Body and Environment Tracking. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–13.

Digital Library

[11]

Nathan DeVrio, Vimal Mollyn, and Chris Harrison. 2023. SmartPoser: Arm Pose Estimation with a Smartphone and Smartwatch Using UWB and IMU Data. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–11.

Digital Library

[12]

Roy Featherstone. 2014. Rigid body dynamics algorithms. Springer.

Digital Library

[13]

Shubham Goel, Georgios Pavlakos, Jathushan Rajasegaran, Angjoo Kanazawa, and Jitendra Malik. 2023. Humans in 4d: Reconstructing and tracking humans with transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14783–14794.

[14]

Yinghao Huang, Manuel Kaufmann, Emre Aksan, Michael J Black, Otmar Hilliges, and Gerard Pons-Moll. 2018. Deep inertial poser: Learning to reconstruct human pose from sparse inertial measurements in real time. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–15.

Digital Library

[15]

Fan Jiang, Xubo Yang, and Lele Feng. 2016. Real-time full-body motion reconstruction and recognition for off-the-shelf VR devices. In Proceedings of the 15th ACM SIGGRAPH Conference on Virtual-Reality Continuum and Its Applications in Industry-Volume 1. 309–318.

Digital Library

[16]

Jiaxi Jiang, Paul Streli, Huajian Qiu, Andreas Fender, Larissa Laich, Patrick Snape, and Christian Holz. 2022. Avatarposer: Articulated full-body pose tracking from sparse motion sensing. In European Conference on Computer Vision. Springer, 443–460.

Digital Library

[17]

Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W Winkler, and C Karen Liu. 2022. Transformer Inertial Poser: Real-time human motion reconstruction from sparse IMUs with simultaneous terrain generation. In SIGGRAPH Asia 2022 Conference Papers. 1–9.

Digital Library

[18]

Haojian Jin, Zhijian Yang, Swarun Kumar, and Jason I Hong. 2018. Towards wearable everyday body-frame tracking using passive RFIDs. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 4 (2018), 1–23.

Digital Library

[19]

Daehwa Kim and Chris Harrison. 2022. Etherpose: Continuous hand pose tracking with wrist-worn antenna impedance characteristic sensing. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology. 1–12.

Digital Library

[20]

David Kim, Otmar Hilliges, Shahram Izadi, Alex D Butler, Jiawen Chen, Iason Oikonomidis, and Patrick Olivier. 2012. Digits: freehand 3D interactions anywhere using a wrist-worn gloveless sensor. In Proceedings of the 25th annual ACM symposium on User interface software and technology. 167–176.

Digital Library

[21]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[22]

Alexander Kyu, Hongyu Mao, Junyi Zhu, Mayank Goel, and Karan Ahuja. 2024. EITPose: Wearable and Practical Electrical Impedance Tomography for Continuous Hand Pose Estimation. In Proceedings of the CHI Conference on Human Factors in Computing Systems. 1–10.

Digital Library

[23]

Jiye Lee and Hanbyul Joo. 2024. Mocap Everyone Everywhere: Lightweight Motion Capture With Smartwatches and a Head-Mounted Camera. arXiv preprint arXiv:2401.00847 (2024).

[24]

Yilin Liu, Shijia Zhang, and Mahanth Gowda. 2021. NeuroPose: 3D hand pose tracking using EMG wearables. In Proceedings of the Web Conference 2021. 1471–1482.

Digital Library

[25]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34, 6 (Oct. 2015), 248:1–248:16.

Digital Library

[26]

Naureen Mahmood, Nima Ghorbani, Nikolaus F Troje, Gerard Pons-Moll, and Michael J Black. 2019. AMASS: Archive of motion capture as surface shapes. In Proceedings of the IEEE/CVF international conference on computer vision. 5442–5451.

[27]

Microsoft Corporation. [n. d.]. Microsoft Kinect.

[28]

Vimal Mollyn, Riku Arakawa, Mayank Goel, Chris Harrison, and Karan Ahuja. 2023. IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[29]

NaturalPoint, Inc.[n. d.]. OptiTrack. https://www.optitrack.com.

[30]

Shu Nishiguchi, Minoru Yamada, Koutatsu Nagai, Shuhei Mori, Yuu Kajiwara, Takuya Sonoda, Kazuya Yoshimura, Hiroyuki Yoshitomi, Hiromu Ito, Kazuya Okamoto, 2012. Reliability and validity of gait analysis by android-based smartphone. Telemedicine and e-Health 18, 4 (2012), 292–296.

[31]

Northern Digital Inc.2020. trakSTAR. https://www.ndigital.com/msci/products/drivebay-trakstar.

[32]

Mathias Parger, Joerg H Mueller, Dieter Schmalstieg, and Markus Steinberger. 2018. Human upper-body inverse kinematics for increased embodiment in consumer-grade virtual reality. In Proceedings of the 24th ACM symposium on virtual reality software and technology. 1–10.

Digital Library

[33]

Polhemus. 2020. Polhemus Motion Capture System. https://polhemus.com/.

[34]

PolyCam. [n. d.]. PolyCam. https://poly.cam/.

[35]

Jose Luis Ponton, Haoran Yun, Andreas Aristidou, Carlos Andujar, and Nuria Pelechano. 2023. SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data. ACM Transactions on Graphics 43, 1 (2023), 1–14.

Digital Library

[36]

Jathushan Rajasegaran, Georgios Pavlakos, Angjoo Kanazawa, and Jitendra Malik. 2021. Tracking people with 3D representations. arXiv preprint arXiv:2111.07868 (2021).

[37]

Nirupam Roy, He Wang, and Romit Roy Choudhury. 2014. I am a smartphone and i can tell my user’s walking direction. In Proceedings of the 12th annual international conference on Mobile systems, applications, and services. 329–342.

Digital Library

[38]

Takaaki Shiratori, Hyun Soo Park, Leonid Sigal, Yaser Sheikh, and Jessica K Hodgins. 2011. Motion capture from body-mounted cameras. In ACM SIGGRAPH 2011 papers. 1–10.

Digital Library

[39]

Ivan E Sutherland. 1968. A head-mounted three dimensional display. In Proceedings of the December 9-11, 1968, fall joint computer conference, part I. 757–764.

Digital Library

[40]

Matthew Trumble, Andrew Gilbert, Charles Malleson, Adrian Hilton, and John Collomosse. 2017. Total capture: 3d human pose estimation fusing video and inertial sensors. In Proceedings of 28th British Machine Vision Conference. 1–13.

[41]

Vicon Motion Systems Ltd.[n. d.]. Vicon. https://www.vicon.com.

[42]

Daniel Vlasic, Rolf Adelsberger, Giovanni Vannucci, John Barnwell, Markus Gross, Wojciech Matusik, and Jovan Popović. 2007. Practical motion capture in everyday surroundings. ACM transactions on graphics (TOG) 26, 3 (2007), 35–es.

Digital Library

[43]

Timo Von Marcard, Bodo Rosenhahn, Michael J Black, and Gerard Pons-Moll. 2017. Sparse inertial poser: Automatic 3d human pose estimation from sparse imus. In Computer graphics forum, Vol. 36. Wiley Online Library, 349–360.

[44]

Erwin Wu, Ye Yuan, Hui-Shyong Yeo, Aaron Quigley, Hideki Koike, and Kris M Kitani. 2020. Back-hand-pose: 3d hand pose estimation for a wrist-worn camera via dorsum deformation network. In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology. 1147–1160.

Digital Library

[45]

Xsens Technologies B.V.[n. d.]. Xsens IMU Systems. https://www.xsens.com. Accessed: 2024-03-07.

[46]

Hang Yan, Qi Shan, and Yasutaka Furukawa. 2018. RIDI: Robust IMU double integration. In Proceedings of the European conference on computer vision (ECCV). 621–636.

Digital Library

[47]

Xinyu Yi, Yuxiao Zhou, Marc Habermann, Vladislav Golyanik, Shaohua Pan, Christian Theobalt, and Feng Xu. 2023. EgoLocate: Real-time Motion Capture, Localization, and Mapping with Sparse Body-mounted Sensors. arXiv preprint arXiv:2305.01599 (2023).

[48]

Xinyu Yi, Yuxiao Zhou, Marc Habermann, Soshi Shimada, Vladislav Golyanik, Christian Theobalt, and Feng Xu. 2022. Physical inertial poser (pip): Physics-aware real-time human motion tracking from sparse inertial sensors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13167–13178.

[49]

Xinyu Yi, Yuxiao Zhou, and Feng Xu. 2021. Transpose: Real-time 3d human translation and pose estimation with six inertial sensors. ACM Transactions on Graphics (TOG) 40, 4 (2021), 1–13.

Digital Library

[50]

Yang Zhang, Chouchang Yang, Scott E Hudson, Chris Harrison, and Alanson Sample. 2018. Wall++ room-scale interactive and context-aware sensing. In Proceedings of the 2018 chi conference on human factors in computing systems. 1–15.

Digital Library

[51]

Mingmin Zhao, Tianhong Li, Mohammad Abu Alsheikh, Yonglong Tian, Hang Zhao, Antonio Torralba, and Dina Katabi. 2018. Through-wall human pose estimation using radio signals. In Proceedings of the IEEE conference on computer vision and pattern recognition. 7356–7365.

[52]

Li’an Zhuo, Jian Cao, Qi Wang, Bang Zhang, and Liefeng Bo. 2023. Towards Stable Human Pose Estimation via Cross-View Fusion and Foot Stabilization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 650–659.

Index Terms

MobilePoser: Real-Time Full-Body Pose Estimation and 3D Human Translation from IMUs in Mobile Consumer Devices
1. Human-centered computing
  1. Ubiquitous and mobile computing

Recommendations

SolePoser: Full Body Pose Estimation using a Single Pair of Insole Sensor
UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

We propose SolePoser, a real-time 3D pose estimation system that leverages only a single pair of insole sensors. Unlike conventional methods relying on fixed cameras or bulky wearable sensors, our approach offers minimal and natural setup requirements. ...
IMUPoser: Full-Body Pose Estimation using IMUs in Phones, Watches, and Earbuds
CHI '23: Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems

Tracking body pose on-the-go could have powerful uses in fitness, mobile gaming, context-aware virtual assistants, and rehabilitation. However, users are unlikely to buy and wear special suits or sensor arrays to achieve this end. Instead, in this work, ...
Capturing human motion using body-fixed sensors: outdoor measurement and clinical applications: Research Articles

Motion capture is mainly based on standard systems using optic, magnetic or sonic technologies. In this paper, the possibility to detect useful human motion based on new techniques using different types of body-fixed sensors is shown. In particular, a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

UIST '24: Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology

October 2024

2334 pages

ISBN:9798400706288

DOI:10.1145/3654777

Editors:
Lining Yao
University of California, Berkeley
,
Mayank Goel
Carnegie Mellon University
,
Alexandra Ion
Carnegie Mellon University
,
Pedro Lopes
University of Chicago

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 October 2024

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

NSF
NSF

Conference

UIST '24

UIST '24: The 37th Annual ACM Symposium on User Interface Software and Technology

October 13 - 16, 2024

PA, Pittsburgh, USA

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
1,427
Total Downloads

Downloads (Last 12 months)1,427
Downloads (Last 6 weeks)367

Reflects downloads up to 25 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten