research-article

iPose: Interactive Human Pose Reconstruction from Video

Authors:

Takeo IgarashiAuthors Info & Claims

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

Article No.: 945, Pages 1 - 14

https://doi.org/10.1145/3613904.3641944

Published: 11 May 2024 Publication History

Abstract

Reconstructing 3D human poses from video has wide applications, such as character animation and sports analysis. Automatic 3D pose reconstruction methods have demonstrated promising results, but failure cases can still appear due to the diversity of human actions, capturing conditions, and depth ambiguities. Thus, manual intervention remains indispensable, which can be time-consuming and require professional skills. We thus present iPose, an interactive tool that facilitates intuitive human pose reconstruction from a given video. Our tool incorporates both human perception in specifying pose appearance to achieve controllability, and video frame processing algorithms to achieve precision and automation. A user manipulates the projection of a 3D pose via 2D operations on top of video frames, and the 3D poses are updated correspondingly while satisfying both kinematic and video frame constraints. The pose updates are propagated temporally to reduce user workload. We evaluate the effectiveness of iPose with a user study on the 3DPW dataset and expert interviews.

Supplemental Material

MP4 File - Video Preview

Video Preview

Download
3.02 MB

MP4 File - Video Presentation

Video Presentation

Transcript for: Video Presentation

MP4 File - Video Figure

This video demonstrates our system with examples.

Download
11.06 MB

References

[1]

Andreas Aristidou, Yiorgos Chrysanthou, and Joan Lasenby. 2016. Extending FABRIK with Model Constraints. Comput. Animat. Virtual Worlds 27, 1 (Jan. 2016), 35–57. https://doi.org/10.1002/cav.1630

Digital Library

[2]

Andreas Aristidou, Joan Lasenby, Yiorgos Chrysanthou, and Ariel Shamir. 2018. Inverse kinematics techniques in computer graphics: A survey. In Computer graphics forum, Vol. 37. Wiley Online Library, 35–58.

[3]

AutoCAD. 2022. About Using 3D Gizmos. https://help.autodesk.com/view/ACD/2022/ENU/?guid=GUID-7BD066C9-31BA-4D47-8064-2F9CF268FA15

[4]

Roger Bartlett. 2014. Introduction to sports biomechanics: Analysing human movement patterns. Routledge.

[5]

Valentin Bazarevsky, Ivan Grishchenko, Karthik Raveendran, Tyler Zhu, Fan Zhang, and Matthias Grundmann. 2020. Blazepose: On-device real-time body pose tracking. arXiv preprint arXiv:2006.10204 (2020).

[6]

Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part V 14. Springer, 561–578.

[7]

John Brooke 1996. SUS-A quick and dirty usability scale. Usability evaluation in industry 189, 194 (1996), 4–7.

[8]

Xin Chen, Anqi Pang, Wei Yang, Yuexin Ma, Lan Xu, and Jingyi Yu. 2021. SportsCap: Monocular 3D Human Motion Capture and Fine-Grained Understanding in Challenging Sports Videos. International Journal of Computer Vision (Aug 2021). https://doi.org/10.1007/s11263-021-01486-4

Digital Library

[9]

John J Craig. 2006. Introduction to robotics. Pearson Educacion.

[10]

James Davis, Maneesh Agrawala, Erika Chuang, Zoran Popović, and David Salesin. 2006. A sketching interface for articulated figure animation. In Acm siggraph 2006 courses. 15–es.

[11]

Chris De Paoli and Karan Singh. 2015. SecondSkin: sketch-based construction of layered 3D models. ACM Transactions on Graphics (TOG) 34, 4 (2015), 1–10.

Digital Library

[12]

DeepMotion. 2022. Rotoscope Pose Editor. https://blog.deepmotion.com/2022/09/30/rotoscopeposeeditor/.

[13]

Moritz Einfalt, Katja Ludwig, and Rainer Lienhart. 2023. Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2903–2913.

[14]

Mihai Fieraru, Mihai Zanfir, Silviu Cristian Pirlea, Vlad Olaru, and Cristian Sminchisescu. 2021. Aifit: Automatic 3d human-interpretable feedback models for fitness training. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9919–9928.

[15]

Martin A Fischler and Robert C Bolles. 1981. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 24, 6 (1981), 381–395.

Digital Library

[16]

CG Geek. 2020. How to Animate 3D Characters in 1 Minute. https://www.youtube.com/watch?v=TjJLIuFKA20&ab_channel=CGGeek.

[17]

Yotam Gingold, Takeo Igarashi, and Denis Zorin. 2009. Structured annotations for 2D-to-3D modeling. In ACM SIGGRAPH Asia 2009 papers. 1–9.

[18]

Omer Gralnik, Guy Gafni, and Ariel Shamir. 2023. Semantify: Simplifying the Control of 3D Morphable Models using CLIP. arxiv:2308.07415 [cs.CV]

[19]

Keith Grochow, Steven L Martin, Aaron Hertzmann, and Zoran Popović. 2004. Style-based inverse kinematics. In ACM SIGGRAPH 2004 Papers. 522–531.

Digital Library

[20]

Christian Keilstrup Ingwersen, Christian Mikkelstrup, Janus Nørtoft Jensen, Morten Rieger Hannemose, and Anders Bjorholm Dahl. 2023. SportsPose – A Dynamic 3D sports pose dataset. arxiv:2304.01865 [cs.CV]

[21]

Antoni Jaume-i Capó, Javier Varona, Manuel González-Hidalgo, and Francisco J. Perales. 2010. Adding Image Constraints to Inverse Kinematics for Human Motion Capture. EURASIP J. Adv. Signal Process 2010, Article 4 (jan 2010), 13 pages. https://doi.org/10.1155/2010/142354

[22]

Nikita Karaev, Ignacio Rocco, Benjamin Graham, Natalia Neverova, Andrea Vedaldi, and Christian Rupprecht. 2023. CoTracker: It is Better to Track Together. arXiv:2307.07635 (2023).

[23]

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. 2023. Segment Anything. arXiv:2304.02643 (2023).

[24]

Muhammed Kocabas, Nikos Athanasiou, and Michael J Black. 2020. Vibe: Video inference for human body pose and shape estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 5253–5263.

[25]

Jiefeng Li, Chao Xu, Zhicun Chen, Siyuan Bian, Lixin Yang, and Cewu Lu. 2021. Hybrik: A hybrid analytical-neural inverse kinematics solution for 3d human pose and shape estimation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 3383–3393.

[26]

Jing Lin, Ailing Zeng, Shunlin Lu, Yuanhao Cai, Ruimao Zhang, Haoqian Wang, and Lei Zhang. 2023. Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset. arXiv preprint arXiv: 2307.00818 (2023).

[27]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A Skinned Multi-Person Linear Model. ACM Trans. Graphics (Proc. SIGGRAPH Asia) 34, 6 (Oct. 2015), 248:1–248:16.

Digital Library

[28]

Hyeongjin Nam, Daniel Sungho Jung, Yeonguk Oh, and Kyoung Mu Lee. 2023. Cyclic Test-Time Adaptation on Monocular Video for 3D Human Mesh Reconstruction. In International Conference on Computer Vision (ICCV).

[29]

Boris N Oreshkin, Florent Bocquelet, Felix G Harvey, Bay Raitt, and Dominic Laflamme. 2021. ProtoRes: Proto-Residual Network for Pose Authoring via Learned Inverse Kinematics. In International Conference on Learning Representations.

[30]

Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 2019. 3D human pose estimation in video with temporal convolutions and semi-supervised training. In Conference on Computer Vision and Pattern Recognition (CVPR).

[31]

Soshi Shimada, Vladislav Golyanik, Weipeng Xu, and Christian Theobalt. 2020. Physcap: Physically plausible monocular 3d motion capture in real time. ACM Transactions on Graphics (ToG) 39, 6 (2020), 1–16.

Digital Library

[32]

Yu Sun, Qian Bao, Wu Liu, Yili Fu, Black Michael J., and Tao Mei. 2021. Monocular, One-stage, Regression of Multiple 3D People. In ICCV.

[33]

Yu Sun, Wu Liu, Qian Bao, Yili Fu, Tao Mei, and Michael J Black. 2022. Putting people in their place: Monocular regression of 3d people in depth. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13243–13252.

[34]

Deepak Tolani and Norman I Badler. 1996. Real-time inverse kinematics of the human arm. Presence: Teleoperators & Virtual Environments 5, 4 (1996), 393–401.

Digital Library

[35]

Shashank Tripathi, Lea Müller, Chun-Hao P. Huang, Taheri Omid, Michael J. Black, and Dimitrios Tzionas. 2023. 3D Human Pose Estimation via Intuitive Physics. In Conference on Computer Vision and Pattern Recognition (CVPR). 4713–4725. https://ipman.is.tue.mpg.de

[36]

Nobuyuki Umetani, Danny M Kaufman, Takeo Igarashi, and Eitan Grinspun. 2011. Sensitive couture for interactive garment modeling and editing.ACM Trans. Graph. 30, 4 (2011), 90.

Digital Library

[37]

Vikram Voleti, Boris Oreshkin, Florent Bocquelet, Félix Harvey, Louis-Simon Ménard, and Christopher Pal. 2022. SMPL-IK: Learned Morphology-Aware Inverse Kinematics for AI Driven Artistic Workflows. In SIGGRAPH Asia 2022 Technical Communications. 1–7.

[38]

Timo von Marcard, Roberto Henschel, Michael Black, Bodo Rosenhahn, and Gerard Pons-Moll. 2018. Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera. In European Conference on Computer Vision (ECCV).

Digital Library

[39]

Kevin Xie, Tingwu Wang, Umar Iqbal, Yunrong Guo, Sanja Fidler, and Florian Shkurti. 2021. Physics-based human motion estimation and synthesis from videos. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11532–11541.

[40]

Katsu Yamane and Yoshihiko Nakamura. 2003. Natural motion animation through constraining and deconstraining at will. IEEE Transactions on visualization and computer graphics 9, 3 (2003), 352–360.

Digital Library

[41]

Hongwen Zhang, Yating Tian, Xinchi Zhou, Wanli Ouyang, Yebin Liu, Limin Wang, and Zhenan Sun. 2021. Pymaf: 3d human pose and shape regression with pyramidal mesh alignment feedback loop. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 11446–11456.

[42]

Zhengyou Zhang. 2014. Weak Perspective Projection. Springer US, Boston, MA, 877–883. https://doi.org/10.1007/978-0-387-31439-6_115

Index Terms

Index terms have been assigned to the content through auto-classification.

Recommendations

Principal direction analysis-based real-time 3D human pose reconstruction from a single depth image
SoICT '13: Proceedings of the 4th Symposium on Information and Communication Technology

Human pose estimation in real-time is a challenging problem in computer vision. In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth human silhouette using Principal Direction Analysis (PDA) on each ...
3D reconstruction based on monocular image sequences
EITCE '22: Proceedings of the 2022 6th International Conference on Electronic Information Technology and Computer Engineering

Aiming at the complex operation of active 3D reconstruction technology, which is easily affected by external environment and equipment, this paper adopts a series of monocular image sequences taken by cell phone cameras from different angles to realize ...
Motion capture and human pose reconstruction from a single-view video sequence

We propose a framework to reconstruct the 3D pose of a human for animation from a sequence of single-view video frames. The framework for pose construction starts with background estimation and the performer@?s silhouette is extracted using image ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI '24: Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems

May 2024

18961 pages

ISBN:9798400703300

DOI:10.1145/3613904

Editors:
Florian Floyd Mueller
Monash University
,
Penny Kyburz
The Australian National University
,
Julie R. Williamson
University of Glasgow
,
Corina Sas
Lancaster University
,
Max L. Wilson
University of Nottingham
,
Phoebe Toups Dugas
Monash University/New Mexico State University
,
Irina Shklovski
University of Copenhagen

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Available / v1.1

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CHI '24

Sponsor:

CHI '24: CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,199 of 26,314 submissions, 24%

Upcoming Conference

CHI 2025

Sponsor:
sigchi

ACM CHI Conference on Human Factors in Computing Systems

April 26 - May 1, 2025

Yokohama , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
899
Total Downloads

Downloads (Last 12 months)899
Downloads (Last 6 weeks)33

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Table of Conten