Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency

Published: 04 September 2020 Publication History

Abstract

We introduce MotioNet, a deep neural network that directly reconstructs the motion of a 3D human skeleton from a monocular video. While previous methods rely on either rigging or inverse kinematics (IK) to associate a consistent skeleton with temporally coherent joint rotations, our method is the first data-driven approach that directly outputs a kinematic skeleton, which is a complete, commonly used motion representation. At the crux of our approach lies a deep neural network with embedded kinematic priors, which decomposes sequences of 2D joint positions into two separate attributes: a single, symmetric skeleton encoded by bone lengths, and a sequence of 3D joint rotations associated with global root positions and foot contact labels. These attributes are fed into an integrated forward kinematics (FK) layer that outputs 3D positions, which are compared to a ground truth. In addition, an adversarial loss is applied to the velocities of the recovered rotations to ensure that they lie on the manifold of natural joint rotations. The key advantage of our approach is that it learns to infer natural joint rotations directly from the training data rather than assuming an underlying model, or inferring them from joint positions using a data-agnostic IK solver. We show that enforcing a single consistent skeleton along with temporally coherent joint rotations constrains the solution space, leading to a more robust handling of self-occlusions and depth ambiguities.

References

[1]
Ijaz Akhter and Michael J. Black. 2015. Pose-conditioned joint angle limits for 3D human pose reconstruction. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). IEEE Computer Society, Washington, DC.
[2]
Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, and Gerard Pons-Moll. 2018. Video based reconstruction of 3D people models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC.
[3]
Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, and James Davis. 2005. SCAPE: Shape completion and animation of people. ACM Trans. Graph. 24, 3 (July 2005), 408--416.
[4]
Anurag Arnab, Carl Doersch, and Andrew Zisserman. 2019. Exploiting temporal context for 3D human pose estimation in the wild. In Proceeding of the IEEE/CVF Conference in Computer Vision and Pattern Recognition (CVPR’19). 3395--3404.
[5]
Andreas Baak, Meinard Müller, Gaurav Bharaj, Hans-Peter Seidel, and Christian Theobalt. 2011. A data-driven approach for real-time full body pose reconstruction from a depth camera. In Proceedings of the International Conference on Computer Vision (ICCV’11). IEEE Computer Society, 1092--1099.
[6]
Didier Bieler, Semih Günel, Pascal Fua, and Helge Rhodin. 2019. Gravity as a Reference for Estimating a Person’s Height from Video. arxiv:cs.CV/1909.02211
[7]
Federica Bogo, Angjoo Kanazawa, Christoph Lassner, Peter Gehler, Javier Romero, and Michael J. Black. 2016. Keep it SMPL: Automatic estimation of 3D human pose and shape from a single image. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer, Berlin, Germany, 561--578.
[8]
Ernesto Brau and Hao Jiang. 2016. 3D human pose estimation via deep learning from 2D annotations. In Proceedings of the 2016 4th International Conference on 3D Vision (3DV’16). 582--591.
[9]
Zhe Cao, Gines Hidalgo, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2018. OpenPose: Realtime multi-person 2D pose estimation using part affinity fields. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC.
[10]
João Carreira, Pulkit Agrawal, Katerina Fragkiadaki, and Jitendra Malik. 2016. Human pose estimation with iterative error feedback. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, Washington, DC, 4733--4742.
[11]
Ching-Hang Chen and Deva Ramanan. 2017. 3d human pose estimation= 2d pose estimation+ matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 7035--7043.
[12]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018a. Cascaded pyramid network for multi-person pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 7103--7112.
[13]
Yilun Chen, Zhicheng Wang, Yuxiang Peng, Zhiqiang Zhang, Gang Yu, and Jian Sun. 2018b. Cascaded pyramid network for multi-person pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7103--7112.
[14]
CMU. 2019. CMU Graphics Lab Motion Capture Database. Retrieved from http://mocap.cs.cmu.edu/.
[15]
Rishabh Dabral, Anurag Mundhada, Uday Kusupati, Safeer Afaque, Abhishek Sharma, and Arjun Jain. 2018. Learning 3D human pose from structure and motion. In Proceedings of the European Conference on Computer Vision (ECCV’18), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, 679--696.
[16]
Yuzhu Dong, Aishat Aloba, Sachin Paryani, Lisa Anthony, Neha Rana, and Eakta Jain. 2017. Adult2Child: Dynamic scaling laws to create child-like motion. In Proceedings of the 10th International Conference on Motion in Games (MIG’17). ACM, New York, NY, Article 13, 13:1–13:10 pages.
[17]
Hao-Shu Fang,*Yuanlu Xu,*Wenguan Wang, Xiaobai Liu, and Song-Chun Zhu. 2018. Learning pose grammar to encode human body configuration for 3D pose estimation. In Proceedings of the AAAI Conference on Artificial Intelligence.
[18]
Keith Grochow, Steven L. Martin, Aaron Hertzmann, and Zoran Popoviundefined. 2004. Style-based inverse kinematics. In Proceedings of the ACM SIGGRAPH 2004 Papers (SIGGRAPH’04). Association for Computing Machinery, New York, NY, 522--531.
[19]
Riza Alp Güler and Iasonas Kokkinos. 2019. HoloPose: Holistic 3D human reconstruction in-the-wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 10876--10886.
[20]
Riza Alp Güler, Natalia Neverova, and Iasonas Kokkinos. 2018. DensePose: Dense human pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 7297--7306.
[21]
Semih Günel, Helge Rhodin, and Pascal Fua. 2018. What Face and Body Shapes Can Tell Us About Height. arxiv:cs.CV/1805.10355
[22]
Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Gerard Pons-Moll, and Christian Theobalt. 2019. In the wild human pose estimation using explicit 2D features and intermediate 3D representations. arXiv preprint arXiv:1904.03289 (2019).
[23]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE International Conference on Computer Vision. 2961--2969.
[24]
Mir Rayat Imtiaz Hossain and James J. Little. 2018. Exploiting temporal information for 3d human pose estimation. In Proceedings of the European Conference on Computer Vision. Springer, 69--86.
[25]
Yinghao Huang, Federica Bogo, Christoph Lassner, Angjoo Kanazawa, Peter V. Gehler, Javier Romero, Ijaz Akhter, and Michael J. Black. 2017. Towards accurate marker-less human shape and pose estimation over time. In Proceedings of the 2017 International Conference on 3D Vision (3DV). IEEE, 421--430.
[26]
Catalin Ionescu, Dragos Papava, Vlad Olaru, and Cristian Sminchisescu. 2014. Human3.6M: Large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36, 7 (July 2014), 1325--1339.
[27]
Sam Johnson and Mark Everingham. 2010. Clustered pose and nonlinear appearance models for human pose estimation. In Proceedings of the British Machine Vision Conference (BMVC’10).
[28]
Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik. 2018. End-to-end recovery of human shape and pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 7122--7131.
[29]
Angjoo Kanazawa, Jason Y. JZhang, Panna Felsen, and Jitendra Malik. 2019. Learning 3D human dynamics from video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).
[30]
Isinsu Katircioglu, Bugra Tekin, Mathieu Salzmann, Vincent Lepetit, and Pascal Fua. 2018. Learning latent representations of 3D human pose with deep neural networks. Int. J. Comput. Vision 126, 12 (2018), 1326--1341.
[31]
Nikos Kolotouros, Georgios Pavlakos, Michael J. Black, and Kostas Daniilidis. 2019b. Learning to reconstruct 3D human pose and shape via model-fitting in the loop. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’19).
[32]
Nikos Kolotouros, Georgios Pavlakos, and Kostas Daniilidis. 2019a. Convolutional mesh regression for single-image human shape reconstruction. In Proceeding of the IEEE/CVF Conference in Computer Vision and Pattern Recognition (CVPR’19).
[33]
Christoph Lassner, Javier Romero, Martin Kiefel, Federica Bogo, Michael J. Black, and Peter V. Gehler. 2017. Unite the people: Closing the loop between 3D and 2D human representations. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 4704--4713.
[34]
Kyoungoh Lee, Inwoong Lee, and Sanghoon Lee. 2018a. Propagating LSTM: 3D pose estimation based on joint interdependency. In Proceedings of the European Conference on Computer Vision (ECCV’18). Springer International Publishing, 123--141.
[35]
Kyungho Lee, Seyoung Lee, and Jehee Lee. 2018b. Interactive character animation by learning multi-objective control. ACM Trans. Graph. 37, 6, Article 180 (Dec. 2018), 10 pages.
[36]
Chen Li and Gim Hee Lee. 2019. Generating multiple hypotheses for 3D human pose estimation with mixture density network. arXiv preprint arXiv:1904.05547 (2019).
[37]
Shuang Liang, Xiao Sun, and Yichen Wei. 2018. Compositional human pose regression. Comput. Vision Image Understanding 176-177 (2018), 1--8.
[38]
Mude Lin, Liang Lin, Xiaodan Liang, Keze Wang, and Hui Cheng. 2017. Recurrent 3d pose sequence machines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 810--819.
[39]
Yebin Liu, Juergen Gall, Carsten Stoll, Qionghai Dai, Hans-Peter Seidel, and Christian Theobalt. 2013. Markerless motion capture of multiple characters using multiview image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 35, 11 (Nov. 2013), 2720--2735.
[40]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Trans. Graph. 34, 6 (Oct. 2015), 248:1–248:16.
[41]
Diogo C. Luvizon, David Picard, and Hedi Tabia. 2018. 2D/3D pose estimation and action recognition using multitask deep learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 5137--5146.
[42]
Julieta Martinez, Rayat Hossain, Javier Romero, and James J. Little. 2017. A simple yet effective baseline for 3D human pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). 2659--2668.
[43]
Dushyant Mehta, Helge Rhodin, Dan Casas, Pascal Fua, Oleksandr Sotnychenko, Weipeng Xu, and Christian Theobalt. 2017a. Monocular 3D human pose estimation in the wild using improved CNN supervision. In Proceedings of the 2017 5th International Conference on 3D Vision (3DV’17). IEEE Computer Society, 506–516.
[44]
Dushyant Mehta, Oleksandr Sotnychenko, Franziska Mueller, Weipeng Xu, Mohamed Elgharib, Pascal Fua, Hans-Peter Seidel, Helge Rhodin, Gerard Pons-Moll, and Christian Theobalt. 2019. XNect: Real-time Multi-person 3D Human Pose Estimation with a Single RGB Camera. arxiv:cs.CV/1907.00837
[45]
Dushyant Mehta, Srinath Sridhar, Oleksandr Sotnychenko, Helge Rhodin, Mohammad Shafiei, Hans-Peter Seidel, Weipeng Xu, Dan Casas, and Christian Theobalt. 2017b. VNect: Real-time 3D human pose estimation with a single RGB camera. ACM Trans. Graph. 36, 4, Article 44 (July 2017), 44:1–44:14 pages.
[46]
Francesc Moreno-Noguer. 2017. 3D human pose estimation from a single image via distance matrix regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 1561--1570.
[47]
Alejandro Newell, Kaiyu Yang, and Jia Deng. 2016. Stacked hourglass networks for human pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV’18).
[48]
George Papandreou, Tyler Zhu, Nori Kanazawa, Alexander Toshev, Jonathan Tompson, Chris Bregler, and Kevin Murphy. 2017. Towards accurate multi-person pose estimation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC.
[49]
Georgios Pavlakos, Xiaowei Zhou, and Kostas Daniilidis. 2018a. Ordinal depth supervision for 3D human pose estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7307--7316.
[50]
Georgios Pavlakos, Xiaowei Zhou, Konstantinos G. Derpanis, and Kostas Daniilidis. 2017. Coarse-to-fine volumetric prediction for single-image 3D human pose. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 1263--1272.
[51]
Georgios Pavlakos, Luyang Zhu, Xiaowei Zhou, and Kostas Daniilidis. 2018b. Learning to estimate 3D human pose and shape from a single color image. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC.
[52]
Dario Pavllo, Christoph Feichtenhofer, David Grangier, and Michael Auli. 2019. 3D human pose estimation in video with temporal convolutions and semi-supervised training. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE Computer Society, Washington, DC.
[53]
Dario Pavllo, David Grangier, and Michael Auli. 2018. QuaterNet: A Quaternion-based Recurrent Model for Human Motion. arxiv:cs.CV/1805.06485
[54]
Xue Bin Peng, Angjoo Kanazawa, Jitendra Malik, Pieter Abbeel, and Sergey Levine. 2018. SFV: Reinforcement learning of physical skills from videos. In SIGGRAPH Asia 2018 Technical Papers. ACM, 178.
[55]
Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2012. Reconstructing 3D human pose from 2D image landmarks. In Proceedings of the European Conference on Computer Vision (ECCV’12), Andrew Fitzgibbon, Svetlana Lazebnik, Pietro Perona, Yoichi Sato, and Cordelia Schmid (Eds.). Springer, Berlin, 573--586.
[56]
Helge Rhodin, Mathieu Salzmann, and Pascal Fua. 2018a. Unsupervised geometry-aware representation for 3D human pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV’18).
[57]
Helge Rhodin, Jörg Spörri, Isinsu Katircioglu, Victor Constantin, Frédéric Meyer, Erich Müller, Mathieu Salzmann, and Pascal Fua. 2018b. Learning monocular 3D human pose estimation from multi-view images. In Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC.
[58]
Nikolaos Sarafianos, Bogdan Boteanu, Bogdan Ionescu, and Ioannis A. Kakadiaris. 2016. 3D human pose estimation: A review of the literature and analysis of covariates. Comput. Vis. Image Underst. 152, C (Nov. 2016), 1--20.
[59]
Toby Sharp. 2012. The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (CVPR’12). IEEE Computer Society, Washington, DC, 103--110.
[60]
Jamie Shotton, Toby Sharp, Alex Kipman, Andrew Fitzgibbon, Mark Finocchio, Andrew Blake, Mat Cook, and Richard Moore. 2013. Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 1 (Jan. 2013), 116--124.
[61]
Leonid Sigal, Alexandru O. Balan, and Michael J. Black. 2009. HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. Int. J. Comput. Vision 87, 1 (Aug. 5, 2009), 4.
[62]
Bugra Tekin, Pablo Márquez-Neila, Mathieu Salzmann, and Pascal Fua. 2017. Learning to fuse 2D and 3D image cues for monocular body pose estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE Computer Society, Washington, DC, 3941--3950.
[63]
Bugra Tekin, Artem Rozantsev, Vincent Lepetit, and Pascal Fua. 2016. Direct prediction of 3D body poses from motion compensated sequences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, Washington, DC, 991--1000.
[64]
Denis Tome, Chris Russell, and Lourdes Agapito. 2017. Lifting from the deep: Convolutional 3D pose estimation from a single image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). IEEE Computer Society, Washington, DC, 2500--2509.
[65]
Alexander Toshev and Christian Szegedy. 2014. DeepPose: Human pose estimation via deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). IEEE Computer Society, Washington, DC, 1653--1660.
[66]
Ruben Villegas, Jimei Yang, Duygu Ceylan, and Honglak Lee. 2018. Neural kinematic networks for unsupervised motion retargetting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 8639--8648.
[67]
Daniel Vlasic, Ilya Baran, Wojciech Matusik, and Jovan Popović. 2008. Articulated mesh animation from multi-view silhouettes. ACM Trans. Graph. 27, 3, Article 97 (Aug. 2008), 97:1–97:9 pages.
[68]
Keze Wang, Liang Lin, Chenhan Jiang, Chen Qian, and Pengxu Wei. 2019. 3D human pose machines with self-supervised learning. IEEE Trans. Pattern Anal. Mach. Intell. (2019).
[69]
Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). IEEE Computer Society, Washington, DC, 4724--4732.
[70]
Xiaolin Wei, Peizhao Zhang, and Jinxiang Chai. 2012. Accurate realtime full-body motion capture using a single depth camera. ACM Trans. Graph. 31, 6, Article 188 (Nov. 2012), 188:1–188:12 pages.
[71]
Weipeng Xu, Avishek Chatterjee, Michael Zollhöfer, Helge Rhodin, Dushyant Mehta, Hans-Peter Seidel, and Christian Theobalt. 2018. MonoPerfCap: Human performance capture from monocular video. ACM Trans. Graph. 37, 2, Article 27 (May 2018), 27:1–27:15 pages.
[72]
Yuanlu Xu, Song-Chun Zhu, and Tony Tung. 2019. DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare. arxiv:cs.CV/1910.00116
[73]
Wei Yang, Wanli Ouyang, Xiaolong Wang, Jimmy Ren, Hongsheng Li, and Xiaogang Wang. 2018. 3D human pose estimation in the wild by adversarial learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). IEEE Computer Society, Washington, DC, 5255--5264.
[74]
Mao Ye and Ruigang Yang. 2014. Real-time simultaneous pose and shape estimation for articulated objects using a single depth camera. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’14). IEEE Computer Society, Washington, DC, 2353--2360.
[75]
Anastasios Yiannakides, Andreas Aristidou, and Yiorgos Chrysanthou. 2019. Real-time 3D human pose and motion reconstruction from monocular RGB videos. Comput. Animat. Virtual Worlds 30, 3–4 (May 2019).
[76]
Yusuke Yoshiyasu, Ryusuke Sagawa, Ko Ayusawa, and Akihiko Murai. 2018. Skeleton Transformer Networks: 3D Human Pose and Skinned Mesh from Single RGB Image. arxiv:cs.CV/1812.11328
[77]
Jason Y. Zhang, Panna Felsen, Angjoo Kanazawa, and Jitendra Malik. 2019. Predicting 3D human dynamics from video. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’19).
[78]
Zerong Zheng, Tao Yu, Yixuan Wei, Qionghai Dai, and Yebin Liu. 2019. DeepHuman: 3D human reconstruction from a single image. In Proceedings of the IEEE International Conference on Computer Vision. 7739–7749.
[79]
Xingyi Zhou, Qi-Xing Huang, Xiao Sun, Xiangyang Xue, and Yichen Wei. 2017. Towards 3D human pose estimation in the wild: A weakly-supervised approach. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’17). IEEE Computer Society, Washington, DC, 398--407.
[80]
Xingyi Zhou, Xiao Sun, Wei Zhang, Shuang Liang, and Yichen Wei. 2016. Deep kinematic pose regression. In Proceedings of the European Conference on Computer Vision (ECCV’16). Springer International Publishing, 186--201.
[81]
Xiaowei Zhou, Menglong Zhu, Georgios Pavlakos, Spyridon Leonardos, Konstantinos G. Derpanis, and Kostas Daniilidis. 2018. MonoCap: Monocular human motion capture using a CNN coupled with a geometric prior. IEEE Trans. Pattern Anal. Mach. Intell. 41, 4 (2018), 901--914.
[82]
Yi Zhou, Connelly Barnes, Lu Jingwan, Yang Jimei, and Li Hao. 2019. On the continuity of rotation representations in neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19).

Cited By

View all
  • (2024)Enhanced real-time motion transfer to 3D avatars using RGB-based human 3D pose estimationProceedings of the 2024 ACM International Conference on Interactive Media Experiences Workshops10.1145/3672406.3672427(88-99)Online publication date: 12-Jun-2024
  • (2024)Virtual Instrument Performances (VIP): A Comprehensive ReviewComputer Graphics Forum10.1111/cgf.1506543:2Online publication date: 30-Apr-2024
  • (2024)Machine Learning Approaches for 3D Motion Synthesis and Musculoskeletal Dynamics Estimation: A SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.330875330:8(5810-5829)Online publication date: 1-Aug-2024
  • Show More Cited By

Index Terms

  1. MotioNet: 3D Human Motion Reconstruction from Monocular Video with Skeleton Consistency

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 40, Issue 1
      February 2021
      139 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/3420236
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 September 2020
      Accepted: 01 June 2020
      Revised: 01 March 2020
      Received: 01 October 2019
      Published in TOG Volume 40, Issue 1

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Pose estimation
      2. motion analysis
      3. motion capturing

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      • Israel Science Foundation
      • European Union’s Horizon 2020 Research and Innovation Programme
      • National Key R8D Program of China

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)248
      • Downloads (Last 6 weeks)19
      Reflects downloads up to 16 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Enhanced real-time motion transfer to 3D avatars using RGB-based human 3D pose estimationProceedings of the 2024 ACM International Conference on Interactive Media Experiences Workshops10.1145/3672406.3672427(88-99)Online publication date: 12-Jun-2024
      • (2024)Virtual Instrument Performances (VIP): A Comprehensive ReviewComputer Graphics Forum10.1111/cgf.1506543:2Online publication date: 30-Apr-2024
      • (2024)Machine Learning Approaches for 3D Motion Synthesis and Musculoskeletal Dynamics Estimation: A SurveyIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.330875330:8(5810-5829)Online publication date: 1-Aug-2024
      • (2024)A Two-Part Transformer Network for Controllable Motion SynthesisIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.328440230:8(5047-5062)Online publication date: 1-Aug-2024
      • (2024)Modification of Skeletal Character Animation Using Inverse Kinematics Controllers2024 International Russian Smart Industry Conference (SmartIndustryCon)10.1109/SmartIndustryCon61328.2024.10515984(553-557)Online publication date: 25-Mar-2024
      • (2024)mmPose-FK: A Forward Kinematics Approach to Dynamic Skeletal Pose Estimation Using mmWave RadarsIEEE Sensors Journal10.1109/JSEN.2023.334819924:5(6469-6481)Online publication date: 1-Mar-2024
      • (2024)Cricket Action Classification Based on Skeleton Data Extracted by OpenPifPaf2024 20th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/ICNC-FSKD64080.2024.10702198(1-5)Online publication date: 27-Jul-2024
      • (2024)Motion Retargeting from Human in Video to 3D Characters with Different Skeleton Topology2024 4th International Conference on Consumer Electronics and Computer Engineering (ICCECE)10.1109/ICCECE61317.2024.10504182(124-128)Online publication date: 12-Jan-2024
      • (2024)Grouped Guidance Convolutional Networks for Absolute 3D Human Pose and Joint Rotations Estimation2024 7th International Conference on Advanced Algorithms and Control Engineering (ICAACE)10.1109/ICAACE61206.2024.10548924(1545-1550)Online publication date: 1-Mar-2024
      • (2024)DiffusionPoser: Real-Time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00243(2513-2523)Online publication date: 16-Jun-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media