Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

Spurr, Adrian; Iqbal, Umar; Molchanov, Pavlo; Hilliges, Otmar; Kautz, Jan

doi:10.1007/978-3-030-58520-4_13

Adrian Spurr^12,13,
Umar Iqbal¹³,
Pavlo Molchanov¹³,
Otmar Hilliges¹² &
…
Jan Kautz¹³

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12362))

Included in the following conference series:

European Conference on Computer Vision

3737 Accesses

Abstract

Estimating 3D hand pose from 2D images is a difficult, inverse problem due to the inherent scale and depth ambiguities. Current state-of-the-art methods train fully supervised deep neural networks with 3D ground-truth data. However, acquiring 3D annotations is expensive, typically requiring calibrated multi-view setups or labour intensive manual annotations. While annotations of 2D keypoints are much easier to obtain, how to efficiently leverage such weakly-supervised data to improve the task of 3D hand pose prediction remains an important open question. The key difficulty stems from the fact that direct application of additional 2D supervision mostly benefits the 2D proxy objective but does little to alleviate the depth and scale ambiguities. Embracing this challenge we propose a set of novel losses that constrain the prediction of a neural network to lie within the range of biomechanically feasible 3D hand configurations. We show by extensive experiments that our proposed constraints significantly reduce the depth ambiguity and allow the network to more effectively leverage additional 2D annotated images. For example, on the challenging freiHAND dataset, using additional 2D annotation without our proposed biomechanical constraints reduces the depth error by only $15\%$, whereas the error is reduced significantly by $50\%$ when the proposed biomechanical constraints are used.

A. Spurr—This work was done during an internship at NVIDIA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance

Coarse-to-Fine 3D Human Pose Estimation

References

Albrecht, I., Haber, J., Seidel, H.P.: Construction and animation of anatomically based human hand models. In: SIGGRAPH (2003)
Google Scholar
Aristidou, A.: Hand tracking with physiological constraints. Vis. Comput. 34(2), 213–228 (2018). https://doi.org/10.1007/s00371-016-1327-8
Article Google Scholar
Armagan, A., et al.: Measuring generalisation to unseen viewpoints, articulations, shapes and objects for 3D hand pose estimation under hand-object interaction. In: ECCV (2020)
Google Scholar
Baek, S., Kim, K.I., Kim, T.K.: Pushing the envelope for RGB-based dense 3D hand pose estimation via neural rendering. In: CVPR (2019)
Google Scholar
Boukhayma, A., de Bem, R., Torr, P.H.: 3D hand shape and pose from images in the wild. In: CVPR (2019)
Google Scholar
Cai, Y., Ge, L., Cai, J., Yuan, J.: Weakly-supervised 3D hand pose estimation from monocular RGB images. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11210, pp. 678–694. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01231-1_41
Chapter Google Scholar
Cai, Y., et al.: Exploiting spatial-temporal relationships for 3D pose estimation via graph convolutional networks. In: CVPR (2019)
Google Scholar
Cerveri, P., De Momi, E., Lopomo, N., Baud-Bovy, G., Barros, R., Ferrigno, G.: Finger kinematic modeling and real-time hand motion estimation. Ann. Biomed. Eng. 35(11), 1989–2002 (2007). https://doi.org/10.1007/s10439-007-9364-0
Article Google Scholar
Chen Chen, F., Appendino, S., Battezzato, A., Favetto, A., Mousavi, M., Pescarmona, F.: Constraint study for a hand exoskeleton: human hand kinematics and dynamics. J. Robot. (2013)
Google Scholar
Cobos, S., Ferre, M., Uran, M.S., Ortego, J., Pena, C.: Efficient human hand kinematics for manipulation tasks. In: IROS (2008)
Google Scholar
Cordella, F., Zollo, L., Guglielmelli, E., Siciliano, B.: A bio-inspired grasp optimization algorithm for an anthropomorphic robotic hand. Int. J. Interact. Des. Manuf. 6(2), 113–122 (2012). https://doi.org/10.1007/s12008-012-0149-9
Article Google Scholar
Dibra, E., Wolf, T., Oztireli, C., Gross, M.: How to refine 3D hand pose estimation from unlabelled depth data? In: 3DV (2017)
Google Scholar
Ge, L., et al.: 3D hand shape and pose estimation from a single RGB image. In: CVPR (2019)
Google Scholar
Hampali, S., Rad, M., Oberweger, M., Lepetit, V.: HOnnotate: a method for 3D annotation of hand and object poses. In: CVPR (2020)
Google Scholar
Hasson, Y.,et al.: Learning joint reconstruction of hands and manipulated objects. In: CVPR (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Google Scholar
Heap, T., Hogg, D.: Towards 3D hand tracking using a deformable model. In: FG (1996)
Google Scholar
Iqbal, U., Molchanov, P., Breuel, T., Gall, J., Kautz, J.: Hand pose estimation via latent 2.5D heatmap regression. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11215, pp. 125–143. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01252-6_8
Chapter Google Scholar
Kanazawa, A., Black, M.J., Jacobs, D.W., Malik, J.: End-to-end recovery of human shape and pose. In: CVPR (2018)
Google Scholar
Kuch, J.J., Huang, T.S.: Vision based hand modeling and tracking for virtual teleconferencing and telecollaboration. In: CVPR (1995)
Google Scholar
Kulon, D., Wang, H., Güler, R.A., Bronstein, M., Zafeiriou, S.: Single image 3D hand reconstruction with mesh convolutions. In: BMVC (2019)
Google Scholar
Lee, J., Kunii, T.L.: Model-based analysis of hand posture. IEEE Comput. Graph. Appl. 15(5), 77–86 (1995)
Article Google Scholar
Lin, J., Wu, Y., Huang, T.S.: Modeling the constraints of human hand motion. In: IEEE Workshop on Human Motion (2000)
Google Scholar
Melax, S., Keselman, L., Orsten, S.: Dynamics based 3D skeletal hand tracking. In: ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (2013)
Google Scholar
Mueller, F., et al.: GANerated hands for real-time 3D hand tracking from monocular RGB. In: CVPR (2018)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Full DOF tracking of a hand interacting with an object by modeling occlusions and physical constraints. In: ICCV (2011)
Google Scholar
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3D tracking of hand articulations using kinect. In: BMVC (2011)
Google Scholar
Panteleris, P., Oikonomidis, I., Argyros, A.: Using a single RGB frame for real time 3D hand pose estimation in the wild. In: WACV (2017)
Google Scholar
Pavlakos, G., et al.: Expressive body capture: 3D hands, face, and body from a single image. In: CVPR (2019)
Google Scholar
Reed, N.: What is the simplest way to compute principal curvature for a mesh triangle? (2019). https://computergraphics.stackexchange.com/questions/1718/what-is-the-simplest-way-to-compute-principal-curvature-for-a-mesh-triangle
Rhee, T., Neumann, U., Lewis, J.P.: Human hand modeling from surface anatomy. In: ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (2006)
Google Scholar
Romero, J., Tzionas, D., Black, M.J.: Embodied hands: modeling and capturing hands and bodies together. In: SIGGRAPH-Asia (2017)
Google Scholar
Ryf, C., Weymann, A.: The neutral zero method–a principle of measuring joint function. Injury 26, 1–11 (1995)
Article Google Scholar
Spurr, A., Song, J., Park, S., Hilliges, O.: Cross-modal deep variational hand pose estimation. In: CVPR (2018)
Google Scholar
Sridhar, S., Mueller, F., Zollhöfer, M., Casas, D., Oulasvirta, A., Theobalt, C.: Real-time joint tracking of a hand manipulating an object from RGB-D input. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 294–310. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_19
Chapter Google Scholar
Sridhar, S., Oulasvirta, A., Theobalt, C.: Interactive markerless articulated hand motion tracking using RGB and depth data. In: ICCV (2013)
Google Scholar
Sun, X., Shang, J., Liang, S., Wei, Y.: Compositional human pose regression. In: ICCV (2017)
Google Scholar
Tekin, B., Bogo, F., Pollefeys, M.: H+o: unified egocentric recognition of 3D hand-object poses and interactions. In: CVPR (2019)
Google Scholar
Tompson, J., Stein, M., Lecun, Y., Perlin, K.: Real-time continuous pose recovery of human hands using convolutional networks. ACM Trans. Graph. (ToG) 33(5), 1–10 (2014)
Article Google Scholar
Wan, C., Probst, T., Gool, L.V., Yao, A.: Self-supervised 3D hand pose estimation through training by fitting. In: CVPR (2019)
Google Scholar
Wu, Y., Huang, T.S.: Capturing articulated human hand motion: a divide-and-conquer approach. In: ICCV (1999)
Google Scholar
Xiang, D., Joo, H., Sheikh, Y.: Monocular total capture: posing face, body, and hands in the wild. In: CVPR (2019)
Google Scholar
Xu, C., Cheng, L.: Efficient hand pose estimation from a single depth image. In: ICCV (2013)
Google Scholar
Yang, L., Yao, A.: Disentangling latent hands for image synthesis and pose estimation. In: CVPR (2019)
Google Scholar
Zhang, J., Jiao, J., Chen, M., Qu, L., Xu, X., Yang, Q.: 3D hand pose tracking and estimation using stereo matching. arXiv:1610.07214 (2016)
Zhang, X., Li, Q., Mo, H., Zhang, W., Zheng, W.: End-to-end hand mesh recovery from a monocular RGB image. In: ICCV (2019)
Google Scholar
Zhou, X., Huang, Q., Sun, X., Xue, X., Wei, Y.: Towards 3D human pose estimation in the wild: a weakly-supervised approach. In: ICCV (2017)
Google Scholar
Zimmermann, C., Brox, T.: Learning to estimate 3D hand pose from single RGB images. In: ICCV (2017)
Google Scholar
Zimmermann, C., Ceylan, D., Yang, J., Russell, B., Argus, M., Brox, T.: FreiHAND: a dataset for markerless capture of hand pose and shape from single RGB images. In: ICCV (2019)
Google Scholar

Download references

Acknowledgments

We are grateful to Christoph Gebhardt and Shoaib Ahmed Siddiqui for the aid in figure creation and Abhishek Badki for helpful discussions.

Author information

Authors and Affiliations

Advanced Interactive Technologies, ETH Zurich, Zürich, Switzerland
Adrian Spurr & Otmar Hilliges
NVIDIA, Santa Clara, USA
Adrian Spurr, Umar Iqbal, Pavlo Molchanov & Jan Kautz

Authors

Adrian Spurr
View author publications
You can also search for this author in PubMed Google Scholar
Umar Iqbal
View author publications
You can also search for this author in PubMed Google Scholar
Pavlo Molchanov
View author publications
You can also search for this author in PubMed Google Scholar
Otmar Hilliges
View author publications
You can also search for this author in PubMed Google Scholar
Jan Kautz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adrian Spurr .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 1284 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Spurr, A., Iqbal, U., Molchanov, P., Hilliges, O., Kautz, J. (2020). Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12362. Springer, Cham. https://doi.org/10.1007/978-3-030-58520-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-58520-4_13
Published: 19 November 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58519-8
Online ISBN: 978-3-030-58520-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance

Coarse-to-Fine 3D Human Pose Estimation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1284 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Weakly Supervised 3D Hand Pose Estimation via Biomechanical Constraints

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Weakly-Supervised 3D Hand Pose Estimation from Monocular RGB Images

Weakly-Supervised 3D Hand Reconstruction with Knowledge Prior and Uncertainty Guidance

Coarse-to-Fine 3D Human Pose Estimation

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 1284 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation