HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Fan, Zicong; Parelli, Maria; Kadoglou, Maria Eleni; Kocabas, Muhammed; Chen, Xu; Black, Michael J.; Hilliges, Otmar

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.18448 (cs)

[Submitted on 30 Nov 2023]

Title:HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Authors:Zicong Fan, Maria Parelli, Maria Eleni Kadoglou, Muhammed Kocabas, Xu Chen, Michael J. Black, Otmar Hilliges

View PDF

Abstract:Since humans interact with diverse objects every day, the holistic 3D capture of these interactions is important to understand and model human behaviour. However, most existing methods for hand-object reconstruction from RGB either assume pre-scanned object templates or heavily rely on limited 3D hand-object data, restricting their ability to scale and generalize to more unconstrained interaction settings. To this end, we introduce HOLD -- the first category-agnostic method that reconstructs an articulated hand and object jointly from a monocular interaction video. We develop a compositional articulated implicit model that can reconstruct disentangled 3D hand and object from 2D images. We also further incorporate hand-object constraints to improve hand-object poses and consequently the reconstruction quality. Our method does not rely on 3D hand-object annotations while outperforming fully-supervised baselines in both in-the-lab and challenging in-the-wild settings. Moreover, we qualitatively show its robustness in reconstructing from in-the-wild videos. Code: this https URL

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2311.18448 [cs.CV]
	(or arXiv:2311.18448v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.18448

Submission history

From: Zicong Fan [view email]
[v1] Thu, 30 Nov 2023 10:50:35 UTC (10,611 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators