Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–50 of 74 results for author: Pons-Moll, G

.
  1. arXiv:2408.16536  [pdf, other

    cs.CV

    Are Pose Estimators Ready for the Open World? STAGE: Synthetic Data Generation Toolkit for Auditing 3D Human Pose Estimators

    Authors: Nikita Kister, István Sárándi, Anna Khoreva, Gerard Pons-Moll

    Abstract: The estimation of 3D human poses from images has progressed tremendously over the last few years as measured on standard benchmarks. However, performance in the open world remains underexplored, as current benchmarks cannot capture its full extent. Especially in safety-critical systems, it is crucial that 3D pose estimators are audited before deployment, and their sensitivity towards single factor… ▽ More

    Submitted 28 August, 2024; originally announced August 2024.

  2. arXiv:2408.13953  [pdf, other

    cs.CV

    InterTrack: Tracking Human Object Interaction without Object Templates

    Authors: Xianghui Xie, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: Tracking human object interaction from videos is important to understand human behavior from the rapidly growing stream of video data. Previous video-based methods require predefined object templates while single-image-based methods are template-free but lack temporal consistency. In this paper, we present a method to track human object interaction without any object shape templates. We decompose… ▽ More

    Submitted 25 August, 2024; originally announced August 2024.

    Comments: 17 pages, 13 figures and 6 tables. Project page: https://virtualhumans.mpi-inf.mpg.de/InterTrack/

  3. arXiv:2407.07532  [pdf, other

    cs.CV

    Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation

    Authors: István Sárándi, Gerard Pons-Moll

    Abstract: With the explosive growth of available training data, single-image 3D human modeling is ahead of a transition to a data-centric paradigm. A key to successfully exploiting data scale is to design flexible models that can be supervised from various heterogeneous data sources produced by different researchers or vendors. To this end, we propose a simple yet powerful paradigm for seamlessly unifying d… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  4. arXiv:2406.08475  [pdf, other

    cs.CV

    Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models

    Authors: Yuxuan Xue, Xianghui Xie, Riccardo Marin, Gerard Pons-Moll

    Abstract: Creating realistic avatars from a single RGB image is an attractive yet challenging problem. Due to its ill-posed nature, recent works leverage powerful prior from 2D diffusion models pretrained on large datasets. Although 2D diffusion models demonstrate strong generalization capability, they cannot provide multi-view shape priors with guaranteed 3D consistency. We propose Human 3Diffusion: Realis… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: Project Page: https://yuxuan-xue.com/human-3diffusion

  5. arXiv:2404.01758  [pdf, other

    cs.CV

    GEARS: Local Geometry-aware Hand-object Interaction Synthesis

    Authors: Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-moll

    Abstract: Generating realistic hand motion sequences in interaction with objects has gained increasing attention with the growing interest in digital humans. Prior work has illustrated the effectiveness of employing occupancy-based or distance-based virtual sensors to extract hand-object interaction features. Nonetheless, these methods show limited generalizability across object categories, shapes and sizes… ▽ More

    Submitted 11 May, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

  6. arXiv:2403.15064  [pdf, other

    cs.CV cs.GR

    Recent Trends in 3D Reconstruction of General Non-Rigid Scenes

    Authors: Raza Yunus, Jan Eric Lenssen, Michael Niemeyer, Yiyi Liao, Christian Rupprecht, Christian Theobalt, Gerard Pons-Moll, Jia-Bin Huang, Vladislav Golyanik, Eddy Ilg

    Abstract: Reconstructing models of the real world, including 3D geometry, appearance, and motion of real scenes, is essential for computer graphics and computer vision. It enables the synthesizing of photorealistic novel views, useful for the movie industry and AR/VR applications. It also facilitates the content creation necessary in computer games and AR/VR by avoiding laborious manual design processes. Fu… ▽ More

    Submitted 6 May, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 42 pages, 18 figures, 5 tables; State-of-the-Art Report at EUROGRAPHICS 2024. Project page: https://razayunus.github.io/non-rigid-star

  7. arXiv:2403.11237  [pdf, other

    cs.CV

    FORCE: Dataset and Method for Intuitive Physics Guided Human-object Interaction

    Authors: Xiaohan Zhang, Bharat Lal Bhatnagar, Sebastian Starke, Ilya Petrov, Vladimir Guzov, Helisa Dhamo, Eduardo Pérez-Pellitero, Gerard Pons-Moll

    Abstract: Interactions between human and objects are influenced not only by the object's pose and shape, but also by physical attributes such as object mass and surface friction. They introduce important motion nuances that are essential for diversity and realism. Despite advancements in recent kinematics-based methods, this aspect has been overlooked. Generating nuanced human motion presents two challenges… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: 24 pages, 9 figures

  8. arXiv:2403.03122  [pdf, other

    cs.CV

    NRDF: Neural Riemannian Distance Fields for Learning Articulated Pose Priors

    Authors: Yannan He, Garvita Tiwari, Tolga Birdal, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: Faithfully modeling the space of articulations is a crucial task that allows recovery and generation of realistic poses, and remains a notorious challenge. To this end, we introduce Neural Riemannian Distance Fields (NRDFs), data-driven priors modeling the space of plausible articulations, represented as the zero-level-set of a neural field in a high-dimensional product-quaternion space. To train… ▽ More

    Submitted 11 April, 2024; v1 submitted 5 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024. Project page: https://virtualhumans.mpi-inf.mpg.de/nrdf

  9. arXiv:2401.12051  [pdf, other

    cs.CV cs.AI

    CloSe: A 3D Clothing Segmentation Dataset and Model

    Authors: Dimitrije Antić, Garvita Tiwari, Batuhan Ozcomlekci, Riccardo Marin, Gerard Pons-Moll

    Abstract: 3D Clothing modeling and datasets play crucial role in the entertainment, animation, and digital fashion industries. Existing work often lacks detailed semantic understanding or uses synthetic datasets, lacking realism and personalization. To address this, we first introduce CloSe-D: a novel large-scale dataset containing 3D clothing segmentation of 3167 scans, covering a range of 18 distinct clot… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  10. arXiv:2401.04143  [pdf, other

    cs.CV

    RHOBIN Challenge: Reconstruction of Human Object Interaction

    Authors: Xianghui Xie, Xi Wang, Nikos Athanasiou, Bharat Lal Bhatnagar, Chun-Hao P. Huang, Kaichun Mo, Hao Chen, Xia Jia, Zerui Zhang, Liangxian Cui, Xiao Lin, Bingqiao Qian, Jie Xiao, Wenfei Yang, Hyeongjin Nam, Daniel Sungho Jung, Kihoon Kim, Kyoung Mu Lee, Otmar Hilliges, Gerard Pons-Moll

    Abstract: Modeling the interaction between humans and objects has been an emerging research direction in recent years. Capturing human-object interaction is however a very challenging task due to heavy occlusion and complex dynamics, which requires understanding not only 3D human pose, and object pose but also the interaction between them. Reconstruction of 3D humans and objects has been two separate resear… ▽ More

    Submitted 7 January, 2024; originally announced January 2024.

    Comments: 14 pages, 5 tables, 7 figure. Technical report of the CVPR'23 workshop: RHOBIN challenge (https://rhobin-challenge.github.io/)

  11. arXiv:2312.14024  [pdf, other

    cs.CV

    NICP: Neural ICP for 3D Human Registration at Scale

    Authors: Riccardo Marin, Enric Corona, Gerard Pons-Moll

    Abstract: Aligning a template to 3D human point clouds is a long-standing problem crucial for tasks like animation, reconstruction, and enabling supervised learning pipelines. Recent data-driven methods leverage predicted surface correspondences. However, they are not robust to varied poses, identities, or noise. In contrast, industrial solutions often rely on expensive manual annotations or multi-view capt… ▽ More

    Submitted 21 July, 2024; v1 submitted 21 December, 2023; originally announced December 2023.

    Comments: Accepted at ECCV 2024

  12. arXiv:2312.11360  [pdf, other

    cs.CV cs.AI cs.GR

    Paint-it: Text-to-Texture Synthesis via Deep Convolutional Texture Map Optimization and Physically-Based Rendering

    Authors: Kim Youwang, Tae-Hyun Oh, Gerard Pons-Moll

    Abstract: We present Paint-it, a text-driven high-fidelity texture map synthesis method for 3D meshes via neural re-parameterized texture optimization. Paint-it synthesizes texture maps from a text description by synthesis-through-optimization, exploiting the Score-Distillation Sampling (SDS). We observe that directly applying SDS yields undesirable texture quality due to its noisy gradients. We reveal the… ▽ More

    Submitted 7 May, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

    Comments: CVPR 2024. Project page: https://kim-youwang.github.io/paint-it

  13. arXiv:2312.07063  [pdf, other

    cs.CV

    Template Free Reconstruction of Human-object Interaction with Procedural Interaction Generation

    Authors: Xianghui Xie, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: Reconstructing human-object interaction in 3D from a single RGB image is a challenging task and existing data driven methods do not generalize beyond the objects present in the carefully curated 3D interaction datasets. Capturing large-scale real data to learn strong interaction and 3D shape priors is very expensive due to the combinatorial nature of human-object interactions. In this paper, we pr… ▽ More

    Submitted 6 April, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: CVPR'24 camera ready version. 25 pages, 20 figures. Project page: https://virtualhumans.mpi-inf.mpg.de/procigen-hdm

  14. arXiv:2311.13655  [pdf, other

    cs.CV

    GAN-Avatar: Controllable Personalized GAN-based Human Head Avatar

    Authors: Berna Kabadayi, Wojciech Zielonka, Bharat Lal Bhatnagar, Gerard Pons-Moll, Justus Thies

    Abstract: Digital humans and, especially, 3D facial avatars have raised a lot of attention in the past years, as they are the backbone of several applications like immersive telepresence in AR or VR. Despite the progress, facial avatars reconstructed from commodity hardware are incomplete and miss out on parts of the side and back of the head, severely limiting the usability of the avatar. This limitation i… ▽ More

    Submitted 22 November, 2023; originally announced November 2023.

    Comments: Website: https://ganavatar.github.io/ , Video: https://www.youtube.com/watch?v=uAi5IVrzzZY&ab_channel=JustusThies , Accepted to 3DV2024

  15. arXiv:2308.14847  [pdf, other

    cs.CV

    NSF: Neural Surface Fields for Human Modeling from Monocular Depth

    Authors: Yuxuan Xue, Bharat Lal Bhatnagar, Riccardo Marin, Nikolaos Sarafianos, Yuanlu Xu, Gerard Pons-Moll, Tony Tung

    Abstract: Obtaining personalized 3D animatable avatars from a monocular camera has several real world applications in gaming, virtual try-on, animation, and VR/XR, etc. However, it is very challenging to model dynamic and fine-grained clothing deformations from such sparse data. Existing methods for modeling 3D humans from depth data have limitations in terms of computational efficiency, mesh coherency, and… ▽ More

    Submitted 27 October, 2023; v1 submitted 28 August, 2023; originally announced August 2023.

    Comments: Accpted to ICCV 2023; Homepage at: https://yuxuan-xue.com/nsf

  16. arXiv:2306.00777  [pdf, other

    cs.CV

    Object pop-up: Can we infer 3D objects and their poses from human interactions alone?

    Authors: Ilya A. Petrov, Riccardo Marin, Julian Chibane, Gerard Pons-Moll

    Abstract: The intimate entanglement between objects affordances and human poses is of large interest, among others, for behavioural sciences, cognitive psychology, and Computer Vision communities. In recent years, the latter has developed several object-centric approaches: starting from items, learning pipelines synthesizing human poses and dynamics in a realistic way, satisfying both geometrical and functi… ▽ More

    Submitted 27 October, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted at CVPR'23

  17. arXiv:2304.02061  [pdf, other

    cs.CV

    Generating Continual Human Motion in Diverse 3D Scenes

    Authors: Aymen Mir, Xavier Puig, Angjoo Kanazawa, Gerard Pons-Moll

    Abstract: We introduce a method to synthesize animator guided human motion across 3D scenes. Given a set of sparse (3 or 4) joint locations (such as the location of a person's hand and two feet) and a seed motion sequence in a 3D scene, our method generates a plausible motion sequence starting from the seed motion while satisfying the constraints imposed by the provided keypoints. We decompose the continual… ▽ More

    Submitted 30 October, 2023; v1 submitted 4 April, 2023; originally announced April 2023.

  18. arXiv:2303.16479  [pdf, other

    cs.CV

    Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

    Authors: Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll

    Abstract: Capturing the interactions between humans and their environment in 3D is important for many applications in robotics, graphics, and vision. Recent works to reconstruct the 3D human and object from a single RGB image do not have consistent relative translation across frames because they assume a fixed depth. Moreover, their performance drops significantly when the object is occluded. In this work,… ▽ More

    Submitted 31 October, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

    Comments: accepted to CVPR 2023, edited acknowledgement

  19. arXiv:2210.12003  [pdf, other

    cs.CV

    HDHumans: A Hybrid Approach for High-fidelity Digital Humans

    Authors: Marc Habermann, Lingjie Liu, Weipeng Xu, Gerard Pons-Moll, Michael Zollhoefer, Christian Theobalt

    Abstract: Photo-real digital human avatars are of enormous importance in graphics, as they enable immersive communication over the globe, improve gaming and entertainment experiences, and can be particularly beneficial for AR and VR settings. However, current avatar generation approaches either fall short in high-fidelity novel view synthesis, generalization to novel motions, reproduction of loose clothing,… ▽ More

    Submitted 14 July, 2023; v1 submitted 21 October, 2022; originally announced October 2022.

  20. arXiv:2208.00790  [pdf, other

    cs.CV cs.GR

    Skeleton-free Pose Transfer for Stylized 3D Characters

    Authors: Zhouyingcheng Liao, Jimei Yang, Jun Saito, Gerard Pons-Moll, Yang Zhou

    Abstract: We present the first method that automatically transfers poses between stylized 3D characters without skeletal rigging. In contrast to previous attempts to learn pose transformations on fixed or topology-equivalent skeleton templates, our method focuses on a novel scenario to handle skeleton-free characters with diverse shapes, topologies, and mesh connectivities. The key idea of our method is to… ▽ More

    Submitted 28 July, 2022; originally announced August 2022.

    Comments: Accepted at ECCV 2022. Project website https://zycliao.github.io/sfpt

  21. arXiv:2207.13807  [pdf, other

    cs.CV

    Pose-NDF: Modeling Human Pose Manifolds with Neural Distance Fields

    Authors: Garvita Tiwari, Dimitrije Antic, Jan Eric Lenssen, Nikolaos Sarafianos, Tony Tung, Gerard Pons-Moll

    Abstract: We present Pose-NDF, a continuous model for plausible human poses based on neural distance fields (NDFs). Pose or motion priors are important for generating realistic new poses and for reconstructing accurate poses from noisy or partial observations. Pose-NDF learns a manifold of plausible poses as the zero level set of a neural implicit function, extending the idea of modeling implicit surfaces i… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Comments: Project page: https://virtualhumans.mpi-inf.mpg.de/posendf

    Journal ref: European Conference on Computer Vision (ECCV 2022), Oral Presentation

  22. arXiv:2206.01203  [pdf, other

    cs.CV

    Box2Mask: Weakly Supervised 3D Semantic Instance Segmentation Using Bounding Boxes

    Authors: Julian Chibane, Francis Engelmann, Tuan Anh Tran, Gerard Pons-Moll

    Abstract: Current 3D segmentation methods heavily rely on large-scale point-cloud datasets, which are notoriously laborious to annotate. Few attempts have been made to circumvent the need for dense per-point annotations. In this work, we look at weakly-supervised 3D semantic instance segmentation. The key idea is to leverage 3D bounding box labels which are easier and faster to annotate. Indeed, we show tha… ▽ More

    Submitted 31 October, 2023; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: Project page: https://virtualhumans.mpi-inf.mpg.de/box2mask/

    Journal ref: European Conference on Computer Vision (ECCV), 2022, Oral Presentation

  23. arXiv:2205.07982  [pdf, other

    cs.CV

    TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement

    Authors: Keyang Zhou, Bharat Lal Bhatnagar, Jan Eric Lenssen, Gerard Pons-Moll

    Abstract: We present TOCH, a method for refining incorrect 3D hand-object interaction sequences using a data prior. Existing hand trackers, especially those that rely on very few cameras, often produce visually unrealistic results with hand-object intersection or missing contacts. Although correcting such errors requires reasoning about temporal aspects of interaction, most previous works focus on static gr… ▽ More

    Submitted 27 October, 2023; v1 submitted 16 May, 2022; originally announced May 2022.

  24. arXiv:2205.06254  [pdf, other

    cs.CV

    Learned Vertex Descent: A New Direction for 3D Human Model Fitting

    Authors: Enric Corona, Gerard Pons-Moll, Guillem Alenyà, Francesc Moreno-Noguer

    Abstract: We propose a novel optimization-based paradigm for 3D human model fitting on images and scans. In contrast to existing approaches that directly regress the parameters of a low-dimensional statistical body model (e.g. SMPL) from input images, we train an ensemble of per-vertex neural fields network. The network predicts, in a distributed manner, the vertex descent direction towards the ground truth… ▽ More

    Submitted 19 July, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Project page: https://www.iri.upc.edu/people/ecorona/lvd/

    Journal ref: ECCV 2022

  25. arXiv:2205.02830  [pdf, other

    cs.CV

    Interaction Replica: Tracking Human-Object Interaction and Scene Changes From Human Motion

    Authors: Vladimir Guzov, Julian Chibane, Riccardo Marin, Yannan He, Yunus Saracoglu, Torsten Sattler, Gerard Pons-Moll

    Abstract: Our world is not static and humans naturally cause changes in their environments through interactions, e.g., opening doors or moving furniture. Modeling changes caused by humans is essential for building digital twins, e.g., in the context of shared physical-virtual spaces (metaverses) and robotics. In order for widespread adoption of such emerging applications, the sensor setup used to capture th… ▽ More

    Submitted 18 March, 2024; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: International Conference on 3D Vision 2024 (3DV'24)

  26. arXiv:2205.00541  [pdf, other

    cs.CV

    COUCH: Towards Controllable Human-Chair Interactions

    Authors: Xiaohan Zhang, Bharat Lal Bhatnagar, Vladimir Guzov, Sebastian Starke, Gerard Pons-Moll

    Abstract: Humans interact with an object in many different ways by making contact at different locations, creating a highly complex motion space that can be difficult to learn, particularly when synthesizing such human interactions in a controllable manner. Existing works on synthesizing human scene interaction focus on the high-level control of action but do not consider the fine-grained control of motion.… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  27. arXiv:2204.10850  [pdf, other

    cs.CV

    Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation

    Authors: Verica Lazova, Vladimir Guzov, Kyle Olszewski, Sergey Tulyakov, Gerard Pons-Moll

    Abstract: We present a novel method for performing flexible, 3D-aware image content manipulation while enabling high-quality novel view synthesis. While NeRF-based approaches are effective for novel view synthesis, such models memorize the radiance for every point in a scene within a neural network. Since these models are scene-specific and lack a 3D scene representation, classical editing such as shape man… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

  28. arXiv:2204.06950  [pdf, other

    cs.CV

    BEHAVE: Dataset and Method for Tracking Human Object Interactions

    Authors: Bharat Lal Bhatnagar, Xianghui Xie, Ilya A. Petrov, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

    Abstract: Modelling interactions between humans and objects in natural environments is central to many applications including gaming, virtual and mixed reality, as well as human behavior analysis and human-robot collaboration. This challenging operation scenario requires generalization to vast number of objects, scenes, and human actions. Unfortunately, there exist no such dataset. Moreover, this data needs… ▽ More

    Submitted 14 April, 2022; originally announced April 2022.

    Comments: Accepted at CVPR'22

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022

  29. arXiv:2204.02445  [pdf, other

    cs.CV

    CHORE: Contact, Human and Object REconstruction from a single RGB image

    Authors: Xianghui Xie, Bharat Lal Bhatnagar, Gerard Pons-Moll

    Abstract: Most prior works in perceiving 3D humans from images reason human in isolation without their surroundings. However, humans are constantly interacting with the surrounding objects, thus calling for models that can reason about not only the human but also the object and their interaction. The problem is extremely challenging due to heavy occlusions between humans and objects, diverse interaction typ… ▽ More

    Submitted 31 October, 2023; v1 submitted 5 April, 2022; originally announced April 2022.

    Comments: Accepted at ECCV 2022, edited the acknowledgement

  30. arXiv:2111.10563  [pdf, other

    cs.CV

    A Deeper Look into DeepCap

    Authors: Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human perfor… ▽ More

    Submitted 20 November, 2021; originally announced November 2021.

    Comments: arXiv admin note: substantial text overlap with arXiv:2003.08325

  31. arXiv:2108.08807  [pdf, other

    cs.CV

    Neural-GIF: Neural Generalized Implicit Functions for Animating People in Clothing

    Authors: Garvita Tiwari, Nikolaos Sarafianos, Tony Tung, Gerard Pons-Moll

    Abstract: We present Neural Generalized Implicit Functions(Neural-GIF), to animate people in clothing as a function of the body pose. Given a sequence of scans of a subject in various poses, we learn to animate the character for new poses. Existing methods have relied on template-based representations of the human body (or clothing). However such models usually have fixed and limited resolutions, require di… ▽ More

    Submitted 20 August, 2021; v1 submitted 19 August, 2021; originally announced August 2021.

  32. arXiv:2107.02407  [pdf, other

    cs.CV

    NRST: Non-rigid Surface Tracking from Monocular Video

    Authors: Marc Habermann, Weipeng Xu, Helge Rhodin, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: We propose an efficient method for non-rigid surface tracking from monocular RGB videos. Given a video and a template mesh, our algorithm sequentially registers the template non-rigidly to each frame. We formulate the per-frame registration as an optimization problem that includes a novel texture term specifically tailored towards tracking objects with uniform texture but fine-scale structure, suc… ▽ More

    Submitted 12 July, 2021; v1 submitted 6 July, 2021; originally announced July 2021.

  33. Real-time Deep Dynamic Characters

    Authors: Marc Habermann, Lingjie Liu, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: We propose a deep videorealistic 3D human character model displaying highly realistic shape, motion, and dynamic appearance learned in a new weakly supervised way from multi-view imagery. In contrast to previous work, our controllable 3D character displays dynamics, e.g., the swing of the skirt, dependent on skeletal body motion in an efficient data-driven way, without requiring complex physics si… ▽ More

    Submitted 31 August, 2021; v1 submitted 4 May, 2021; originally announced May 2021.

    Journal ref: ACM Transactions on Graphics (SIGGRAPH 2021)

  34. arXiv:2104.06935  [pdf, other

    cs.CV cs.LG

    Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

    Authors: Julian Chibane, Aayush Bansal, Verica Lazova, Gerard Pons-Moll

    Abstract: Recent neural view synthesis methods have achieved impressive quality and realism, surpassing classical pipelines which rely on multi-view reconstruction. State-of-the-Art methods, such as NeRF, are designed to learn a single scene with a neural network and require dense multi-view inputs. Testing on a new scene requires re-training from scratch, which takes 2-3 days. In this work, we introduce St… ▽ More

    Submitted 14 April, 2021; originally announced April 2021.

    Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021

    Journal ref: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021

  35. arXiv:2103.17265  [pdf, other

    cs.CV

    Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors

    Authors: Vladimir Guzov, Aymen Mir, Torsten Sattler, Gerard Pons-Moll

    Abstract: We introduce (HPS) Human POSEitioning System, a method to recover the full 3D pose of a human registered with a 3D scan of the surrounding environment using wearable sensors. Using IMUs attached at the body limbs and a head mounted camera looking outwards, HPS fuses camera based self-localization with IMU-based human body tracking. The former provides drift-free but noisy position and orientation… ▽ More

    Submitted 31 March, 2021; originally announced March 2021.

    Comments: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

  36. arXiv:2103.06871  [pdf, other

    cs.CV

    SMPLicit: Topology-aware Generative Model for Clothed People

    Authors: Enric Corona, Albert Pumarola, Guillem Alenyà, Gerard Pons-Moll, Francesc Moreno-Noguer

    Abstract: In this paper we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (e.g. from sleeveless tops to hoodies and to open jackets), while controlling other propert… ▽ More

    Submitted 2 April, 2021; v1 submitted 11 March, 2021; originally announced March 2021.

    Comments: Accepted at CVPR 2021

  37. arXiv:2102.06837  [pdf, other

    cs.CV

    Learning Speech-driven 3D Conversational Gestures from Video

    Authors: Ikhsanul Habibie, Weipeng Xu, Dushyant Mehta, Lingjie Liu, Hans-Peter Seidel, Gerard Pons-Moll, Mohamed Elgharib, Christian Theobalt

    Abstract: We propose the first approach to automatically and jointly synthesize both the synchronous 3D conversational body and hand gestures, as well as 3D face and head animations, of a virtual character from speech input. Our algorithm uses a CNN architecture that leverages the inherent correlation between facial expression and hand gestures. Synthesis of conversational body gestures is a multi-modal pro… ▽ More

    Submitted 12 February, 2021; originally announced February 2021.

  38. arXiv:2102.01161  [pdf, other

    cs.CV

    Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes

    Authors: Keyang Zhou, Bharat Lal Bhatnagar, Bernt Schiele, Gerard Pons-Moll

    Abstract: Most learning methods for 3D data (point clouds, meshes) suffer significant performance drops when the data is not carefully aligned to a canonical orientation. Aligning real world 3D data collected from different sources is non-trivial and requires manual intervention. In this paper, we propose the Adjoint Rigid Transform (ART) Network, a neural module which can be integrated with a variety of 3D… ▽ More

    Submitted 27 October, 2023; v1 submitted 1 February, 2021; originally announced February 2021.

  39. arXiv:2011.13961  [pdf, other

    cs.CV

    D-NeRF: Neural Radiance Fields for Dynamic Scenes

    Authors: Albert Pumarola, Enric Corona, Gerard Pons-Moll, Francesc Moreno-Noguer

    Abstract: Neural rendering techniques combining machine learning with geometric reasoning have arisen as one of the most promising approaches for synthesizing novel views of a scene from a sparse set of images. Among these, stands out the Neural radiance fields (NeRF), which trains a deep network to map 5D input coordinates (representing spatial location and viewing direction) into a volume density and view… ▽ More

    Submitted 27 November, 2020; originally announced November 2020.

  40. SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera

    Authors: Denis Tome, Thiemo Alldieck, Patrick Peluse, Gerard Pons-Moll, Lourdes Agapito, Hernan Badino, Fernando De la Torre

    Abstract: We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device. This unusual viewpoint leads to images with unique visual appearance, with severe self-occlusions and perspective distortions that result in drastic differences in resolution between lower and upper body. We propose an e… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: 14 pages. arXiv admin note: substantial text overlap with arXiv:1907.10045

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020

  41. arXiv:2010.13938  [pdf, other

    cs.CV cs.LG

    Neural Unsigned Distance Fields for Implicit Function Learning

    Authors: Julian Chibane, Aymen Mir, Gerard Pons-Moll

    Abstract: In this work we target a learnable output representation that allows continuous, high resolution outputs of arbitrary shape. Recent works represent 3D surfaces implicitly with a Neural Network, thereby breaking previous barriers in resolution, and ability to represent diverse topologies. However, neural implicit representations are limited to closed surfaces, which divide the space into inside and… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: Neural Information Processing Systems (NeurIPS) 2020

    Journal ref: Neural Information Processing Systems (NeurIPS) 2020

  42. arXiv:2010.13508  [pdf, other

    cs.CV

    SHARP 2020: The 1st Shape Recovery from Partial Textured 3D Scans Challenge Results

    Authors: Alexandre Saint, Anis Kacem, Kseniya Cherenkova, Konstantinos Papadopoulos, Julian Chibane, Gerard Pons-Moll, Gleb Gusev, David Fofi, Djamila Aouada, Bjorn Ottersten

    Abstract: The SHApe Recovery from Partial textured 3D scans challenge, SHARP 2020, is the first edition of a challenge fostering and benchmarking methods for recovering complete textured 3D scans from raw incomplete data. SHARP 2020 is organised as a workshop in conjunction with ECCV 2020. There are two complementary challenges, the first one on 3D human scans, and the second one on generic objects. Challen… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: SHARP workshop, ECCV 2020

  43. arXiv:2010.12447  [pdf, other

    cs.CV

    LoopReg: Self-supervised Learning of Implicit Surface Correspondences, Pose and Shape for 3D Human Mesh Registration

    Authors: Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

    Abstract: We address the problem of fitting 3D human models to 3D scans of dressed humans. Classical methods optimize both the data-to-model correspondences and the human model parameters (pose and shape), but are reliable only when initialized close to the solution. Some methods initialize the optimization based on fully supervised correspondence predictors, which is not differentiable end-to-end, and can… ▽ More

    Submitted 26 November, 2021; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: NeurIPS'20 (Oral)

    Journal ref: NeurIPS 2020

  44. arXiv:2009.09458  [pdf, other

    cs.CV

    Implicit Feature Networks for Texture Completion from Partial 3D Data

    Authors: Julian Chibane, Gerard Pons-Moll

    Abstract: Prior work to infer 3D texture use either texture atlases, which require uv-mappings and hence have discontinuities, or colored voxels, which are memory inefficient and limited in resolution. Recent work, predicts RGB color at every XYZ coordinate forming a texture field, but focus on completing texture given a single 2D image. Instead, we focus on 3D texture and geometry completion from partial a… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

    Comments: SHARP Workshop, European Conference on Computer Vision (ECCV), 2020

    Journal ref: SHARP Workshop, European Conference on Computer Vision (ECCV), 2020

  45. arXiv:2007.11610  [pdf, other

    cs.CV

    SIZER: A Dataset and Model for Parsing 3D Clothing and Learning Size Sensitive 3D Clothing

    Authors: Garvita Tiwari, Bharat Lal Bhatnagar, Tony Tung, Gerard Pons-Moll

    Abstract: While models of 3D clothing learned from real data exist, no method can predict clothing deformation as a function of garment size. In this paper, we introduce SizerNet to predict 3D clothing conditioned on human body shape and garment size parameters, and ParserNet to infer garment meshes and shape under clothing with personal details in a single pass from an input mesh. SizerNet allows to estima… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

    Comments: European Conference on Computer Vision 2020

  46. arXiv:2007.11432  [pdf, other

    cs.CV

    Combining Implicit Function Learning and Parametric Models for 3D Human Reconstruction

    Authors: Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, Gerard Pons-Moll

    Abstract: Implicit functions represented as deep learning approximations are powerful for reconstructing 3D surfaces. However, they can only produce static surfaces that are not controllable, which provides limited ability to modify the resulting model by editing its pose or shape parameters. Nevertheless, such features are essential in building flexible models for both computer graphics and computer vision… ▽ More

    Submitted 26 November, 2021; v1 submitted 22 July, 2020; originally announced July 2020.

    Comments: Accepted at ECCV'20 (Oral)

  47. arXiv:2007.11341  [pdf, other

    cs.CV

    Unsupervised Shape and Pose Disentanglement for 3D Meshes

    Authors: Keyang Zhou, Bharat Lal Bhatnagar, Gerard Pons-Moll

    Abstract: Parametric models of humans, faces, hands and animals have been widely used for a range of tasks such as image-based reconstruction, shape correspondence estimation, and animation. Their key strength is the ability to factor surface variations into shape and pose dependent components. Learning such models requires lots of expert knowledge and hand-defined object-specific constraints, making the le… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

  48. arXiv:2007.09548  [pdf, other

    cs.CV

    Kinematic 3D Object Detection in Monocular Video

    Authors: Garrick Brazil, Gerard Pons-Moll, Xiaoming Liu, Bernt Schiele

    Abstract: Perceiving the physical world in 3D is fundamental for self-driving applications. Although temporal motion is an invaluable resource to human vision for detection, tracking, and depth perception, such features have not been thoroughly utilized in modern 3D object detectors. In this work, we propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic mot… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: To appear in ECCV 2020

  49. arXiv:2003.08325  [pdf, other

    cs.CV

    DeepCap: Monocular Human Performance Capture Using Weak Supervision

    Authors: Marc Habermann, Weipeng Xu, Michael Zollhoefer, Gerard Pons-Moll, Christian Theobalt

    Abstract: Human performance capture is a highly important computer vision problem with many applications in movie production and virtual/augmented reality. Many previous performance capture approaches either required expensive multi-view setups or did not recover dense space-time coherent geometry with frame-to-frame correspondences. We propose a novel deep learning approach for monocular dense human perfor… ▽ More

    Submitted 18 March, 2020; originally announced March 2020.

  50. arXiv:2003.04583  [pdf, other

    cs.CV cs.GR

    TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style

    Authors: Chaitanya Patel, Zhouyingcheng Liao, Gerard Pons-Moll

    Abstract: In this paper, we present TailorNet, a neural model which predicts clothing deformation in 3D as a function of three factors: pose, shape and style (garment geometry), while retaining wrinkle detail. This goes beyond prior models, which are either specific to one style and shape, or generalize to different shapes producing smooth results, despite being style specific. Our hypothesis is that (even… ▽ More

    Submitted 15 March, 2020; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Accepted to CVPR 2020. Chaitanya Patel and Zhouyingcheng Liao contributed equally