Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to content
/ gSDF Public
forked from zerchen/gSDF

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction, CVPR 2023.

Notifications You must be signed in to change notification settings

dqj5182/gSDF

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction (CVPR 2023)

This repository is the official implementation of gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction. It also provides implementations for Grasping Field and AlignSDF. Project webpage: https://zerchen.github.io/projects/gsdf.html

Abstract: Signed distance functions (SDFs) is an attractive framework that has recently shown promising results for 3D shape reconstruction from images. SDFs seamlessly generalize to different shape resolutions and topologies but lack explicit modelling of the underlying 3D geometry. In this work, we exploit the hand structure and use it as guidance for SDF-based shape reconstruction. In particular, we address reconstruction of hands and manipulated objects from monocular RGB images. To this end, we estimate poses of hands and objects and use them to guide 3D reconstruction. More specifically, we predict kinematic chains of pose transformations and align SDFs with highly-articulated hand poses. We improve the visual features of 3D points with geometry alignment and further leverage temporal information to enhance the robustness to occlusion and motion blurs. We conduct extensive experiments on the challenging ObMan and DexYCB benchmarks and demonstrate significant improvements of the proposed method over the state of the art.

Installation

Please follow instructions listed below to build the environment.

conda create -n gsdf python=3.9
conda activate gsdf
conda install pytorch==1.9.0 torchvision==0.10.0 torchaudio==0.9.0 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt

Dataset

  1. ObMan dataset preparations.
  • Download ObMan data from the official website.
  • Set up a soft link from the download path to ${ROOT}/datasets/obman/data.
  • Download processed SDF files and json files.
  • Run ${ROOT}/preprocess/cocoify_obman.py to generate LMDB training files. The data organization looks like this:
    ${ROOT}/datasets/obman
    └── splits
        obman_train.json
        obman_test.json
        obman.py
        data
         ├── val
         ├── train
         |   ├── rgb
         |   ├── rgb.lmdb
         |   ├── sdf_hand
         |   ├── sdf_hand.lmdb
         |   ├── sdf_obj
         |   ├── sdf_obj.lmdb
         └── test
             ├── rgb
             ├── mesh_hand
             ├── mesh_obj
    
  1. DexYCB dataset preparations.
  • Download DexYCB data from the official webpage.
  • Set up a soft link from the download path to ${ROOT}/datasets/dexycb/data.
  • Download processed SDF files and json files.
  • Run ${ROOT}/preprocess/cocoify_dexycb.py to generate LMDB training files. The data organization looks like this:
    ${ROOT}/datasets/dexycb
    └── splits
        toolkit
        dexycb_train_s0.json
        dexycb_test_s0.json
        dexycb.py
        data
         ├── 20200709-subject-01
         ├── .
         ├── .
         ├── 20201022-subject-10
         ├── bop
         ├── models
         ├── mesh_data
         ├── sdf_data
         ├── rgb_s0.lmdb
         ├── sdf_hand_s0.lmdb
         └── sdf_obj_s0.lmdb
    

Training

  1. Establish the output directory by mkdir ${ROOT}/outputs and cd ${ROOT}/tools.

  2. ${ROOT}/playground provides implementations of different models:

    ${ROOT}/playground
     ├── pose_kpt                  # A component for gSDF which solves pose estimation problem
     ├── hsdf_osdf_1net            # The SDF network with a single backbone like Grasping Field or AlignSDF
     ├── hsdf_osdf_2net            # The SDF network with two backbones like gSDF
     ├── hsdf_osdf_2net_pa         # Compared with hsdf_osdf_2net, it additionally uses pixel-aligned visual features
     ├── hsdf_osdf_2net_video_pa   # Compared with hsdf_osdf_2net_pa, it additionally uses spatial-temporay transformer to process multiple frames
    
  3. Train the Grasping Field model:

CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/hsdf_osdf_1net/experiments/obman_resnet18_hnerf3_onerf3.yaml
  1. Train the AlignSDF model:
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/hsdf_osdf_1net/experiments/obman_resnet18_hkine6_otrans6.yaml
  1. Train the gSDF model:

For Obman

# It first needs to train a checkpoint for hand pose estimation.
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/pose_kpt/experiments/obman_hand.yaml

# Then, load the pretrained pose checkpoint and train the SDF model.
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/hsdf_osdf_2net_pa/experiments/obman_presnet18_sresnet18_hkine6_okine6.yaml

For DexYCB

# It first needs to train a checkpoint for hand pose estimation.
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/pose_kpt/experiments/dexycb_s0_hand.yaml

# Then, load the pretrained pose checkpoint and train the SDF model.
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/hsdf_osdf_2net_pa/experiments/dexycbs0_presnet18_sresnet18_hkine6_okine6.yaml hand_point_latent 51 obj_point_latent 72 ckpt ../outputs/pose_kpt/dexycbs0_29k_resnet18_rot0_6d_h1_o0_norm0_e100_b128_vw1.0_ocrw0.0_how1.0_sow0.0/model_dump/snapshot_99.pth.tar

# Train the model that processes multiple frames (DexYCB provides videos).
CUDA_VISIBLE_DEVICES=4,5,6,7 python train.py --gpu 4-7 -e ../playground/hsdf_osdf_2net_video_pa/experiments/dexycbs0_3frames_presnet18_sresnet18_hkine6_okine6.yaml hand_point_latent 51 obj_point_latent 72 ckpt path_to_pretrained_model

Testing and Evaluation

Actually, when it finishes training, the script will launch the testing automatically. You could also launch the training explicitly by:
For Obman

CUDA_VISIBLE_DEVICES=1 python test.py --gpu 1 -e ../outputs/hsdf_osdf_2net_pa/gsdf_obman/exp.yaml

For DexYCB

CUDA_VISIBLE_DEVICES=7 python test.py --gpu 7 -e ../outputs/hsdf_osdf_2net_pa/dexycbs0_29k_resnet18_resnet18_h1_o1_sdf5_cls0_rot0_hand_kine_51_obj_kine_72_np2000_adf1_e1600_ae1201_scale6.2_b64_hsw0.5_osw0.5_hcw0.0_vw0.5/exp.yaml

After the testing phase ends, you could evaluate the performance: For Obman

CUDA_VISIBLE_DEVICES=1 python eval.py --gpu 1 -e ../outputs/hsdf_osdf_2net_pa/gsdf_obman

For DexYCB

CUDA_VISIBLE_DEVICES=7 python eval.py -e ../outputs/hsdf_osdf_2net_pa/dexycbs0_29k_resnet18_resnet18_h1_o1_sdf5_cls0_rot0_hand_kine_51_obj_kine_72_np2000_adf1_e1600_ae1201_scale6.2_b64_hsw0.5_osw0.5_hcw0.0_vw0.5

Citation

If you find this work useful, please consider citing:

@InProceedings{chen2023gsdf,
author       = {Chen, Zerui and Chen, Shizhe and Schmid, Cordelia and Laptev, Ivan},
title        = {{gSDF}: {Geometry-Driven} Signed Distance Functions for {3D} Hand-Object Reconstruction},
booktitle    = {CVPR},
year         = {2023},
}

Acknowledgement

Some of the codes are built upon manopth, PoseNet, PCL, Grasping Field, and HALO. Thanks them for their great works!

About

gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction, CVPR 2023.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.7%
  • Shell 0.3%