KGpose: Keypoint-Graph Driven End-to-End Multi-Object 6D Pose Estimation via Point-Wise Pose Voting
Authors:
Andrew Jeong
Abstract:
This letter presents KGpose, a novel end-to-end framework for 6D pose estimation of multiple objects. Our approach combines keypoint-based method with learnable pose regression through `keypoint-graph', which is a graph representation of the keypoints. KGpose first estimates 3D keypoints for each object using an attentional multi-modal feature fusion of RGB and point cloud features. These keypoint…
▽ More
This letter presents KGpose, a novel end-to-end framework for 6D pose estimation of multiple objects. Our approach combines keypoint-based method with learnable pose regression through `keypoint-graph', which is a graph representation of the keypoints. KGpose first estimates 3D keypoints for each object using an attentional multi-modal feature fusion of RGB and point cloud features. These keypoints are estimated from each point of point cloud and converted into a graph representation. The network directly regresses 6D pose parameters for each point through a sequence of keypoint-graph embedding and local graph embedding which are designed with graph convolutions, followed by rotation and translation heads. The final pose for each object is selected from the candidates of point-wise predictions. The method achieves competitive results on the benchmark dataset, demonstrating the effectiveness of our model. KGpose enables multi-object pose estimation without requiring an extra localization step, offering a unified and efficient solution for understanding geometric contexts in complex scenes for robotic applications.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
A Meta-Learning Approach for Medical Image Registration
Authors:
Heejung Park,
Gyeong Min Lee,
Soopil Kim,
Ga Hyung Ryu,
Areum Jeong,
Sang Hyun Park,
Min Sagong
Abstract:
Non-rigid registration is a necessary but challenging task in medical imaging studies. Recently, unsupervised registration models have shown good performance, but they often require a large-scale training dataset and long training times. Therefore, in real world application where only dozens to hundreds of image pairs are available, existing models cannot be practically used. To address these limi…
▽ More
Non-rigid registration is a necessary but challenging task in medical imaging studies. Recently, unsupervised registration models have shown good performance, but they often require a large-scale training dataset and long training times. Therefore, in real world application where only dozens to hundreds of image pairs are available, existing models cannot be practically used. To address these limitations, we propose a novel unsupervised registration model which is integrated with a gradient-based meta learning framework. In particular, we train a meta learner which finds an initialization point of parameters by utilizing a variety of existing registration datasets. To quickly adapt to various tasks, the meta learner was updated to get close to the center of parameters which are fine-tuned for each registration task. Thereby, our model can adapt to unseen domain tasks via a short fine-tuning process and perform accurate registration. To verify the superiority of our model, we train the model for various 2D medical image registration tasks such as retinal choroid Optical Coherence Tomography Angiography (OCTA), CT organs, and brain MRI scans and test on registration of retinal OCTA Superficial Capillary Plexus (SCP). In our experiments, the proposed model obtained significantly improved performance in terms of accuracy and training time compared to other registration models.
△ Less
Submitted 21 April, 2021;
originally announced April 2021.