inerf: Inverting neural radiance fields for pose estimation

L Yen-Chen, P Florence, JT Barron… - 2021 IEEE/RSJ …, 2021 - ieeexplore.ieee.org
2021 IEEE/RSJ International Conference on Intelligent Robots and …, 2021ieeexplore.ieee.org
We present iNeRF, a framework that performs mesh-free pose estimation by" inverting" a
Neural Radiance Field (NeRF). NeRFs have been shown to be remarkably effective for the
task of view synthesis—synthesizing photorealistic novel views of real-world scenes or
objects. In this work, we investigate whether we can apply analysis-by-synthesis via NeRF
for mesh-free, RGB-only 6DoF pose estimation–given an image, find the translation and
rotation of a camera relative to a 3D object or scene. Our method assumes that no object …
We present iNeRF, a framework that performs mesh-free pose estimation by "inverting" a Neural Radiance Field (NeRF). NeRFs have been shown to be remarkably effective for the task of view synthesis — synthesizing photorealistic novel views of real-world scenes or objects. In this work, we investigate whether we can apply analysis-by-synthesis via NeRF for mesh-free, RGB-only 6DoF pose estimation – given an image, find the translation and rotation of a camera relative to a 3D object or scene. Our method assumes that no object mesh models are available during either training or test time. Starting from an initial pose estimate, we use gradient descent to minimize the residual between pixels rendered from a NeRF and pixels in an observed image. In our experiments, we first study 1) how to sample rays during pose refinement for iNeRF to collect informative gradients and 2) how different batch sizes of rays affect iNeRF on a synthetic dataset. We then show that for complex real-world scenes from the LLFF dataset [21], iNeRF can improve NeRF by estimating the camera poses of novel images and using these images as additional training data for NeRF. Finally, we show iNeRF can perform categorylevel object pose estimation, including object instances not seen during training, with RGB images by inverting a NeRF model inferred from a single view.
ieeexplore.ieee.org