6DGS: 6D Pose Estimation from a Single Image and a 3D Gaussian Splatting Model
Creators
Description
Abstract. We propose 6DGS to estimate the camera pose of a target
RGB image given a 3D Gaussian Splatting (3DGS) model representing the scene. 6DGS avoids the iterative process typical of analysis-bysynthesis methods (e.g. iNeRF) that also require an initialization of the
camera pose in order to converge. Instead, our method estimates a 6DoF
pose by inverting the 3DGS rendering process. Starting from the object
surface, we define a radiant Ellicell that uniformly generates rays departing from each ellipsoid that parameterize the 3DGS model. Each Ellicell
ray is associated with the rendering parameters of each ellipsoid, which in
turn is used to obtain the best bindings between the target image pixels
and the cast rays. These pixel-ray bindings are then ranked to select the
best scoring bundle of rays, which their intersection provides the camera
center and, in turn, the camera rotation. The proposed solution obviates
the necessity of an “a priori” pose for initialization, and it solves 6DoF
pose estimation in closed form, without the need for iterations. Moreover,
compared to the existing Novel View Synthesis (NVS) baselines for pose
estimation, 6DGS can improve the overall average rotational accuracy
by 12% and translation accuracy by 22% on real scenes, despite not requiring any initialization pose. At the same time, our method operates
near real-time, reaching 15f ps on consumer hardware.
Files
2407.15484v1.pdf
Files
(22.1 MB)
Name | Size | Download all |
---|---|---|
md5:76853f8768cea18ec282d0359298d238
|
22.1 MB | Preview Download |