Monday, February 23, 2009

Monocular Visual Odometry in Urban Environments Using an Omnidirectional Camera

Main idea: Performing visual SLAM using Preemptive RANSAC with the 5-point and 2-point algorithm to compute the position accurately. It uses SIFT correspondeces to compute the epipolar geometry from which the translation and rotation are obtained in a decoupled way.


Landmark detection and tracking. This approach use detector descriptor and matching for SIFT points with a slight modification on some of its thresholds. Matching is divided into two steps computing the epipolar geometry with different distances (20 and 2 pixels ) to the epipolar line.

After the matching between two frames is computed a triangulation of the landmarks is performed to get and initial reconstruction of the scene.

The trajectory estimation relies on the pose estimation process. It has as input a set of 3D-2D correspondences and the correspondeces between the two frames. In this approach the orientation estimation is decoupled from the position estimation.

Epipolar geometry provides the relative position of the new camera up to scale. Only the direction of the translation is estimated. Only one 3D point is required to recover that scale. In doing so, the estimated camera position is consistent with both 3D points and epipolar geometry. Two 3D-2D correspondences are used to estimate the full camera position while fixing its orientation (not only the scale). Preemptive RANSAC is performed followed by iterative refinement.

Triangulation. In this step the orientation is estimated by taking a small set of high quality points. If only 2 landmarks are available close-form, two view algorithm is performed. Otherwise a DLT algorithm for n-views is performed.



No comments:

Post a Comment