Tuesday, February 24, 2009

A visual bag of words method for interactive qualitative localization and mapping

Main idea: The bag-of-words approach is adapted to perform visual localization. It is based in the creation of a dictionary from the words (features) acquired from images. Then the localization is performed by comparing the words from a new scene with the ones stored in the dictionary.

Visual localization
  • The goal of the classifier is to infer the room from an image.
  • The classifier can be trained incrementally by a vote scheme.
  • Active learning is performed when then classifier fails and the user has to provide the right label for the place that was not well classified.
more detailed (assume the map is already built)
    • The features are extracted and the corresponding words are found in the dictionary. These words then vote at a first level for the rooms in which they have been perceived.
    • The vote result is calculated by the difference between the maximum and the second maximum.
    • The winning category votes at a second level.
    • This process is repeated with the other feature spaces and with new images until the quality of the second level vote reaches a given threshold.
Mapping is performed in two processes
  • Building the dictionary
  • Gathering data for the classifier
  • User-aided approach
  • Memorize in which category a given word has been perceived
Features used
  • SIFT
  • local color histograms
  • local normalized grey level histograms

No comments:

Post a Comment