TY - GEN
T1 - Learning to Localize in New Environments from Synthetic Training Data
AU - Winkelbauer, Dominik
AU - Denninger, Maximilian
AU - Triebel, Rudolph
N1 - Publisher Copyright:
© 2021 IEEE
PY - 2021
Y1 - 2021
N2 - Most existing approaches for visual localization either need a detailed 3D model of the environment or, in the case of learning-based methods, must be retrained for each new scene. This can either be very expensive or simply impossible for large, unknown environments, for example in search-and-rescue scenarios. Although there are learning-based approaches that operate scene-agnostically, the generalization capability of these methods is still outperformed by classical approaches. In this paper, we present an approach that can generalize to new scenes by applying specific changes to the model architecture, including an extended regression part, the use of hierarchical correlation layers, and the exploitation of scale and uncertainty information. Our approach outperforms the 5-point algorithm using SIFT features on equally big images and additionally surpasses all previous learning-based approaches that were trained on different data. It is also superior to most of the approaches that were specifically trained on the respective scenes. We also evaluate our approach in a scenario with only very few reference images, showing that under such more realistic conditions our learning-based approach considerably exceeds both existing learning-based and classical methods.
AB - Most existing approaches for visual localization either need a detailed 3D model of the environment or, in the case of learning-based methods, must be retrained for each new scene. This can either be very expensive or simply impossible for large, unknown environments, for example in search-and-rescue scenarios. Although there are learning-based approaches that operate scene-agnostically, the generalization capability of these methods is still outperformed by classical approaches. In this paper, we present an approach that can generalize to new scenes by applying specific changes to the model architecture, including an extended regression part, the use of hierarchical correlation layers, and the exploitation of scale and uncertainty information. Our approach outperforms the 5-point algorithm using SIFT features on equally big images and additionally surpasses all previous learning-based approaches that were trained on different data. It is also superior to most of the approaches that were specifically trained on the respective scenes. We also evaluate our approach in a scenario with only very few reference images, showing that under such more realistic conditions our learning-based approach considerably exceeds both existing learning-based and classical methods.
UR - http://www.scopus.com/inward/record.url?scp=85125471526&partnerID=8YFLogxK
U2 - 10.1109/ICRA48506.2021.9560872
DO - 10.1109/ICRA48506.2021.9560872
M3 - Conference contribution
AN - SCOPUS:85125471526
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 5840
EP - 5846
BT - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2021 IEEE International Conference on Robotics and Automation, ICRA 2021
Y2 - 30 May 2021 through 5 June 2021
ER -