TY - JOUR
T1 - IMAGE-TO-IMAGE TRANSLATION for ENHANCED FEATURE MATCHING, IMAGE RETRIEVAL and VISUAL LOCALIZATION
AU - Mueller, M. S.
AU - Sattler, T.
AU - Pollefeys, M.
AU - Jutzi, B.
N1 - Publisher Copyright:
© 2019 ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. All rights reserved.
PY - 2019/9/16
Y1 - 2019/9/16
N2 - The performance of machine learning and deep learning algorithms for image analysis depends significantly on the quantity and quality of the training data. The generation of annotated training data is often costly, time-consuming and laborious. Data augmentation is a powerful option to overcome these drawbacks. Therefore, we augment training data by rendering images with arbitrary poses from 3D models to increase the quantity of training images. These training images usually show artifacts and are of limited use for advanced image analysis. Therefore, we propose to use image-to-image translation to transform images from a rendered domain to a captured domain. We show that translated images in the captured domain are of higher quality than the rendered images. Moreover, we demonstrate that image-to-image translation based on rendered 3D models enhances the performance of common computer vision tasks, namely feature matching, image retrieval and visual localization. The experimental results clearly show the enhancement on translated images over rendered images for all investigated tasks. In addition to this, we present the advantages utilizing translated images over exclusively captured images for visual localization.
AB - The performance of machine learning and deep learning algorithms for image analysis depends significantly on the quantity and quality of the training data. The generation of annotated training data is often costly, time-consuming and laborious. Data augmentation is a powerful option to overcome these drawbacks. Therefore, we augment training data by rendering images with arbitrary poses from 3D models to increase the quantity of training images. These training images usually show artifacts and are of limited use for advanced image analysis. Therefore, we propose to use image-to-image translation to transform images from a rendered domain to a captured domain. We show that translated images in the captured domain are of higher quality than the rendered images. Moreover, we demonstrate that image-to-image translation based on rendered 3D models enhances the performance of common computer vision tasks, namely feature matching, image retrieval and visual localization. The experimental results clearly show the enhancement on translated images over rendered images for all investigated tasks. In addition to this, we present the advantages utilizing translated images over exclusively captured images for visual localization.
KW - 3D Models
KW - Convolutional Neural Networks
KW - Data Augmentation
KW - Feature Matching
KW - Generative Adversarial Networks
KW - Image Retrieval
KW - Image-to-Image Translation
KW - Visual Localization
UR - http://www.scopus.com/inward/record.url?scp=85084672809&partnerID=8YFLogxK
U2 - 10.5194/isprs-annals-IV-2-W7-111-2019
DO - 10.5194/isprs-annals-IV-2-W7-111-2019
M3 - Conference article
AN - SCOPUS:85084672809
SN - 2194-9042
VL - 4
SP - 111
EP - 119
JO - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
JF - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
IS - 2/W7
T2 - 1st Photogrammetric Image Analysis and Munich Remote Sensing Symposium, PIA 2019+MRSS 2019
Y2 - 18 September 2019 through 20 September 2019
ER -