Learning visual overlapping image pairs for SfM via CNN fine-tuning with photogrammetric geometry information

Qianbao Hou, Rui Xia, Jiahuan Zhang, Yu Feng, Zongqian Zhan, Xin Wang

Research output: Contribution to journalReview articlepeer-review

12 Scopus citations

Abstract

Efficient and accurate identification of visual overlapping image pairs is an ongoing challenge for large-scale Structure from Motion (SfM). Recently, CNN-based methods have demonstrated the ability to find visually similar image pairs. BoW (Bag-of-Word) or Visual Vocabulary tree (VoC) with hand-crafted or learning-based local features is yet widely embedded in 3D reconstruction tasks. To explore the corresponding differences, in this work, we fine-tuned several popular CNNs (AlexNet, VGG, ResNet) according to the regularities which are tailored for determining visual overlapping image pairs for SfM. More specifically, a new training dataset (called LOIP) consisting of regular photogrammetric images and crowdsourced images from the Internet is generated by fully considering photogrammetric requirements and 3D mesh models. The local regional overlapping information from paired images was employed in fine-tuning procedure. To aggregate feature maps from various channels, learnable multiple NetVLADs for each regional information are employed to further improve the retrieval performance. Comprehensive experiments have been conducted and the obtained results demonstrate that the image retrieval performance is improved, and the cost time of image matching is significantly reduced by applying the identifications of visual overlapping pairs. Furthermore, the SfM results are basically on par with several state-of-the-art CNN-based and VoC methods.1

Original languageEnglish
Article number103162
JournalInternational Journal of Applied Earth Observation and Geoinformation
Volume116
DOIs
StatePublished - Feb 2023

Keywords

  • CNN-based fine-tuning
  • Image retrieval
  • NetVLAD
  • Visual overlapping image pairs

Fingerprint

Dive into the research topics of 'Learning visual overlapping image pairs for SfM via CNN fine-tuning with photogrammetric geometry information'. Together they form a unique fingerprint.

Cite this