Unsupervised Deep Joint Segmentation of Multitemporal High-Resolution Images

Sudipan Saha, Lichao Mou, Chunping Qiu, Xiao Xiang Zhu, Francesca Bovolo, Lorenzo Bruzzone

Research output: Contribution to journalArticlepeer-review

51 Scopus citations

Abstract

High/very-high-resolution (HR/VHR) multitemporal images are important in remote sensing to monitor the dynamics of the Earth's surface. Unsupervised object-based image analysis provides an effective solution to analyze such images. Image semantic segmentation assigns pixel labels from meaningful object groups and has been extensively studied in the context of single-image analysis, however not explored for multitemporal one. In this article, we propose to extend supervised semantic segmentation to the unsupervised joint semantic segmentation of multitemporal images. We propose a novel method that processes multitemporal images by separately feeding to a deep network comprising of trainable convolutional layers. The training process does not involve any external label, and segmentation labels are obtained from the argmax classification of the final layer. A novel loss function is used to detect object segments from individual images as well as establish a correspondence between distinct multitemporal segments. Multitemporal semantic labels and weights of the trainable layers are jointly optimized in iterations. We tested the method on three different HR/VHR data sets from Munich, Paris, and Trento, which shows the method to be effective. We further extended the proposed joint segmentation method for change detection (CD) and tested on a VHR multisensor data set from Trento.

Original languageEnglish
Article number9091105
Pages (from-to)8780-8792
Number of pages13
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume58
Issue number12
DOIs
StatePublished - Dec 2020
Externally publishedYes

Keywords

  • Deep learning
  • high resolution (HR)
  • multitemporal image
  • segmentation

Fingerprint

Dive into the research topics of 'Unsupervised Deep Joint Segmentation of Multitemporal High-Resolution Images'. Together they form a unique fingerprint.

Cite this