One for All: Toward Unified Foundation Models for Earth Vision

Zhitong Xiong, Yi Wang, Fahong Zhang, Xiao Xiang Zhu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Foundation models characterized by extensive parameters and trained on large-scale datasets have demonstrated remarkable efficacy across various downstream tasks for remote sensing data. Current remote sensing foundation models typically specialize in a single modality or a specific spatial resolution range, limiting their versatility for downstream datasets. While there have been attempts to develop multi-modal remote sensing foundation models, they typically employ separate vision encoders for each modality or spatial resolution, necessitating a switch in backbones contingent upon the input data. To address this issue, we introduce a simple yet effective method, termed OFA-Net (One-For-All Network): employing a single, shared Transformer backbone for multiple data modalities with different spatial resolutions. Using the masked image modeling mechanism, we pre-train a single Transformer backbone on a curated multi-modal dataset with this simple design. Then the backbone model can be used in different downstream tasks, thus forging a path towards a unified foundation backbone model in Earth vision. The proposed method is evaluated on 12 distinct downstream tasks and demonstrates promising performance.

Original languageEnglish
Title of host publicationIGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2734-2738
Number of pages5
ISBN (Electronic)9798350360325
DOIs
StatePublished - 2024
Event2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, Greece
Duration: 7 Jul 202412 Jul 2024

Publication series

NameInternational Geoscience and Remote Sensing Symposium (IGARSS)

Conference

Conference2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024
Country/TerritoryGreece
CityAthens
Period7/07/2412/07/24

Keywords

  • Earth observation
  • Foundation models
  • remote sensing
  • self-supervised learning

Fingerprint

Dive into the research topics of 'One for All: Toward Unified Foundation Models for Earth Vision'. Together they form a unique fingerprint.

Cite this