Self-attention for raw optical Satellite Time Series Classification

Marc Rußwurm, Marco Körner

Research output: Contribution to journalArticlepeer-review

247 Scopus citations

Abstract

The amount of available Earth observation data has increased dramatically in recent years. Efficiently making use of the entire body of information is a current challenge in remote sensing; it demands lightweight problem-agnostic models that do not require region- or problem-specific expert knowledge. End-to-end trained deep learning models can make use of raw sensory data by learning feature extraction and classification in one step, solely from data. Still, many methods proposed in remote sensing research require implicit feature extraction through data preprocessing or explicit design of features. In this work, we compare recent deep learning models on crop type classification on raw and preprocessed Sentinel 2 data. We concentrate on the common neural network architectures for time series, i.e., 1D-convolutions, recurrence, and the novel self-attention architecture. Our central findings are that data preprocessing still increased the overall classification performance for all models while the choice of model was less crucial. Self-attention and recurrent neural networks, by their architecture, outperformed convolutional neural networks on raw satellite time series. We explore this by a feature importance analysis based on gradient backpropagation that exploits the differentiable nature of deep learning models. Further, we qualitatively show how self-attention scores focus selectively on a few classification-relevant observations.

Original languageEnglish
Pages (from-to)421-435
Number of pages15
JournalISPRS Journal of Photogrammetry and Remote Sensing
Volume169
DOIs
StatePublished - Nov 2020

Keywords

  • Crop type mapping
  • Deep learning
  • Multitemporal Earth observation
  • Self-attention
  • Time series classification
  • Transformer
  • Vegetation monitoring

Fingerprint

Dive into the research topics of 'Self-attention for raw optical Satellite Time Series Classification'. Together they form a unique fingerprint.

Cite this