TY - JOUR
T1 - RESIDUAL SHUFFLING CONVOLUTIONAL NEURAL NETWORKS for DEEP SEMANTIC IMAGE SEGMENTATION USING MULTI-MODAL DATA
AU - Chen, K.
AU - Weinmann, M.
AU - Gao, X.
AU - Yan, M.
AU - Hinz, S.
AU - Jutzi, B.
AU - Weinmann, M.
N1 - Publisher Copyright:
© 2018 Copernicus GmbH. All rights reserved.
PY - 2018/5/28
Y1 - 2018/5/28
N2 - In this paper, we address the deep semantic segmentation of aerial imagery based on multi-modal data. Given multi-modal data composed of true orthophotos and the corresponding Digital Surface Models (DSMs), we extract a variety of hand-crafted radiometric and geometric features which are provided separately and in different combinations as input to a modern deep learning framework. The latter is represented by a Residual Shuffling Convolutional Neural Network (RSCNN) combining the characteristics of a Residual Network with the advantages of atrous convolution and a shuffling operator to achieve a dense semantic labeling. Via performance evaluation on a benchmark dataset, we analyze the value of different feature sets for the semantic segmentation task. The derived results reveal that the use of radiometric features yields better classification results than the use of geometric features for the considered dataset. Furthermore, the consideration of data on both modalities leads to an improvement of the classification results. However, the derived results also indicate that the use of all defined features is less favorable than the use of selected features. Consequently, data representations derived via feature extraction and feature selection techniques still provide a gain if used as the basis for deep semantic segmentation.
AB - In this paper, we address the deep semantic segmentation of aerial imagery based on multi-modal data. Given multi-modal data composed of true orthophotos and the corresponding Digital Surface Models (DSMs), we extract a variety of hand-crafted radiometric and geometric features which are provided separately and in different combinations as input to a modern deep learning framework. The latter is represented by a Residual Shuffling Convolutional Neural Network (RSCNN) combining the characteristics of a Residual Network with the advantages of atrous convolution and a shuffling operator to achieve a dense semantic labeling. Via performance evaluation on a benchmark dataset, we analyze the value of different feature sets for the semantic segmentation task. The derived results reveal that the use of radiometric features yields better classification results than the use of geometric features for the considered dataset. Furthermore, the consideration of data on both modalities leads to an improvement of the classification results. However, the derived results also indicate that the use of all defined features is less favorable than the use of selected features. Consequently, data representations derived via feature extraction and feature selection techniques still provide a gain if used as the basis for deep semantic segmentation.
KW - Aerial Imagery
KW - CNN
KW - Deep Learning
KW - Multi-Modal Data
KW - Residual Network
KW - Semantic Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85048425189&partnerID=8YFLogxK
U2 - 10.5194/isprs-annals-IV-2-65-2018
DO - 10.5194/isprs-annals-IV-2-65-2018
M3 - Conference article
AN - SCOPUS:85048425189
SN - 2194-9042
VL - 4
SP - 65
EP - 72
JO - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
JF - ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
IS - 2
T2 - 2018 ISPRS TC II Mid-term Symposium "Towards Photogrammetry 2020"
Y2 - 4 June 2018 through 7 June 2018
ER -