Vehicle Instance Segmentation from Aerial Image and Video Using a Multitask Learning Residual Fully Convolutional Network

Lichao Mou, Xiao Xiang Zhu

Research output: Contribution to journalArticlepeer-review

168 Scopus citations

Abstract

Object detection and semantic segmentation are two main themes in object retrieval from high-resolution remote sensing images, which have recently achieved remarkable performance by surfing the wave of deep learning and, more notably, convolutional neural networks. In this paper, we are interested in a novel, more challenging problem of vehicle instance segmentation, which entails identifying, at a pixel level, where the vehicles appear as well as associating each pixel with a physical instance of a vehicle. In contrast, vehicle detection and semantic segmentation each only concern one of the two. We propose to tackle this problem with a semantic boundary-aware multitask learning network. More specifically, we utilize the philosophy of residual learning to construct a fully convolutional network that is capable of harnessing multilevel contextual feature representations learned from different residual blocks. We theoretically analyze and discuss why residual networks can produce better probability maps for pixelwise segmentation tasks. Then, based on this network architecture, we propose a unified multitask learning network that can simultaneously learn two complementary tasks, namely, segmenting vehicle regions and detecting semantic boundaries. The latter subproblem is helpful for differentiating 'touching' vehicles that are usually not correctly separated into instances. Currently, data sets with a pixelwise annotation for vehicle extraction are the ISPRS data set and the IEEE GRSS DFC2015 data set over Zeebrugge, which specializes in a semantic segmentation. Therefore, we built a new, more challenging data set for vehicle instance segmentation, called the Busy Parking Lot Unmanned Aerial Vehicle Video data set, and we make our data set available at http://www.sipeo.bgu.tum.de/downloads so that it can be used to benchmark future vehicle instance segmentation algorithms.

Original languageEnglish
Article number8408520
Pages (from-to)6699-6711
Number of pages13
JournalIEEE Transactions on Geoscience and Remote Sensing
Volume56
Issue number11
DOIs
StatePublished - Nov 2018

Keywords

  • Boundary-aware multitask learning network
  • Fully convolutional network (FCN)
  • High-resolution remote sensing image/video
  • Instance semantic segmentation
  • Residual neural network (ResNet)
  • Vehicle detection

Fingerprint

Dive into the research topics of 'Vehicle Instance Segmentation from Aerial Image and Video Using a Multitask Learning Residual Fully Convolutional Network'. Together they form a unique fingerprint.

Cite this