A Survey of the Four Pillars for Small Object Detection: Multiscale Representation, Contextual Information, Super-Resolution, and Region Proposal

Guang Chen, Haitao Wang, Kai Chen, Zhijun Li, Zida Song, Yinlong Liu, Wenkai Chen, Alois Knoll

Publikation: Beitrag in FachzeitschriftArtikelBegutachtung

113 Zitate (Scopus)

Abstract

Although great progress has been made in generic object detection by advanced deep learning techniques, detecting small objects from images is still a difficult and challenging problem in the field of computer vision due to the limited size, less appearance, and geometry cues, and the lack of large-scale datasets of small targets. Improving the performance of small object detection has a wider significance in many real-world applications, such as self-driving cars, unmanned aerial vehicles, and robotics. In this article, the first-ever survey of recent studies in deep learning-based small object detection is presented. Our review begins with a brief introduction of the four pillars for small object detection, including multiscale representation, contextual information, super-resolution, and region-proposal. Then, the collection of state-of-the-art datasets for small object detection is listed. The performance of different methods on these datasets is reported later. Moreover, the state-of-the-art small object detection networks are investigated along with a special focus on the differences and modifications to improve the detection performance comparing to generic object detection architectures. Finally, several promising directions and tasks for future work in small object detection are provided. Researchers can track up-to-date studies on this webpage available at: https://github.com/tjtum-chenlab/SmallObjectDetectionList.

OriginalspracheEnglisch
Seiten (von - bis)936-953
Seitenumfang18
FachzeitschriftIEEE Transactions on Systems, Man, and Cybernetics: Systems
Jahrgang52
Ausgabenummer2
DOIs
PublikationsstatusVeröffentlicht - 1 Feb. 2022

Fingerprint

Untersuchen Sie die Forschungsthemen von „A Survey of the Four Pillars for Small Object Detection: Multiscale Representation, Contextual Information, Super-Resolution, and Region Proposal“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren