Skip to main navigation Skip to search Skip to main content

Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection

  • Sondos Mohamed
  • , Walter Zimmer
  • , Ross Greer
  • , Ahmed Alaaeldin Ghita
  • , Modesto Castrillón-Santana
  • , Mohan Trivedi
  • , Alois Knoll
  • , Salvatore Mario Carta
  • , Mirko Marras
  • University of Cagliari
  • Technical University of Munich
  • University of California
  • SETLabs Research GmbH
  • Las Palmas de Gran Canaria University
  • University of California, San Diego

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Accurately detecting 3D objects from monocular images in dynamic roadside scenarios remains a challenging problem due to varying camera perspectives and unpredictable scene conditions. This paper introduces a two-stage training strategy to address these challenges. Our approach initially trains a model on the large-scale synthetic dataset, RoadSense3D, which offers a diverse range of scenarios for robust feature learning. Subsequently, we fine-tune the model on a combination of real-world datasets to enhance its adaptability to practical conditions. Experimental results of the Cube R-CNN model on challenging public benchmarks show a remarkable improvement in detection performance, with a mean average precision rising from 0.26 to 12.76 on the TUM Traffic A9 Highway dataset and from 2.09 to 6.60 on the DAIR-V2X-I dataset, when performing transfer learning. Code, data, and qualitative video results are available at https://roadsense3d.github.io.

Original languageEnglish
Title of host publicationComputer Vision – ECCV 2024 Workshops, Proceedings
EditorsAlessio Del Bue, Cristian Canton, Jordi Pont-Tuset, Tatiana Tommasi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages309-325
Number of pages17
ISBN (Print)9783031918124
DOIs
StatePublished - 2025
EventWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024 - Milan, Italy
Duration: 29 Sep 20244 Oct 2024

Publication series

NameLecture Notes in Computer Science
Volume15630 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceWorkshops that were held in conjunction with the 18th European Conference on Computer Vision, ECCV 2024
Country/TerritoryItaly
CityMilan
Period29/09/244/10/24

Keywords

  • Intelligent Transportation Systems
  • Intelligent Vehicles
  • Monocular 3D Object Detection
  • Synthetic Data
  • Transfer Learning

Fingerprint

Dive into the research topics of 'Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection'. Together they form a unique fingerprint.

Cite this