Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection

Mohamed, Sondos; Zimmer, Walter; Greer, Ross; Ghita, Ahmed Alaaeldin; Castrillón-Santana, Modesto Fernando; Trivedi, Mohan Manubhai; Knoll, Alois; Carta, Salvatore Mario; Marras, Mirko

doi:10.1007/978-3-031-91813-1_20

Published September 29, 2024 | Version v1

Conference paper Open

Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection

1. University of Cagliari
2. Technical University of Munich
3. University of California, Merced
4. SETLabs Research GmbH, Munich, Germany
5. Universidad de Las Palmas de Gran Canaria
6. University of California, San Diego

Accurately detecting 3D objects from monocular images in dynamic roadside scenarios remains a challenging problem due to varying camera perspectives and unpredictable scene conditions. This paper introduces a two-stage training strategy to address these challenges. Our approach initially trains a model on the large-scale synthetic dataset, RoadSense3D, which offers a diverse range of scenarios for robust feature learning. Subsequently, we fine-tune the model on a combination of real-world datasets to enhance its adaptability to practical conditions. Experimental results of the Cube R-CNN model on challenging public benchmarks show a remarkable improvement in detection performance, with a mean average precision rising from 0.26 to 12.76 on the TUM Traffic A9 Highway dataset and from 2.09 to 6.60 on the DAIR-V2X-I dataset, when performing transfer learning. Code, data, and qualitative video results are available at https://roadsense3d.github.io.

Files

Abstract.pdf

Files (120.6 kB)

Name	Size	Download all
Abstract.pdf md5:ba9af8508363c73e74f01d709b813871	120.6 kB	Preview Download

	All versions	This version
Views	52	52
Downloads	12	12
Data volume	1.7 MB	1.7 MB

Transfer Learning from Simulated to Real Scenes for Monocular 3D Object Detection

Authors/Creators

Description

Files

Abstract.pdf

Files (120.6 kB)

Additional details

References