TY - GEN
T1 - Camera-LiDAR Inconsistency Analysis for Active Learning in Object Detection
AU - Rivera, Esteban
AU - Serra Do Nascimento, Ana Clara
AU - Lienkamp, Markus
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Today, deep learning detectors for autonomous driving are delivering impressive results on public datasets and in real-world applications. However, these detectors require large amounts of data, especially labeled data, to achieve the performance needed to ensure safe driving. The process of collecting and tagging data is expensive and cumbersome. Therefore, the recent focus of the industry has been on how to achieve similar performance while limiting the amount of labeled data required to train such models. Within the cross-modal active learning paradigm, we propose and analyze new strategies to exploit the inconsistencies between camera and LiDAR detectors to improve sampling efficiency and label only the samples that promise improvements for model training. For this, we leverage the 2D projection of the bounding boxes to equalize the output quality of camera and LiDAR detections. Finally, we achieve up to 0.6% AP improvement for camera and 2% improvement for LiDAR over random sampling on the KITTI dataset using a sampling strategy based on the number of detected objects.
AB - Today, deep learning detectors for autonomous driving are delivering impressive results on public datasets and in real-world applications. However, these detectors require large amounts of data, especially labeled data, to achieve the performance needed to ensure safe driving. The process of collecting and tagging data is expensive and cumbersome. Therefore, the recent focus of the industry has been on how to achieve similar performance while limiting the amount of labeled data required to train such models. Within the cross-modal active learning paradigm, we propose and analyze new strategies to exploit the inconsistencies between camera and LiDAR detectors to improve sampling efficiency and label only the samples that promise improvements for model training. For this, we leverage the 2D projection of the bounding boxes to equalize the output quality of camera and LiDAR detections. Finally, we achieve up to 0.6% AP improvement for camera and 2% improvement for LiDAR over random sampling on the KITTI dataset using a sampling strategy based on the number of detected objects.
KW - active learning
KW - autonomous driving
KW - data efficiency
KW - multimodality
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=85199757291&partnerID=8YFLogxK
U2 - 10.1109/IV55156.2024.10588859
DO - 10.1109/IV55156.2024.10588859
M3 - Conference contribution
AN - SCOPUS:85199757291
T3 - IEEE Intelligent Vehicles Symposium, Proceedings
SP - 97
EP - 103
BT - 35th IEEE Intelligent Vehicles Symposium, IV 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 35th IEEE Intelligent Vehicles Symposium, IV 2024
Y2 - 2 June 2024 through 5 June 2024
ER -