TY - JOUR
T1 - Vision Language Models in Autonomous Driving
T2 - A Survey and Outlook
AU - Zhou, Xingcheng
AU - Liu, Mingyu
AU - Yurtsever, Ekim
AU - Zagar, Bare Luka
AU - Zimmer, Walter
AU - Cao, Hu
AU - Knoll, Alois C.
N1 - Publisher Copyright:
Authors
PY - 2024
Y1 - 2024
N2 - The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the driving systems can be able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive and systematic survey of the advances in language models in this domain, encompassing perception and understanding, navigation and planning, decision-making and control, end-to-end autonomous driving, and data generation. We introduce the mainstream VLM tasks and the commonly utilized metrics. Additionally, we review current studies and applications in various areas and summarize the existing language-enhanced autonomous driving dataset thoroughly. At last, we discuss the benefits and challenges of VLMs in AD, and provide researchers with the current research gaps and future trends. https://github.com/ge25nab/Awesome-VLM-AD-ITS
AB - The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD) have attracted widespread attention due to their outstanding performance and the ability to leverage Large Language Models (LLMs). By integrating language data, the driving systems can be able to deeply understand real-world environments, improving driving safety and efficiency. In this work, we present a comprehensive and systematic survey of the advances in language models in this domain, encompassing perception and understanding, navigation and planning, decision-making and control, end-to-end autonomous driving, and data generation. We introduce the mainstream VLM tasks and the commonly utilized metrics. Additionally, we review current studies and applications in various areas and summarize the existing language-enhanced autonomous driving dataset thoroughly. At last, we discuss the benefits and challenges of VLMs in AD, and provide researchers with the current research gaps and future trends. https://github.com/ge25nab/Awesome-VLM-AD-ITS
KW - Autonomous Driving
KW - Autonomous vehicles
KW - Computational modeling
KW - Conditional Data Generation
KW - Data models
KW - Decision Making
KW - End-to-End Autonomous Driving
KW - Intelligent Vehicle
KW - Language-guided Navigation
KW - Large Language Model
KW - Planning
KW - Surveys
KW - Task analysis
KW - Vision Language Model
KW - Visualization
UR - http://www.scopus.com/inward/record.url?scp=85193496822&partnerID=8YFLogxK
U2 - 10.1109/TIV.2024.3402136
DO - 10.1109/TIV.2024.3402136
M3 - Article
AN - SCOPUS:85193496822
SN - 2379-8858
SP - 1
EP - 20
JO - IEEE Transactions on Intelligent Vehicles
JF - IEEE Transactions on Intelligent Vehicles
ER -