TY - GEN
T1 - Continuous Control of Autonomous Vehicles using Plan-assisted Deep Reinforcement Learning
AU - Dwivedi, Tanay
AU - Betz, Tobias
AU - Sauerbeck, Florian
AU - Manivannan, Pv
AU - Lienkamp, Markus
N1 - Publisher Copyright:
© 2022 ICROS.
PY - 2022
Y1 - 2022
N2 - End-to-end deep reinforcement learning (DRL) is emerging as a promising paradigm for autonomous driving. Although DRL provides an elegant framework to accomplish final goals without extensive manual engineering, capturing plans and behavior using deep neural networks is still an unsolved issue. End-to-end architectures, as a result, are currently limited to simple driving scenarios, often performing sub-optimally when rare, unique conditions are encountered. We propose a novel plan-assisted deep reinforcement learning framework that, along with the typical state-space, leverages a 'trajectory-space' to learn optimal control. While the trajectory-space, generated by an external planner, intrinsically captures the agent's high-level plans, world models are used to understand the dynamics of the environment for learning behavior in latent space. An actor-critic network, trained in imagination, uses these latent features to predict policy and state-value function. Based primarily on DreamerV2 and Racing Dreamer, the proposed model is first trained in a simulator and eventually tested on the FITENTH race car. We evaluate our model for best lap times against parameter-tuned and learning-based controllers on unseen race tracks and demonstrate that it generalizes to complex scenarios where other approaches perform sub-optimally. Furthermore, we show the model's enhanced stability as a trajectory tracker and establish the improvement in interpretability achieved by the proposed framework.
AB - End-to-end deep reinforcement learning (DRL) is emerging as a promising paradigm for autonomous driving. Although DRL provides an elegant framework to accomplish final goals without extensive manual engineering, capturing plans and behavior using deep neural networks is still an unsolved issue. End-to-end architectures, as a result, are currently limited to simple driving scenarios, often performing sub-optimally when rare, unique conditions are encountered. We propose a novel plan-assisted deep reinforcement learning framework that, along with the typical state-space, leverages a 'trajectory-space' to learn optimal control. While the trajectory-space, generated by an external planner, intrinsically captures the agent's high-level plans, world models are used to understand the dynamics of the environment for learning behavior in latent space. An actor-critic network, trained in imagination, uses these latent features to predict policy and state-value function. Based primarily on DreamerV2 and Racing Dreamer, the proposed model is first trained in a simulator and eventually tested on the FITENTH race car. We evaluate our model for best lap times against parameter-tuned and learning-based controllers on unseen race tracks and demonstrate that it generalizes to complex scenarios where other approaches perform sub-optimally. Furthermore, we show the model's enhanced stability as a trajectory tracker and establish the improvement in interpretability achieved by the proposed framework.
KW - Autonomous driving
KW - artificial intelligence
KW - deep reinforcement learning
KW - intelligent vehicles
KW - world models
UR - http://www.scopus.com/inward/record.url?scp=85146590718&partnerID=8YFLogxK
U2 - 10.23919/ICCAS55662.2022.10003698
DO - 10.23919/ICCAS55662.2022.10003698
M3 - Conference contribution
AN - SCOPUS:85146590718
T3 - International Conference on Control, Automation and Systems
SP - 244
EP - 250
BT - 2022 22nd International Conference on Control, Automation and Systems, ICCAS 2022
PB - IEEE Computer Society
T2 - 22nd International Conference on Control, Automation and Systems, ICCAS 2022
Y2 - 27 November 2022 through 1 December 2022
ER -