Safety and Security of Reinforcement Learning for Autonomous Driving

Lohrasbi, Saeedeh2025-11-272025-11-272025-11-272025-10-31https://hdl.handle.net/10012/22654In the context of autonomous driving, reinforcement learning (RL) presents a powerful paradigm: agents capable of learning to drive efficiently in unseen situations through experience. However, this promise is shadowed by a fundamental concern—how can we entrust decision-making to agents that rely on trial-and-error learning in safety-critical environments where errors may carry severe consequences? This thesis advances a step toward resolving this dilemma by integrating three foundational pillars: adversarial robustness, simulation realism, and model-based safety. We begin with a comprehensive survey of adversarial attacks and corresponding defences within the domains of deep learning (DL) and deep reinforcement learning (DRL) for autonomous vehicles. This survey reveals the porous boundary between safety and security—both natural disturbances and adversarial perturbations can destabilize learned policies. Motivated by this insight, we introduce the Optimism Induction Attack (OIA), a novel adversarial technique that manipulates an RL agent’s perception of safety, causing it to act with unwarranted confidence in hazardous situations. Evaluated in the context of an Adaptive Cruise Control (ACC) task, the OIA significantly impairs policy performance, exposing critical vulnerabilities in state-of-the-art RL algorithms. To counter the demonstrated threats, we present a systematic defence architecture. We develop REVEAL, a high-fidelity simulation framework designed to support the training and evaluation of safe RL agents under realistic vehicle dynamics, traffic scenarios, and adversarial conditions. By narrowing the gap between abstract simulation and real-world complexity, REVEAL facilitates rigorous and nuanced testing, which is essential for safety-critical applications. To enhance learning efficiency within this environment, we employ a transfer learning (TL) strategy: policies initially trained in simplified simulators (e.g., SUMO) are adapted and fine-tuned in REVEAL, leading to faster convergence and improved safety performance during both training and deployment. Central to our approach is the development of a Multi-Output Control Barrier Function (MO-CBF), which simultaneously supervises throttle and brake commands to enforce safety constraints in real time. Rather than relying on hard overrides, the MO-CBF operates cooperatively with the learning agent—gently adjusting unsafe actions and introducing corresponding penalties during training. This enables the agent not only to learn safe behaviour but also to internalize safety principles and anticipate potentially unsafe scenarios. Our empirical evaluation demonstrates the effectiveness of the proposed framework across a spectrum of disturbances, adversarial inputs, and realistic high-risk maneuvers. The results consistently show improved safety and robustness, highlighting the framework’s capacity to transform RL agents from vulnerable learners into trustworthy autonomous systems. In summary, this thesis presents a comprehensive methodology for safe and secure RL in autonomous driving. By grounding agent training in high-fidelity simulation, leveraging adversarial awareness, and embedding real-time model-based safety mechanisms, we provide a cohesive and scalable pathway toward deploying RL in the real world with confidence.enautonomous drivingreinforcement learningsafe reinforcement learningadversarial attack and defencecontrol barrier functionscybersecuritySafety and Security of Reinforcement Learning for Autonomous DrivingDoctoral Thesis