A Unified Multi-Frame Strategy for Autonomous Vehicle Perception and Localization Using Radar, Camera, LiDAR, and HD Map Fusion

Alghooneh, Ahmad Reza

A Unified Multi-Frame Strategy for Autonomous Vehicle Perception and Localization Using Radar, Camera, LiDAR, and HD Map Fusion

Files

Alghooneh_AhmadReza.pdf (16.42 MB)

Date

2024-11-26

Authors

Alghooneh, Ahmad Reza

Advisor

Khajepour, Amir
Shaker, George

Publisher

University of Waterloo

Abstract

This thesis presents a novel unified perception-localization module by developing an advanced late fusion system that integrates radar, LiDAR, camera, and High Definition (HD) map information. Emphasizing reliability, robustness to inclement weather, and real-time implementation, this approach particularly focuses on radar data—the most crucial component in the fusion process. Although radar data is rich in features and robust against weather changes, it is often with false alarms and clutters. These issues pose challenges to the fusion process if not addressed. Therefore, the radar point cloud is first refined to remove these false alarms and clutter. Unlike existing approaches in the literature that require massive neural networks, this thesis presents a stochastic alternative that provides the same level of accuracy while maintaining real-time performance and minimal resource usage. By utilizing statistical information from the HD map, the system develops a likelihood of object occurrences in the Frenet coordinate system. This likelihood is then fused with a specific set of radar features to formulate a probabilistic classifier. The classifier focuses on detecting clutter and false alarms, removing them to prevent the radar point cloud from misleading the downstream fusion process. This approach significantly reduces noise and irrelevant data, precision up to 94 percent on our dataset, ensuring that the radar primarily focuses on potential obstacles and critical elements such as cars and pedestrians. The \textbf{video} of the results is available \href{https://www.youtube.com/watch?v=cNb_OR19BQk}{here}. In the subsequent phase, the refined radar data is fused with LiDAR and camera data using a novel, robust frustum approach in a late-fusion manner. Unlike previous works that rely solely on perspective equations, our method constructs a cost function and employs the Hungarian matching algorithm to associate camera data with the 3D positions of objects. This late-fusion methodology significantly enhances overall detection accuracy and reliability by effectively integrating data from multiple sensors. The resulting objects are then fed into a novel tracking module that leverages the radar's radial velocity measurements as partial observations while associating objects across frames using the Shortest Simple Path algorithm and radar features. For each individual object, the tracking module builds an Extended Kalman Filter, providing a full-state estimation of their motion states, including position, velocity, and acceleration. The \textbf{video} showing the performance of the \href{https://www.youtube.com/watch?v=LEIqpTD-pwQ}{perception}, and the \textbf{one} for the \href{https://www.youtube.com/watch?v=LNXKZDAgO40}{tracking}. In the localization phase, the previously processed perception information is utilized to improve positioning accuracy without relying on the Global Positioning System (GPS). This approach ensures a unified module that substantially reduces resource consumption. Moreover, methods relying on GPS and inertial sensors such as the Inertial Measurement Unit (IMU) are susceptible to weather and road conditions, which can challenge the reliability of autonomous driving. In contrast, a robust perception unit can overcome this challenge by using radar, a sensor resilient to inclement weather. In this approach, the fused radar-LiDAR point cloud is used to form a local map across multiple frames, and with the help of the Iterative Closest Point (ICP) algorithm, the vehicle's odometry is estimated. Then, using camera-based lane detection, the lateral distance to the centerline and curb is appended to the odometry measurement. Finally, the vehicle's motion model is rewritten in the Frenet frame, incorporating the measurement models from the radar-LiDAR point cloud and camera lane detection in an Extended Kalman Filter (EKF) formulation. This approach outperforms state-of-the-art positioning solutions in both accuracy and resource consumption. The \textbf{video} of the performance with the \href{https://drive.google.com/file/d/1WdNrda2y7kJk67JrCC8o5mcbgH78zvoh/view?usp=share_link}{localization} in the loop.