Multi-sensor Fusion for Positioning and Semantic Information Extraction in Indoor Environments
MetadataShow full item record
This thesis focuses on three significant challenges in multi-sensor data fusion for real-world uses: (1) Lack of multisensory datasets for mapping and positioning datasets in underground environments, (2) inefficient and error-prone of most of the existing multi-sensor calibration methods, and (3) lack of effective methods for semantic segmentation of 3D point clouds in indoor environments, which is an important direction for last-mile autonomous driving and indoor positioning. This thesis proposes three methods to address the above problem: (1) Using a modular mobile robot platform to collect multi-sensor datasets for underground environments. For various indoor and underground scenarios (e.g., office buildings, underground parking lots), different mobile robots, such as mobile LiDAR backpacks, robotic dogs, and wheeled multi-sensor carts, are used to collect rich 2D-3D data. Accurate trajectory acquisition is achieved by optimized algorithms. (2) Target-free automatic multi-sensor calibration: a target-free automatic self-calibration method is proposed for LiDAR and camera. The proposed method uses the trajectory constraint between LiDAR and camera to optimize the calibration parameters. The graph optimization algorithm is used to solve the problem of camera trajectory drift and finally achieve the purpose of automatic calibration. (3) A novel 3D SLAM algorithm tailored specifically for indoor point cloud scenes consists of a front-end ranging method and a back-end closed-loop detection and optimization algorithm using sub-maps. Based on this, a front- and back-end 3D point cloud segmentation framework is proposed, which includes the PointNet algorithm for creating high-precision point cloud semantic models. And utilizing the fusion information 3D map to achieve environmental semantic information extraction for last-mile autonomous driving. The above methods are validated using the collected datasets. The results show that (1) for the LiDAR and camera automatic calibration methods, the RMSE can be improved from a maximum of 6.637° to 0.564°, and the RMSE of translation can be improved from a maximum of 0.197 m to 0.564° compared with traditional manual calibration methods. (2) Experimental results of point cloud quality analysis and evaluation for the optimized SLAM algorithm show that the front-end trajectory and attitude estimation (ranging) algorithm, combined with the back-end closed-loop optimization, can provide centimeter-level accuracy for localization and map construction. (3) Results from indoor 3D point cloud data segmentation experiments demonstrate that the PSIF method outperforms others, improving the training efficiency of PointNet. The average semantic modeling accuracy across categories exceeds 84.8%.
Cite this version of the work
Dedong Zhang (2023). Multi-sensor Fusion for Positioning and Semantic Information Extraction in Indoor Environments. UWSpace. http://hdl.handle.net/10012/19512