Vision-Inertial SLAM using Natural Features in Outdoor Environments
Simultaneous Localization and Mapping (SLAM) is a recursive probabilistic inferencing process used for robot navigation when Global Positioning Systems (GPS) are unavailable. SLAM operates by building a map of the robot environment, while concurrently localizing the robot within this map. The ultimate goal of SLAM is to operate anywhere using the environment's natural features as landmarks. Such a goal is difficult to achieve for several reasons. Firstly, different environments contain different types of natural features, each exhibiting large variance in its shape and appearance. Secondly, objects look differently from different viewpoints and it is therefore difficult to always recognize them. Thirdly, in most outdoor environments it is not possible to predict the motion of a vehicle using wheel encoders because of errors caused by slippage. Finally, the design of a SLAM system to operate in a large-scale outdoor setting is in itself a challenge. <br /><br /> The above issues are addressed as follows. Firstly, a camera is used to recognize the environmental context (e. g. , indoor office, outdoor park) by analyzing the holistic spectral content of images of the robot's surroundings. A type of feature (e. g. , trees for a park) is then chosen for SLAM that is likely observable in the recognized setting. A novel tree detection system is introduced, which is based on perceptually organizing the content of images into quasi-vertical structures and marking those structures that intersect ground level as tree trunks. Secondly, a new tree recognition system is proposed, which is based on extracting Scale Invariant Feature Transform (SIFT) features on each tree trunk region and matching trees in feature space. Thirdly, dead-reckoning is performed via an Inertial Navigation System (INS), bounded by non-holonomic constraints. INS are insensitive to slippage and varying ground conditions. Finally, the developed Computer Vision and Inertial systems are integrated within the framework of an Extended Kalman Filter into a working Vision-INS SLAM system, named VisSLAM. <br /><br /> VisSLAM is tested on data collected during a real test run in an outdoor unstructured environment. Three test scenarios are proposed, ranging from semi-automatic detection, recognition, and initialization to a fully automated SLAM system. The first two scenarios are used to verify the presented inertial and Computer Vision algorithms in the context of localization, where results indicate accurate vehicle pose estimation for the majority of its journey. The final scenario evaluates the application of the proposed systems for SLAM, where results indicate successful operation for a long portion of the vehicle journey. Although the scope of this thesis is to operate in an outdoor park setting using tree trunks as landmarks, the developed techniques lend themselves to other environments using different natural objects as landmarks.