RGB-D Scene Flow via Grouping Rigid Motions
MetadataShow full item record
Robotics and artificial intelligence have seen drastic advancements in technology and algorithms over the last decade. Computer vision algorithms play a crucial role in enabling robots and machines to understand their environment. A fundamental cue in understanding environments is analyzing the motions within the scene, otherwise known as scene flow. Scene flow estimates the 3D velocity of each imaged point captured by a camera. The 3D information of the scene can be acquired by RGB-D cameras, which produce both colour and depth images and have been proven to be useful for solving many computer vision tasks. Scene flow has numerous applications such as motion segmentation, 3D mapping, robotic navigation and obstacle avoidance, gesture recognition, etc. Most state-of-the-art RGB-D scene flow methods are set in a variational framework and formulated as an energy minimization problem. While these methods are able to provide high accuracy, they are computationally expensive and not robust under larger motions in the scene. The main contributions of this research is a method for efficiently estimating approximate RGB-D scene flow. A new approach to scene flow estimation has been introduced based on matching 3D points from one frame to the next in a hierarchical fashion. One main observation that is used is that most scene motions in everyday life consist of rigid motions. As such, large parts of the scene will follow the same motion. The new method takes advantage of this fact by attempting to group the 3D data in each frame according to like-motions using concepts from spectral clustering. A simple coarse-to-fine voxelization scheme is used to provide fast estimates of motion and accommodate for larger motions. This is a much more tractable approach than existing methods and does not depend on convergence of some defined objective function in an optimization framework. By assuming the scene is composed of rigidly moving parts, non-rigid motions are not accurately estimated and hence the method is an approximate scene flow estimation. Still, quickly determining approximate motions in a scene is tremendously useful for any computer vision tasks that benefit from motion cues. Evaluation is performed on a custom RGB-D dataset because existing RGB-D scene flow datasets presented to date are mostly based on qualitative evaluation. The dataset consists of real scenes that demonstrates realistic scene flow. Experimental results show that the presented method can provide reliable scene flow estimates at significantly faster runtime speed and can handle larger motions better than current methods.