Semantic-Aware Active Perception for Next-Best-View Grasp Planning

Loading...
Thumbnail Image

Date

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Robotic grasping is a cornerstone of manufacturing automation, and recent advances in deep learning have brought data-driven adaptability to vision-based grasping. However, achieving human-like performance in cluttered environments requires additional capabilities, such as correctly perceiving the object to be retrieved and efficiently planning viewpoints to reconstruct the target object for better grasping under heavy occlusion. To address these challenges, we propose a semantic-aware Next-Best-View (NBV) planning framework that integrates geometric and semantic information gains for targeted exploration. The proposed method maintains a semantic–geometric voxel representation that incrementally accumulates semantic detections across views, guiding viewpoint selection toward regions most likely to reveal graspable target surfaces. We evaluate the framework in simulation and real-world experiments using a Franka Emika Panda arm under heavy occlusion. The proposed approach achieves an 84% success rate in simulation and 10/10 successful grasps in real-world experiments, outperforming baselines in simulation while matching the real-world performance of a geometric NBV method despite requiring no prior knowledge of object locations.

Description

LC Subject Headings

Citation