Semantic-Aware Active Perception for Next-Best-View Grasp Planning

Kweon, Tae Hyeon; Jeon, Soo

Semantic-Aware Active Perception for Next-Best-View Grasp Planning

Files

Semantic_Aware_Active_Perception_for_NBV_Grasp_Planning.pdf (50.32 MB)

Date

2026

Authors

Kweon, Tae Hyeon

Jeon, Soo

Abstract

Robotic grasping is a cornerstone of manufacturing automation, and recent advances in deep learning have brought data-driven adaptability to vision-based grasping. However, achieving human-like performance in cluttered environments requires additional capabilities, such as correctly perceiving the object to be retrieved and efficiently planning viewpoints to reconstruct the target object for better grasping under heavy occlusion. To address these challenges, we propose a semantic-aware Next-Best-View (NBV) planning framework that integrates geometric and semantic information gains for targeted exploration. The proposed method maintains a semantic–geometric voxel representation that incrementally accumulates semantic detections across views, guiding viewpoint selection toward regions most likely to reveal graspable target surfaces. We evaluate the framework in simulation and real-world experiments using a Franka Emika Panda arm under heavy occlusion. The proposed approach achieves an 84% success rate in simulation and 10/10 successful grasps in real-world experiments, outperforming baselines in simulation while matching the real-world performance of a geometric NBV method despite requiring no prior knowledge of object locations.