On Deep Learning for Nonprehensile Manipulation

Caro, Steven

On Deep Learning for Nonprehensile Manipulation

dc.contributor.author	Caro, Steven
dc.date.accessioned	2026-06-23T14:08:09Z
dc.date.available	2026-06-23T14:08:09Z
dc.date.issued	2026-06-23
dc.date.submitted	2026-05-29
dc.description.abstract	Nonprehensile manipulation, i.e. interaction without grasping, is a fundamental capability for mobile robots operating in unstructured environments, yet it remains a challenging control problem due to complex contact dynamics and under-actuated physics. Progress in this domain has been further hindered by the lack of standardized evaluation frameworks, leading to fragmented research efforts. This thesis addresses these gaps through two primary contributions: a unified benchmarking suite and a novel hierarchical control architecture. First, we introduce Bench-Push, a comprehensive benchmark designed specifically for pushing-based mobile robot tasks. Unlike existing benchmarks that penalize environment interaction, Bench-Push provides diverse environments — ranging from navigation-centric mazes to manipulation-centric delivery tasks — and introduces novel metrics to quantify the trade-off between task efficiency and interaction effort. We validate the framework by evaluating state-of-the-art baselines and demonstrating successful zero-shot transfer to a physical robot. Second, we propose the Hierarchical Reinforcement Learning - Diffusion Policy (HeRD), a hybrid architecture designed to solve long-horizon manipulation tasks. We identify that while Reinforcement Learning (RL) excels at strategic decision-making, it struggles to learn precise low-level contact dynamics. Conversely, Generative Diffusion Models synthesize smooth, context-aware trajectories, but lack high-level planning capabilities. HeRD bridges this gap by decoupling the control hierarchy: a high-level RL planner, utilizing a Spatial Action Map action space, selects strategic subgoals, which are then executed by a low-level, goal-conditioned diffusion policy. Extensive experiments in the Box-Delivery task demonstrate that HeRD significantly outperforms both state-of-the-art learning-based methods (SAM) and classical motion planners (Greedy Heuristic, Hierarchical RRT*). HeRD achieves higher success rates and greater interaction efficiency in both simulation and real-world deployments. Furthermore, we demonstrate that HeRD is capable of robust zero-shot generalization to unseen, unstructured clutter, successfully navigating complex environments where classical planners experience catastrophic failure.
dc.identifier.uri	https://hdl.handle.net/10012/23657
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	mobile manipulation
dc.subject	reinforcement learning
dc.subject	learning from demonstration
dc.title	On Deep Learning for Nonprehensile Manipulation
dc.type	Master Thesis
uws-etd.degree	Master of Applied Science
uws-etd.degree.department	Electrical and Computer Engineering
uws-etd.degree.discipline	Electrical and Computer Engineering
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	0
uws.contributor.advisor	Smith, Stephen
uws.contributor.affiliation1	Faculty of Engineering
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Caro_Steven.pdf
Size:: 5.97 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Electrical and Computer Engineering