Generative Models for Planning and Decision-Making

Karthikeyan, Akash

Generative Models for Planning and Decision-Making

dc.contributor.author	Karthikeyan, Akash
dc.date.accessioned	2025-08-14T20:04:33Z
dc.date.available	2025-08-14T20:04:33Z
dc.date.issued	2025-08-14
dc.date.submitted	2025-08-08
dc.description.abstract	Generative models have achieved remarkable progress across domains such as vision and language. However, their application to sequential decision-making and planning remains challenging. In reinforcement learning and robotics, agents must handle task hierarchies, long-horizon dependencies, adapt to harder unseen tasks and environments, and, especially in multi-agent settings, respond to adversarial or evolving opponents. Despite progress in behavioral cloning and offline policy learning, existing approaches often struggle to generalize beyond the train distribution or to learn robust, interactive behaviors in competitive games. These limitations restrict current systems to narrow tasks with short temporal horizons, or deterministic settings. For instance, behavioral planners trained on single-goal environments struggle scaling to multi-task missions requiring subgoal discovery and adaptive reasoning, as there is no straightforward mechanism for iterative test-time adaptation to these unseen tasks. Similarly, in multi-agent reinforcement learning, standard policy optimization often yields unimodal, brittle strategies that overfit to specific opponents and fail to converge to a Nash equilibrium in continuous state-action games. This thesis explores challenges and opportunities in using generative models for planning and decision-making tasks, specifically energy-based and diffusion-based models which serve as both representations and solvers for planning and policy learning. In the single-agent setting, we introduce GenPlan, a discrete-flow planner that reframes planning as iterative denoising over trajectories using an energy-guided diffusion process. This formulation enables task and goal discovery, and adaptation to unseen environments. In the multi-agent setting, we propose DiffFP, a diffusion policy gradient method within the fictitious play framework. By approximating best responses through diffusion models, DiffFP captures multimodal strategies, improves sample efficiency, and remains robust to evolving opponents in dynamic, continuous state-action games. Our empirical studies show that GenPlan outperforms baselines by over 10% on adaptive planning tasks, generalizing from single-task demonstrations to complex, compositional multi-task missions. Likewise, DiffFP achieves up to 3× faster convergence and 30× higher success rates compared to other baseline reinforcement learning algorithms in multi-agent benchmarks. These results demonstrate the potential of generative modeling not only for representation learning, but as a unified substrate for planning, learning, and decision-making across settings.
dc.identifier.uri	https://hdl.handle.net/10012/22172
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	https://github.com/CL2-UWaterloo/GenPlan/
dc.relation.uri	https://github.com/CL2-UWaterloo/DiffFP/
dc.subject	generative models
dc.subject	reinforcement learning
dc.subject	diffusion model
dc.title	Generative Models for Planning and Decision-Making
dc.type	Master Thesis
uws-etd.degree	Master of Applied Science
uws-etd.degree.department	Electrical and Computer Engineering
uws-etd.degree.discipline	Electrical and Computer Engineering
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	1 year
uws.comment.hidden	Full Name: Akash Karthikeyan Student ID: 21104705
uws.contributor.advisor	Pant, Yash
uws.contributor.affiliation1	Faculty of Engineering
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Karthikeyan_Akash.pdf
Size:: 5.02 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Electrical and Computer Engineering