Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments

Pereira, Sahil

Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments

dc.contributor.advisor	Crowley, Mark
dc.contributor.author	Pereira, Sahil
dc.date.accessioned	2020-05-14T15:13:32Z
dc.date.available	2020-05-14T15:13:32Z
dc.date.issued	2020-05-14
dc.date.submitted	2020-05-03
dc.description.abstract	This thesis explores the application of multi-agent reinforcement learning in domains containing asymmetries between agents, caused by differences in information and position, resulting in a hierarchy of leaders and followers. Leaders are agents that have access to follower agent policies and the ability to commit to an action before the followers. These followers can observe actions taken by leaders and respond to maximize their own payoffs. Since leaders know the follower policies, they can manipulate the followers to elicit a better payoff for themselves. In this work, we focus on the problem of training agents in a multi-agent setting with continuous actions at different levels of hierarchy to obtain the best payoffs at their given positions. To address this problem we propose a new algorithm, Stackelberg Multi-Agent Reinforcement Learning (SMARL) that incorporates the Stackelberg equilibrium concept into the multi-agent deep deterministic policy gradient (MADDPG) algorithm. This enables us to efficiently train agents at all levels in the hierarchy. Since maximization over a continuous action space is intractable, we propose a method to solve our Stackelberg formulation for continuous actions using conditional actions and gradient descent. We evaluate our algorithm on multiple mixed cooperative and competitive multi-agent domains, consisting of our custom built highway driving environment and a subset of the multi-agent particle environments. We show that agents trained using our proposed algorithm outperform those trained with existing methods in most hierarchical domains, and are comparable in the rest.	en
dc.identifier.uri	http://hdl.handle.net/10012/15851
dc.language.iso	en	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.subject	reinforcement learning	en
dc.subject	multi-agent	en
dc.subject	stackelberg model	en
dc.subject	hierarchical environments	en
dc.subject	game theory	en
dc.subject	machine learning	en
dc.subject	continuous space	en
dc.subject	policy gradient	en
dc.subject	markov games	en
dc.subject	actor critic	en
dc.title	Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments	en
dc.type	Master Thesis	en
uws-etd.degree	Master of Applied Science	en
uws-etd.degree.department	Electrical and Computer Engineering	en
uws-etd.degree.discipline	Electrical and Computer Engineering	en
uws-etd.degree.grantor	University of Waterloo	en
uws.contributor.advisor	Crowley, Mark
uws.contributor.affiliation1	Faculty of Engineering	en
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Pereira_Sahil.pdf
Size:: 37.8 MB
Format:: Adobe Portable Document Format
Description:: Masters Thesis

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Electrical and Computer Engineering