Stackelberg Multi-Agent Reinforcement Learning for Hierarchical Environments

dc.contributor.authorPereira, Sahil
dc.date.accessioned2020-05-14T15:13:32Z
dc.date.available2020-05-14T15:13:32Z
dc.date.issued2020-05-14
dc.date.submitted2020-05-03
dc.description.abstractThis thesis explores the application of multi-agent reinforcement learning in domains containing asymmetries between agents, caused by differences in information and position, resulting in a hierarchy of leaders and followers. Leaders are agents that have access to follower agent policies and the ability to commit to an action before the followers. These followers can observe actions taken by leaders and respond to maximize their own payoffs. Since leaders know the follower policies, they can manipulate the followers to elicit a better payoff for themselves. In this work, we focus on the problem of training agents in a multi-agent setting with continuous actions at different levels of hierarchy to obtain the best payoffs at their given positions. To address this problem we propose a new algorithm, Stackelberg Multi-Agent Reinforcement Learning (SMARL) that incorporates the Stackelberg equilibrium concept into the multi-agent deep deterministic policy gradient (MADDPG) algorithm. This enables us to efficiently train agents at all levels in the hierarchy. Since maximization over a continuous action space is intractable, we propose a method to solve our Stackelberg formulation for continuous actions using conditional actions and gradient descent. We evaluate our algorithm on multiple mixed cooperative and competitive multi-agent domains, consisting of our custom built highway driving environment and a subset of the multi-agent particle environments. We show that agents trained using our proposed algorithm outperform those trained with existing methods in most hierarchical domains, and are comparable in the rest.en
dc.identifier.urihttp://hdl.handle.net/10012/15851
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectreinforcement learningen
dc.subjectmulti-agenten
dc.subjectstackelberg modelen
dc.subjecthierarchical environmentsen
dc.subjectgame theoryen
dc.subjectmachine learningen
dc.subjectcontinuous spaceen
dc.subjectpolicy gradienten
dc.subjectmarkov gamesen
dc.subjectactor criticen
dc.titleStackelberg Multi-Agent Reinforcement Learning for Hierarchical Environmentsen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Applied Scienceen
uws-etd.degree.departmentElectrical and Computer Engineeringen
uws-etd.degree.disciplineElectrical and Computer Engineeringen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorCrowley, Mark
uws.contributor.affiliation1Faculty of Engineeringen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Pereira_Sahil.pdf
Size:
37.8 MB
Format:
Adobe Portable Document Format
Description:
Masters Thesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: