Obedience-based Multi-Agent Cooperation for Sequential Social Dilemmas

dc.contributor.authorGupta, Gaurav
dc.date.accessioned2020-05-14T15:38:46Z
dc.date.available2020-05-14T15:38:46Z
dc.date.issued2020-05-14
dc.date.submitted2020-05-05
dc.description.abstractWe propose a mechanism for achieving cooperation and communication in Multi-Agent Reinforcement Learning (MARL) settings by intrinsically rewarding agents for obeying the commands of other agents. At every timestep, agents exchange commands through a cheap-talk channel. During the following timestep, agents are rewarded both for taking actions that conform to commands received as well as for giving successful commands. We refer to this approach as obedience-based learning. We demonstrate the potential for obedience-based approaches to enhance coordination and communication in challenging sequential social dilemmas, where traditional MARL approaches often collapse without centralized training or specialized architectures. We also demonstrate the flexibility of this approach with regards to population heterogeneity and vocabulary size. Obedience-based learning stands out as an intuitive form of cooperation with minimal complexity and overhead that can be applied to heterogeneous populations. In contrast, previous works with sequential social dilemmas are often restricted to homogeneous populations and require complete knowledge of every player's reward structure. Obedience-based learning is a promising direction for exploration in the field of cooperative MARL.en
dc.identifier.urihttp://hdl.handle.net/10012/15853
dc.language.isoenen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.relation.urihttps://github.com/gauravg11/sequential_social_dilemma_gamesen
dc.subjectReinforcement Learningen
dc.subjectCooperationen
dc.subjectMulti-Agent Reinforcement Learningen
dc.subjectIntrinsic Rewarden
dc.subjectCheap-Talk Communicationen
dc.titleObedience-based Multi-Agent Cooperation for Sequential Social Dilemmasen
dc.typeMaster Thesisen
uws-etd.degreeMaster of Mathematicsen
uws-etd.degree.departmentDavid R. Cheriton School of Computer Scienceen
uws-etd.degree.disciplineComputer Scienceen
uws-etd.degree.grantorUniversity of Waterlooen
uws.contributor.advisorHoey, Jesse
uws.contributor.affiliation1Faculty of Mathematicsen
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gupta_Gaurav.pdf
Size:
1.33 MB
Format:
Adobe Portable Document Format
Description:

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: