The Bounds of Algorithmic Collusion: Q-learning, Gradient Learning, and the Folk Theorem

Loading...
Thumbnail Image

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

London School of Economics, University of Waterloo

Abstract

We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics, including Q-learning, projected gradient, replicator and log-barrier dynamics. Going beyond the better understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall under different forms of monitoring. We obtain a Folk Theorem-style result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion. Achieving this requires a novel technical approach, which, to the best of our knowledge, yields the first convergence result for multi-agent Q-learning algorithms in repeated games.

Description

Keywords

LC Subject Headings

Citation