The Bounds of Algorithmic Collusion: Q-learning, Gradient Learning, and the Folk Theorem
| dc.contributor.author | Askenazi-Golan, Galit | |
| dc.contributor.author | Mergoni Cecchelli, Domenico | |
| dc.contributor.author | Plumb, Edward | |
| dc.contributor.author | Possnig, Clemens | |
| dc.date.accessioned | 2026-06-10T16:18:20Z | |
| dc.date.available | 2026-06-10T16:18:20Z | |
| dc.date.issued | 2026-03-03 | |
| dc.description.abstract | We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics, including Q-learning, projected gradient, replicator and log-barrier dynamics. Going beyond the better understood classes of potential games and zero-sum games, we consider the setting of a general repeated game with finite recall under different forms of monitoring. We obtain a Folk Theorem-style result and characterise the set of payoff vectors that can be obtained by these dynamics, discovering a wide range of possibilities for the emergence of algorithmic collusion. Achieving this requires a novel technical approach, which, to the best of our knowledge, yields the first convergence result for multi-agent Q-learning algorithms in repeated games. | |
| dc.identifier.uri | https://hdl.handle.net/10012/23583 | |
| dc.language.iso | en | |
| dc.publisher | London School of Economics, University of Waterloo | |
| dc.title | The Bounds of Algorithmic Collusion: Q-learning, Gradient Learning, and the Folk Theorem | |
| dc.type | Preprint | |
| uws.contributor.affiliation1 | Faculty of Arts | |
| uws.contributor.affiliation2 | Economics | |
| uws.peerReviewStatus | Unreviewed | |
| uws.scholarLevel | Faculty | |
| uws.typeOfResource | Text | en |