Possnig, Clemens2026-06-102026-06-102026-03-20https://hdl.handle.net/10012/23580This paper develops an analytical framework to study when sophisticated machine learning algorithms may learn to collude. Algorithms observe a state variable and update policies to maximize long-term payoffs; their long-run policies correspond to the stable equilibria of a tractable differential equation. In a repeated Bertrand game, I derive necessary and sufficient conditions under which Nash equilibria are learned. This reveals how the interplay between monitoring technology (state variables) and market conditions determines whether competitive or collusive outcomes emerge. I apply these insights to evaluate two key regulatory policies: limiting algorithmic data inputs and imposing competition in the software provider market.enMulti-agent reinforcement learningRepeated gamesCollusionLearning in gamesMonitoring, Market Primitives, and the Stability of Algorithmic CollusionPreprint