Linearizing Contextual Multi-Armed Bandit Problems with Latent Dynamics

Nelson, Elliot

Linearizing Contextual Multi-Armed Bandit Problems with Latent Dynamics

Files

Nelson_Elliot.pdf (1.34 MB)

Date

2022-02-10

Authors

Nelson, Elliot

Advisor

Poupart, Pascal

Publisher

University of Waterloo

Abstract

In many real-world applications of multi-armed bandit problems, both rewards and observed contexts are often influenced by confounding latent variables which evolve stochastically over time. While the observed contexts and rewards are nonlinearly related, prior knowledge of latent graphical structure can be leveraged to reduce the problem to the linear bandit setting. We develop a linearized latent Thompson sampling algorithm (L2TS), which exploits prior knowledge of the dependence of observed contexts on the hidden state to build a least-squares estimator of the latent transition matrix, and uses the resulting approximate posterior beliefs over the latent space as context features in a linear bandit problem. We upper bound the error in reward parameter estimates in our method, demonstrating the role of the latent dynamics and evolution of posterior beliefs. We also demonstrate through experiments the superiority of our approach over related bandit algorithms. Lastly, we derive a theoretical bound which demonstrates the influence of the latent dynamics and information theoretic structure of the problem on Bayesian inference over the latent space. Overall, our approach uses prior knowledge to reduce a complex decision-making problem to a simpler problem for which existing solutions and methods can be applied.

Keywords

multi-armed bandits, sequential decision making, Bayesian inference, latent variable models, hidden Markov models, linear regression, Thompson sampling, exploration

URI

http://hdl.handle.net/10012/18068

Collections

Theses
Computer Science

Full item page

Linearizing Contextual Multi-Armed Bandit Problems with Latent Dynamics

Files

Date

Authors

Advisor

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

LC Subject Headings

Citation

URI

Collections