Asking for Help with a Cost in Reinforcement Learning
Loading...
Date
2020-05-15
Authors
Vandenhof, Colin
Advisor
Law, Edith
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Reinforcement learning (RL) is a powerful tool for developing
intelligent agents, and the use of neural networks makes RL techniques more
scalable to challenging real-world applications, from task-oriented dialogue
systems to autonomous driving. However, one of the major bottlenecks to the
adoption of RL is efficiency, as it often takes many time steps to learn an
acceptable policy. To address this problem, we investigate the idea of
allowing the agent to ask for advice from a teacher. We formalize this
concept in a framework called ask-for-help RL, which entails augmenting a
Markov decision process with a teacher-query action that can be taken at a
fixed cost in any state. In this task, the agent faces a dilemma between
exploration, exploitation, and teacher-querying. To make this trade-off, we
propose an action selection strategy that is rooted in the classical notion
of value-of-information, and suggest a practical implementation that is based
on deep Q-learning. This algorithm, called VOE/Q, can jointly decide between
taking a particular environment action or querying the teacher, and is
sensitive to the query cost. We then perform experiments in two domains: a
maze navigation task and the Atari game Freeway. When the teacher is
excluded, the algorithm shows substantial gains over many other exploration
strategies from the literature. With the teacher included, we again find that
the algorithm outperforms baselines. By taking advantage of the teacher,
higher cumulative reward can be achieved than with standard RL alone.
Together, our results point to a promising approach to both RL and
ask-for-help RL.
Description
Keywords
reinforcement learning, apprenticeship learning, imitation learning, learning from demonstration, human-in-the-loop, interactive reinforcement learning, deep reinforcement learning, active learning
LC Subject Headings
Reinforcement learning, Active learning