Learning-Free Methods for Goal Conditioned Reinforcement Learning from Images
MetadataShow full item record
We are interested in training goal-conditioned reinforcement learning agents to reach arbitrary goals specified as images. In order to make our agent fully general, we provide the agent with only images of the environment and the goal image. Prior methods in goal-conditioned reinforcement learning from images use a learned lower-dimensional representation of images. These learned latent representations are not necessary to solve a variety of goal-conditioned tasks from images. We show that a goal-conditioned reinforcement learning policy can be successfully trained end-to-end from pixels by using simple reward functions. In contrast to prior work, we demonstrate that using negative raw pixel distance as a reward function is a strong baseline. We also show that using the negative Euclidian distance between feature vectors produced by a random convolutional neural network outperforms learned latent representations like convolutional variational autoencoders.
Cite this version of the work
Alexander Van de Kleut (2021). Learning-Free Methods for Goal Conditioned Reinforcement Learning from Images. UWSpace. http://hdl.handle.net/10012/16908
Showing items related by title, author, creator and subject.
Vandenhof, Colin (University of Waterloo, 2020-05-15)Reinforcement learning (RL) is a powerful tool for developing intelligent agents, and the use of neural networks makes RL techniques more scalable to challenging real-world applications, from task-oriented dialogue systems ...
Song, Haobei (University of Waterloo, 2019-09-12)The exploration/exploitation dilemma is a fundamental but often computationally intractable problem in reinforcement learning. The dilemma also impacts data efficiency which can be pivotal when the interactions between the ...
Minhas, Manpreet Singh (University of Waterloo, 2019-12-17)Detecting anomalies in textured surfaces is an important and interesting problem that has practical applications in industrial defect detection and infrastructure asset management with a lot of potential financial benefits. ...