Rainbow: Combining Improvements in Deep Reinforcement Learning

Published October 2017 - arXiv

Main concepts

Context

At this point,

Rainbow DQN (Hessel et al., 2017) is best summarized as multiple improvements on top of the original Nature DQN (Mnih et al., 2015) applied together. Specifically, Deep Q Network (DQN) (Mnih et al., 2015) combines the off-policy algorithm Q-Learning with a convolutional neural network as the function approximator to map raw pixels to action value functions. Since then, multiple improvements have been proposed such as Double Q Learning (Van Hasselt et al., 2016), Dueling Network Architectures (Wang et al., 2015), Prioritized Experience Replay (Schaul et al., 2015), and Noisy Networks (Fortunato et al., 2017). Additionally, distributional reinforcement learning (Bellemare et al., 2017) proposed the technique of predicting a distribution over possible value function bins through the C51 Algorithm. Rainbow DQN combines all of the above techniques into a single off-policy algorithm for state-of-the-art sample efficiency on Atari benchmarks. Additionally, Rainbow al

-- CURL paper


Last update: April 9, 2020