Rainbow: Combining Improvements in Deep Reinforcement Learning¶
Published October 2017 - arXiv
Main concepts¶
Context¶
At this point,
Links¶
Rainbow DQN (Hessel et al., 2017) is best summarized as multiple improvements on top of the original Nature DQN (Mnih et al., 2015) applied together. Specifically, Deep Q Network (DQN) (Mnih et al., 2015) combines the off-policy algorithm Q-Learning with a convolutional neural network as the function approximator to map raw pixels to action value functions. Since then, multiple improvements have been proposed such as Double Q Learning (Van Hasselt et al., 2016), Dueling Network Architectures (Wang et al., 2015), Prioritized Experience Replay (Schaul et al., 2015), and Noisy Networks (Fortunato et al., 2017). Additionally, distributional reinforcement learning (Bellemare et al., 2017) proposed the technique of predicting a distribution over possible value function bins through the C51 Algorithm. Rainbow DQN combines all of the above techniques into a single off-policy algorithm for state-of-the-art sample efficiency on Atari benchmarks. Additionally, Rainbow al
-- CURL paper