A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents

Published in Advances in Neural Information Processing Systems (NeurIPS), 2018

Bibtex

This paper studies efficient policy detecting and reusing techniques when playing against non-stationary agents in Markov games. We propose a new deep BPR+ algorithm by extending the recent BPR+ algorithm with a neural network as the value-function approximator.

Recommended citation: Yan Zheng and Zhaopeng Meng and Jianye Hao and Zongzhang Zhang and Tianpei Yang and Changjie Fan(2019). “A Deep Bayesian Policy Reuse Approach Against Non-Stationary Agents.” Advances in Neural Information Processing Systems. 962–972.