A Graph-Based Soft Actor Critic Approach in Multi-Agent Reinforcement Learning
[摘要] Multi-Agent Reinforcement Learning (MARL) is widely used to solve various real-world problems. In MARL, the environment contains multiple agents. A good grasp of the environment can guide agents to learn cooperative strategies. In Centralized Training Decentralized Execution (CTDE), a centralized critic is used to guide cooperative strategies learning. However, having multiple agents in the environment leads to the curse of dimensionality and influence of other agents’ strategies, resulting in difficulties for centralized critics to learn good cooperative strategies. We propose a graph-based approach to overcome the above problems. It uses a graph neural network, which uses partial observations of agents as input, and information between agents is aggregated by graph methods to extract information about the whole environment. In this way, agents can improve their understanding of the overall state of the environment and other agents in the environment while avoiding dimensional explosion. Then we combine a dual critic dynamic decomposition method with soft actor-critic to train policy. The former uses individual and global rewards for learning, avoiding the influence of other agents’ strategies, and the latter help to learn an optional policy better. We call this approach Multi-Agent Graph-based soft Actor-Critic (MAGAC). We compare our proposed method with several classical MARL algorithms under the Multi-agent Particle Environment (MPE). The experimental results show that our method can achieve a faster learning speed while learning better policy.
[发布日期] [发布机构]
[效力级别] [学科分类] 自动化工程
[关键词] Multi-Agent Systems;deep reinforcement learning (DRL);graph convolution neural network [时效性]