Learning to communicate using counterfactual reasoning

Vanneste, Simon; Vanneste, Astrid; Mets, Kevin; De Schepper, Tom; Anwar, Ali; Mercelis, Siegfried; Latré, Steven; Hellinckx, Peter

Title

Author

Vanneste, Simon

Vanneste, Astrid

Mets, Kevin

De Schepper, Tom

Anwar, Ali

Mercelis, Siegfried

Latré, Steven

Hellinckx, Peter

Abstract

Learning to communicate in order to share state information is an active problem in the area of multi-agent reinforcement learning (MARL). The credit assignment problem, the non-stationarity of the communication environment and the problem of encouraging the agents to be influenced by incoming messages are major challenges within this research field which need to be overcome in order to learn a valid communication protocol. This paper introduces the novel multi-agent counterfactual communication learning (MACC) method which adapts counterfactual reasoning in order to overcome the credit assignment problem for communicating agents. Next, the non-stationarity of the communication environment, while learning the communication Q-function, is overcome by creating the communication Q-function using the action policy of the other agents and the Q-function of the action environment. As the exact method to create the communication Q-function can be computationally intensive for a large number of agents, two approximation methods are proposed. Additionally, a social loss function is introduced in order to create influenceable agents, which is required to learn a valid communication protocol. Our experiments show that MACC is able to outperform the state-of-the-art baselines in four different scenarios in the Particle environment. Finally, we demonstrate the scalability of MACC in a matrix environment.

Language

English

Source (book)

Adaptive and Learning Agents Workshop (ALA), collocated with AAMAS, 11-13 May, 2022, Auckland, New Zealand

Publication

2022

Volume/pages

p. 1-9

Medium

E-only publicatie

Full text (open access)

https://repository.uantwerpen.be/docstore/d:irua:20830

Faculty/Department				Faculty of Sciences. Mathematics and Computer Science Faculty of Applied Engineering Sciences

Research group				Internet Data Lab (IDLab)
Project info				Multi-Agent Communication and Behaviour Training using Reinforcement Learning. Learning to communicate efficiently with multi-agent reinforcement learning for distributed control applications.
Publication type				P3 Proceeding

Subject				Engineering sciences. Technology Computer. Automation

Affiliation				Publications with a UAntwerp address

Source file

https://ala2022.github.io/papers/ALA2022_paper_17.pdf

Identifier

Creation

12.12.2023

Last edited

17.06.2024

To cite this reference

https://hdl.handle.net/10067/2015360151162165141