Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review


The significance of network structures in promoting group cooperation within social dilemmas has been widely recognized. Prior studies attribute this
facilitation to the assortment of strategies driven by spatial interactions. Although reinforcement learning has been employed to investigate the impact of dynamic interaction on the evolution of cooperation, there remains a lack of understanding about how agents develop neighbour selection behaviours and the formation of strategic assortment within an explicit interaction structure. To address this, our
study introduces a computational framework based on multi-agent reinforcement learning in the spatial Prisoner’s Dilemma game. This framework allows agents to select dilemma strategies and interacting neighbours based on their long-term experiences, differing from existing research that relies on preset social norms or external incentives. By modelling each agent using two distinct Q-networks, we disentangle the coevolutionary dynamics between cooperation and interaction. The results indicate that long-term experience enables agents to develop the
ability to identify non-cooperative neighbours and exhibit a preference for interaction with cooperative ones. This emergent self-organizing behaviour
leads to the clustering of agents with similar strategies, thereby increasing network reciprocity and enhancing group cooperation.
Original languageEnglish
Title of host publicationProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence (IJCAI-24)
Publication statusAccepted/In press - 16 Apr 2024


Dive into the research topics of 'Enhancing Cooperation through Selective Interaction and Long-term Experiences in Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this