SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning

  • Benjamin Ellis*
  • , Jonathan Cook
  • , Skander Moalla
  • , Mikayel Samvelyan
  • , Mingfei Sun
  • , Anuj Mahajan
  • , Jakob N. Foerster
  • , Shimon Whiteson
  • *Corresponding author for this work

Research output: Chapter in Book/Conference proceedingConference contributionpeer-review

211 Downloads (Pure)

Abstract

The availability of challenging benchmarks has played a key role in the recent progress of machine learning. In cooperative multi-agent reinforcement learning, the StarCraft Multi-Agent Challenge (SMAC) has become a popular testbed for centralised training with decentralised execution. However, after years of sustained improvement on SMAC, algorithms now achieve near-perfect performance. In this work, we conduct new analysis demonstrating that SMAC lacks the stochasticity and partial observability to require complex closed-loop policies. In particular, we show that an open-loop policy conditioned only on the timestep can achieve non-trivial win rates for many SMAC scenarios. To address this limitation, we introduce SMACv2, a new version of the benchmark where scenarios are procedurally generated and require agents to generalise to previously unseen settings (from the same distribution) during evaluation. We also introduce the extended partial observability challenge (EPO), which augments SMACv2 to ensure meaningful partial observability. We show that these changes ensure the benchmark requires the use of closed-loop policies. We evaluate state-of-the-art algorithms on SMACv2 and show that it presents significant challenges not present in the original benchmark. Our analysis illustrates that SMACv2 addresses the discovered deficiencies of SMAC and can help benchmark the next generation of MARL methods.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 36
Subtitle of host publication(NeurIPS 2023)
EditorsA. Oh, T. Neumann, A. Globerson, K. Saenko, M. Hardt, S. Levine
PublisherNeural information processing systems foundation
Pages 37567 - 37593
Number of pages27
ISBN (Electronic)9781713899921
Publication statusPublished - Jul 2024
Event37th Conference on Neural Information Processing Systems, NeurIPS 2023 - New Orleans, United States
Duration: 10 Dec 202316 Dec 2023

Publication series

NameAdvances in Neural Information Processing Systems
Volume36
ISSN (Print)1049-5258

Conference

Conference37th Conference on Neural Information Processing Systems, NeurIPS 2023
Country/TerritoryUnited States
CityNew Orleans
Period10/12/2316/12/23

Keywords

  • cs.LG
  • cs.MA

Fingerprint

Dive into the research topics of 'SMACv2: An Improved Benchmark for Cooperative Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this