Skip to main navigation Skip to search Skip to main content

Data-Driven Robust Multi-Agent Reinforcement Learning

  • Yudan Wang
  • , Yue Wang
  • , Yi Zhou
  • , Alvaro Velasquez
  • , Shaofeng Zou
  • SUNY Buffalo
  • University of Utah
  • Air Force Research Laboratory

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Multi-agent reinforcement learning (MARL) in the collaborative setting aims to find a joint policy that maximizes the accumulated reward averaged over all the agents. In this paper, we focus on MARL under model uncertainty, where the transition kernel is assumed to be in an uncertainty set, and the goal is to optimize the worst-case performance over the uncertainty set. We investigate the model-free setting, where the uncertain set centers around an unknown Markov decision process from which a single sample trajectory can be obtained sequentially. We develop a robust multi-agent Q-learning algorithm, which is model-free and fully decentralized. We theoretically prove that the proposed algorithm converges to the minimax robust policy, and further characterize its sample complexity. Our algorithm, comparing to the vanilla multi-agent Q-learning, offers provable robustness under model uncertainty without incurring additional computational and memory cost.

Original languageEnglish
Title of host publication2022 IEEE 32nd International Workshop on Machine Learning for Signal Processing, MLSP 2022
PublisherIEEE Computer Society
ISBN (Electronic)9781665485470
DOIs
StatePublished - 2022
Event32nd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2022 - Xi'an, China
Duration: Aug 22 2022Aug 25 2022

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2022-August
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference32nd IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2022
Country/TerritoryChina
CityXi'an
Period08/22/2208/25/22

Keywords

  • Distributionally robust
  • finite-time analysis
  • model-free
  • robust MDP
  • sample complexity

Fingerprint

Dive into the research topics of 'Data-Driven Robust Multi-Agent Reinforcement Learning'. Together they form a unique fingerprint.

Cite this