Near Optimal Convergence to Coarse Correlated Equilibrium in General-Sum Markov Games
2511.02157v1
cs.GT, cs.AI, cs.LG, cs.SY, eess.SY, math.OC
2025-11-06
Авторы:
Asrin Efe Yorulmaz, Tamer Başar
Abstract
No-regret learning dynamics play a central role in game theory, enabling
decentralized convergence to equilibrium for concepts such as Coarse Correlated
Equilibrium (CCE) or Correlated Equilibrium (CE). In this work, we improve the
convergence rate to CCE in general-sum Markov games, reducing it from the
previously best-known rate of $\mathcal{O}(\log^5 T / T)$ to a sharper
$\mathcal{O}(\log T / T)$. This matches the best known convergence rate for CE
in terms of $T$, number of iterations, while also improving the dependence on
the action set size from polynomial to polylogarithmic-yielding exponential
gains in high-dimensional settings. Our approach builds on recent advances in
adaptive step-size techniques for no-regret algorithms in normal-form games,
and extends them to the Markovian setting via a stage-wise scheme that adjusts
learning rates based on real-time feedback. We frame policy updates as an
instance of Optimistic Follow-the-Regularized-Leader (OFTRL), customized for
value-iteration-based learning. The resulting self-play algorithm achieves, to
our knowledge, the fastest known convergence rate to CCE in Markov games.