MAGIC-MASK: Multi-Agent Guided Inter-Agent Collaboration with Mask-Based Explainability for Reinforcement Learning
2510.00274v1
cs.AI, cs.LG, cs.MA
2025-10-05
Авторы:
Maisha Maliha, Dean Hougen
Abstract
Understanding the decision-making process of Deep Reinforcement Learning
agents remains a key challenge for deploying these systems in safety-critical
and multi-agent environments. While prior explainability methods like
StateMask, have advanced the identification of critical states, they remain
limited by computational cost, exploration coverage, and lack of adaptation to
multi-agent settings. To overcome these limitations, we propose a
mathematically grounded framework, MAGIC-MASK (Multi-Agent Guided Inter-agent
Collaboration with Mask-Based Explainability for Reinforcement Learning), that
extends perturbation-based explanation to Multi-Agent Reinforcement Learning.
Our method integrates Proximal Policy Optimization, adaptive epsilon-greedy
exploration, and lightweight inter-agent collaboration to share masked state
information and peer experience. This collaboration enables each agent to
perform saliency-guided masking and share reward-based insights with peers,
reducing the time required for critical state discovery, improving explanation
fidelity, and leading to faster and more robust learning. The core novelty of
our approach lies in generalizing explainability from single-agent to
multi-agent systems through a unified mathematical formalism built on
trajectory perturbation, reward fidelity analysis, and Kullback-Leibler
divergence regularization. This framework yields localized, interpretable
explanations grounded in probabilistic modeling and multi-agent Markov decision
processes. We validate our framework on both single-agent and multi-agent
benchmarks, including a multi-agent highway driving environment and Google
Research Football, demonstrating that MAGIC-MASK consistently outperforms
state-of-the-art baselines in fidelity, learning efficiency, and policy
robustness while offering interpretable and transferable explanations.
Ссылки и действия
Дополнительные ресурсы: