Opponent Shaping in LLM Agents
2510.08255v1
cs.LG, cs.AI, cs.CL, cs.MA
2025-10-11
Авторы:
Marta Emili Garcia Segura, Stephen Hailes, Mirco Musolesi
Abstract
Large Language Models (LLMs) are increasingly being deployed as autonomous
agents in real-world environments. As these deployments scale, multi-agent
interactions become inevitable, making it essential to understand strategic
behavior in such systems. A central open question is whether LLM agents, like
reinforcement learning agents, can shape the learning dynamics and influence
the behavior of others through interaction alone. In this paper, we present the
first investigation of opponent shaping (OS) with LLM-based agents. Existing OS
algorithms cannot be directly applied to LLMs, as they require higher-order
derivatives, face scalability constraints, or depend on architectural
components that are absent in transformers. To address this gap, we introduce
ShapeLLM, an adaptation of model-free OS methods tailored for transformer-based
agents. Using ShapeLLM, we examine whether LLM agents can influence co-players'
learning dynamics across diverse game-theoretic environments. We demonstrate
that LLM agents can successfully guide opponents toward exploitable equilibria
in competitive games (Iterated Prisoner's Dilemma, Matching Pennies, and
Chicken) and promote coordination and improve collective welfare in cooperative
games (Iterated Stag Hunt and a cooperative version of the Prisoner's Dilemma).
Our findings show that LLM agents can both shape and be shaped through
interaction, establishing opponent shaping as a key dimension of multi-agent
LLM research.