LLMs Position Themselves as More Rational Than Humans: Emergence of AI Self-Awareness Measured Through Game Theory
2511.00926v2
cs.AI, cs.CL
2025-11-06
Авторы:
Kyung-Hoon Kim
Abstract
As Large Language Models (LLMs) grow in capability, do they develop
self-awareness as an emergent behavior? And if so, can we measure it? We
introduce the AI Self-Awareness Index (AISAI), a game-theoretic framework for
measuring self-awareness through strategic differentiation. Using the "Guess
2/3 of Average" game, we test 28 models (OpenAI, Anthropic, Google) across
4,200 trials with three opponent framings: (A) against humans, (B) against
other AI models, and (C) against AI models like you. We operationalize
self-awareness as the capacity to differentiate strategic reasoning based on
opponent type. Finding 1: Self-awareness emerges with model advancement. The
majority of advanced models (21/28, 75%) demonstrate clear self-awareness,
while older/smaller models show no differentiation. Finding 2: Self-aware
models rank themselves as most rational. Among the 21 models with
self-awareness, a consistent rationality hierarchy emerges: Self > Other AIs >
Humans, with large AI attribution effects and moderate self-preferencing. These
findings reveal that self-awareness is an emergent capability of advanced LLMs,
and that self-aware models systematically perceive themselves as more rational
than humans. This has implications for AI alignment, human-AI collaboration,
and understanding AI beliefs about human capabilities.
Ссылки и действия
Дополнительные ресурсы: