Are Large Language Models Sensitive to the Motives Behind Communication?
2510.19687v1
cs.CL, cs.AI, cs.LG
2025-10-24
Авторы:
Addison J. Wu, Ryan Liu, Kerem Oktar, Theodore R. Sumers, Thomas L. Griffiths
Abstract
Human communication is motivated: people speak, write, and create content
with a particular communicative intent in mind. As a result, information that
large language models (LLMs) and AI agents process is inherently framed by
humans' intentions and incentives. People are adept at navigating such nuanced
information: we routinely identify benevolent or self-serving motives in order
to decide what statements to trust. For LLMs to be effective in the real world,
they too must critically evaluate content by factoring in the motivations of
the source -- for instance, weighing the credibility of claims made in a sales
pitch. In this paper, we undertake a comprehensive study of whether LLMs have
this capacity for motivational vigilance. We first employ controlled
experiments from cognitive science to verify that LLMs' behavior is consistent
with rational models of learning from motivated testimony, and find they
successfully discount information from biased sources in a human-like manner.
We then extend our evaluation to sponsored online adverts, a more naturalistic
reflection of LLM agents' information ecosystems. In these settings, we find
that LLMs' inferences do not track the rational models' predictions nearly as
closely -- partly due to additional information that distracts them from
vigilance-relevant considerations. However, a simple steering intervention that
boosts the salience of intentions and incentives substantially increases the
correspondence between LLMs and the rational model. These results suggest that
LLMs possess a basic sensitivity to the motivations of others, but generalizing
to novel real-world settings will require further improvements to these models.
Ссылки и действия
Дополнительные ресурсы: