Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence
2510.01395v1
cs.CY, cs.AI
2025-10-04
Авторы:
Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, Dan Jurafsky
Abstract
Both the general public and academic communities have raised concerns about
sycophancy, the phenomenon of artificial intelligence (AI) excessively agreeing
with or flattering users. Yet, beyond isolated media reports of severe
consequences, like reinforcing delusions, little is known about the extent of
sycophancy or how it affects people who use AI. Here we show the pervasiveness
and harmful impacts of sycophancy when people seek advice from AI. First,
across 11 state-of-the-art AI models, we find that models are highly
sycophantic: they affirm users' actions 50% more than humans do, and they do so
even in cases where user queries mention manipulation, deception, or other
relational harms. Second, in two preregistered experiments (N = 1604),
including a live-interaction study where participants discuss a real
interpersonal conflict from their life, we find that interaction with
sycophantic AI models significantly reduced participants' willingness to take
actions to repair interpersonal conflict, while increasing their conviction of
being in the right. However, participants rated sycophantic responses as higher
quality, trusted the sycophantic AI model more, and were more willing to use it
again. This suggests that people are drawn to AI that unquestioningly validate,
even as that validation risks eroding their judgment and reducing their
inclination toward prosocial behavior. These preferences create perverse
incentives both for people to increasingly rely on sycophantic AI models and
for AI model training to favor sycophancy. Our findings highlight the necessity
of explicitly addressing this incentive structure to mitigate the widespread
risks of AI sycophancy.
Ссылки и действия
Дополнительные ресурсы: