Conflict Adaptation in Vision-Language Models
2510.24804v1
cs.CV, cs.CL
2025-10-31
Авторы:
Xiaoyang Hu
Abstract
A signature of human cognitive control is conflict adaptation: improved
performance on a high-conflict trial following another high-conflict trial.
This phenomenon offers an account for how cognitive control, a scarce resource,
is recruited. Using a sequential Stroop task, we find that 12 of 13
vision-language models (VLMs) tested exhibit behavior consistent with conflict
adaptation, with the lone exception likely reflecting a ceiling effect. To
understand the representational basis of this behavior, we use sparse
autoencoders (SAEs) to identify task-relevant supernodes in InternVL 3.5 4B.
Partially overlapping supernodes emerge for text and color in both early and
late layers, and their relative sizes mirror the automaticity asymmetry between
reading and color naming in humans. We further isolate a conflict-modulated
supernode in layers 24-25 whose ablation significantly increases Stroop errors
while minimally affecting congruent trials.
Ссылки и действия
Дополнительные ресурсы: