HEART: Emotionally-driven test-time scaling of Language Models
2509.22876v1
cs.CL, cs.LG
2025-10-01
Авторы:
Gabriela Pinto, Palash Goyal, Yiwen Song, Souradip Chakraborty, Zifeng Wang, Tomas Pfister, Hamid Palangi
Abstract
Test-time scaling has shown considerable success in improving the performance
of language models on complex reasoning tasks without requiring fine-tuning.
However, current strategies such as self-reflection primarily focus on logical
or structural refinement. They do not leverage the guiding potential of
affective feedback. Inspired by psychological research showing that emotions
can modulate cognitive performance, we introduce HEART--a novel framework that
uses emotionally-driven prompts for iterative self-correction. HEART provides
feedback on a model's incorrect response using a curated set of concise,
emotionally charged phrases based on the six universal emotions categorized by
Dr. Paul Ekman. By systematically varying the emotional tone of the feedback
across iterations, our method guides the model to escape flawed reasoning paths
and explore more promising alternatives. We evaluate our framework on
challenging reasoning benchmarks including OlympiadBench, Humanity's Last Exam,
and SimpleQA. Our results reveal a significant new phenomenon: when guided by
an oracle verifier, this affective iteration protocol unlocks significantly
deeper reasoning, leading to consistent and substantial increases in accuracy
over state-of-the-art baselines with the same verifier. However, we also
identify a critical bottleneck for practical deployment. In a verifier-free
setting, it struggles to harness these gains consistently, highlighting as a
key challenge for future work. Our findings suggest that the next frontier in
machine reasoning may lie not just in refining logic, but also in understanding
and leveraging the `HEART' of the models.
Ссылки и действия
Дополнительные ресурсы: