Ensembling Large Language Models to Characterize Affective Dynamics in Student-AI Tutor Dialogues
2510.13862v1
cs.CL, cs.AI, cs.HC
2025-10-18
Авторы:
Chenyu Zhang, Sharifa Alghowinem, Cynthia Breazeal
Abstract
While recent studies have examined the leaning impact of large language model
(LLM) in educational contexts, the affective dynamics of LLM-mediated tutoring
remain insufficiently understood. This work introduces the first ensemble-LLM
framework for large-scale affect sensing in tutoring dialogues, advancing the
conversation on responsible pathways for integrating generative AI into
education by attending to learners' evolving affective states. To achieve this,
we analyzed two semesters' worth of 16,986 conversational turns exchanged
between PyTutor, an LLM-powered AI tutor, and 261 undergraduate learners across
three U.S. institutions. To investigate learners' emotional experiences, we
generate zero-shot affect annotations from three frontier LLMs (Gemini, GPT-4o,
Claude), including scalar ratings of valence, arousal, and
learning-helpfulness, along with free-text emotion labels. These estimates are
fused through rank-weighted intra-model pooling and plurality consensus across
models to produce robust emotion profiles. Our analysis shows that during
interaction with the AI tutor, students typically report mildly positive affect
and moderate arousal. Yet learning is not uniformly smooth: confusion and
curiosity are frequent companions to problem solving, and frustration, while
less common, still surfaces in ways that can derail progress. Emotional states
are short-lived--positive moments last slightly longer than neutral or negative
ones, but they are fragile and easily disrupted. Encouragingly, negative
emotions often resolve quickly, sometimes rebounding directly into positive
states. Neutral moments frequently act as turning points, more often steering
students upward than downward, suggesting opportunities for tutors to intervene
at precisely these junctures.
Ссылки и действия
Дополнительные ресурсы: