Who's Asking? Simulating Role-Based Questions for Conversational AI Evaluation

2510.16829v1 cs.CL, cs.AI, cs.CY, cs.HC 2025-10-22

Авторы:

Navreet Kaur, Hoda Ayad, Hayoung Jung, Shravika Mittal, Munmun De Choudhury, Tanushree Mitra

Abstract

Language model users often embed personal and social context in their questions. The asker's role -- implicit in how the question is framed -- creates specific needs for an appropriate response. However, most evaluations, while capturing the model's capability to respond, often ignore who is asking. This gap is especially critical in stigmatized domains such as opioid use disorder (OUD), where accounting for users' contexts is essential to provide accessible, stigma-free responses. We propose CoRUS (COmmunity-driven Roles for User-centric Question Simulation), a framework for simulating role-based questions. Drawing on role theory and posts from an online OUD recovery community (r/OpiatesRecovery), we first build a taxonomy of asker roles -- patients, caregivers, practitioners. Next, we use it to simulate 15,321 questions that embed each role's goals, behaviors, and experiences. Our evaluations show that these questions are both highly believable and comparable to real-world data. When used to evaluate five LLMs, for the same question but differing roles, we find systematic differences: vulnerable roles, such as patients and caregivers, elicit more supportive responses (+17%) and reduced knowledge content (-19%) in comparison to practitioners. Our work demonstrates how implicitly signaling a user's role shapes model responses, and provides a methodology for role-informed evaluation of conversational AI.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Who's Asking? Simulating Role-Based Questions for Conversational AI Evaluation

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Benchmarking Educational LLMs with Analytics: A Case Study on Gender Bias in Fee...

From Binary to Bilingual: How the National Weather Service is Using Artificial I...

Social Welfare Function Leaderboard: When LLM Agents Allocate Social Welfare

A perishable ability? The future of writing in the face of generative artificial...

Навигация