Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation

2511.17813v1 cs.CL, cs.AI, cs.LG, cs.SD 2025-11-25
Авторы:

Scott Merrill, Shashank Srivastava

Abstract

Large language models offer opportunities to simulate multi-party deliberation, but realistic modeling remains limited by a lack of speaker-attributed data. Transcripts produced via automatic speech recognition (ASR) assign anonymous speaker labels (e.g., Speaker_1), preventing models from capturing consistent human behavior. This work introduces a reproducible pipeline to transform public Zoom recordings into speaker-attributed transcripts with metadata like persona profiles and pragmatic action tags (e.g., [propose_motion]). We release three local government deliberation datasets: Appellate Court hearings, School Board meetings, and Municipal Council sessions. Fine-tuning LLMs to model specific participants using this "action-aware" data produces a 67% reduction in perplexity and nearly doubles classifier-based performance metrics for speaker fidelity and realism. Turing-style human evaluations show our simulations are often indistinguishable from real deliberations, providing a practical and scalable method for complex realistic civic simulations.

Ссылки и действия

Связанные статьи

AuditoryBench++: Can Language Models Understand Auditory Knowledge without Heari...

#### Контекст Осуществление многомерных взаимодействий между текстом и аудио является ключевым запросом в современных т...

2025-09-24

PARCO: Phoneme-Augmented Robust Contextual ASR via Contrastive Entity Disambigua...

## Контекст Автоматическое распознавание речи (ASR) широко применяется в различных областях, но сталкивается с значитель...

2025-09-06