Mapping from Meaning: Addressing the Miscalibration of Prompt-Sensitive Language Models
2510.17028v1
cs.CL, cs.LG
2025-10-22
Авторы:
Kyle Cox, Jiawei Xu, Yikun Han, Rong Xu, Tianhao Li, Chi-Yang Hsu, Tianlong Chen, Walter Gerych, Ying Ding
Abstract
An interesting behavior in large language models (LLMs) is prompt
sensitivity. When provided with different but semantically equivalent versions
of the same prompt, models may produce very different distributions of answers.
This suggests that the uncertainty reflected in a model's output distribution
for one prompt may not reflect the model's uncertainty about the meaning of the
prompt. We model prompt sensitivity as a type of generalization error, and show
that sampling across the semantic ``concept space'' with paraphrasing
perturbations improves uncertainty calibration without compromising accuracy.
Additionally, we introduce a new metric for uncertainty decomposition in
black-box LLMs that improves upon entropy-based decomposition by modeling
semantic continuities in natural language generation. We show that this
decomposition metric can be used to quantify how much LLM uncertainty is
attributed to prompt sensitivity. Our work introduces a new way to improve
uncertainty calibration in prompt-sensitive language models, and provides
evidence that some LLMs fail to exhibit consistent general reasoning about the
meanings of their inputs.
Ссылки и действия
Дополнительные ресурсы: