Embedding Trust: Semantic Isotropy Predicts Nonfactuality in Long-Form Text Generation
2510.21891v1
cs.CL, cs.AI, cs.LG, stat.ME, stat.ML
2025-10-29
Авторы:
Dhrupad Bhardwaj, Julia Kempe, Tim G. J. Rudner
Abstract
To deploy large language models (LLMs) in high-stakes application domains
that require substantively accurate responses to open-ended prompts, we need
reliable, computationally inexpensive methods that assess the trustworthiness
of long-form responses generated by LLMs. However, existing approaches often
rely on claim-by-claim fact-checking, which is computationally expensive and
brittle in long-form responses to open-ended prompts. In this work, we
introduce semantic isotropy -- the degree of uniformity across normalized text
embeddings on the unit sphere -- and use it to assess the trustworthiness of
long-form responses generated by LLMs. To do so, we generate several long-form
responses, embed them, and estimate the level of semantic isotropy of these
responses as the angular dispersion of the embeddings on the unit sphere. We
find that higher semantic isotropy -- that is, greater embedding dispersion --
reliably signals lower factual consistency across samples. Our approach
requires no labeled data, no fine-tuning, and no hyperparameter selection, and
can be used with open- or closed-weight embedding models. Across multiple
domains, our method consistently outperforms existing approaches in predicting
nonfactuality in long-form responses using only a handful of samples --
offering a practical, low-cost approach for integrating trust assessment into
real-world LLM workflows.