Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics

2511.04527v1 cs.CL, cs.AI 2025-11-08

Авторы:

Amir Zur, Atticus Geiger, Ekdeep Singh Lubana, Eric Bigelow

Abstract

When a language model generates text, the selection of individual tokens might lead it down very different reasoning paths, making uncertainty difficult to quantify. In this work, we consider whether reasoning language models represent the alternate paths that they could take during generation. To test this hypothesis, we use hidden activations to control and predict a language model's uncertainty during chain-of-thought reasoning. In our experiments, we find a clear correlation between how uncertain a model is at different tokens, and how easily the model can be steered by controlling its activations. This suggests that activation interventions are most effective when there are alternate paths available to the model -- in other words, when it has not yet committed to a particular final answer. We also find that hidden activations can predict a model's future outcome distribution, demonstrating that models implicitly represent the space of possible paths.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Prompting-in-a-Series: Psychology-Informed Contents and Embeddings for Personali...

Leveraging KV Similarity for Online Structured Pruning in LLMs

Persian-Phi: Efficient Cross-Lingual Adaptation of Compact LLMs via Curriculum L...

LIME: Making LLM Data More Efficient with Linguistic Metadata Embeddings

SPAD: Seven-Source Token Probability Attribution with Syntactic Aggregation for ...

Навигация