Causal Reflection with Language Models
2508.04495v1
cs.LG, cs.CL
2025-08-09
Авторы:
Abi Aryan, Zac Liu
Резюме на русском
**Резюме**
Современные языковые модели (LLMs) в impressive fluency и factual recall, но часто опираются на spurious correlations и brittle patterns при робастном causal reasoning. Аналогично, reinforcement learning агенты, оптимизирующие rewards, не развивают касуального понимания. Мы предлагаем Causal Reflection, новую архитектуру, которая структурированно моделирует causality как динамическую функцию, зависящую от state, action, time и perturbation. Также мы внедрили Reflect mechanism, который идентифицирует mismatches в прогнозах и выводит causal hypotheses для перестройки internal model. LLMs в этой системе выступают не как black-box reasoners, а как structured inference engines, которые оценивают и описывают causal рассуждения в natural language. Наша работа базируется на новых теоретических предпосылках для Causal Reflective agents, способных adapt, self-correct и explain causal relations в changing environments.
Abstract
While LLMs exhibit impressive fluency and factual recall, they struggle with
robust causal reasoning, often relying on spurious correlations and brittle
patterns. Similarly, traditional Reinforcement Learning agents also lack causal
understanding, optimizing for rewards without modeling why actions lead to
outcomes. We introduce Causal Reflection, a framework that explicitly models
causality as a dynamic function over state, action, time, and perturbation,
enabling agents to reason about delayed and nonlinear effects. Additionally, we
define a formal Reflect mechanism that identifies mismatches between predicted
and observed outcomes and generates causal hypotheses to revise the agent's
internal model. In this architecture, LLMs serve not as black-box reasoners,
but as structured inference engines translating formal causal outputs into
natural language explanations and counterfactuals. Our framework lays the
theoretical groundwork for Causal Reflective agents that can adapt,
self-correct, and communicate causal understanding in evolving environments.
Ссылки и действия
Дополнительные ресурсы: