Causal Reflection with Language Models

2508.04495v1 cs.LG, cs.CL 2025-08-09
Авторы:

Abi Aryan, Zac Liu

Резюме на русском

**Резюме** Современные языковые модели (LLMs) в impressive fluency и factual recall, но часто опираются на spurious correlations и brittle patterns при робастном causal reasoning. Аналогично, reinforcement learning агенты, оптимизирующие rewards, не развивают касуального понимания. Мы предлагаем Causal Reflection, новую архитектуру, которая структурированно моделирует causality как динамическую функцию, зависящую от state, action, time и perturbation. Также мы внедрили Reflect mechanism, который идентифицирует mismatches в прогнозах и выводит causal hypotheses для перестройки internal model. LLMs в этой системе выступают не как black-box reasoners, а как structured inference engines, которые оценивают и описывают causal рассуждения в natural language. Наша работа базируется на новых теоретических предпосылках для Causal Reflective agents, способных adapt, self-correct и explain causal relations в changing environments.

Abstract

While LLMs exhibit impressive fluency and factual recall, they struggle with robust causal reasoning, often relying on spurious correlations and brittle patterns. Similarly, traditional Reinforcement Learning agents also lack causal understanding, optimizing for rewards without modeling why actions lead to outcomes. We introduce Causal Reflection, a framework that explicitly models causality as a dynamic function over state, action, time, and perturbation, enabling agents to reason about delayed and nonlinear effects. Additionally, we define a formal Reflect mechanism that identifies mismatches between predicted and observed outcomes and generates causal hypotheses to revise the agent's internal model. In this architecture, LLMs serve not as black-box reasoners, but as structured inference engines translating formal causal outputs into natural language explanations and counterfactuals. Our framework lays the theoretical groundwork for Causal Reflective agents that can adapt, self-correct, and communicate causal understanding in evolving environments.

Ссылки и действия