Stable but Miscalibrated: A Kantian View on Overconfidence from Filters to Large Language Models
2510.14925v1
cs.AI, cs.CL, cs.LG
2025-10-18
Авторы:
Akira Okutomi
Abstract
We reinterpret Kant's Critique of Pure Reason as a theory of feedback
stability, viewing reason as a regulator that keeps inference within the bounds
of possible experience. We formalize this intuition via a composite instability
index (H-Risk) combining spectral margin, conditioning, temporal sensitivity,
and innovation amplification. In linear-Gaussian simulations, higher H-Risk
predicts overconfident errors even under formal stability, revealing a gap
between nominal and epistemic stability. Extending to large language models
(LLMs), we find that fragile internal dynamics correlate with miscalibration
and hallucination, while critique-style prompts show mixed effects on
calibration and hallucination. These results suggest a structural bridge
between Kantian self-limitation and feedback control, offering a principled
lens for diagnosing -- and selectively reducing -- overconfidence in reasoning
systems. This is a preliminary version; supplementary experiments and broader
replication will be reported in a future revision.
Ссылки и действия
Дополнительные ресурсы: