Predictive Coding Enhances Meta-RL To Achieve Interpretable Bayes-Optimal Belief Representation Under Partial Observability
2510.22039v1
cs.AI, q-bio.NC
2025-10-29
Авторы:
Po-Chen Kuo, Han Hou, Will Dabney, Edgar Y. Walker
Abstract
Learning a compact representation of history is critical for planning and
generalization in partially observable environments. While meta-reinforcement
learning (RL) agents can attain near Bayes-optimal policies, they often fail to
learn the compact, interpretable Bayes-optimal belief states. This
representational inefficiency potentially limits the agent's adaptability and
generalization capacity. Inspired by predictive coding in neuroscience--which
suggests that the brain predicts sensory inputs as a neural implementation of
Bayesian inference--and by auxiliary predictive objectives in deep RL, we
investigate whether integrating self-supervised predictive coding modules into
meta-RL can facilitate learning of Bayes-optimal representations. Through state
machine simulation, we show that meta-RL with predictive modules consistently
generates more interpretable representations that better approximate
Bayes-optimal belief states compared to conventional meta-RL across a wide
variety of tasks, even when both achieve optimal policies. In challenging tasks
requiring active information seeking, only meta-RL with predictive modules
successfully learns optimal representations and policies, whereas conventional
meta-RL struggles with inadequate representation learning. Finally, we
demonstrate that better representation learning leads to improved
generalization. Our results strongly suggest the role of predictive learning as
a guiding principle for effective representation learning in agents navigating
partial observability.
Ссылки и действия
Дополнительные ресурсы: