Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents
2510.13704v1
cs.LG, cs.AI, cs.RO
2025-10-17
Авторы:
Johan Obando-Ceron, Walter Mayor, Samuel Lavoie, Scott Fujimoto, Aaron Courville, Pablo Samuel Castro
Abstract
Recent works have proposed accelerating the wall-clock training time of
actor-critic methods via the use of large-scale environment parallelization;
unfortunately, these can sometimes still require large number of environment
interactions to achieve a desired level of performance. Noting that
well-structured representations can improve the generalization and sample
efficiency of deep reinforcement learning (RL) agents, we propose the use of
simplicial embeddings: lightweight representation layers that constrain
embeddings to simplicial structures. This geometric inductive bias results in
sparse and discrete features that stabilize critic bootstrapping and strengthen
policy gradients. When applied to FastTD3, FastSAC, and PPO, simplicial
embeddings consistently improve sample efficiency and final performance across
a variety of continuous- and discrete-control environments, without any loss in
runtime speed.
Ссылки и действия
Дополнительные ресурсы: