Offline Reinforcement Learning in Large State Spaces: Algorithms and Guarantees
2510.04088v1
cs.LG, cs.AI, stat.ML
2025-10-08
Авторы:
Nan Jiang, Tengyang Xie
Abstract
This article introduces the theory of offline reinforcement learning in large
state spaces, where good policies are learned from historical data without
online interactions with the environment. Key concepts introduced include
expressivity assumptions on function approximation (e.g., Bellman completeness
vs. realizability) and data coverage (e.g., all-policy vs. single-policy
coverage). A rich landscape of algorithms and results is described, depending
on the assumptions one is willing to make and the sample and computational
complexity guarantees one wishes to achieve. We also discuss open questions and
connections to adjacent areas.
Ссылки и действия
Дополнительные ресурсы: