On the hardness of RL with Lookahead

2510.19372v1 stat.ML, cs.LG 2025-10-24

Авторы:

Corentin Pla, Hugo Richard, Marc Abeille, Nadav Merlis, Vianney Perchet

Abstract

We study reinforcement learning (RL) with transition look-ahead, where the agent may observe which states would be visited upon playing any sequence of $\ell$ actions before deciding its course of action. While such predictive information can drastically improve the achievable performance, we show that using this information optimally comes at a potentially prohibitive computational cost. Specifically, we prove that optimal planning with one-step look-ahead ($\ell=1$) can be solved in polynomial time through a novel linear programming formulation. In contrast, for $\ell \geq 2$, the problem becomes NP-hard. Our results delineate a precise boundary between tractable and intractable cases for the problem of planning with transition look-ahead in reinforcement learning.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

On the hardness of RL with Lookahead

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Comparison of neural network training strategies for the simulation of dynamical...

Informative missingness and its implications in semi-supervised learning

Recurrent Neural Networks with Linear Structures for Electricity Price Forecasti...

Control Consistency Losses for Diffusion Bridges

Foundations of Diffusion Models in General State Spaces: A Self-Contained Introd...

Навигация