Scaling Long-Horizon LLM Agent via Context-Folding
2510.11967v1
cs.CL, cs.LG
2025-10-16
Авторы:
Weiwei Sun, Miao Lu, Zhan Ling, Kang Liu, Xuesong Yao, Yiming Yang, Jiecao Chen
Abstract
Large language model (LLM) agents are fundamentally constrained by context
length on long-horizon tasks. We introduce Context-Folding, a framework that
empowers agents to actively manage their working context. An agent can
procedurally branch into a sub-trajectory to handle a subtask and then fold it
upon completion, collapsing the intermediate steps while retaining a concise
summary of the outcome. To make this behavior learnable, we develop an
end-to-end reinforcement learning framework FoldGRPO with specific process
rewards to encourage effective task decomposition and context management. On
complex long-horizon tasks (Deep Research and SWE), our folding agent matches
or outperforms the ReAct baselines while using an active context 10$\times$
smaller and significantly outperforms models that rely on summarization-based
context management.
Ссылки и действия
Дополнительные ресурсы: