Can LLMs Translate Human Instructions into a Reinforcement Learning Agent's Internal Emergent Symbolic Representation?
2510.24259v1
cs.CL, cs.RO
2025-10-30
Авторы:
Ziqi Ma, Sao Mai Nguyen, Philippe Xu
Abstract
Emergent symbolic representations are critical for enabling developmental
learning agents to plan and generalize across tasks. In this work, we
investigate whether large language models (LLMs) can translate human natural
language instructions into the internal symbolic representations that emerge
during hierarchical reinforcement learning. We apply a structured evaluation
framework to measure the translation performance of commonly seen LLMs -- GPT,
Claude, Deepseek and Grok -- across different internal symbolic partitions
generated by a hierarchical reinforcement learning algorithm in the Ant Maze
and Ant Fall environments. Our findings reveal that although LLMs demonstrate
some ability to translate natural language into a symbolic representation of
the environment dynamics, their performance is highly sensitive to partition
granularity and task complexity. The results expose limitations in current LLMs
capacity for representation alignment, highlighting the need for further
research on robust alignment between language and internal agent
representations.
Ссылки и действия
Дополнительные ресурсы: