Balancing Specialization and Centralization: A Multi-Agent Reinforcement Learning Benchmark for Sequential Industrial Control
2510.20408v1
cs.LG, cs.AI, cs.MA, cs.SY, eess.SY
2025-10-25
Авторы:
Tom Maus, Asma Atamna, Tobias Glasmachers
Abstract
Autonomous control of multi-stage industrial processes requires both local
specialization and global coordination. Reinforcement learning (RL) offers a
promising approach, but its industrial adoption remains limited due to
challenges such as reward design, modularity, and action space management. Many
academic benchmarks differ markedly from industrial control problems, limiting
their transferability to real-world applications. This study introduces an
enhanced industry-inspired benchmark environment that combines tasks from two
existing benchmarks, SortingEnv and ContainerGym, into a sequential recycling
scenario with sorting and pressing operations. We evaluate two control
strategies: a modular architecture with specialized agents and a monolithic
agent governing the full system, while also analyzing the impact of action
masking. Our experiments show that without action masking, agents struggle to
learn effective policies, with the modular architecture performing better. When
action masking is applied, both architectures improve substantially, and the
performance gap narrows considerably. These results highlight the decisive role
of action space constraints and suggest that the advantages of specialization
diminish as action complexity is reduced. The proposed benchmark thus provides
a valuable testbed for exploring practical and robust multi-agent RL solutions
in industrial automation, while contributing to the ongoing debate on
centralization versus specialization.