AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents
2510.21031v1
cs.SE, cs.AI
2025-10-28
Авторы:
Qinghua Lu, Dehai Zhao, Yue Liu, Hao Zhang, Liming Zhu, Xiwei Xu, Angela Shi, Tristan Tan, Rick Kazman
Abstract
The emergence of foundation models (FMs) has enabled the development of
highly capable and autonomous agents, unlocking new application opportunities
across a wide range of domains. Evaluating the architecture of agents is
particularly important as the architectural decisions significantly impact the
quality attributes of agents given their unique characteristics, including
compound architecture, autonomous and non-deterministic behaviour, and
continuous evolution. However, these traditional methods fall short in
addressing the evaluation needs of agent architecture due to the unique
characteristics of these agents. Therefore, in this paper, we present
AgentArcEval, a novel agent architecture evaluation method designed specially
to address the complexities of FM-based agent architecture and its evaluation.
Moreover, we present a catalogue of agent-specific general scenarios, which
serves as a guide for generating concrete scenarios to design and evaluate the
agent architecture. We demonstrate the usefulness of AgentArcEval and the
catalogue through a case study on the architecture evaluation of a real-world
tax copilot, named Luna.
Ссылки и действия
Дополнительные ресурсы: