AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents

2510.21031v1 cs.SE, cs.AI 2025-10-28

Авторы:

Qinghua Lu, Dehai Zhao, Yue Liu, Hao Zhang, Liming Zhu, Xiwei Xu, Angela Shi, Tristan Tan, Rick Kazman

Abstract

The emergence of foundation models (FMs) has enabled the development of highly capable and autonomous agents, unlocking new application opportunities across a wide range of domains. Evaluating the architecture of agents is particularly important as the architectural decisions significantly impact the quality attributes of agents given their unique characteristics, including compound architecture, autonomous and non-deterministic behaviour, and continuous evolution. However, these traditional methods fall short in addressing the evaluation needs of agent architecture due to the unique characteristics of these agents. Therefore, in this paper, we present AgentArcEval, a novel agent architecture evaluation method designed specially to address the complexities of FM-based agent architecture and its evaluation. Moreover, we present a catalogue of agent-specific general scenarios, which serves as a guide for generating concrete scenarios to design and evaluate the agent architecture. We demonstrate the usefulness of AgentArcEval and the catalogue through a case study on the architecture evaluation of a real-world tax copilot, named Luna.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

AgentArcEval: An Architecture Evaluation Method for Foundation Model based Agents

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Automating Complex Document Workflows via Stepwise and Rollback-Enabled Operatio...

Quantitative Analysis of Technical Debt and Pattern Violation in Large Language ...

MANTRA: a Framework for Multi-stage Adaptive Noise TReAtment During Training

Beyond Greenfield: The D3 Framework for AI-Driven Productivity in Brownfield Eng...

LLM-as-a-Judge for Scalable Test Coverage Evaluation: Accuracy, Operational Reli...

Навигация