The Argument is the Explanation: Structured Argumentation for Trust in Agents

2510.03442v1 cs.LG, cs.AI, cs.MA 2025-10-08

Авторы:

Ege Cakar, Per Ola Kristensson

Abstract

Humans are black boxes -- we cannot observe their neural processes, yet society functions by evaluating verifiable arguments. AI explainability should follow this principle: stakeholders need verifiable reasoning chains, not mechanistic transparency. We propose using structured argumentation to provide a level of explanation and verification neither interpretability nor LLM-generated explanation is able to offer. Our pipeline achieves state-of-the-art 94.44 macro F1 on the AAEC published train/test split (5.7 points above prior work) and $0.81$ macro F1, $\sim$0.07 above previous published results with comparable data setups, for Argumentative MicroTexts relation classification, converting LLM text into argument graphs and enabling verification at each inferential step. We demonstrate this idea on multi-agent risk assessment using the Structured What-If Technique, where specialized agents collaborate transparently to carry out risk assessment otherwise achieved by humans alone. Using Bipolar Assumption-Based Argumentation, we capture support/attack relationships, thereby enabling automatic hallucination detection via fact nodes attacking arguments. We also provide a verification mechanism that enables iterative refinement through test-time feedback without retraining. For easy deployment, we provide a Docker container for the fine-tuned AMT model, and the rest of the code with the Bipolar ABA Python package on GitHub.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

The Argument is the Explanation: Structured Argumentation for Trust in Agents

Авторы:

Abstract

Ссылки и действия

Связанные статьи

A Hierarchical Hybrid AI Approach: Integrating Deep Reinforcement Learning and S...

Can Vibe Coding Beat Graduate CS Students? An LLM vs. Human Coding Tournament on...

A Mathematical Framework for Custom Reward Functions in Job Application Evaluati...

Large Language Model-Based Reward Design for Deep Reinforcement Learning-Driven ...

Partial Action Replacement: Tackling Distribution Shift in Offline MARL

Навигация