RAISE: A Unified Framework for Responsible AI Scoring and Evaluation
2510.18559v1
cs.LG, cs.AI, cs.CE, cs.CY
2025-10-23
Авторы:
Loc Phuc Truong Nguyen, Hung Thanh Do
Abstract
As AI systems enter high-stakes domains, evaluation must extend beyond
predictive accuracy to include explainability, fairness, robustness, and
sustainability. We introduce RAISE (Responsible AI Scoring and Evaluation), a
unified framework that quantifies model performance across these four
dimensions and aggregates them into a single, holistic Responsibility Score. We
evaluated three deep learning models: a Multilayer Perceptron (MLP), a Tabular
ResNet, and a Feature Tokenizer Transformer, on structured datasets from
finance, healthcare, and socioeconomics. Our findings reveal critical
trade-offs: the MLP demonstrated strong sustainability and robustness, the
Transformer excelled in explainability and fairness at a very high
environmental cost, and the Tabular ResNet offered a balanced profile. These
results underscore that no single model dominates across all responsibility
criteria, highlighting the necessity of multi-dimensional evaluation for
responsible model selection. Our implementation is available at:
https://github.com/raise-framework/raise.