FinTrust: A Comprehensive Benchmark of Trustworthiness Evaluation in Finance Domain
2510.15232v1
cs.LG, cs.CL
2025-10-21
Авторы:
Tiansheng Hu, Tongyan Hu, Liuyang Bai, Yilun Zhao, Arman Cohan, Chen Zhao
Abstract
Recent LLMs have demonstrated promising ability in solving finance related
problems. However, applying LLMs in real-world finance application remains
challenging due to its high risk and high stakes property. This paper
introduces FinTrust, a comprehensive benchmark specifically designed for
evaluating the trustworthiness of LLMs in finance applications. Our benchmark
focuses on a wide range of alignment issues based on practical context and
features fine-grained tasks for each dimension of trustworthiness evaluation.
We assess eleven LLMs on FinTrust and find that proprietary models like o4-mini
outperforms in most tasks such as safety while open-source models like
DeepSeek-V3 have advantage in specific areas like industry-level fairness. For
challenging task like fiduciary alignment and disclosure, all LLMs fall short,
showing a significant gap in legal awareness. We believe that FinTrust can be a
valuable benchmark for LLMs' trustworthiness evaluation in finance domain.
Ссылки и действия
Дополнительные ресурсы: