Auditing Pay-Per-Token in Large Language Models
2510.05181v1
cs.CR, cs.AI, cs.CY
2025-10-09
Авторы:
Ander Artola Velasco, Stratis Tsirtsis, Manuel Gomez-Rodriguez
Abstract
Millions of users rely on a market of cloud-based services to obtain access
to state-of-the-art large language models. However, it has been very recently
shown that the de facto pay-per-token pricing mechanism used by providers
creates a financial incentive for them to strategize and misreport the (number
of) tokens a model used to generate an output. In this paper, we develop an
auditing framework based on martingale theory that enables a trusted
third-party auditor who sequentially queries a provider to detect token
misreporting. Crucially, we show that our framework is guaranteed to always
detect token misreporting, regardless of the provider's (mis-)reporting policy,
and not falsely flag a faithful provider as unfaithful with high probability.
To validate our auditing framework, we conduct experiments across a wide range
of (mis-)reporting policies using several large language models from the
$\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input
prompts from a popular crowdsourced benchmarking platform. The results show
that our framework detects an unfaithful provider after observing fewer than
$\sim 70$ reported outputs, while maintaining the probability of falsely
flagging a faithful provider below $\alpha = 0.05$.
Ссылки и действия
Дополнительные ресурсы: