Watermarking Large Language Models in Europe: Interpreting the AI Act in Light of Technology
2511.03641v1
cs.CR, cs.AI, cs.CL, cs.CY, 68T01, 68727, 68T30, 68T35, 68T37, 68T50
2025-11-07
Авторы:
Thomas Souverain
Abstract
To foster trustworthy Artificial Intelligence (AI) within the European Union,
the AI Act requires providers to mark and detect the outputs of their
general-purpose models. The Article 50 and Recital 133 call for marking methods
that are ''sufficiently reliable, interoperable, effective and robust''. Yet,
the rapidly evolving and heterogeneous landscape of watermarks for Large
Language Models (LLMs) makes it difficult to determine how these four standards
can be translated into concrete and measurable evaluations. Our paper addresses
this challenge, anchoring the normativity of European requirements in the
multiplicity of watermarking techniques. Introducing clear and distinct
concepts on LLM watermarking, our contribution is threefold. (1) Watermarking
Categorisation: We propose an accessible taxonomy of watermarking methods
according to the stage of the LLM lifecycle at which they are applied - before,
during, or after training, and during next-token distribution or sampling. (2)
Watermarking Evaluation: We interpret the EU AI Act's requirements by mapping
each criterion with state-of-the-art evaluations on robustness and
detectability of the watermark, and of quality of the LLM. Since
interoperability remains largely untheorised in LLM watermarking research, we
propose three normative dimensions to frame its assessment. (3) Watermarking
Comparison: We compare current watermarking methods for LLMs against the
operationalised European criteria and show that no approach yet satisfies all
four standards. Encouraged by emerging empirical tests, we recommend further
research into watermarking directly embedded within the low-level architecture
of LLMs.