PVMark: Enabling Public Verifiability for LLM Watermarking Schemes
2510.26274v1
cs.CR, cs.CL, cs.LG
2025-11-01
Авторы:
Haohua Duan, Liyao Xiang, Xin Zhang
Abstract
Watermarking schemes for large language models (LLMs) have been proposed to
identify the source of the generated text, mitigating the potential threats
emerged from model theft. However, current watermarking solutions hardly
resolve the trust issue: the non-public watermark detection cannot prove itself
faithfully conducting the detection. We observe that it is attributed to the
secret key mostly used in the watermark detection -- it cannot be public, or
the adversary may launch removal attacks provided the key; nor can it be
private, or the watermarking detection is opaque to the public. To resolve the
dilemma, we propose PVMark, a plugin based on zero-knowledge proof (ZKP),
enabling the watermark detection process to be publicly verifiable by third
parties without disclosing any secret key. PVMark hinges upon the proof of
`correct execution' of watermark detection on which a set of ZKP constraints
are built, including mapping, random number generation, comparison, and
summation. We implement multiple variants of PVMark in Python, Rust and Circom,
covering combinations of three watermarking schemes, three hash functions, and
four ZKP protocols, to show our approach effectively works under a variety of
circumstances. By experimental results, PVMark efficiently enables public
verifiability on the state-of-the-art LLM watermarking schemes yet without
compromising the watermarking performance, promising to be deployed in
practice.
Ссылки и действия
Дополнительные ресурсы: