DSSmoothing: Toward Certified Dataset Ownership Verification for Pre-trained Language Models via Dual-Space Smoothing
2510.15303v1
cs.CR, cs.AI, cs.CY
2025-10-21
Авторы:
Ting Qiao, Xing Liu, Wenke Huang, Jianbin Li, Zhaoxin Fan, Yiming Li
Abstract
Large web-scale datasets have driven the rapid advancement of pre-trained
language models (PLMs), but unauthorized data usage has raised serious
copyright concerns. Existing dataset ownership verification (DOV) methods
typically assume that watermarks remain stable during inference; however, this
assumption often fails under natural noise and adversary-crafted perturbations.
We propose the first certified dataset ownership verification method for PLMs
based on dual-space smoothing (i.e., DSSmoothing). To address the challenges of
text discreteness and semantic sensitivity, DSSmoothing introduces continuous
perturbations in the embedding space to capture semantic robustness and applies
controlled token reordering in the permutation space to capture sequential
robustness. DSSmoothing consists of two stages: in the first stage, triggers
are collaboratively embedded in both spaces to generate norm-constrained and
robust watermarked datasets; in the second stage, randomized smoothing is
applied in both spaces during verification to compute the watermark robustness
(WR) of suspicious models and statistically compare it with the principal
probability (PP) values of a set of benign models. Theoretically, DSSmoothing
provides provable robustness guarantees for dataset ownership verification by
ensuring that WR consistently exceeds PP under bounded dual-space
perturbations. Extensive experiments on multiple representative web datasets
demonstrate that DSSmoothing achieves stable and reliable verification
performance and exhibits robustness against potential adaptive attacks.
Ссылки и действия
Дополнительные ресурсы: