Verifiable Fine-Tuning for LLMs: Zero-Knowledge Training Proofs Bound to Data Provenance and Policy
2510.16830v1
cs.CR, cs.CL, 68T07, 94A60, 68Q25, I.2.6; G.1.6; E.3; C.2.4
2025-10-22
Авторы:
Hasan Akgul, Daniel Borg, Arta Berisha, Amina Rahimova, Andrej Novak, Mila Petrov
Abstract
Large language models are often adapted through parameter efficient fine
tuning, but current release practices provide weak assurances about what data
were used and how updates were computed. We present Verifiable Fine Tuning, a
protocol and system that produces succinct zero knowledge proofs that a
released model was obtained from a public initialization under a declared
training program and an auditable dataset commitment. The approach combines
five elements. First, commitments that bind data sources, preprocessing,
licenses, and per epoch quota counters to a manifest. Second, a verifiable
sampler that supports public replayable and private index hiding batch
selection. Third, update circuits restricted to parameter efficient fine tuning
that enforce AdamW style optimizer semantics and proof friendly approximations
with explicit error budgets. Fourth, recursive aggregation that folds per step
proofs into per epoch and end to end certificates with millisecond
verification. Fifth, provenance binding and optional trusted execution property
cards that attest code identity and constants. On English and bilingual
instruction mixtures, the method maintains utility within tight budgets while
achieving practical proof performance. Policy quotas are enforced with zero
violations, and private sampling windows show no measurable index leakage.
Federated experiments demonstrate that the system composes with probabilistic
audits and bandwidth constraints. These results indicate that end to end
verifiable fine tuning is feasible today for real parameter efficient
pipelines, closing a critical trust gap for regulated and decentralized
deployments.