Score-based Membership Inference on Diffusion Models
2509.25003v1
cs.LG, cs.CV
2025-10-01
Авторы:
Mingxing Rao, Bowen Qu, Daniel Moyer
Abstract
Membership inference attacks (MIAs) against diffusion models have emerged as
a pressing privacy concern, as these models may inadvertently reveal whether a
given sample was part of their training set. We present a theoretical and
empirical study of score-based MIAs, focusing on the predicted noise vectors
that diffusion models learn to approximate. We show that the expected denoiser
output points toward a kernel-weighted local mean of nearby training samples,
such that its norm encodes proximity to the training set and thereby reveals
membership. Building on this observation, we propose SimA, a single-query
attack that provides a principled, efficient alternative to existing
multi-query methods. SimA achieves consistently strong performance across
variants of DDPM, Latent Diffusion Model (LDM). Notably, we find that Latent
Diffusion Models are surprisingly less vulnerable than pixel-space models, due
to the strong information bottleneck imposed by their latent auto-encoder. We
further investigate this by differing the regularization hyperparameters
($\beta$ in $\beta$-VAE) in latent channel and suggest a strategy to make LDM
training more robust to MIA. Our results solidify the theory of score-based
MIAs, while highlighting that Latent Diffusion class of methods requires better
understanding of inversion for VAE, and not simply inversion of the Diffusion
process
Ссылки и действия
Дополнительные ресурсы: