ObCLIP: Oblivious CLoud-Device Hybrid Image Generation with Privacy Preservation
2510.04153v1
cs.CR, cs.LG
2025-10-08
Авторы:
Haoqi Wu, Wei Dai, Ming Xu, Li Wang, Qiang Yan
Abstract
Diffusion Models have gained significant popularity due to their remarkable
capabilities in image generation, albeit at the cost of intensive computation
requirement. Meanwhile, despite their widespread deployment in inference
services such as Midjourney, concerns about the potential leakage of sensitive
information in uploaded user prompts have arisen. Existing solutions either
lack rigorous privacy guarantees or fail to strike an effective balance between
utility and efficiency. To bridge this gap, we propose ObCLIP, a plug-and-play
safeguard that enables oblivious cloud-device hybrid generation. By oblivious,
each input prompt is transformed into a set of semantically similar candidate
prompts that differ only in sensitive attributes (e.g., gender, ethnicity). The
cloud server processes all candidate prompts without knowing which one is the
real one, thus preventing any prompt leakage. To mitigate server cost, only a
small portion of denoising steps is performed upon the large cloud model. The
intermediate latents are then sent back to the client, which selects the
targeted latent and completes the remaining denoising using a small device
model. Additionally, we analyze and incorporate several cache-based
accelerations that leverage temporal and batch redundancy, effectively reducing
computation cost with minimal utility degradation. Extensive experiments across
multiple datasets demonstrate that ObCLIP provides rigorous privacy and
comparable utility to cloud models with slightly increased server cost.
Ссылки и действия
Дополнительные ресурсы: