Cocoon: A System Architecture for Differentially Private Training with Correlated Noises
2510.07304v1
cs.AR, cs.AI, cs.CR, cs.LG
2025-10-10
Авторы:
Donghwan Kim, Xin Gu, Jinho Baek, Timothy Lo, Younghoon Min, Kwangsik Shin, Jongryool Kim, Jongse Park, Kiwan Maeng
Abstract
Machine learning (ML) models memorize and leak training data, causing serious
privacy issues to data owners. Training algorithms with differential privacy
(DP), such as DP-SGD, have been gaining attention as a solution. However,
DP-SGD adds a noise at each training iteration, which degrades the accuracy of
the trained model. To improve accuracy, a new family of approaches adds
carefully designed correlated noises, so that noises cancel out each other
across iterations. We performed an extensive characterization study of these
new mechanisms, for the first time to the best of our knowledge, and show they
incur non-negligible overheads when the model is large or uses large embedding
tables. Motivated by the analysis, we propose Cocoon, a hardware-software
co-designed framework for efficient training with correlated noises. Cocoon
accelerates models with embedding tables through pre-computing and storing
correlated noises in a coalesced format (Cocoon-Emb), and supports large models
through a custom near-memory processing device (Cocoon-NMP). On a real system
with an FPGA-based NMP device prototype, Cocoon improves the performance by
2.33-10.82x(Cocoon-Emb) and 1.55-3.06x (Cocoon-NMP).