Is the Hard-Label Cryptanalytic Model Extraction Really Polynomial?
2510.06692v1
cs.LG, cs.CR
2025-10-10
Авторы:
Akira Ito, Takayuki Miura, Yosuke Todo
Abstract
Deep Neural Networks (DNNs) have attracted significant attention, and their
internal models are now considered valuable intellectual assets. Extracting
these internal models through access to a DNN is conceptually similar to
extracting a secret key via oracle access to a block cipher. Consequently,
cryptanalytic techniques, particularly differential-like attacks, have been
actively explored recently. ReLU-based DNNs are the most commonly and widely
deployed architectures. While early works (e.g., Crypto 2020, Eurocrypt 2024)
assume access to exact output logits, which are usually invisible, more recent
works (e.g., Asiacrypt 2024, Eurocrypt 2025) focus on the hard-label setting,
where only the final classification result (e.g., "dog" or "car") is available
to the attacker. Notably, Carlini et al. (Eurocrypt 2025) demonstrated that
model extraction is feasible in polynomial time even under this restricted
setting.
In this paper, we first show that the assumptions underlying their attack
become increasingly unrealistic as the attack-target depth grows. In practice,
satisfying these assumptions requires an exponential number of queries with
respect to the attack depth, implying that the attack does not always run in
polynomial time. To address this critical limitation, we propose a novel attack
method called CrossLayer Extraction. Instead of directly extracting the secret
parameters (e.g., weights and biases) of a specific neuron, which incurs
exponential cost, we exploit neuron interactions across layers to extract this
information from deeper layers. This technique significantly reduces query
complexity and mitigates the limitations of existing model extraction
approaches.
Ссылки и действия
Дополнительные ресурсы: