On the Adversarial Robustness of Learning-based Conformal Novelty Detection
2510.00463v1
stat.ML, cs.LG, eess.SP, stat.ME
2025-10-04
Авторы:
Daofu Zhang, Mehrdad Pournaderi, Hanne M. Clifford, Yu Xiang, Pramod K. Varshney
Abstract
This paper studies the adversarial robustness of conformal novelty detection.
In particular, we focus on AdaDetect, a powerful learning-based framework for
novelty detection with finite-sample false discovery rate (FDR) control. While
AdaDetect provides rigorous statistical guarantees under benign conditions, its
behavior under adversarial perturbations remains unexplored. We first formulate
an oracle attack setting that quantifies the worst-case degradation of FDR,
deriving an upper bound that characterizes the statistical cost of attacks.
This idealized formulation directly motivates a practical and effective attack
scheme that only requires query access to AdaDetect's output labels. Coupling
these formulations with two popular and complementary black-box adversarial
algorithms, we systematically evaluate the vulnerability of AdaDetect on
synthetic and real-world datasets. Our results show that adversarial
perturbations can significantly increase the FDR while maintaining high
detection power, exposing fundamental limitations of current error-controlled
novelty detection methods and motivating the development of more robust
alternatives.