TED++: Submanifold-Aware Backdoor Detection via Layerwise Tubular-Neighbourhood Screening
2510.14299v1
cs.LG, cs.AI, 68T07, 62H30, 53Z50, I.2.6; I.5.1; K.6.5
2025-10-18
Авторы:
Nam Le, Leo Yu Zhang, Kewen Liao, Shirui Pan, Wei Luo
Abstract
As deep neural networks power increasingly critical applications, stealthy
backdoor attacks, where poisoned training inputs trigger malicious model
behaviour while appearing benign, pose a severe security risk. Many existing
defences are vulnerable when attackers exploit subtle distance-based anomalies
or when clean examples are scarce. To meet this challenge, we introduce TED++,
a submanifold-aware framework that effectively detects subtle backdoors that
evade existing defences. TED++ begins by constructing a tubular neighbourhood
around each class's hidden-feature manifold, estimating its local ``thickness''
from a handful of clean activations. It then applies Locally Adaptive Ranking
(LAR) to detect any activation that drifts outside the admissible tube. By
aggregating these LAR-adjusted ranks across all layers, TED++ captures how
faithfully an input remains on the evolving class submanifolds. Based on such
characteristic ``tube-constrained'' behaviour, TED++ flags inputs whose
LAR-based ranking sequences deviate significantly. Extensive experiments are
conducted on benchmark datasets and tasks, demonstrating that TED++ achieves
state-of-the-art detection performance under both adaptive-attack and
limited-data scenarios. Remarkably, even with only five held-out examples per
class, TED++ still delivers near-perfect detection, achieving gains of up to
14\% in AUROC over the next-best method. The code is publicly available at
https://github.com/namle-w/TEDpp.