Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness
2510.00517v1
cs.LG, cs.CR
2025-10-04
Авторы:
Tsubasa Takahashi, Shojiro Yamabe, Futa Waseda, Kento Sasaki
Abstract
Differential Attention (DA) has been proposed as a refinement to standard
attention, suppressing redundant or noisy context through a subtractive
structure and thereby reducing contextual hallucination. While this design
sharpens task-relevant focus, we show that it also introduces a structural
fragility under adversarial perturbations. Our theoretical analysis identifies
negative gradient alignment-a configuration encouraged by DA's subtraction-as
the key driver of sensitivity amplification, leading to increased gradient
norms and elevated local Lipschitz constants. We empirically validate this
Fragile Principle through systematic experiments on ViT/DiffViT and evaluations
of pretrained CLIP/DiffCLIP, spanning five datasets in total. These results
demonstrate higher attack success rates, frequent gradient opposition, and
stronger local sensitivity compared to standard attention. Furthermore,
depth-dependent experiments reveal a robustness crossover: stacking DA layers
attenuates small perturbations via depth-dependent noise cancellation, though
this protection fades under larger attack budgets. Overall, our findings
uncover a fundamental trade-off: DA improves discriminative focus on clean
inputs but increases adversarial vulnerability, underscoring the need to
jointly design for selectivity and robustness in future attention mechanisms.
Ссылки и действия
Дополнительные ресурсы: