Foundation Visual Encoders Are Secretly Few-Shot Anomaly Detectors
2510.01934v1
cs.CV, cs.AI, cs.LG
2025-10-04
Авторы:
Guangyao Zhai, Yue Zhou, Xinyan Deng, Lars Heckler, Nassir Navab, Benjamin Busam
Abstract
Few-shot anomaly detection streamlines and simplifies industrial safety
inspection. However, limited samples make accurate differentiation between
normal and abnormal features challenging, and even more so under
category-agnostic conditions. Large-scale pre-training of foundation visual
encoders has advanced many fields, as the enormous quantity of data helps to
learn the general distribution of normal images. We observe that the anomaly
amount in an image directly correlates with the difference in the learnt
embeddings and utilize this to design a few-shot anomaly detector termed
FoundAD. This is done by learning a nonlinear projection operator onto the
natural image manifold. The simple operator acts as an effective tool for
anomaly detection to characterize and identify out-of-distribution regions in
an image. Extensive experiments show that our approach supports multi-class
detection and achieves competitive performance while using substantially fewer
parameters than prior methods. Backed up by evaluations with multiple
foundation encoders, including fresh DINOv3, we believe this idea broadens the
perspective on foundation features and advances the field of few-shot anomaly
detection.
Ссылки и действия
Дополнительные ресурсы: