Unified Unsupervised Anomaly Detection via Matching Cost Filtering
2510.03363v1
cs.CV, cs.AI, eess.IV
2025-10-08
Авторы:
Zhe Zhang, Mingxiu Cai, Gaochang Wu, Jing Zhang, Lingqiao Liu, Dacheng Tao, Tianyou Chai, Xiatian Zhu
Abstract
Unsupervised anomaly detection (UAD) aims to identify image- and pixel-level
anomalies using only normal training data, with wide applications such as
industrial inspection and medical analysis, where anomalies are scarce due to
privacy concerns and cold-start constraints. Existing methods, whether
reconstruction-based (restoring normal counterparts) or embedding-based
(pretrained representations), fundamentally conduct image- or feature-level
matching to generate anomaly maps. Nonetheless, matching noise has been largely
overlooked, limiting their detection ability. Beyond earlier focus on unimodal
RGB-based UAD, recent advances expand to multimodal scenarios, e.g., RGB--3D
and RGB--Text, enabled by point cloud sensing and vision--language models.
Despite shared challenges, these lines remain largely isolated, hindering a
comprehensive understanding and knowledge transfer. In this paper, we advocate
unified UAD for both unimodal and multimodal settings in the matching
perspective. Under this insight, we present Unified Cost Filtering (UCF), a
generic post-hoc refinement framework for refining anomaly cost volume of any
UAD model. The cost volume is constructed by matching a test sample against
normal samples from the same or different modalities, followed by a learnable
filtering module with multi-layer attention guidance from the test sample,
mitigating matching noise and highlighting subtle anomalies. Comprehensive
experiments on 22 diverse benchmarks demonstrate the efficacy of UCF in
enhancing a variety of UAD methods, consistently achieving new state-of-the-art
results in both unimodal (RGB) and multimodal (RGB--3D, RGB--Text) UAD
scenarios. Code and models will be released at
https://github.com/ZHE-SAPI/CostFilter-AD.
Ссылки и действия
Дополнительные ресурсы: