research-article

Open access

FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization

Authors:

Hao Li,

Jinqiao WangAuthors Info & Claims

MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Pages 2041 - 2049

https://doi.org/10.1145/3664647.3680685

Published: 28 October 2024 Publication History

PDF eReader

Abstract

Zero-shot anomaly detection (ZSAD) methods detect anomalies without prior access to known normal or abnormal samples within target categories. Existing methods typically rely on pretrained multimodal models, computing similarities between manually crafted textual features representing ''normal'' or ''abnormal'' semantics and image patch features to detect anomalies. However, the generic descriptions of ''abnormal'' often fail to precisely match diverse types of anomalies across different object categories. Additionally, computing feature similarities for single patches struggles to pinpoint specific locations of anomalies with various sizes and scales. To address these issues, we propose a novel ZSAD method called FiLo, comprising two components: adaptively learned Fine-Grained Description (FG-Des) and position-enhanced High-Quality Localization (HQ-Loc). FG-Des introduces fine-grained anomaly descriptions for each category using Large Language Models (LLMs) and employs adaptively learned textual templates to enhance the accuracy and interpretability of anomaly detection. HQ-Loc, utilizing Grounding DINO for preliminary localization, position-enhanced text prompts, and Multi-scale Multi-shape Cross-modal Interaction (MMCI) module, facilitates more accurate localization of anomalies of different sizes and shapes. Experimental results on datasets like MVTec and VisA demonstrate that FiLo significantly improves the performance of ZSAD in both detection and localization, achieving state-of-the-art performance with an image-level AUC of 83.9% and a pixel-level AUC of 95.9% on the VisA dataset. Code is available at https://github.com/CASIA-IVA-Lab/FiLo.

Supplemental Material

MP4 File - 1502-video

FiLo: Zero-Shot Anomaly Detection by Fine-Grained Description and High-Quality Localization

Download
6.20 MB

References

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

Abstract

Supplemental Material

References

Cited By

Index Terms

Recommendations

Robust Anomaly Detection and Localization via Simulated Anomalies

Autoencoding Binary Classifiers for Supervised Anomaly Detection

AugPaste: One-Shot Anomaly Detection for Medical Images

Comments

Information

Published In

Sponsors

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations