상세 보기
- Seo, Jungyeon;
- Hong, Kibeom
WEB OF SCIENCE
0SCOPUS
0초록
Anomaly detection (AD) aims to identify regions in an image that deviate from the expected distribution of normal visual data, a task critical for applications such as industrial inspection. Recent CLIP-based approaches have enabled zero-shot anomaly detection by comparing image features with text-derived embeddings, leveraging pretrained vision-language alignment. While effective in general scenarios, these methods struggle to capture domain-specific normality and often fail to accurately localize subtle anomalies. We introduce a novel framework that integrates CLIP-guided mask inference with a diffusion-based generative inpainting module trained on normal data. To improve semantic consistency and reconstruction fidelity, we incorporate score distillation sampling (SDS) loss, which aligns the inpainted output with the distribution of normal images in the embedding space. Our method is model-agnostic and can be integrated into existing CLIP-based detectors without requiring anomaly annotations. Experiments on datasets from industrial and medical domains demonstrate consistent improvements when integrated with various backbones in both image-level and pixel-level detection tasks. Qualitative results show improved reconstruction and precise localization of fine-grained anomalies.
키워드
- 제목
- VLM-Guided Inpainting for Anomaly Detection
- 저자
- Seo, Jungyeon; Hong, Kibeom
- 발행일
- 2025-09
- 유형
- Y
- 저널명
- Journal of Multimedia Information System
- 권
- 12
- 호
- 3
- 페이지
- 87 ~ 94