VLM-Guided Inpainting for Anomaly Detection

Seo, Jungyeon; Hong, Kibeom

doi:10.33851/JMIS.2025.12.3.87

상세 보기

VLM-Guided Inpainting for Anomaly Detection

Seo, Jungyeon;
Hong, Kibeom

Citations

WEB OF SCIENCE

0

Citations

SCOPUS

0

초록

Anomaly detection (AD) aims to identify regions in an image that deviate from the expected distribution of normal visual data, a task critical for applications such as industrial inspection. Recent CLIP-based approaches have enabled zero-shot anomaly detection by comparing image features with text-derived embeddings, leveraging pretrained vision-language alignment. While effective in general scenarios, these methods struggle to capture domain-specific normality and often fail to accurately localize subtle anomalies. We introduce a novel framework that integrates CLIP-guided mask inference with a diffusion-based generative inpainting module trained on normal data. To improve semantic consistency and reconstruction fidelity, we incorporate score distillation sampling (SDS) loss, which aligns the inpainted output with the distribution of normal images in the embedding space. Our method is model-agnostic and can be integrated into existing CLIP-based detectors without requiring anomaly annotations. Experiments on datasets from industrial and medical domains demonstrate consistent improvements when integrated with various backbones in both image-level and pixel-level detection tasks. Qualitative results show improved reconstruction and precise localization of fine-grained anomalies.

키워드

Anomaly Detection; Vision-Language Models; Diffusion Models; Score Distillation Sampling.

제목: VLM-Guided Inpainting for Anomaly Detection

저자: Seo, Jungyeon; Hong, Kibeom

DOI: 10.33851/JMIS.2025.12.3.87

발행일: 2025-09

유형: Y

저널명: Journal of Multimedia Information System

권: 12

호: 3

페이지: 87 ~ 94

ScholarWorks@숙명여자대학교

상세 보기

초록

키워드