paperarXivTrust 82 · PrimaryPublished 2d agoLive · 21h ago

GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models

Industrial inspection requires more than binary anomaly detection: a practical system should determine whether an anomaly exists, localize the defective region, identify the defect type, and provide interpretable visual evidence. Existing CLIP-based methods detect and localize anomalies well but offer limited language-level defect understanding, while instruction-tuned vision-language models can describe defects but do not natively produce pixel-level masks. We introduce GenAU, a Generalist vision-language framework for industrial Anomaly Understanding that unifies image-level detection, pixel

Lineage graph

Paper → model → repo connections mined from source citations (Tier-1 exact match).

Why these links exist

Linked via arxiv authorHongkuan Zhou →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
Linked via arxiv authorTristan Rehm →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
Linked via arxiv authorNadeem Nazer →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
Linked via arxiv authorLavdim Halilaj →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
Linked via arxiv authorJingcheng Wu →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models
Linked via arxiv authorSteffen Staab →
GenAU: Language-Grounded Industrial Anomaly Understanding with Vision-Language Models

Implements

repovlm-starter

Covers

newsInto the Omniverse: Three Workflows for Improving Vision AI Agent Accuracy With Synthetic Data and Fine-Tuning

Has model

modelVioletVision-3B

authored (incoming)

personHongkuan Zhou personTristan Rehm personNadeem Nazer personLavdim Halilaj personJingcheng Wu personSteffen Staab

Related across the graph

personNadeem Nazer personLavdim Halilaj newsInto the Omniverse: Three Workflows for Improving Vision AI Agent Accuracy With Synthetic Data and Fine-Tuning personSteffen Staab modelVioletVision-3B personJingcheng Wu personTristan Rehm personHongkuan Zhou repovlm-starter

Topics

cs.CV