Unlocking Visual Anomaly Detection: Navigating Challenges and Pioneering with Vision-Language Models

Hossein Kashiani
Clemson University

Time: 2025-12-03, 12:00 - 13:00 ET
Location: Rice 540 and Zoom

Abstract Visual anomaly detection (VAD) is pivotal for ensuring quality in manufacturing, medical imaging, and safety inspections, yet it continues to face challenges such as data scarcity, domain shifts, and the need for precise localization and reasoning. This seminar explores VAD fundamentals, core challenges, and recent advancements leveraging vision-language models and multimodal large language models (MLLMs). We contrast CLIP-based methods for efficient zero/few-shot detection with MLLM-driven reasoning for explainable, threshold-free outcomes. Drawing from recent studies, we highlight emerging trends, benchmarks, and future directions toward building adaptable, real-world VAD systems. This talk is designed for researchers and practitioners interested in AI-driven inspection and next-generation multimodal approaches.

Bio: Hossein Kashiani is a fourth-year Ph.D. student in the IS-WiN Lab at Clemson University (CU), advised by Dr. Fatemeh Afghah. Before joining CU, he worked as a research assistant on biometrics at West Virginia University. He completed his Master’s degree in Electrical Engineering at Iran University of Science & Technology. His research focuses on enhancing the generalization of machine learning models to unseen domains, with applications spanning various areas, including anomaly detection, biometrics, healthcare, visual perception tasks, and scene understanding.