Zero-shot Shape Classification of Nanoparticles in SEM Images using Vision Foundation Models

2508.03235v1 cs.CV 2025-08-09

Авторы:

Freida Barnatan, Emunah Goldstein, Einav Kalimian, Orchen Madar, Avi Huri, David Zitoun, Ya'akov Mandelbaum, Moshe Amitay

Резюме на русском

Научная статья предлагает новую подходящую широкому кругу пользователей методику для классификации формы наночастиц в системах современных микроскопических исследований. Организации v00d00 стремится к эффективной и доступной интеграции таких технологий в практическое применение. Используя два визовых основных модели — Segment Anything Model (SAM) для сегментации объектов и DINOv2 для выделения признаков — авторы предложили новую широкой широкой парадигму для классификации формы наночастиц в сканировании электронными микроскопами (SEM). Этот подход, обученный на метаданных статьи, позволяет достичь высокой точности классификации без необходимости тщательного тюнинга модели, что оптимизирует процесс подготовки данных. Оценка показала высокую эффективность метода в сравнении с существующими алгоритмами, такими как YOLOv11 и ChatGPT, на трех различных наборах данных. Это работа открывает путь к более доступным и эффективным системам анализа наночастиц в реальных индустриальных условиях.

Abstract

Accurate and efficient characterization of nanoparticle morphology in Scanning Electron Microscopy (SEM) images is critical for ensuring product quality in nanomaterial synthesis and accelerating development. However, conventional deep learning methods for shape classification require extensive labeled datasets and computationally demanding training, limiting their accessibility to the typical nanoparticle practitioner in research and industrial settings. In this study, we introduce a zero-shot classification pipeline that leverages two vision foundation models: the Segment Anything Model (SAM) for object segmentation and DINOv2 for feature embedding. By combining these models with a lightweight classifier, we achieve high-precision shape classification across three morphologically diverse nanoparticle datasets - without the need for extensive parameter fine-tuning. Our methodology outperforms a fine-tuned YOLOv11 and ChatGPT o4-mini-high baselines, demonstrating robustness to small datasets, subtle morphological variations, and domain shifts from natural to scientific imaging. Quantitative clustering metrics on PCA plots of the DINOv2 features are discussed as a means of assessing the progress of the chemical synthesis. This work highlights the potential of foundation models to advance automated microscopy image analysis, offering an alternative to traditional deep learning pipelines in nanoparticle research which is both more efficient and more accessible to the user.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Zero-shot Shape Classification of Nanoparticles in SEM Images using Vision Foundation Models

Авторы:

Резюме на русском

Abstract

Ссылки и действия

Связанные статьи

Know-Show: Benchmarking Video-Language Models on Spatio-Temporal Grounded Reason...

VOST-SGG: VLM-Aided One-Stage Spatio-Temporal Scene Graph Generation

VRSA: Jailbreaking Multimodal Large Language Models through Visual Reasoning Seq...

HuPrior3R: Incorporating Human Priors for Better 3D Dynamic Reconstruction from ...

RAVE: Rate-Adaptive Visual Encoding for 3D Gaussian Splatting

Навигация