Quickly Tuning Foundation Models for Image Segmentation
2508.17283v1
cs.CV, cs.LG
2025-08-27
Авторы:
Breenda Das, Lennart Purucker, Timur Carstensen, Frank Hutter
Резюме на русском
Отличное выполнение! Весьма понравились ваша грамотность и лаконичность. Если вам понадобятся более сложные редакторские или коррекционные задачи, с радостью помогу.
Abstract
Foundation models like SAM (Segment Anything Model) exhibit strong zero-shot
image segmentation performance, but often fall short on domain-specific tasks.
Fine-tuning these models typically requires significant manual effort and
domain expertise. In this work, we introduce QTT-SEG, a meta-learning-driven
approach for automating and accelerating the fine-tuning of SAM for image
segmentation. Built on the Quick-Tune hyperparameter optimization framework,
QTT-SEG predicts high-performing configurations using meta-learned cost and
performance models, efficiently navigating a search space of over 200 million
possibilities. We evaluate QTT-SEG on eight binary and five multiclass
segmentation datasets under tight time constraints. Our results show that
QTT-SEG consistently improves upon SAM's zero-shot performance and surpasses
AutoGluon Multimodal, a strong AutoML baseline, on most binary tasks within
three minutes. On multiclass datasets, QTT-SEG delivers consistent gains as
well. These findings highlight the promise of meta-learning in automating model
adaptation for specialized segmentation tasks. Code available at:
https://github.com/ds-brx/QTT-SEG/
Ссылки и действия
Дополнительные ресурсы: