Learning What Helps: Task-Aligned Context Selection for Vision Tasks
2512.00489v1
cs.CV
2025-12-04
Авторы:
Jingyu Guo, Emir Konuk, Fredrik Strand, Christos Matsoukas, Kevin Smith
Abstract
Humans often resolve visual uncertainty by comparing an image with relevant examples, but ViTs lack the ability to identify which examples would improve their predictions. We present Task-Aligned Context Selection (TACS), a framework that learns to select paired examples which truly improve task performance rather than those that merely appear similar. TACS jointly trains a selector network with the task model through a hybrid optimization scheme combining gradient-based supervision and reinforcement learning, making retrieval part of the learning objective. By aligning selection with task rewards, TACS enables discriminative models to discover which contextual examples genuinely help. Across 18 datasets covering fine-grained recognition, medical image classification, and medical image segmentation, TACS consistently outperforms similarity-based retrieval, particularly in challenging or data-limited settings.
Ссылки и действия
Дополнительные ресурсы: