TRIM: Token-wise Attention-Derived Saliency for Data-Efficient Instruction Tuning
2510.07118v1
cs.CL, cs.LG
2025-10-10
Авторы:
Manish Nagaraj, Sakshi Choudhary, Utkarsh Saxena, Deepak Ravikumar, Kaushik Roy
Abstract
Instruction tuning is essential for aligning large language models (LLMs) to
downstream tasks and commonly relies on large, diverse corpora. However, small,
high-quality subsets, known as coresets, can deliver comparable or superior
results, though curating them remains challenging. Existing methods often rely
on coarse, sample-level signals like gradients, an approach that is
computationally expensive and overlooks fine-grained features. To address this,
we introduce TRIM (Token Relevance via Interpretable Multi-layer Attention), a
forward-only, token-centric framework. Instead of using gradients, TRIM
operates by matching underlying representational patterns identified via
attention-based "fingerprints" from a handful of target samples. Such an
approach makes TRIM highly efficient and uniquely sensitive to the structural
features that define a task. Coresets selected by our method consistently
outperform state-of-the-art baselines by up to 9% on downstream tasks and even
surpass the performance of full-data fine-tuning in some settings. By avoiding
expensive backward passes, TRIM achieves this at a fraction of the
computational cost. These findings establish TRIM as a scalable and efficient
alternative for building high-quality instruction-tuning datasets.
Ссылки и действия
Дополнительные ресурсы: