GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters
2510.19778v1
cs.LG, cs.CL
2025-10-24
Авторы:
Anand Choudhary, Yasser Sulaıman, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Antoine Bosselut
Abstract
Sparse fine-tuning techniques adapt LLMs to downstream tasks by only tuning a
sparse subset of model parameters. However, the effectiveness of sparse
adaptation depends on optimally selecting the model parameters to be
fine-tuned. In this work, we introduce a novel sparse fine-tuning technique
named GaLLoP: Gradient-based Sparse Learning on Low-Magnitude Parameters, which
fine-tunes only those model parameters which have the largest gradient
magnitudes on downstream tasks and the smallest pre-trained magnitudes,
intuitively prioritizing parameters that are highly task-relevant, but
minimally disruptive to pre-trained knowledge. Our experimentation with LLaMA3
8B and Gemma 2B as base models shows that GaLLoP consistently improves or
matches the in-distribution as well as out-of-distribution performance obtained
via the usage of other leading parameter-efficient fine-tuning techniques,
including LoRA, DoRA, and SAFT. Our analysis demonstrates that GaLLoP mitigates
catastrophic forgetting and memorization of task data, as important pre-trained
parameters remain unchanged, and stabilizes performance relative to other
fine-tuning techniques, robustly generalizing across most random seeds.
Ссылки и действия
Дополнительные ресурсы: