Rethinking Parameter Sharing for LLM Fine-Tuning with Multiple LoRAs
2509.25414v1
cs.LG, cs.AI, cs.CL
2025-10-02
Авторы:
Hao Ban, Kaiyi Ji
Abstract
Large language models are often adapted using parameter-efficient techniques
such as Low-Rank Adaptation (LoRA), formulated as $y = W_0x + BAx$, where $W_0$
is the pre-trained parameters and $x$ is the input to the adapted layer. While
multi-adapter extensions often employ multiple LoRAs, prior studies suggest
that the inner $A$ matrices are highly similar during training and thus
suitable for sharing. We revisit this phenomenon and find that this similarity
is largely attributable to the identical initialization rather than shared
knowledge, with $B$ playing a more critical role in knowledge encoding and
transfer. Motivated by these insights, we propose \textbf{ALoRA}, an asymmetric
multi-LoRA design with multiple $A$ matrices and a single shared $B$ in
multi-task fine-tuning, and \textbf{Fed-ALoRA}, which shares $B$ across clients
in federated fine-tuning under both homogeneous and heterogeneous settings,
through a novel matrix decomposition strategy to accommodate heterogeneous
ranks across clients. Experiments on commonsense reasoning, math reasoning,
multi-task NLP dataset, and federated NLP dataset demonstrate that our methods
achieve more balanced performance across tasks with comparable or superior
average accuracy relative to existing multi-LoRA approaches. Codes are
available at https://github.com/OptMN-Lab/ALoRA.
Ссылки и действия
Дополнительные ресурсы: