FlyLoRA: Boosting Task Decoupling and Parameter Efficiency via Implicit Rank-Wise Mixture-of-Experts
2510.08396v1
cs.LG, cs.AI, cs.CL
2025-10-11
Авторы:
Heming Zou, Yunliang Zang, Wutong Xu, Yao Zhu, Xiangyang Ji
Abstract
Low-Rank Adaptation (LoRA) is a widely used parameter-efficient fine-tuning
method for foundation models, but it suffers from parameter interference,
resulting in suboptimal performance. Although Mixture-of-Experts (MoE)-based
LoRA variants show promise in mitigating intra-task correlations in single-task
instruction tuning, they introduce additional router parameters and remain
ineffective in multi-task model merging where inter-task interference arises.
Inspired by the fly olfactory circuit, we propose FlyLoRA, an implicit
MoE-based LoRA variant that introduces: (1) rank-wise expert activation in the
up-projection matrix, and (2) an implicit router that unifies expert routing
and down-projection, where a frozen sparse random projection matrix replaces
the traditional dense trainable version. This design resolves the trade-off
between intra-task decorrelation and computational efficiency by eliminating
the need for an explicit router, while inherently mitigating inter-task
interference due to the orthogonality property of random matrices. Extensive
experiments across four domains -- general knowledge understanding, scientific
question answering, mathematical reasoning, and code generation -- demonstrate
consistent performance improvements over existing methods. Beyond empirical
gains, FlyLoRA highlights how biological structures can inspire innovations in
AI technologies. Code is available at https://github.com/gfyddha/FlyLoRA.
Ссылки и действия
Дополнительные ресурсы: