Adaptive Shared Experts with LoRA-Based Mixture of Experts for Multi-Task Learning
2510.00570v1
cs.CV, cs.AI, cs.LG
2025-10-04
Авторы:
Minghao Yang, Ren Togo, Guang Li, Takahiro Ogawa, Miki Haseyama
Abstract
Mixture-of-Experts (MoE) has emerged as a powerful framework for multi-task
learning (MTL). However, existing MoE-MTL methods often rely on single-task
pretrained backbones and suffer from redundant adaptation and inefficient
knowledge sharing during the transition from single-task to multi-task learning
(STL to MTL). To address these limitations, we propose adaptive shared experts
(ASE) within a low-rank adaptation (LoRA) based MoE, where shared experts are
assigned router-computed gating weights jointly normalized with sparse experts.
This design facilitates STL to MTL transition, enhances expert specialization,
and cooperation. Furthermore, we incorporate fine-grained experts by increasing
the number of LoRA experts while proportionally reducing their rank, enabling
more effective knowledge sharing under a comparable parameter budget. Extensive
experiments on the PASCAL-Context benchmark, under unified training settings,
demonstrate that ASE consistently improves performance across diverse
configurations and validates the effectiveness of fine-grained designs for MTL.
Ссылки и действия
Дополнительные ресурсы: