Learning from All: Concept Alignment for Autonomous Distillation from Multiple Drifting MLLMs
2510.04142v1
cs.CV, cs.AI, cs.LG
2025-10-08
Авторы:
Xiaoyu Yang, Jie Lu, En Yu
Abstract
This paper identifies a critical yet underexplored challenge in distilling
from multimodal large language models (MLLMs): the reasoning trajectories
generated by multiple drifting teachers exhibit concept drift, whereby their
reasoning distributions evolve unpredictably and transmit biases to the student
model, ultimately compromising its performance. To tackle this issue, we
pioneer a theoretical connection between concept drift and knowledge
distillation, casting the non-stationary reasoning dynamics from multiple MLLM
teachers as next-token prediction of multi-stream reasoning trajectories.Guided
by concept drift, we introduce the "learn, compare, critique" paradigm,
culminating in autonomous preference optimization (APO). Under the active
guidance of the teachers, the student model first learns and self-distils
preferred thinking by comparing multiple teachers. It then engages in critical
reflection over the drifting inference from teachers, performing concept
alignment through APO, ultimately yielding a robust, consistent, and
generalizable model.Extensive experiments demonstrate our superior performance
of consistency, robustness and generalization within knowledge distillation.
Besides, we also contributed a large-scale dataset, CXR-MAX (Multi-teachers
Alignment X-rays), comprising 170,982 distilled reasoning trajectories derived
from publicly accessible MLLMs based on MIMIC-CXR. Our code and data are public
at: https://anonymous.4open.science/r/Autonomous-Distillation/.
Ссылки и действия
Дополнительные ресурсы: