Simplifying Knowledge Transfer in Pretrained Models
2510.22208v1
cs.LG, cs.CV
2025-10-29
Авторы:
Siddharth Jain, Shyamgopal Karthik, Vineet Gandhi
Abstract
Pretrained models are ubiquitous in the current deep learning landscape,
offering strong results on a broad range of tasks. Recent works have shown that
models differing in various design choices exhibit categorically diverse
generalization behavior, resulting in one model grasping distinct data-specific
insights unavailable to the other. In this paper, we propose to leverage large
publicly available model repositories as an auxiliary source of model
improvements. We introduce a data partitioning strategy where pretrained models
autonomously adopt either the role of a student, seeking knowledge, or that of
a teacher, imparting knowledge. Experiments across various tasks demonstrate
the effectiveness of our proposed approach. In image classification, we
improved the performance of ViT-B by approximately 1.4% through bidirectional
knowledge transfer with ViT-T. For semantic segmentation, our method boosted
all evaluation metrics by enabling knowledge transfer both within and across
backbone architectures. In video saliency prediction, our approach achieved a
new state-of-the-art. We further extend our approach to knowledge transfer
between multiple models, leading to considerable performance improvements for
all model participants.
Ссылки и действия
Дополнительные ресурсы: