Model Merging Improves Zero-Shot Generalization in Bioacoustic Foundation Models
2511.05171v1
cs.LG, cs.AI, cs.SD
2025-11-11
Авторы:
Davide Marincione, Donato Crisostomi, Roberto Dessi, Emanuele Rodolà, Emanuele Rossi
Abstract
Foundation models capable of generalizing across species and tasks represent
a promising new frontier in bioacoustics, with NatureLM being one of the most
prominent examples. While its domain-specific fine-tuning yields strong
performance on bioacoustic benchmarks, we observe that it also introduces
trade-offs in instruction-following flexibility. For instance, NatureLM
achieves high accuracy when prompted for either the common or scientific name
individually, but its accuracy drops significantly when both are requested in a
single prompt. We address this by applying a simple model merging strategy that
interpolates NatureLM with its base language model, recovering
instruction-following capabilities with minimal loss of domain expertise.
Finally, we show that the merged model exhibits markedly stronger zero-shot
generalization, achieving over a 200% relative improvement and setting a new
state-of-the-art in closed-set zero-shot classification of unseen species.
Ссылки и действия
Дополнительные ресурсы: