📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
📄 An Evaluation of Interleaved Instruction Tuning on Semantic Reasoning Performance in an Audio MLLM
2025-11-06Авторы:
Jiawei Liu, Enis Berk Çoban, Zarina Schevchenko, Hao Tang, Zhigang Zhu, Michael I Mandel, Johanna Devaney
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Standard training for Multi-modal Large Language Models (MLLMs) involves
concatenating non-textual information, like vision or audio, with a text
prompt. This approach may not encourage deep integration of modalities,
limiting the model's ability to leverage the core language model's reasoning
capabilities. This work examined the impact of interleaved instruction tuning
in an audio MLLM, where audio tokens are interleaved within the prompt. Using
the Listen, Think, and Understand (LTU) model as ...