VioPTT: Violin Technique-Aware Transcription from Synthetic Data Augmentation
2509.23759v2
cs.SD, cs.LG
2025-10-01
Авторы:
Ting-Kang Wang, Yueh-Po Peng, Li Su, Vincent K. M. Cheung
Abstract
While automatic music transcription is well-established in music information
retrieval, most models are limited to transcribing pitch and timing information
from audio, and thus omit crucial expressive and instrument-specific nuances.
One example is playing technique on the violin, which affords its distinct
palette of timbres for maximal emotional impact. Here, we propose VioPTT
(Violin Playing Technique-aware Transcription), a lightweight, end-to-end model
that directly transcribes violin playing technique in addition to pitch onset
and offset. Furthermore, we release MOSA-VPT, a novel, high-quality synthetic
violin playing technique dataset to circumvent the need for manually labeled
annotations. Leveraging this dataset, our model demonstrated strong
generalization to real-world note-level violin technique recordings in addition
to achieving state-of-the-art transcription performance. To our knowledge,
VioPTT is the first to jointly combine violin transcription and playing
technique prediction within a unified framework.
Ссылки и действия
Дополнительные ресурсы: