HNote: Extending YNote with Hexadecimal Encoding for Fine-Tuning LLMs in Music Modeling
2509.25694v1
cs.SD, cs.AI
2025-10-02
Авторы:
Hung-Ying Chu, Shao-Yu Wei, Guan-Wei Chen, Tzu-Wei Hung, ChengYang Tsai, Yu-Cheng Lin
Abstract
Recent advances in large language models (LLMs) have created new
opportunities for symbolic music generation. However, existing formats such as
MIDI, ABC, and MusicXML are either overly complex or structurally inconsistent,
limiting their suitability for token-based learning architectures. To address
these challenges, we propose HNote, a novel hexadecimal-based notation system
extended from YNote, which encodes both pitch and duration within a fixed
32-unit measure framework. This design ensures alignment, reduces ambiguity,
and is directly compatible with LLM architectures. We converted 12,300
Jiangnan-style songs generated from traditional folk pieces from YNote into
HNote, and fine-tuned LLaMA-3.1(8B) using parameter-efficient LoRA.
Experimental results show that HNote achieves a syntactic correctness rate of
82.5%, and BLEU and ROUGE evaluations demonstrate strong symbolic and
structural similarity, producing stylistically coherent compositions. This
study establishes HNote as an effective framework for integrating LLMs with
cultural music modeling.
Ссылки и действия
Дополнительные ресурсы: