Story2MIDI: Emotionally Aligned Music Generation from Text

2512.02192v1 cs.SD, cs.AI, cs.CL 2025-12-03

Авторы:

Mohammad Shokri, Alexandra C. Salem, Gabriel Levine, Johanna Devaney, Sarah Ita Levitan

Abstract

In this paper, we introduce Story2MIDI, a sequence-to-sequence Transformer-based model for generating emotion-aligned music from a given piece of text. To develop this model, we construct the Story2MIDI dataset by merging existing datasets for sentiment analysis from text and emotion classification in music. The resulting dataset contains pairs of text blurbs and music pieces that evoke the same emotions in the reader or listener. Despite the small scale of our dataset and limited computational resources, our results indicate that our model effectively learns emotion-relevant features in music and incorporates them into its generation process, producing samples with diverse emotional responses. We evaluate the generated outputs using objective musical metrics and a human listening study, confirming the model's ability to capture intended emotional cues.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Story2MIDI: Emotionally Aligned Music Generation from Text

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Melody or Machine: Detecting Synthetic Music with Dual-Stream Contrastive Learni...

SpeechJudge: Towards Human-Level Judgment for Speech Naturalness

Finding My Voice: Generative Reconstruction of Disordered Speech for Automated C...

Spatial Audio Motion Understanding and Reasoning

Bona fide Cross Testing Reveals Weak Spot in Audio Deepfake Detection Systems

Навигация