MusRec: Zero-Shot Text-to-Music Editing via Rectified Flow and Diffusion Transformers

2511.04376v1 cs.SD, cs.AI, cs.LG, cs.MM, eess.AS 2025-11-08
Авторы:

Ali Boudaghi, Hadi Zare

Abstract

Music editing has emerged as an important and practical area of artificial intelligence, with applications ranging from video game and film music production to personalizing existing tracks according to user preferences. However, existing models face significant limitations, such as being restricted to editing synthesized music generated by their own models, requiring highly precise prompts, or necessitating task-specific retraining, thus lacking true zero-shot capability. Leveraging recent advances in rectified flow and diffusion transformers, we introduce MusRec, the first zero-shot text-to-music editing model capable of performing diverse editing tasks on real-world music efficiently and effectively. Experimental results demonstrate that our approach outperforms existing methods in preserving musical content, structural consistency, and editing fidelity, establishing a strong foundation for controllable music editing in real-world scenarios.

Ссылки и действия

Связанные статьи

On the de-duplication of the Lakh MIDI dataset

## Контекст Lakh MIDI Dataset (LMD) является одним из крупнейших общедоступных источников символической музыки. Он содер...

2025-09-24

The Name-Free Gap: Policy-Aware Stylistic Control in Music Generation

#### Контекст Текстово-музыкальные модели, такие как MusicGen, успешно подхватывают широкие атрибуты музыки, такие как ...

2025-09-05

From Discord to Harmony: Decomposed Consonance-based Training for Improved Audio...

## Контекст Аудио Чорд Эстимация (Audio Chord Estimation, ACE) — это ключевая задача в области музыкального информационн...

2025-09-05