📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 AWARE: Audio Watermarking with Adversarial Resistance to Edits

2025-10-22

Авторы:

Kosta Pavlović, Lazar Stanarević, Petar Nedić, Slavko Kovačević, Igor Djurović

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Prevailing practice in learning-based audio watermarking is to pursue robustness by expanding the set of simulated distortions during training. However, such surrogates are narrow and prone to overfitting. This paper presents AWARE (Audio Watermarking with Adversarial Resistance to Edits), an alternative approach that avoids reliance on attack-simulation stacks and handcrafted differentiable distortions. Embedding is obtained via adversarial optimization in the time-frequency domain under a leve...

ID: 2510.17512v1 cs.SD, cs.LG, cs.MM, eess.AS

arXiv PDF

📄 Language Model Based Text-to-Audio Generation: Anti-Causally Aligned Collaborative Residual Transformers

2025-10-08

Авторы:

Juncheng Wang, Chao Xu, Cheng Yu, Zhe Hu, Haoyu Xie, Guoqi Yu, Lei Shang, Shujun Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While language models (LMs) paired with residual vector quantization (RVQ) tokenizers have shown promise in text-to-audio (T2A) generation, they still lag behind diffusion-based models by a non-trivial margin. We identify a critical dilemma underpinning this gap: incorporating more RVQ layers improves audio reconstruction fidelity but exceeds the generation capacity of conventional LMs. To address this, we first analyze RVQ dynamics and uncover two key limitations: 1) orthogonality of features a...

ID: 2510.04577v1 cs.SD, cs.LG, cs.MM, eess.AS

arXiv PDF