Studies for : A Human-AI Co-Creative Sound Artwork Using a Real-time Multi-channel Sound Generation Model
2510.25228v1
cs.SD, cs.AI
2025-10-31
Авторы:
Chihiro Nagashima, Akira Takahashi, Zhi Zhong, Shusuke Takahashi, Yuki Mitsufuji
Abstract
This paper explores the integration of AI technologies into the artistic
workflow through the creation of Studies for, a generative sound installation
developed in collaboration with sound artist Evala
(https://www.ntticc.or.jp/en/archive/works/studies-for/). The installation
employs SpecMaskGIT, a lightweight yet high-quality sound generation AI model,
to generate and playback eight-channel sound in real-time, creating an
immersive auditory experience over the course of a three-month exhibition. The
work is grounded in the concept of a "new form of archive," which aims to
preserve the artistic style of an artist while expanding beyond artists' past
artworks by continued generation of new sound elements. This speculative
approach to archival preservation is facilitated by training the AI model on a
dataset consisting of over 200 hours of Evala's past sound artworks.
By addressing key requirements in the co-creation of art using AI, this study
highlights the value of the following aspects: (1) the necessity of integrating
artist feedback, (2) datasets derived from an artist's past works, and (3)
ensuring the inclusion of unexpected, novel outputs. In Studies for, the model
was designed to reflect the artist's artistic identity while generating new,
previously unheard sounds, making it a fitting realization of the concept of "a
new form of archive." We propose a Human-AI co-creation framework for
effectively incorporating sound generation AI models into the sound art
creation process and suggest new possibilities for creating and archiving sound
art that extend an artist's work beyond their physical existence. Demo page:
https://sony.github.io/studies-for/
Ссылки и действия
Дополнительные ресурсы: