3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation

2509.26233v1 cs.GR, cs.AI, cs.CV 2025-10-02

Авторы:

Balamurugan Thambiraja, Malte Prinzler, Sadegh Aliakbarian, Darren Cosker, Justus Thies

Abstract

Creating personalized 3D animations with precise control and realistic head motions remains challenging for current speech-driven 3D facial animation methods. Editing these animations is especially complex and time consuming, requires precise control and typically handled by highly skilled animators. Most existing works focus on controlling style or emotion of the synthesized animation and cannot edit/regenerate parts of an input animation. They also overlook the fact that multiple plausible lip and head movements can match the same audio input. To address these challenges, we present 3DiFACE, a novel method for holistic speech-driven 3D facial animation. Our approach produces diverse plausible lip and head motions for a single audio input and allows for editing via keyframing and interpolation. Specifically, we propose a fully-convolutional diffusion model that can leverage the viseme-level diversity in our training corpus. Additionally, we employ a speaking-style personalization and a novel sparsely-guided motion diffusion to enable precise control and editing. Through quantitative and qualitative evaluations, we demonstrate that our method is capable of generating and editing diverse holistic 3D facial animations given a single audio input, with control between high fidelity and diversity. Code and models are available here: https://balamuruganthambiraja.github.io/3DiFACE

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

3DiFACE: Synthesizing and Editing Holistic 3D Facial Animation

Авторы:

Abstract

Ссылки и действия

Связанные статьи

A 3D Generation Framework from Cross Modality to Parameterized Primitive

3Dify: a Framework for Procedural 3D-CG Generation Assisted by LLMs Using MCP an...

Bridging Text and Video Generation: A Survey

SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder

ReLumix: Extending Image Relighting to Video via Video Diffusion Models

Навигация