Efficient High-Resolution Image Editing with Hallucination-Aware Loss and Adaptive Tiling
2510.06295v1
cs.CV, cs.AI, cs.LG
2025-10-10
Авторы:
Young D. Kwon, Abhinav Mehrotra, Malcolm Chadwick, Alberto Gil Ramos, Sourav Bhattacharya
Abstract
High-resolution (4K) image-to-image synthesis has become increasingly
important for mobile applications. Existing diffusion models for image editing
face significant challenges, in terms of memory and image quality, when
deployed on resource-constrained devices. In this paper, we present
MobilePicasso, a novel system that enables efficient image editing at high
resolutions, while minimising computational cost and memory usage.
MobilePicasso comprises three stages: (i) performing image editing at a
standard resolution with hallucination-aware loss, (ii) applying latent
projection to overcome going to the pixel space, and (iii) upscaling the edited
image latent to a higher resolution with adaptive context-preserving tiling.
Our user study with 46 participants reveals that MobilePicasso not only
improves image quality by 18-48% but reduces hallucinations by 14-51% over
existing methods. MobilePicasso demonstrates significantly lower latency, e.g.,
up to 55.8$\times$ speed-up, yet with a small increase in runtime memory, e.g.,
a mere 9% increase over prior work. Surprisingly, the on-device runtime of
MobilePicasso is observed to be faster than a server-based high-resolution
image editing model running on an A100 GPU.
Ссылки и действия
Дополнительные ресурсы: