FreqCa: Accelerating Diffusion Models via Frequency-Aware Caching
2510.08669v1
cs.LG, cs.AI, cs.CV
2025-10-14
Авторы:
Jiacheng Liu, Peiliang Cai, Qinming Zhou, Yuqi Lin, Deyang Kong, Benhao Huang, Yupei Pan, Haowen Xu, Chang Zou, Junshu Tang, Shikang Zheng, Linfeng Zhang
Abstract
The application of diffusion transformers is suffering from their significant
inference costs. Recently, feature caching has been proposed to solve this
problem by reusing features from previous timesteps, thereby skipping
computation in future timesteps. However, previous feature caching assumes that
features in adjacent timesteps are similar or continuous, which does not always
hold in all settings. To investigate this, this paper begins with an analysis
from the frequency domain, which reveal that different frequency bands in the
features of diffusion models exhibit different dynamics across timesteps.
Concretely, low-frequency components, which decide the structure of images,
exhibit higher similarity but poor continuity. In contrast, the high-frequency
bands, which decode the details of images, show significant continuity but poor
similarity. These interesting observations motivate us to propose
Frequency-aware Caching (FreqCa)
which directly reuses features of low-frequency components based on their
similarity, while using a second-order Hermite interpolator to predict the
volatile high-frequency ones based on its continuity.
Besides, we further propose to cache Cumulative Residual Feature (CRF)
instead of the features in all the layers, which reduces the memory footprint
of feature caching by 99%.
Extensive experiments on FLUX.1-dev, FLUX.1-Kontext-dev, Qwen-Image, and
Qwen-Image-Edit demonstrate its effectiveness in both generation and editing.
Codes are available in the supplementary materials and will be released on
GitHub.
Ссылки и действия
Дополнительные ресурсы: