Flow Matching with Semidiscrete Couplings
2509.25519v1
cs.LG, stat.ML
2025-10-03
Авторы:
Alireza Mousavi-Hosseini, Stephen Y. Zhang, Michal Klein, Marco Cuturi
Abstract
Flow models parameterized as time-dependent velocity fields can generate data
from noise by integrating an ODE. These models are often trained using flow
matching, i.e. by sampling random pairs of noise and target points
$(\mathbf{x}_0,\mathbf{x}_1)$ and ensuring that the velocity field is aligned,
on average, with $\mathbf{x}_1-\mathbf{x}_0$ when evaluated along a segment
linking $\mathbf{x}_0$ to $\mathbf{x}_1$. While these pairs are sampled
independently by default, they can also be selected more carefully by matching
batches of $n$ noise to $n$ target points using an optimal transport (OT)
solver. Although promising in theory, the OT flow matching (OT-FM) approach is
not widely used in practice. Zhang et al. (2025) pointed out recently that
OT-FM truly starts paying off when the batch size $n$ grows significantly,
which only a multi-GPU implementation of the Sinkhorn algorithm can handle.
Unfortunately, the costs of running Sinkhorn can quickly balloon, requiring
$O(n^2/\varepsilon^2)$ operations for every $n$ pairs used to fit the velocity
field, where $\varepsilon$ is a regularization parameter that should be
typically small to yield better results. To fulfill the theoretical promises of
OT-FM, we propose to move away from batch-OT and rely instead on a semidiscrete
formulation that leverages the fact that the target dataset distribution is
usually of finite size $N$. The SD-OT problem is solved by estimating a dual
potential vector using SGD; using that vector, freshly sampled noise vectors at
train time can then be matched with data points at the cost of a maximum inner
product search (MIPS). Semidiscrete FM (SD-FM) removes the quadratic dependency
on $n/\varepsilon$ that bottlenecks OT-FM. SD-FM beats both FM and OT-FM on all
training metrics and inference budget constraints, across multiple datasets, on
unconditional/conditional generation, or when using mean-flow models.
Ссылки и действия
Дополнительные ресурсы: