Efficient High-Accuracy PDEs Solver with the Linear Attention Neural Operator
2510.16816v1
cs.LG, cs.AI, math-ph, math.MP, physics.comp-ph
2025-10-22
Авторы:
Ming Zhong, Zhenya Yan
Abstract
Neural operators offer a powerful data-driven framework for learning mappings
between function spaces, in which the transformer-based neural operator
architecture faces a fundamental scalability-accuracy trade-off: softmax
attention provides excellent fidelity but incurs quadratic complexity
$\mathcal{O}(N^2 d)$ in the number of mesh points $N$ and hidden dimension $d$,
while linear attention variants reduce cost to $\mathcal{O}(N d^2)$ but often
suffer significant accuracy degradation. To address the aforementioned
challenge, in this paper, we present a novel type of neural operators, Linear
Attention Neural Operator (LANO), which achieves both scalability and high
accuracy by reformulating attention through an agent-based mechanism. LANO
resolves this dilemma by introducing a compact set of $M$ agent tokens $(M \ll
N)$ that mediate global interactions among $N$ tokens. This agent attention
mechanism yields an operator layer with linear complexity $\mathcal{O}(MN d)$
while preserving the expressive power of softmax attention. Theoretically, we
demonstrate the universal approximation property, thereby demonstrating
improved conditioning and stability properties. Empirically, LANO surpasses
current state-of-the-art neural PDE solvers, including Transolver with
slice-based softmax attention, achieving average $19.5\%$ accuracy improvement
across standard benchmarks. By bridging the gap between linear complexity and
softmax-level performance, LANO establishes a scalable, high-accuracy
foundation for scientific machine learning applications.