Wave-PDE Nets: Trainable Wave-Equation Layers as an Alternative to Attention
2510.04304v1
cs.LG, cs.CL
2025-10-08
Авторы:
Harshil Vejendla
Abstract
We introduce Wave-PDE Nets, a neural architecture whose elementary operation
is a differentiable simulation of the second-order wave equation. Each layer
propagates its hidden state as a continuous field through a medium with
trainable spatial velocity c(x) and damping {\gamma}(x). A symplectic spectral
solver based on FFTs realises this propagation in O(nlog n) time. This
oscillatory, global mechanism provides a powerful alternative to attention and
first-order state-space models. We prove that a single Wave-PDE layer is a
universal approximator. On language and vision benchmarks, Wave-PDE Nets match
or exceed Transformer performance while demonstrating superior practical
efficiency, reducing wall-clock time by up to 30% and peak memory by 25%.
Ablation studies confirm the critical role of symplectic integration and a
spectral Laplacian for stability and performance. Visualizations of the learned
physical parameters reveal that the model learns intuitive strategies for
information propagation. These results position Wave-PDE Nets as a
computationally efficient and robust architecture with a strong physical
inductive bias.
Ссылки и действия
Дополнительные ресурсы: