Clebsch-Gordan Transformer: Fast and Global Equivariant Attention
2509.24093v1
cs.LG, cs.CV, cs.RO
2025-10-01
Авторы:
Owen Lewis Howell, Linfeng Zhao, Xupeng Zhu, Yaoyao Qian, Haojie Huang, Lingfeng Sun, Wil Thomason, Robert Platt, Robin Walters
Abstract
The global attention mechanism is one of the keys to the success of
transformer architecture, but it incurs quadratic computational costs in
relation to the number of tokens. On the other hand, equivariant models, which
leverage the underlying geometric structures of problem instance, often achieve
superior accuracy in physical, biochemical, computer vision, and robotic tasks,
at the cost of additional compute requirements. As a result, existing
equivariant transformers only support low-order equivariant features and local
context windows, limiting their expressiveness and performance. This work
proposes Clebsch-Gordan Transformer, achieving efficient global attention by a
novel Clebsch-Gordon Convolution on $\SO(3)$ irreducible representations. Our
method enables equivariant modeling of features at all orders while achieving
${O}(N \log N)$ input token complexity. Additionally, the proposed method
scales well with high-order irreducible features, by exploiting the sparsity of
the Clebsch-Gordon matrix. Lastly, we also incorporate optional token
permutation equivariance through either weight sharing or data augmentation. We
benchmark our method on a diverse set of benchmarks including n-body
simulation, QM9, ModelNet point cloud classification and a robotic grasping
dataset, showing clear gains over existing equivariant transformers in GPU
memory size, speed, and accuracy.
Ссылки и действия
Дополнительные ресурсы: