Fast and Interpretable Protein Substructure Alignment via Optimal Transport
2510.11752v1
q-bio.QM, cs.AI, cs.LG
2025-10-16
Авторы:
Zhiyu Wang, Bingxin Zhou, Jing Wang, Yang Tan, Weishu Zhao, Pietro Liò, Liang Hong
Abstract
Proteins are essential biological macromolecules that execute life functions.
Local motifs within protein structures, such as active sites, are the most
critical components for linking structure to function and are key to
understanding protein evolution and enabling protein engineering. Existing
computational methods struggle to identify and compare these local structures,
which leaves a significant gap in understanding protein structures and
harnessing their functions. This study presents PLASMA, the first deep learning
framework for efficient and interpretable residue-level protein substructure
alignment. We reformulate the problem as a regularized optimal transport task
and leverage differentiable Sinkhorn iterations. For a pair of input protein
structures, PLASMA outputs a clear alignment matrix with an interpretable
overall similarity score. Through extensive quantitative evaluations and three
biological case studies, we demonstrate that PLASMA achieves accurate,
lightweight, and interpretable residue-level alignment. Additionally, we
introduce PLASMA-PF, a training-free variant that provides a practical
alternative when training data are unavailable. Our method addresses a critical
gap in protein structure analysis tools and offers new opportunities for
functional annotation, evolutionary studies, and structure-based drug design.
Reproducibility is ensured via our official implementation at
https://github.com/ZW471/PLASMA-Protein-Local-Alignment.git.
Ссылки и действия
Дополнительные ресурсы: