GeoPep: A geometry-aware masked language model for protein-peptide binding site prediction
2510.27040v1
eess.SP, cs.LG
2025-11-04
Авторы:
Dian Chen, Yunkai Chen, Tong Lin, Sijie Chen, Xiaolin Cheng
Abstract
Multimodal approaches that integrate protein structure and sequence have
achieved remarkable success in protein-protein interface prediction. However,
extending these methods to protein-peptide interactions remains challenging due
to the inherent conformational flexibility of peptides and the limited
availability of structural data that hinder direct training of structure-aware
models. To address these limitations, we introduce GeoPep, a novel framework
for peptide binding site prediction that leverages transfer learning from ESM3,
a multimodal protein foundation model. GeoPep fine-tunes ESM3's rich
pre-learned representations from protein-protein binding to address the limited
availability of protein-peptide binding data. The fine-tuned model is further
integrated with a parameter-efficient neural network architecture capable of
learning complex patterns from sparse data. Furthermore, the model is trained
using distance-based loss functions that exploit 3D structural information to
enhance binding site prediction. Comprehensive evaluations demonstrate that
GeoPep significantly outperforms existing methods in protein-peptide binding
site prediction by effectively capturing sparse and heterogeneous binding
patterns.
Ссылки и действия
Дополнительные ресурсы: