Distilled Protein Backbone Generation
2510.03095v1
cs.LG, cs.AI, stat.ML
2025-10-07
Авторы:
Liyang Xie, Haoran Zhang, Zhendong Wang, Wesley Tansey, Mingyuan Zhou
Abstract
Diffusion- and flow-based generative models have recently demonstrated strong
performance in protein backbone generation tasks, offering unprecedented
capabilities for de novo protein design. However, while achieving notable
performance in generation quality, these models are limited by their generating
speed, often requiring hundreds of iterative steps in the reverse-diffusion
process. This computational bottleneck limits their practical utility in
large-scale protein discovery, where thousands to millions of candidate
structures are needed. To address this challenge, we explore the techniques of
score distillation, which has shown great success in reducing the number of
sampling steps in the vision domain while maintaining high generation quality.
However, a straightforward adaptation of these methods results in unacceptably
low designability. Through extensive study, we have identified how to
appropriately adapt Score identity Distillation (SiD), a state-of-the-art score
distillation strategy, to train few-step protein backbone generators which
significantly reduce sampling time, while maintaining comparable performance to
their pretrained teacher model. In particular, multistep generation combined
with inference time noise modulation is key to the success. We demonstrate that
our distilled few-step generators achieve more than a 20-fold improvement in
sampling speed, while achieving similar levels of designability, diversity, and
novelty as the Proteina teacher model. This reduction in inference cost enables
large-scale in silico protein design, thereby bringing diffusion-based models
closer to real-world protein engineering applications.
Ссылки и действия
Дополнительные ресурсы: