MarS-FM: Generative Modeling of Molecular Dynamics via Markov State Models
2509.24779v2
cs.LG, q-bio.BM
2025-10-01
Авторы:
Kacper Kapuśniak, Cristian Gabellini, Michael Bronstein, Prudencio Tossou, Francesco Di Giovanni
Abstract
Molecular Dynamics (MD) is a powerful computational microscope for probing
protein functions. However, the need for fine-grained integration and the long
timescales of biomolecular events make MD computationally expensive. To address
this, several generative models have been proposed to generate surrogate
trajectories at lower cost. Yet, these models typically learn a fixed-lag
transition density, causing the training signal to be dominated by frequent but
uninformative transitions. We introduce a new class of generative models, MSM
Emulators, which instead learn to sample transitions across discrete states
defined by an underlying Markov State Model (MSM). We instantiate this class
with Markov Space Flow Matching (MarS-FM), whose sampling offers more than two
orders of magnitude speedup compared to implicit- or explicit-solvent MD
simulations. We benchmark Mars-FM ability to reproduce MD statistics through
structural observables such as RMSD, radius of gyration, and secondary
structure content. Our evaluation spans protein domains (up to 500 residues)
with significant chemical and structural diversity, including unfolding events,
and enforces strict sequence dissimilarity between training and test sets to
assess generalization. Across all metrics, MarS-FM outperforms existing
methods, often by a substantial margin.
Ссылки и действия
Дополнительные ресурсы: