A Standardized Benchmark for Machine-Learned Molecular Dynamics using Weighted Ensemble Sampling
2510.17187v1
cs.LG, q-bio.BM, 92B20
2025-10-22
Авторы:
Alexander Aghili, Andy Bruce, Daniel Sabo, Sanya Murdeshwar, Kevin Bachelor, Ionut Mistreanu, Ashwin Lokapally, Razvan Marinescu
Abstract
The rapid evolution of molecular dynamics (MD) methods, including
machine-learned dynamics, has outpaced the development of standardized tools
for method validation. Objective comparison between simulation approaches is
often hindered by inconsistent evaluation metrics, insufficient sampling of
rare conformational states, and the absence of reproducible benchmarks. To
address these challenges, we introduce a modular benchmarking framework that
systematically evaluates protein MD methods using enhanced sampling analysis.
Our approach uses weighted ensemble (WE) sampling via The Weighted Ensemble
Simulation Toolkit with Parallelization and Analysis (WESTPA), based on
progress coordinates derived from Time-lagged Independent Component Analysis
(TICA), enabling fast and efficient exploration of protein conformational
space. The framework includes a flexible, lightweight propagator interface that
supports arbitrary simulation engines, allowing both classical force fields and
machine learning-based models. Additionally, the framework offers a
comprehensive evaluation suite capable of computing more than 19 different
metrics and visualizations across a variety of domains. We further contribute a
dataset of nine diverse proteins, ranging from 10 to 224 residues, that span a
variety of folding complexities and topologies. Each protein has been
extensively simulated at 300K for one million MD steps per starting point (4
ns). To demonstrate the utility of our framework, we perform validation tests
using classic MD simulations with implicit solvent and compare protein
conformational sampling using a fully trained versus under-trained CGSchNet
model. By standardizing evaluation protocols and enabling direct, reproducible
comparisons across MD approaches, our open-source platform lays the groundwork
for consistent, rigorous benchmarking across the molecular simulation
community.
Ссылки и действия
Дополнительные ресурсы: