Coder as Editor: Code-driven Interpretable Molecular Optimization
2510.14455v1
cs.LG, q-bio.BM
2025-10-18
Авторы:
Wenyu Zhu, Chengzhu Li, Xiaohe Tian, Yifan Wang, Yinjun Jia, Jianhui Wang, Bowen Gao, Ya-Qin Zhang, Wei-Ying Ma, Yanyan Lan
Abstract
Molecular optimization is a central task in drug discovery that requires
precise structural reasoning and domain knowledge. While large language models
(LLMs) have shown promise in generating high-level editing intentions in
natural language, they often struggle to faithfully execute these
modifications-particularly when operating on non-intuitive representations like
SMILES. We introduce MECo, a framework that bridges reasoning and execution by
translating editing actions into executable code. MECo reformulates molecular
optimization for LLMs as a cascaded framework: generating human-interpretable
editing intentions from a molecule and property goal, followed by translating
those intentions into executable structural edits via code generation. Our
approach achieves over 98% accuracy in reproducing held-out realistic edits
derived from chemical reactions and target-specific compound pairs. On
downstream optimization benchmarks spanning physicochemical properties and
target activities, MECo substantially improves consistency by 38-86 percentage
points to 90%+ and achieves higher success rates over SMILES-based baselines
while preserving structural similarity. By aligning intention with execution,
MECo enables consistent, controllable and interpretable molecular design,
laying the foundation for high-fidelity feedback loops and collaborative
human-AI workflows in drug discovery.
Ссылки и действия
Дополнительные ресурсы: