Steering an Active Learning Workflow Towards Novel Materials Discovery via Queue Prioritization
2509.25538v1
cs.LG, cond-mat.mtrl-sci, cs.AI
2025-10-02
Авторы:
Marcus Schwarting, Logan Ward, Nathaniel Hudson, Xiaoli Yan, Ben Blaiszik, Santanu Chaudhuri, Eliu Huerta, Ian Foster
Abstract
Generative AI poses both opportunities and risks for solving inverse design
problems in the sciences. Generative tools provide the ability to expand and
refine a search space autonomously, but do so at the cost of exploring
low-quality regions until sufficiently fine tuned. Here, we propose a queue
prioritization algorithm that combines generative modeling and active learning
in the context of a distributed workflow for exploring complex design spaces.
We find that incorporating an active learning model to prioritize top design
candidates can prevent a generative AI workflow from expending resources on
nonsensical candidates and halt potential generative model decay. For an
existing generative AI workflow for discovering novel molecular structure
candidates for carbon capture, our active learning approach significantly
increases the number of high-quality candidates identified by the generative
model. We find that, out of 1000 novel candidates, our workflow without active
learning can generate an average of 281 high-performing candidates, while our
proposed prioritization with active learning can generate an average 604
high-performing candidates.