Synthesizability Prediction of Crystalline Structures with a Hierarchical Transformer and Uncertainty Quantification
2510.19251v1
cond-mat.mtrl-sci, cs.LG
2025-10-24
Авторы:
Danial Ebrahimzadeh, Sarah Sharif, Yaser Mike Banad
Abstract
Predicting which hypothetical inorganic crystals can be experimentally
realized remains a central challenge in accelerating materials discovery.
SyntheFormer is a positive-unlabeled framework that learns synthesizability
directly from crystal structure, combining a Fourier-transformed crystal
periodicity (FTCP) representation with hierarchical feature extraction,
Random-Forest feature selection, and a compact deep MLP classifier. The model
is trained on historical data from 2011 through 2018 and evaluated
prospectively on future years from 2019 to 2025, where the positive class
constitutes only 1.02 per cent of samples. Under this temporally separated
evaluation, SyntheFormer achieves a test area under the ROC curve of 0.735 and,
with dual-threshold calibration, attains high-recall screening with 97.6 per
cent recall at 94.2 per cent coverage, which minimizes missed opportunities
while preserving discriminative power. Crucially, the model recovers
experimentally confirmed metastable compounds that lie far from the convex hull
and simultaneously assigns low scores to many thermodynamically stable yet
unsynthesized candidates, demonstrating that stability alone is insufficient to
predict experimental attainability. By aligning structure-aware representation
with uncertainty-aware decision rules, SyntheFormer provides a practical route
to prioritize synthesis targets and focus laboratory effort on the most
promising new inorganic materials.