BigBang-Proton Technical Report: Next-Word-Prediction is Scientific Multitask Learner
2510.00129v1
cs.LG, cond-mat.mtrl-sci, cs.AI, physics.comp-ph, 68T05, 68T50, 00A69, 94A99, I.2.6; I.2.7; J.2; I.6.3; K.4.1
2025-10-05
Авторы:
Hengkui Wu, Liujiang Liu, Jihua He, Qihao Wang, Keke Zhao, Shuyang Hu, Renle Fu, Dahao Liang, Lingyu Zeng, Bruce Liu, Yuan Liu, Jin Zhan, Jiaqiang Niu, Xinglong Jia, Yaqin Hu, Wenjun Ji, Panpan Chi, Ken Chen, Hengyuan Wu, Yingsi Xin, Yongfeng Zhu, Yuexin Wang, Manqi Ruan, Ningtao Bian, Xiaohua Wu, Weipeng Xu
Abstract
We introduce BigBang-Proton, a unified sequence-based architecture for
auto-regressive language modeling pretrained on cross-scale, cross-structure,
cross-discipline real-world scientific tasks to construct a scientific
multi-task learner. BigBang-Proton incorporates three fundamental innovations
compared to mainstream general-purpose LLMs: Theory-Experiment Learning
paradigm aligns large-scale numerical experimental data with theoretical text
corpora; Binary Patch Encoding replaces byte pair encoding(BPE) tokenization;
Monte Carlo Attention substitutes traditional transformer architectures.
Through next-word-prediction pretraining on cross-discipline scientific
datasets of real-world problems mixed with general textual corpus, followed by
fine-tuning and inference on downstream tasks, BigBang-Proton demonstrates
100\% accuracy in up to 50-digit arithmetic addition operations, performance on
par with leading specialized models in particle physics jet tagging, matching
MAE of specialized models in inter-atomic potential simulation, performance
comparable to traditional spatiotemporal models in water quality prediction,
and benchmark-exceeding performance in genome modeling. These results prove
that language-guided scientific computing can match or exceed the performance
of task-specific scientific models while maintaining multitask learning
capabilities. We further hypothesize to scale the pretraining to the universe
scale as a fundamental step toward developing material world foundational
model.