Decoding Partial Differential Equations: Cross-Modal Adaptation of Decoder-only Models to PDEs
2510.05278v1
cs.LG, cs.CL
2025-10-09
Авторы:
Paloma García-de-Herreros, Philipp Slusallek, Dietrich Klakow, Vagrant Gautam
Abstract
Large language models have shown great success on natural language tasks in
recent years, but they have also shown great promise when adapted to new
modalities, e.g., for scientific machine learning tasks. Even though
decoder-only models are more popular within NLP and scale exceedingly well at
generating natural language, most proposed approaches for cross-modal
adaptation focus on encoder-only models, raising the question of how model
architecture affects these approaches. In this paper, we therefore perform a
series of ablation studies to answer this question, systematically comparing
encoder-only and decoder-only models on cross-modal adaptation for
time-dependent simulation tasks based on partial differential equations (PDEs).
We find that decoder-only models are far worse than encoder-only models, when
existing approaches are applied unmodified. In contrast to several other
domains, scaling decoder-only models also does not help. To harness the
potential of decoder-only models in this context, we introduce two novel
approaches, Parallel Flipping and Sequence Doubling, attempting to mimic
bidirectionality in autoregressive models. Both our methods improve overall
performance using decoder-only models for all tasks and all cross-model
adaptation methods, closing the gap to encoder-only model performance. We hope
that our findings broaden the spectrum of models used on cross-modal adaptation
tasks to further scientific ML.
Ссылки и действия
Дополнительные ресурсы: