Evaluating the effectiveness of LLM-based interoperability
2510.23893v1
cs.SE, cs.AI
2025-10-30
Авторы:
Rodrigo Falcão, Stefan Schweitzer, Julien Siebert, Emily Calvet, Frank Elberzhager
Abstract
Background: Systems of systems are becoming increasingly dynamic and
heterogeneous, and this adds pressure on the long-standing challenge of
interoperability. Besides its technical aspect, interoperability has also an
economic side, as development time efforts are required to build the
interoperability artifacts. Objectives: With the recent advances in the field
of large language models (LLMs), we aim at analyzing the effectiveness of
LLM-based strategies to make systems interoperate autonomously, at runtime,
without human intervention. Method: We selected 13 open source LLMs and curated
four versions of a dataset in the agricultural interoperability use case. We
performed three runs of each model with each version of the dataset, using two
different strategies. Then we compared the effectiveness of the models and the
consistency of their results across multiple runs. Results: qwen2.5-coder:32b
was the most effective model using both strategies DIRECT (average pass@1 >=
0.99) and CODEGEN (average pass@1 >= 0.89) in three out of four dataset
versions. In the fourth dataset version, which included an unit conversion, all
models using the strategy DIRECT failed, whereas using CODEGEN
qwen2.5-coder:32b succeeded with an average pass@1 = 0.75. Conclusion: Some
LLMs can make systems interoperate autonomously. Further evaluation in
different domains is recommended, and further research on reliability
strategies should be conducted.
Ссылки и действия
Дополнительные ресурсы: