Analyzing the Impact of Participant Failures in Cross-Silo Federated Learning
2511.14456v1
cs.DC, cs.AI
2025-11-20
Авторы:
Fabian Stricker, David Bermbach, Christian Zirpins
Abstract
Federated learning (FL) is a new paradigm for training machine learning (ML) models without sharing data. While applying FL in cross-silo scenarios, where organizations collaborate, it is necessary that the FL system is reliable; however, participants can fail due to various reasons (e.g., communication issues or misconfigurations). In order to provide a reliable system, it is necessary to analyze the impact of participant failures. While this problem received attention in cross-device FL where mobile devices with limited resources participate, there is comparatively little research in cross-silo FL.
Therefore, we conduct an extensive study for analyzing the impact of participant failures on the model quality in the context of inter-organizational cross-silo FL with few participants. In our study, we focus on analyzing generally influential factors such as the impact of the timing and the data as well as the impact on the evaluation, which is important for deciding, if the model should be deployed. We show that under high skews the evaluation is optimistic and hides the real impact. Furthermore, we demonstrate that the timing impacts the quality of the trained model. Our results offer insights for researchers and software architects aiming to build robust FL systems.
Ссылки и действия
Дополнительные ресурсы: