📊 Статистика дайджестов

Всего дайджестов: 34607 Добавлено сегодня: 484

Последнее обновление: сегодня

📄 An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning

2025-11-04

Авторы:

Chuyan Chen, Chenyang Ma, Zhangxin Li, Yutong He, Yanjie Dong, Kun Yuan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Communication remains a central bottleneck in large-scale distributed machine learning, and gradient sparsification has emerged as a promising strategy to alleviate this challenge. However, existing gradient compressors face notable limitations: Rand-$K$ discards structural information and performs poorly in practice, while Top-$K$ preserves informative entries but loses the contraction property and requires costly All-Gather operations. In this paper, we propose ARC-Top-$K$, an {All-Reduce}-Com...

ID: 2510.26709v2 cs.LG, cs.DC

arXiv PDF

📄 SERFLOW: A Cross-Service Cost Optimization Framework for SLO-Aware Dynamic ML Inference

2025-11-04

Авторы:

Zongshun Zhang, Ibrahim Matta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Dynamic offloading of Machine Learning (ML) model partitions across different resource orchestration services, such as Function-as-a-Service (FaaS) and Infrastructure-as-a-Service (IaaS), can balance processing and transmission delays while minimizing costs of adaptive inference applications. However, prior work often overlooks real-world factors, such as Virtual Machine (VM) cold starts, requests under long-tail service time distributions, etc. To tackle these limitations, we model each ML quer...

ID: 2510.27182v1 cs.LG, cs.DC

arXiv PDF

📄 ReSpec: Towards Optimizing Speculative Decoding in Reinforcement Learning Systems

2025-11-01

Авторы:

Qiaoling Chen, Zijun Liu, Peng Sun, Shenggui Li, Guoteng Wang, Ziming Liu, Yonggang Wen, Siyuan Feng, Tianwei Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adapting large language models (LLMs) via reinforcement learning (RL) is often bottlenecked by the generation stage, which can consume over 75\% of the training time. Speculative decoding (SD) accelerates autoregressive generation in serving systems, but its behavior under RL training remains largely unexplored. We identify three critical gaps that hinder the naive integration of SD into RL systems: diminishing speedups at large batch sizes, drafter staleness under continual actor updates, and d...

ID: 2510.26475v1 cs.LG, cs.DC

arXiv PDF

📄 An All-Reduce Compatible Top-K Compressor for Communication-Efficient Distributed Learning

2025-11-01

Авторы:

Chuyan Chen, Chenyang Ma, Zhangxin Li, Yutong He, Yanjie Dong, Kun Yuan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Communication remains a central bottleneck in large-scale distributed machine learning, and gradient sparsification has emerged as a promising strategy to alleviate this challenge. However, existing gradient compressors face notable limitations: Rand-$K$\ discards structural information and performs poorly in practice, while Top-$K$\ preserves informative entries but loses the contraction property and requires costly All-Gather operations. In this paper, we propose ARC-Top-$K$, an {All-Reduce}-C...

ID: 2510.26709v1 cs.LG, cs.DC

arXiv PDF

📄 Machine Learning and CPU (Central Processing Unit) Scheduling Co-Optimization over a Network of Computing Centers

2025-10-31

Авторы:

Mohammadreza Doostmohammadian, Zulfiya R. Gabidullina, Hamid R. Rabiee

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In the rapidly evolving research on artificial intelligence (AI) the demand for fast, computationally efficient, and scalable solutions has increased in recent years. The problem of optimizing the computing resources for distributed machine learning (ML) and optimization is considered in this paper. Given a set of data distributed over a network of computing-nodes/servers, the idea is to optimally assign the CPU (central processing unit) usage while simultaneously training each computing node lo...

ID: 2510.25176v1 cs.LG, cs.DC, cs.MA, cs.SY, eess.SY, math.OC

arXiv PDF

📄 Sentinel: Dynamic Knowledge Distillation for Personalized Federated Intrusion Detection in Heterogeneous IoT Networks

2025-10-29

Авторы:

Gurpreet Singh, Keshav Sood, P. Rajalakshmi, Yong Xiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Federated learning (FL) offers a privacy-preserving paradigm for machine learning, but its application in intrusion detection systems (IDS) within IoT networks is challenged by severe class imbalance, non-IID data, and high communication overhead.These challenges severely degrade the performance of conventional FL methods in real-world network traffic classification. To overcome these limitations, we propose Sentinel, a personalized federated IDS (pFed-IDS) framework that incorporates a dual-mod...

ID: 2510.23019v1 cs.LG, cs.DC

arXiv PDF

📄 AirFed: Federated Graph-Enhanced Multi-Agent Reinforcement Learning for Multi-UAV Cooperative Mobile Edge Computing

2025-10-29

Авторы:

Zhiyu Wang, Suman Raj, Rajkumar Buyya

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multiple Unmanned Aerial Vehicles (UAVs) cooperative Mobile Edge Computing (MEC) systems face critical challenges in coordinating trajectory planning, task offloading, and resource allocation while ensuring Quality of Service (QoS) under dynamic and uncertain environments. Existing approaches suffer from limited scalability, slow convergence, and inefficient knowledge sharing among UAVs, particularly when handling large-scale IoT device deployments with stringent deadline constraints. This paper...

ID: 2510.23053v1 cs.LG, cs.DC

arXiv PDF

📄 Accelerating Mobile Inference through Fine-Grained CPU-GPU Co-Execution

2025-10-28

Авторы:

Zhuojin Li, Marco Paolieri, Leana Golubchik

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deploying deep neural networks on mobile devices is increasingly important but remains challenging due to limited computing resources. On the other hand, their unified memory architecture and narrower gap between CPU and GPU performance provide an opportunity to reduce inference latency by assigning tasks to both CPU and GPU. The main obstacles for such collaborative execution are the significant synchronization overhead required to combine partial results, and the difficulty of predicting execu...

ID: 2510.21081v1 cs.LG, cs.DC, cs.PF

arXiv PDF

📄 Benchmarking Catastrophic Forgetting Mitigation Methods in Federated Time Series Forecasting

2025-10-28

Авторы:

Khaled Hallak, Oudom Kem

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Catastrophic forgetting (CF) poses a persistent challenge in continual learning (CL), especially within federated learning (FL) environments characterized by non-i.i.d. time series data. While existing research has largely focused on classification tasks in vision domains, the regression-based forecasting setting prevalent in IoT and edge applications remains underexplored. In this paper, we present the first benchmarking framework tailored to investigate CF in federated continual time series fo...

ID: 2510.21491v1 cs.LG, cs.DC, stat.ML, 68T07, 68W15, 62M10, I.2.6; I.2.7; I.5.1; I.5.4

arXiv PDF

📄 ADP-VRSGP: Decentralized Learning with Adaptive Differential Privacy via Variance-Reduced Stochastic Gradient Push

2025-10-25

Авторы:

Xiaoming Wu, Teng Liu, Xin Wang, Ming Yang, Jiguo Yu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Differential privacy is widely employed in decentralized learning to safeguard sensitive data by introducing noise into model updates. However, existing approaches that use fixed-variance noise often degrade model performance and reduce training efficiency. To address these limitations, we propose a novel approach called decentralized learning with adaptive differential privacy via variance-reduced stochastic gradient push (ADP-VRSGP). This method dynamically adjusts both the noise variance and ...

ID: 2510.20157v1 cs.LG, cs.DC

arXiv PDF

Показано 21 - 30 из 83 записей