📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 A Multi-Agent, Policy-Gradient approach to Network Routing

2025-12-04

Авторы:

Nigel Tao, Jonathan Baxter, Lex Weaver

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Network routing is a distributed decision problem which naturally admits numerical performance measures, such as the average time for a packet to travel from source to destination. OLPOMDP, a policy-gradient reinforcement learning algorithm, was successfully applied to simulated network routing under a number of network models. Multiple distributed agents (routers) learned co-operative behavior without explicit inter-agent communication, and they avoided behavior which was individually desirable...

ID: 2512.03211v1 cs.LG, cs.NI

arXiv PDF

📄 How to DP-fy Your Data: A Practical Guide to Generating Synthetic Data With Differential Privacy

2025-12-04

Авторы:

Natalia Ponomareva, Zheng Xu, H. Brendan McMahan, Peter Kairouz, Lucas Rosenblatt, Vincent Cohen-Addad, Cristóbal Guzmán, Ryan McKenna, Galen Andrew, Alex Bie, Da Yu, Alex Kurakin, Morteza Zadimoghaddam, Sergei Vassilvitskii, Andreas Terzis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

High quality data is needed to unlock the full potential of AI for end users. However finding new sources of such data is getting harder: most publicly-available human generated data will soon have been used. Additionally, publicly available data often is not representative of users of a particular system -- for example, a research speech dataset of contractors interacting with an AI assistant will likely be more homogeneous, well articulated and self-censored than real world commands that end u...

ID: 2512.03238v1 cs.CR, cs.AI, cs.LG, stat.ML

arXiv PDF

📄 Iterative Tilting for Diffusion Fine-Tuning

2025-12-04

Авторы:

Jean Pachebat, Giovanni Conforti, Alain Durmus, Yazid Janati

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce iterative tilting, a gradient-free method for fine-tuning diffusion models toward reward-tilted distributions. The method decomposes a large reward tilt $\exp(λr)$ into $N$ sequential smaller tilts, each admitting a tractable score update via first-order Taylor expansion. This requires only forward evaluations of the reward function and avoids backpropagating through sampling chains. We validate on a two-dimensional Gaussian mixture with linear reward, where the exact tilted distrib...

ID: 2512.03234v1 stat.ML, cs.LG

arXiv PDF

📄 Convergence of a class of gradient-free optimisation schemes when the objective function is noisy, irregular, or both

2025-12-04

Авторы:

Christophe Andrieu, Nicolas Chopin, Ettore Fincato, Mathieu Gerber

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We investigate the convergence properties of a class of iterative algorithms designed to minimize a potentially non-smooth and noisy objective function, which may be algebraically intractable and whose values may be obtained as the output of a black box. The algorithms considered can be cast under the umbrella of a generalised gradient descent recursion, where the gradient is that of a smooth approximation of the objective function. The framework we develop includes as special cases model-based ...

ID: 2512.03225v1 stat.CO, cs.LG, stat.ML

arXiv PDF

📄 SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

2025-12-04

Авторы:

Salman Rahman, Sruthi Gorantla, Arpit Gupta, Swastik Roy, Nanyun Peng, Yang Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Process reward models (PRMs) that provide dense, step-level feedback have shown promise for reinforcement learning, yet their adoption remains limited by the need for expensive step-level annotations or ground truth references. We propose SPARK: a three-stage framework where in the first stage a generator model produces diverse solutions and a verifier model evaluates them using parallel scaling (self-consistency) and sequential scaling (meta-critique). In the second stage, we use these verifica...

ID: 2512.03244v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Novelty detection on path space

2025-12-04

Авторы:

Ioannis Gasteratos, Antoine Jacquier, Maud Lemercier, Terry Lyons, Cristopher Salvi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We frame novelty detection on path space as a hypothesis testing problem with signature-based test statistics. Using transportation-cost inequalities of Gasteratos and Jacquier (2023), we obtain tail bounds for false positive rates that extend beyond Gaussian measures to laws of RDE solutions with smooth bounded vector fields, yielding estimates of quantiles and p-values. Exploiting the shuffle product, we derive exact formulae for smooth surrogates of conditional value-at-risk (CVaR) in terms o...

ID: 2512.03243v1 stat.ML, cs.LG, math.PR, math.ST

arXiv PDF

📄 Too Late to Recall: Explaining the Two-Hop Problem in Multimodal Knowledge Retrieval

2025-12-04

Авторы:

Constantin Venhoff, Ashkan Khakzar, Sonia Joseph, Philip Torr, Neel Nanda

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Training vision language models (VLMs) aims to align visual representations from a vision encoder with the textual representations of a pretrained large language model (LLM). However, many VLMs exhibit reduced factual recall performance compared to their LLM backbones, raising the question of how effective multimodal fine-tuning is at extending existing mechanisms within the LLM to visual inputs. We argue that factual recall based on visual inputs requires VLMs to solve a two-hop problem: (1) fo...

ID: 2512.03276v1 cs.LG

arXiv PDF

📄 Learning Network Sheaves for AI-native Semantic Communication

2025-12-04

Авторы:

Enrico Grimaldi, Mario Edoardo Pandolfo, Gabriele D'Acunto, Sergio Barbarossa, Paolo Di Lorenzo

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recent advances in AI call for a paradigm shift from bit-centric communication to goal- and semantics-oriented architectures, paving the way for AI-native 6G networks. In this context, we address a key open challenge: enabling heterogeneous AI agents to exchange compressed latent-space representations while mitigating semantic noise and preserving task-relevant meaning. We cast this challenge as learning both the communication topology and the alignment maps that govern information exchange amon...

ID: 2512.03248v1 cs.MA, cs.AI, cs.LG, eess.SP

arXiv PDF

📄 Multi-Frequency Federated Learning for Human Activity Recognition Using Head-Worn Sensors

2025-12-04

Авторы:

Dario Fenoglio, Mohan Li, Davide Casnici, Matias Laporte, Shkurta Gashi, Silvia Santini, Martin Gjoreski, Marc Langheinrich

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Human Activity Recognition (HAR) benefits various application domains, including health and elderly care. Traditional HAR involves constructing pipelines reliant on centralized user data, which can pose privacy concerns as they necessitate the uploading of user data to a centralized server. This work proposes multi-frequency Federated Learning (FL) to enable: (1) privacy-aware ML; (2) joint ML model learning across devices with varying sampling frequency. We focus on head-worn devices (e.g., ear...

ID: 2512.03287v1 cs.LG, cs.DC

arXiv PDF

📄 BlendedNet++: A Large-Scale Blended Wing Body Aerodynamics Dataset and Benchmark

2025-12-04

Авторы:

Nicholas Sung, Steven Spreizer, Mohamed Elrefaie, Matthew C. Jones, Faez Ahmed

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Despite progress in machine learning-based aerodynamic surrogates, the scarcity of large, field-resolved datasets limits progress on accurate pointwise prediction and reproducible inverse design for aircraft. We introduce BlendedNet++, a large-scale aerodynamic dataset and benchmark focused on blended wing body (BWB) aircraft. The dataset contains over 12,000 unique geometries, each simulated at a single flight condition, yielding 12,490 aerodynamic results for steady RANS CFD. For every case, w...

ID: 2512.03280v1 cs.LG, cs.AI

arXiv PDF

1
2
40
41
42
43
44
1398
1399

Показано 411 - 420 из 13987 записей