📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Fast Decoding for Non-Adaptive Learning of Erdős--Rényi Random Graphs

2025-11-25

Авторы:

Hoang Ta, Jonathan Scarlett

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study the problem of learning an unknown graph via group queries on node subsets, where each query reports whether at least one edge is present among the queried nodes. In general, learning arbitrary graphs with $n$ nodes and $k$ edges is hard in the non-adaptive setting, requiring $Ω\big(\min\{k^2\log n,\,n^2\}\big)$ tests even when a small error probability is allowed. We focus on learning Erdős--Rényi (ER) graphs $G\sim\ER(n,q)$ in the non-adaptive setting, where the expected numbe...

ID: 2511.17240v1 cs.IT, cs.DM, cs.LG, math.PR

arXiv PDF

📄 High-Dimensional Asymptotics of Differentially Private PCA

2025-11-15

Авторы:

Youngjoo Yun, Rishabh Dudeja

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In differential privacy, statistics of a sensitive dataset are privatized by introducing random noise. Most privacy analyses provide privacy bounds specifying a noise level sufficient to achieve a target privacy guarantee. Sometimes, these bounds are pessimistic and suggest adding excessive noise, which overwhelms the meaningful signal. It remains unclear if such high noise levels are truly necessary or a limitation of the proof techniques. This paper explores whether we can obtain sharp privacy...

ID: 2511.07270v1 math.ST, cs.IT, cs.LG, math.PR, stat.ML

arXiv PDF

📄 Parallel Sampling via Autospeculation

2025-11-15

Авторы:

Nima Anari, Carlo Baronio, CJ Chen, Alireza Haqi, Frederic Koehler, Anqi Li, Thuy-Duong Vuong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present parallel algorithms to accelerate sampling via counting in two settings: any-order autoregressive models and denoising diffusion models. An any-order autoregressive model accesses a target distribution $μ$ on $[q]^n$ through an oracle that provides conditional marginals, while a denoising diffusion model accesses a target distribution $μ$ on $\mathbb{R}^n$ through an oracle that provides conditional means under Gaussian noise. Standard sequential sampling algorithms require $\widetild...

ID: 2511.07869v1 cs.DS, cs.DC, cs.LG, math.PR

arXiv PDF

📄 ODE approximation for the Adam algorithm: General and overparametrized setting

2025-11-08

Авторы:

Steffen Dereich, Arnulf Jentzen, Sebastian Kassing

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The Adam optimizer is currently presumably the most popular optimization method in deep learning. In this article we develop an ODE based method to study the Adam optimizer in a fast-slow scaling regime. For fixed momentum parameters and vanishing step-sizes, we show that the Adam algorithm is an asymptotic pseudo-trajectory of the flow of a particular vector field, which is referred to as the Adam vector field. Leveraging properties of asymptotic pseudo-trajectories, we establish convergence re...

ID: 2511.04622v1 math.OC, cs.LG, math.PR

arXiv PDF

📄 Precise asymptotic analysis of Sobolev training for random feature models

2025-11-07

Авторы:

Katharine E Fisher, Matthew TC Li, Youssef Marzouk, Timo Schorlepp

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Gradient information is widely useful and available in applications, and is therefore natural to include in the training of neural networks. Yet little is known theoretically about the impact of Sobolev training -- regression with both function and gradient data -- on the generalization error of highly overparameterized predictive models in high dimensions. In this paper, we obtain a precise characterization of this training modality for random feature (RF) models in the limit where the number o...

ID: 2511.03050v1 stat.ML, cond-mat.dis-nn, cs.LG, math.PR, math.ST, stat.TH

arXiv PDF

📄 Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks

2025-11-06

Авторы:

Parsa Rangriz

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper studies the high-dimensional scaling limits of online stochastic gradient descent (SGD) for single-layer networks. Building on the seminal work of Saad and Solla, which analyzed the deterministic (ballistic) scaling limits of SGD corresponding to the gradient flow of the population loss, we focus on the critical scaling regime of the step size. Below this critical scale, the effective dynamics are governed by ballistic (ODE) limits, but at the critical scale, new correction term appea...

ID: 2511.02258v1 stat.ML, cs.LG, math.PR, math.ST, stat.TH

arXiv PDF

📄 Global Dynamics of Heavy-Tailed SGDs in Nonconvex Loss Landscape: Characterization and Control

2025-10-28

Авторы:

Xingyu Wang, Chang-Han Rhee

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Stochastic gradient descent (SGD) and its variants enable modern artificial intelligence. However, theoretical understanding lags far behind their empirical success. It is widely believed that SGD has a curious ability to avoid sharp local minima in the loss landscape, which are associated with poor generalization. To unravel this mystery and further enhance such capability of SGDs, it is imperative to go beyond the traditional local convergence analysis and obtain a comprehensive understanding ...

ID: 2510.20905v1 cs.LG, math.PR

arXiv PDF

📄 Local regression on path spaces with signature metrics

2025-10-22

Авторы:

Christian Bayer, Davit Gogolashvili, Luca Pelizzari

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study nonparametric regression and classification for path-valued data. We introduce a functional Nadaraya-Watson estimator that combines the signature transform from rough path theory with local kernel regression. The signature transform provides a principled way to encode sequential data through iterated integrals, enabling direct comparison of paths in a natural metric space. Our approach leverages signature-induced distances within the classical kernel regression framework, achieving comp...

ID: 2510.16728v1 stat.ML, cs.LG, math.PR, stat.ME, 60L10, 60L20, 62G05, 62G08

arXiv PDF

📄 Asymptotically optimal reinforcement learning in Block Markov Decision Processes

2025-10-17

Авторы:

Thomas van Vuren, Fiona Sloothaak, Maarten G. Wolf, Jaron Sanders

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The curse of dimensionality renders Reinforcement Learning (RL) impractical in many real-world settings with exponentially large state and action spaces. Yet, many environments exhibit exploitable structure that can accelerate learning. To formalize this idea, we study RL in Block Markov Decision Processes (BMDPs). BMDPs model problems with large observation spaces, but where transition dynamics are fully determined by latent states. Recent advances in clustering methods have enabled the efficie...

ID: 2510.13748v1 cs.LG, math.PR, stat.ML, 90C40, 62H30, 60J20

arXiv PDF

📄 Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models

2025-10-16

Авторы:

Shai Zucker, Xiong Wang, Fei Lu, Inbar Seroussi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study the convergence rate of learning pairwise interactions in single-layer attention-style models, where tokens interact through a weight matrix and a non-linear activation function. We prove that the minimax rate is $M^{-\frac{2\beta}{2\beta+1}}$ with $M$ being the sample size, depending only on the smoothness $\beta$ of the activation, and crucially independent of token count, ambient dimension, or rank of the weight matrix. These results highlight a fundamental dimension-free statistical...

ID: 2510.11789v1 stat.ML, cs.LG, math.PR, math.ST, stat.TH

arXiv PDF

Показано 11 - 20 из 43 записей