📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Hoang Ta, Jonathan Scarlett
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study the problem of learning an unknown graph via group queries on node subsets, where each query reports whether at least one edge is present among the queried nodes. In general, learning arbitrary graphs with \(n\) nodes and \(k\) edges is hard in the non-adaptive setting, requiring \(Ω\big(\min\{k^2\log n,\,n^2\}\big)\) tests even when a small error probability is allowed. We focus on learning Erdős--Rényi (ER) graphs \(G\sim\ER(n,q)\) in the non-adaptive setting, where the expected numbe...
Авторы:
Youngjoo Yun, Rishabh Dudeja
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In differential privacy, statistics of a sensitive dataset are privatized by introducing random noise. Most privacy analyses provide privacy bounds specifying a noise level sufficient to achieve a target privacy guarantee. Sometimes, these bounds are pessimistic and suggest adding excessive noise, which overwhelms the meaningful signal. It remains unclear if such high noise levels are truly necessary or a limitation of the proof techniques. This paper explores whether we can obtain sharp privacy...
📄 Parallel Sampling via Autospeculation
2025-11-15Авторы:
Nima Anari, Carlo Baronio, CJ Chen, Alireza Haqi, Frederic Koehler, Anqi Li, Thuy-Duong Vuong
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We present parallel algorithms to accelerate sampling via counting in two settings: any-order autoregressive models and denoising diffusion models. An any-order autoregressive model accesses a target distribution $μ$ on $[q]^n$ through an oracle that provides conditional marginals, while a denoising diffusion model accesses a target distribution $μ$ on $\mathbb{R}^n$ through an oracle that provides conditional means under Gaussian noise. Standard sequential sampling algorithms require $\widetild...
Авторы:
Steffen Dereich, Arnulf Jentzen, Sebastian Kassing
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The Adam optimizer is currently presumably the most popular optimization
method in deep learning. In this article we develop an ODE based method to
study the Adam optimizer in a fast-slow scaling regime. For fixed momentum
parameters and vanishing step-sizes, we show that the Adam algorithm is an
asymptotic pseudo-trajectory of the flow of a particular vector field, which is
referred to as the Adam vector field. Leveraging properties of asymptotic
pseudo-trajectories, we establish convergence re...
Авторы:
Katharine E Fisher, Matthew TC Li, Youssef Marzouk, Timo Schorlepp
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Gradient information is widely useful and available in applications, and is
therefore natural to include in the training of neural networks. Yet little is
known theoretically about the impact of Sobolev training -- regression with
both function and gradient data -- on the generalization error of highly
overparameterized predictive models in high dimensions. In this paper, we
obtain a precise characterization of this training modality for random feature
(RF) models in the limit where the number o...
📄 Limit Theorems for Stochastic Gradient Descent in High-Dimensional Single-Layer Networks
2025-11-06Авторы:
Parsa Rangriz
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This paper studies the high-dimensional scaling limits of online stochastic
gradient descent (SGD) for single-layer networks. Building on the seminal work
of Saad and Solla, which analyzed the deterministic (ballistic) scaling limits
of SGD corresponding to the gradient flow of the population loss, we focus on
the critical scaling regime of the step size. Below this critical scale, the
effective dynamics are governed by ballistic (ODE) limits, but at the critical
scale, new correction term appea...
📄 Global Dynamics of Heavy-Tailed SGDs in Nonconvex Loss Landscape: Characterization and Control
2025-10-28Авторы:
Xingyu Wang, Chang-Han Rhee
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Stochastic gradient descent (SGD) and its variants enable modern artificial
intelligence. However, theoretical understanding lags far behind their
empirical success. It is widely believed that SGD has a curious ability to
avoid sharp local minima in the loss landscape, which are associated with poor
generalization. To unravel this mystery and further enhance such capability of
SGDs, it is imperative to go beyond the traditional local convergence analysis
and obtain a comprehensive understanding ...
Авторы:
Christian Bayer, Davit Gogolashvili, Luca Pelizzari
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study nonparametric regression and classification for path-valued data. We
introduce a functional Nadaraya-Watson estimator that combines the signature
transform from rough path theory with local kernel regression. The signature
transform provides a principled way to encode sequential data through iterated
integrals, enabling direct comparison of paths in a natural metric space. Our
approach leverages signature-induced distances within the classical kernel
regression framework, achieving comp...
Авторы:
Thomas van Vuren, Fiona Sloothaak, Maarten G. Wolf, Jaron Sanders
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The curse of dimensionality renders Reinforcement Learning (RL) impractical
in many real-world settings with exponentially large state and action spaces.
Yet, many environments exhibit exploitable structure that can accelerate
learning. To formalize this idea, we study RL in Block Markov Decision
Processes (BMDPs). BMDPs model problems with large observation spaces, but
where transition dynamics are fully determined by latent states. Recent
advances in clustering methods have enabled the efficie...
📄 Dimension-Free Minimax Rates for Learning Pairwise Interactions in Attention-Style Models
2025-10-16Авторы:
Shai Zucker, Xiong Wang, Fei Lu, Inbar Seroussi
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study the convergence rate of learning pairwise interactions in
single-layer attention-style models, where tokens interact through a weight
matrix and a non-linear activation function. We prove that the minimax rate is
$M^{-\frac{2\beta}{2\beta+1}}$ with $M$ being the sample size, depending only
on the smoothness $\beta$ of the activation, and crucially independent of token
count, ambient dimension, or rank of the weight matrix. These results highlight
a fundamental dimension-free statistical...
Показано 11 -
20
из 43 записей