📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
📄 Kernel Learning with Adversarial Features: Numerical Efficiency and Adaptive Regularization
2025-10-28Авторы:
Antônio H. Ribeiro, David Vävinggren, Dave Zachariah, Thomas B. Schön, Francis Bach
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Adversarial training has emerged as a key technique to enhance model
robustness against adversarial input perturbations. Many of the existing
methods rely on computationally expensive min-max problems that limit their
application in practice. We propose a novel formulation of adversarial training
in reproducing kernel Hilbert spaces, shifting from input to feature-space
perturbations. This reformulation enables the exact solution of inner
maximization and efficient optimization. It also provides...
📄 Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data
2025-10-28Авторы:
Hancheng Min, Zhihui Zhu, René Vidal
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Among many mysteries behind the success of deep networks lies the exceptional
discriminative power of their learned representations as manifested by the
intriguing Neural Collapse (NC) phenomenon, where simple feature structures
emerge at the last layer of a trained neural network. Prior works on the
theoretical understandings of NC have focused on analyzing the optimization
landscape of matrix-factorization-like problems by considering the last-layer
features as unconstrained free optimization ...
Авторы:
Noah Oberweis, Semih Cayci
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Continuous-time models provide important insights into the training dynamics
of optimization algorithms in deep learning. In this work, we establish a
non-asymptotic convergence analysis of stochastic gradient Langevin dynamics
(SGLD), which is an It\^o stochastic differential equation (SDE) approximation
of stochastic gradient descent in continuous time, in the lazy training regime.
We show that, under regularity conditions on the Hessian of the loss function,
SGLD with multiplicative and state...
Авторы:
Fangyuan Sun, Ilyas Fatkhullin, Niao He
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Stochastic Natural Gradient Variational Inference (NGVI) is a widely used
method for approximating posterior distribution in probabilistic models.
Despite its empirical success and foundational role in variational inference,
its theoretical underpinnings remain limited, particularly in the case of
non-conjugate likelihoods. While NGVI has been shown to be a special instance
of Stochastic Mirror Descent, and recent work has provided convergence
guarantees using relative smoothness and strong conv...
Авторы:
Xiaoxing Ren, Nicola Bastianello, Thomas Parisini, Andreas A. Malikopoulos
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In this paper, we study the problem of reinforcement learning in multi-agent
systems where communication among agents is limited. We develop a decentralized
actor-critic learning framework in which each agent performs several local
updates of its policy and value function, where the latter is approximated by a
multi-layer neural network, before exchanging information with its neighbors.
This local training strategy substantially reduces the communication burden
while maintaining coordination acr...
Авторы:
Youngsik Hwang, Dong-Young Lim
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Machine unlearning (MU) aims to remove the influence of specific data from a
trained model. However, approximate unlearning methods, often formulated as a
single-objective optimization (SOO) problem, face a critical trade-off between
unlearning efficacy and model fidelity. This leads to three primary challenges:
the risk of over-forgetting, a lack of fine-grained control over the unlearning
process, and the absence of metrics to holistically evaluate the trade-off. To
address these issues, we re...
Авторы:
Egor Shulgin, Sultan AlRashed, Francesco Orabona, Peter Richtárik
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The Muon optimizer has rapidly emerged as a powerful, geometry-aware
alternative to AdamW, demonstrating strong performance in large-scale training
of neural networks. However, a critical theory-practice disconnect exists:
Muon's efficiency relies on fast, approximate orthogonalization, yet all prior
theoretical work analyzes an idealized, computationally intractable version
assuming exact SVD-based updates. This work moves beyond the ideal by providing
the first analysis of the inexact orthogon...
Авторы:
Fabian Schaipp
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The training of diffusion models is often absent in the evaluation of new
optimization techniques. In this work, we benchmark recent optimization
algorithms for training a diffusion model for denoising flow trajectories. We
observe that Muon and SOAP are highly efficient alternatives to AdamW (18%
lower final loss). We also revisit several recent phenomena related to the
training of models for text or image applications in the context of diffusion
model training. This includes the impact of the ...
📄 Rethinking PCA Through Duality
2025-10-23Авторы:
Jan Quan, Johan Suykens, Panagiotis Patrinos
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Motivated by the recently shown connection between self-attention and
(kernel) principal component analysis (PCA), we revisit the fundamentals of
PCA. Using the difference-of-convex (DC) framework, we present several novel
formulations and provide new theoretical insights. In particular, we show the
kernelizability and out-of-sample applicability for a PCA-like family of
problems. Moreover, we uncover that simultaneous iteration, which is connected
to the classical QR algorithm, is an instance o...
Авторы:
Jostein Barry-Straume, Adwait D. Verulkar, Arash Sarshar, Andrey A. Popov, Adrian Sandu
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The objective of designing a control system is to steer a dynamical system
with a control signal, guiding it to exhibit the desired behavior. The
Hamilton-Jacobi-Bellman (HJB) partial differential equation offers a framework
for optimal control system design. However, numerical solutions to this
equation are computationally intensive, and analytical solutions are frequently
unavailable. Knowledge-guided machine learning methodologies, such as
physics-informed neural networks (PINNs), offer new a...
Показано 41 -
50
из 157 записей