📊 Статистика дайджестов
Всего дайджестов: 35039 Добавлено сегодня: 432
Последнее обновление: сегодня
Авторы:
Xuan Tang, Jichu Li, Difan Zou
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The rapid scaling of large language models (LLMs) has made low-precision
training essential for reducing memory, improving efficiency, and enabling
larger models and datasets. Existing convergence theories for adaptive
optimizers, however, assume all components are exact and neglect hardware-aware
quantization, leaving open the question of why low-precision training remains
effective. We introduce the first theoretical framework for analyzing the
convergence of adaptive optimizers, including Ada...
Авторы:
Myeongho Jeon, Jan Sobotka, Suhwan Choi, Maria Brbić
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
As future superhuman models become increasingly complex, accurately
supervising their behavior may exceed human capabilities. Recent works have
demonstrated that in such scenarios, weak models can effectively supervise
strong models, a phenomenon known as weak-to-strong generalization. However, we
find that naive weak-to-strong generalization fails under distribution shifts,
often leading to worse performance of the strong model than its weak
supervisors. To address this, we propose RAVEN, a rob...
Авторы:
Aymane El Firdoussi, El Mahdi Chayti, Mohamed El Amine Seddik, Martin Jaggi
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Fine-tuning has proven to be highly effective in adapting pre-trained models
to perform better on new desired tasks with minimal data samples. Among the
most widely used approaches are reparameterization methods, which update a
target module by augmenting its frozen weight matrix with an additional
trainable weight matrix. The most prominent example is Low Rank Adaption
(LoRA), which gained significant attention in recent years. In this paper, we
introduce a new class of reparameterization metho...
Авторы:
Stephan Rabanser, Nicolas Papernot
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Selective classifiers improve model reliability by abstaining on inputs the
model deems uncertain. However, few practical approaches achieve the
gold-standard performance of a perfect-ordering oracle that accepts examples
exactly in order of correctness. Our work formalizes this shortfall as the
selective-classification gap and present the first finite-sample decomposition
of this gap to five distinct sources of looseness: Bayes noise, approximation
error, ranking error, statistical noise, and i...
Авторы:
Wang Zixian
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Anchored Direct Preference Optimization (ADPO) is a unified framework that
generalizes Direct Preference Optimization (DPO) with soft preferences,
reference-policy anchoring, and groupwise extensions. While standard DPO
assumes hard binary labels and pairwise comparisons, ADPO introduces: (i) soft
preference probabilities that encode uncertainty and mitigate gradient drift;
(ii) arbitrary reference-policy anchors that stabilize training via groupwise
shift invariance and implicit KL regularizati...
Авторы:
Stephan Rabanser, Nicolas Papernot
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Selective classifiers improve model reliability by abstaining on inputs the
model deems uncertain. However, few practical approaches achieve the
gold-standard performance of a perfect-ordering oracle that accepts examples
exactly in order of correctness. Our work formalizes this shortfall as the
selective-classification gap and present the first finite-sample decomposition
of this gap to five distinct sources of looseness: Bayes noise, approximation
error, ranking error, statistical noise, and i...
📄 Latent Discrete Diffusion Models
2025-10-23Авторы:
Dario Shariatian, Alain Durmus, Stefano Peluchetti
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study discrete diffusion for language and other categorical data and focus
on a common limitation of masked denoisers: reverse transitions typically
factorize across positions, which can weaken joint structure and degrade
quality in few-step generation. We propose \emph{Latent Discrete Diffusion
Models} (LDDMs), which couple a masked discrete diffusion over tokens with a
continuous diffusion over latent embeddings. The latent channel provides a
softer signal and carries cross-token dependenci...
📄 Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options
2025-10-23Авторы:
Joongkyu Lee, Seouh-won Yi, Min-hwan Oh
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study online preference-based reinforcement learning (PbRL) with the goal
of improving sample efficiency. While a growing body of theoretical work has
emerged-motivated by PbRL's recent empirical success, particularly in aligning
large language models (LLMs)-most existing studies focus only on pairwise
comparisons. A few recent works (Zhu et al., 2023, Mukherjee et al., 2024,
Thekumparampil et al., 2024) have explored using multiple comparisons and
ranking feedback, but their performance guar...
📄 Structured Temporal Causality for Interpretable Multivariate Time Series Anomaly Detection
2025-10-22Авторы:
Dongchan Cho, Jiho Han, Keumyeong Kang, Minsang Kim, Honggyu Ryu, Namsoon Jung
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Real-world multivariate time series anomalies are rare and often unlabeled.
Additionally, prevailing methods rely on increasingly complex architectures
tuned to benchmarks, detecting only fragments of anomalous segments and
overstating performance. In this paper, we introduce OracleAD, a simple and
interpretable unsupervised framework for multivariate time series anomaly
detection. OracleAD encodes each variable's past sequence into a single causal
embedding to jointly predict the present time p...
📄 Symmetry and Generalisation in Neural Approximations of Renormalisation Transformations
2025-10-22Авторы:
Cassidy Ashworth, Pietro Liò, Francesco Caso
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Deep learning models have proven enormously successful at using multiple
layers of representation to learn relevant features of structured data.
Encoding physical symmetries into these models can improve performance on
difficult tasks, and recent work has motivated the principle of parameter
symmetry breaking and restoration as a unifying mechanism underlying their
hierarchical learning dynamics. We evaluate the role of parameter symmetry and
network expressivity in the generalisation behaviour ...
Показано 41 -
50
из 124 записей