📊 Статистика дайджестов

Всего дайджестов: 35039 Добавлено сегодня: 432

Последнее обновление: сегодня

📄 A Convergence Analysis of Adaptive Optimizers under Floating-point Quantization

2025-10-28

Авторы:

Xuan Tang, Jichu Li, Difan Zou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The rapid scaling of large language models (LLMs) has made low-precision training essential for reducing memory, improving efficiency, and enabling larger models and datasets. Existing convergence theories for adaptive optimizers, however, assume all components are exact and neglect hardware-aware quantization, leaving open the question of why low-precision training remains effective. We introduce the first theoretical framework for analyzing the convergence of adaptive optimizers, including Ada...

ID: 2510.21314v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Weak-to-Strong Generalization under Distribution Shifts

2025-10-28

Авторы:

Myeongho Jeon, Jan Sobotka, Suhwan Choi, Maria Brbić

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As future superhuman models become increasingly complex, accurately supervising their behavior may exceed human capabilities. Recent works have demonstrated that in such scenarios, weak models can effectively supervise strong models, a phenomenon known as weak-to-strong generalization. However, we find that naive weak-to-strong generalization fails under distribution shifts, often leading to worse performance of the strong model than its weak supervisors. To address this, we propose RAVEN, a rob...

ID: 2510.21332v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 $α$-LoRA: Effective Fine-Tuning via Base Model Rescaling

2025-10-28

Авторы:

Aymane El Firdoussi, El Mahdi Chayti, Mohamed El Amine Seddik, Martin Jaggi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Fine-tuning has proven to be highly effective in adapting pre-trained models to perform better on new desired tasks with minimal data samples. Among the most widely used approaches are reparameterization methods, which update a target module by augmenting its frozen weight matrix with an additional trainable weight matrix. The most prominent example is Low Rank Adaption (LoRA), which gained significant attention in recent years. In this paper, we introduce a new class of reparameterization metho...

ID: 2510.21345v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 What Does It Take to Build a Performant Selective Classifier?

2025-10-27

Авторы:

Stephan Rabanser, Nicolas Papernot

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Selective classifiers improve model reliability by abstaining on inputs the model deems uncertain. However, few practical approaches achieve the gold-standard performance of a perfect-ordering oracle that accepts examples exactly in order of correctness. Our work formalizes this shortfall as the selective-classification gap and present the first finite-sample decomposition of this gap to five distinct sources of looseness: Bayes noise, approximation error, ranking error, statistical noise, and i...

ID: 2510.20242v2 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 ADPO: Anchored Direct Preference Optimization

2025-10-25

Авторы:

Wang Zixian

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Anchored Direct Preference Optimization (ADPO) is a unified framework that generalizes Direct Preference Optimization (DPO) with soft preferences, reference-policy anchoring, and groupwise extensions. While standard DPO assumes hard binary labels and pairwise comparisons, ADPO introduces: (i) soft preference probabilities that encode uncertainty and mitigate gradient drift; (ii) arbitrary reference-policy anchors that stabilize training via groupwise shift invariance and implicit KL regularizati...

ID: 2510.18913v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 What Does It Take to Build a Performant Selective Classifier?

2025-10-25

Авторы:

Stephan Rabanser, Nicolas Papernot

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

ID: 2510.20242v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Latent Discrete Diffusion Models

2025-10-23

Авторы:

Dario Shariatian, Alain Durmus, Stefano Peluchetti

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study discrete diffusion for language and other categorical data and focus on a common limitation of masked denoisers: reverse transitions typically factorize across positions, which can weaken joint structure and degrade quality in few-step generation. We propose \emph{Latent Discrete Diffusion Models} (LDDMs), which couple a masked discrete diffusion over tokens with a continuous diffusion over latent embeddings. The latent channel provides a softer signal and carries cross-token dependenci...

ID: 2510.18114v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Preference-based Reinforcement Learning beyond Pairwise Comparisons: Benefits of Multiple Options

2025-10-23

Авторы:

Joongkyu Lee, Seouh-won Yi, Min-hwan Oh

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study online preference-based reinforcement learning (PbRL) with the goal of improving sample efficiency. While a growing body of theoretical work has emerged-motivated by PbRL's recent empirical success, particularly in aligning large language models (LLMs)-most existing studies focus only on pairwise comparisons. A few recent works (Zhu et al., 2023, Mukherjee et al., 2024, Thekumparampil et al., 2024) have explored using multiple comparisons and ranking feedback, but their performance guar...

ID: 2510.18713v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Structured Temporal Causality for Interpretable Multivariate Time Series Anomaly Detection

2025-10-22

Авторы:

Dongchan Cho, Jiho Han, Keumyeong Kang, Minsang Kim, Honggyu Ryu, Namsoon Jung

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Real-world multivariate time series anomalies are rare and often unlabeled. Additionally, prevailing methods rely on increasingly complex architectures tuned to benchmarks, detecting only fragments of anomalous segments and overstating performance. In this paper, we introduce OracleAD, a simple and interpretable unsupervised framework for multivariate time series anomaly detection. OracleAD encodes each variable's past sequence into a single causal embedding to jointly predict the present time p...

ID: 2510.16511v1 cs.LG, cs.AI, stat.ML

arXiv PDF

📄 Symmetry and Generalisation in Neural Approximations of Renormalisation Transformations

2025-10-22

Авторы:

Cassidy Ashworth, Pietro Liò, Francesco Caso

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep learning models have proven enormously successful at using multiple layers of representation to learn relevant features of structured data. Encoding physical symmetries into these models can improve performance on difficult tasks, and recent work has motivated the principle of parameter symmetry breaking and restoration as a unifying mechanism underlying their hierarchical learning dynamics. We evaluate the role of parameter symmetry and network expressivity in the generalisation behaviour ...

ID: 2510.16591v1 cs.LG, cond-mat.stat-mech, cs.AI, stat.ML

arXiv PDF

Показано 41 - 50 из 124 записей