📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 How Muon's Spectral Design Benefits Generalization: A Study on Imbalanced Data

2025-10-29

Авторы:

Bhavya Vasudeva, Puneesh Deora, Yize Zhao, Vatsal Sharan, Christos Thrampoulidis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The growing adoption of spectrum-aware matrix-valued optimizers such as Muon and Shampoo in deep learning motivates a systematic study of their generalization properties and, in particular, when they might outperform competitive algorithms. We approach this question by introducing appropriate simplifying abstractions as follows: First, we use imbalanced data as a testbed. Second, we study the canonical form of such optimizers, which is Spectral Gradient Descent (SpecGD) -- each update step is $U...

ID: 2510.22980v1 cs.LG, stat.ML

arXiv PDF

📄 Adaptive Forests For Classification

2025-10-29

Авторы:

Dimitris Bertsimas, Yubing Cui

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Random Forests (RF) and Extreme Gradient Boosting (XGBoost) are two of the most widely used and highly performing classification and regression models. They aggregate equally weighted CART trees, generated randomly in RF or sequentially in XGBoost. In this paper, we propose Adaptive Forests (AF), a novel approach that adaptively selects the weights of the underlying CART models. AF combines (a) the Optimal Predictive-Policy Trees (OP2T) framework to prescribe tailored, input-dependent unequal we...

ID: 2510.22991v1 cs.LG, stat.ML

arXiv PDF

📄 The Benchmarking Epistemology: Construct Validity for Evaluating Machine Learning Models

2025-10-29

Авторы:

Timo Freiesleben, Sebastian Zezulka

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Predictive benchmarking, the evaluation of machine learning models based on predictive performance and competitive ranking, is a central epistemic practice in machine learning research and an increasingly prominent method for scientific inquiry. Yet, benchmark scores alone provide at best measurements of model performance relative to an evaluation dataset and a concrete learning problem. Drawing substantial scientific inferences from the results, say about theoretical tasks like image classifica...

ID: 2510.23191v1 cs.LG, stat.ML

arXiv PDF

📄 GCAO: Group-driven Clustering via Gravitational Attraction and Optimization

2025-10-29

Авторы:

Qi Li, Jun Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Traditional clustering algorithms often struggle with high-dimensional and non-uniformly distributed data, where low-density boundary samples are easily disturbed by neighboring clusters, leading to unstable and distorted clustering results. To address this issue, we propose a Group-driven Clustering via Gravitational Attraction and Optimization (GCAO) algorithm. GCAO introduces a group-level optimization mechanism that aggregates low-density boundary points into collaboratively moving groups, r...

ID: 2510.23259v1 cs.LG, stat.ML

arXiv PDF

📄 Robust Non-negative Proximal Gradient Algorithm for Inverse Problems

2025-10-29

Авторы:

Hanzhang Wang, Zonglin Liu, Jingyi Xu, Chenyang Wang, Zhiwei Zhong, Qiangqiang Shen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Proximal gradient algorithms (PGA), while foundational for inverse problems like image reconstruction, often yield unstable convergence and suboptimal solutions by violating the critical non-negativity constraint. We identify the gradient descent step as the root cause of this issue, which introduces negative values and induces high sensitivity to hyperparameters. To overcome these limitations, we propose a novel multiplicative update proximal gradient algorithm (SSO-PGA) with convergence guaran...

ID: 2510.23362v1 cs.LG, stat.ML

arXiv PDF

📄 An Information-Theoretic Analysis of Out-of-Distribution Generalization in Meta-Learning with Applications to Meta-RL

2025-10-29

Авторы:

Xingtu Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this work, we study out-of-distribution generalization in meta-learning from an information-theoretic perspective. We focus on two scenarios: (i) when the testing environment mismatches the training environment, and (ii) when the training environment is broader than the testing environment. The first corresponds to the standard distribution mismatch setting, while the second reflects a broad-to-narrow training scenario. We further formalize the generalization problem in meta-reinforcement lea...

ID: 2510.23448v1 cs.LG, stat.ML

arXiv PDF

📄 An Analytic Theory of Quantum Imaginary Time Evolution

2025-10-29

Авторы:

Min Chen, Bingzhi Zhang, Quntao Zhuang, Junyu Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Quantum imaginary time evolution (QITE) algorithm is one of the most promising variational quantum algorithms (VQAs), bridging the current era of Noisy Intermediate-Scale Quantum devices and the future of fully fault-tolerant quantum computing. Although practical demonstrations of QITE and its potential advantages over the general VQA trained with vanilla gradient descent (GD) in certain tasks have been reported, a first-principle, theoretical understanding of QITE remains limited. Here, we aim ...

ID: 2510.22481v1 quant-ph, cs.AI, cs.LG, stat.ML

arXiv PDF

📄 HRM-Agent: Training a recurrent reasoning model in dynamic environments using reinforcement learning

2025-10-29

Авторы:

Long H Dang, David Rawlinson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The Hierarchical Reasoning Model (HRM) has impressive reasoning abilities given its small size, but has only been applied to supervised, static, fully-observable problems. One of HRM's strengths is its ability to adapt its computational effort to the difficulty of the problem. However, in its current form it cannot integrate and reuse computation from previous time-steps if the problem is dynamic, uncertain or partially observable, or be applied where the correct action is undefined, characteris...

ID: 2510.22832v1 cs.AI, cs.LG, stat.ML, 68T07 (Primary) 62M45, 37N99 (Secondary), I.2.6; I.2.8

arXiv PDF

📄 From Information to Generative Exponent: Learning Rate Induces Phase Transitions in SGD

2025-10-28

Авторы:

Konstantinos Christopher Tsiolis, Alireza Mousavi-Hosseini, Murat A. Erdogdu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

To understand feature learning dynamics in neural networks, recent theoretical works have focused on gradient-based learning of Gaussian single-index models, where the label is a nonlinear function of a latent one-dimensional projection of the input. While the sample complexity of online SGD is determined by the information exponent of the link function, recent works improved this by performing multiple gradient steps on the same sample with different learning rates -- yielding a non-correlation...

ID: 2510.21020v1 cs.LG, stat.ML

arXiv PDF

📄 Amortized Active Generation of Pareto Sets

2025-10-28

Авторы:

Daniel M. Steinberg, Asiri Wijesinghe, Rafael Oliveira, Piotr Koniusz, Cheng Soon Ong, Edwin V. Bonilla

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce active generation of Pareto sets (A-GPS), a new framework for online discrete black-box multi-objective optimization (MOO). A-GPS learns a generative model of the Pareto set that supports a-posteriori conditioning on user preferences. The method employs a class probability estimator (CPE) to predict non-dominance relations and to condition the generative model toward high-performing regions of the search space. We also show that this non-dominance CPE implicitly estimates the probab...

ID: 2510.21052v1 cs.LG, stat.ML

arXiv PDF

Показано 141 - 150 из 385 записей