📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Covariance Scattering Transforms

2025-11-15

Авторы:

Andrea Cavallo, Ayushman Raghuvanshi, Sundeep Prabhakar Chepuri, Elvin Isufi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Machine learning and data processing techniques relying on covariance information are widespread as they identify meaningful patterns in unsupervised and unlabeled settings. As a prominent example, Principal Component Analysis (PCA) projects data points onto the eigenvectors of their covariance matrix, capturing the directions of maximum variance. This mapping, however, falls short in two directions: it fails to capture information in low-variance directions, relevant when, e.g., the data contai...

ID: 2511.08878v1 cs.LG, stat.ML

arXiv PDF

📄 Several Supporting Evidences for the Adaptive Feature Program

2025-11-15

Авторы:

Yicheng Li, Qian Lin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Theoretically exploring the advantages of neural networks might be one of the most challenging problems in the AI era. An adaptive feature program has recently been proposed to analyze the feature learning characteristic property of neural networks in a more abstract way. Motivated by the celebrated Le Cam equivalence, we advocate the over-parametrized sequence models to further simplify the analysis of the training dynamics of adaptive feature program and present several supporting evidences fo...

ID: 2511.09425v1 cs.LG, stat.ML

arXiv PDF

📄 Boosted GFlowNets: Improving Exploration via Sequential Learning

2025-11-15

Авторы:

Pedro Dall'Antonia, Tiago da Silva, Daniel Augusto de Souza, César Lincoln C. Mattos, Diego Mesquita

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Generative Flow Networks (GFlowNets) are powerful samplers for compositional objects that, by design, sample proportionally to a given non-negative reward. Nonetheless, in practice, they often struggle to explore the reward landscape evenly: trajectories toward easy-to-reach regions dominate training, while hard-to-reach modes receive vanishing or uninformative gradients, leading to poor coverage of high-reward areas. We address this imbalance with Boosted GFlowNets, a method that sequentially t...

ID: 2511.09677v1 cs.LG, stat.ML

arXiv PDF

📄 Global Convergence of Four-Layer Matrix Factorization under Random Initialization

2025-11-15

Авторы:

Minrui Luo, Weihang Xu, Xiang Gao, Maryam Fazel, Simon Shaolei Du

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Gradient descent dynamics on the deep matrix factorization problem is extensively studied as a simplified theoretical model for deep neural networks. Although the convergence theory for two-layer matrix factorization is well-established, no global convergence guarantee for general deep matrix factorization under random initialization has been established to date. To address this gap, we provide a polynomial-time global convergence guarantee for randomly initialized gradient descent on four-layer...

ID: 2511.09925v1 math.OC, cs.LG, stat.ML

arXiv PDF

📄 A Novel Data-Dependent Learning Paradigm for Large Hypothesis Classes

2025-11-15

Авторы:

Alireza F. Pour, Shai Ben-David

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We address the general task of learning with a set of candidate models that is too large to have a uniform convergence of empirical estimates to true losses. While the common approach to such challenges is SRM (or regularization) based learning algorithms, we propose a novel learning paradigm that relies on stronger incorporation of empirical data and requires less algorithmic decisions to be based on prior assumptions. We analyze the generalization capabilities of our approach and demonstrate i...

ID: 2511.09996v1 cs.LG, stat.ML

arXiv PDF

📄 Algorithm Design and Stronger Guarantees for the Improving Multi-Armed Bandits Problem

2025-11-15

Авторы:

Avrim Blum, Marten Garicano, Kavya Ravichandran, Dravyansh Sharma

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The improving multi-armed bandits problem is a formal model for allocating effort under uncertainty, motivated by scenarios such as investing research effort into new technologies, performing clinical trials, and hyperparameter selection from learning curves. Each pull of an arm provides reward that increases monotonically with diminishing returns. A growing line of work has designed algorithms for improving bandits, albeit with somewhat pessimistic worst-case guarantees. Indeed, strong lower bo...

ID: 2511.10619v1 cs.LG, stat.ML

arXiv PDF

📄 Laplacian Score Sharpening for Mitigating Hallucination in Diffusion Models

2025-11-15

Авторы:

Barath Chandran. C, Srinivas Anumasa, Dianbo Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Diffusion models, though successful, are known to suffer from hallucinations that create incoherent or unrealistic samples. Recent works have attributed this to the phenomenon of mode interpolation and score smoothening, but they lack a method to prevent their generation during sampling. In this paper, we propose a post-hoc adjustment to the score function during inference that leverages the Laplacian (or sharpness) of the score to reduce mode interpolation hallucination in unconditional diffusi...

ID: 2511.07496v1 cs.CV, cs.AI, cs.LG, stat.ML

arXiv PDF

📄 Trained on Tokens, Calibrated on Concepts: The Emergence of Semantic Calibration in LLMs

2025-11-11

Авторы:

Preetum Nakkiran, Arwen Bradley, Adam Goliński, Eugene Ndiaye, Michael Kirchhof, Sinead Williamson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) often lack meaningful confidence estimates for their outputs. While base LLMs are known to exhibit next-token calibration, it remains unclear whether they can assess confidence in the actual meaning of their responses beyond the token level. We find that, when using a certain sampling-based notion of semantic calibration, base LLMs are remarkably well-calibrated: they can meaningfully assess confidence in open-domain question-answering tasks, despite not being explic...

ID: 2511.04869v1 cs.CL, cs.LG, stat.ML

arXiv PDF

📄 Efficient Swap Multicalibration of Elicitable Properties

2025-11-11

Авторы:

Lunjia Hu, Haipeng Luo, Spandan Senapati, Vatsal Sharan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multicalibration [HJKRR18] is an algorithmic fairness perspective that demands that the predictions of a predictor are correct conditional on themselves and membership in a collection of potentially overlapping subgroups of a population. The work of [NR23] established a surprising connection between multicalibration for an arbitrary property $\Gamma$ (e.g., mean or median) and property elicitation: a property $\Gamma$ can be multicalibrated if and only if it is elicitable, where elicitability is...

ID: 2511.04907v1 cs.LG, stat.ML

arXiv PDF

📄 Linear Gradient Prediction with Control Variates

2025-11-11

Авторы:

Kamil Ciosek, Nicolò Felicioni, Juan Elenter Litwin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose a new way of training neural networks, with the goal of reducing training cost. Our method uses approximate predicted gradients instead of the full gradients that require an expensive backward pass. We derive a control-variate-based technique that ensures our updates are unbiased estimates of the true gradient. Moreover, we propose a novel way to derive a predictor for the gradient inspired by the theory of the Neural Tangent Kernel. We empirically show the efficacy of the technique o...

ID: 2511.05187v1 cs.LG, stat.ML

arXiv PDF

Показано 101 - 110 из 385 записей