📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня
Авторы:

Mehdi Dagdoug, Clement Dombry, Jean-Jil Duchamps

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Random Forests and Gradient Boosting are among the most effective algorithms for supervised learning on tabular data. Both belong to the class of tree-based ensemble methods, where predictions are obtained by aggregating many randomized regression trees. In this paper, we develop a theoretical framework for analyzing such methods through Reproducing Kernel Hilbert Spaces (RKHSs) constructed on tree ensembles -- more precisely, on the random partitions generated by randomized regression trees. We...
ID: 2512.00397v1 stat.ML, cs.LG
Авторы:

Shiqin Tang, Yining Dong, S. Joe Qin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In traditional multivariate data analysis, dimension reduction and regression have been treated as distinct endeavors. Established techniques such as principal component regression (PCR) and partial least squares (PLS) regression traditionally compute latent components as intermediary steps -- although with different underlying criteria -- before proceeding with the regression analysis. In this paper, we introduce an innovative regression methodology named PLS-integrated Lasso (PLS-Lasso) that i...
ID: 2511.23205v1 stat.ML, cs.LG, stat.ME
Авторы:

Tomoshige Nakamura, Hiroshi Shiraishi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Quantile Regression Forests (QRF) are widely used for non-parametric conditional quantile estimation, yet statistical inference for variable importance measures remains challenging due to the non-smoothness of the loss function and the complex bias-variance trade-off. In this paper, we develop a asymptotic theory for variable importance defined as the difference in pinball loss risks. We first establish the asymptotic normality of the QRF estimator by handling the non-differentiable pinball loss...
ID: 2511.23212v1 stat.ML, cs.LG, math.ST
Авторы:

Zixun Huang, Jiayi Sheng, Zeyu Zheng

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Existing reinforcement learning (RL)-based post-training methods for large language models have advanced rapidly, yet their design has largely been guided by heuristics rather than systematic theoretical principles. This gap limits our understanding of the properties of the gradient estimators and the associated optimization algorithms, thereby constraining opportunities to improve training stability and overall performance. In this work, we provide a unified theoretical framework that character...
ID: 2511.23310v1 stat.ML, cs.LG
Авторы:

Dimitris Bertsimas, Caio de Prospero Iglesias, Nicholas A. G. Johnson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study Sparse Multiple Kernel Learning (SMKL), which is the problem of selecting a sparse convex combination of prespecified kernels for support vector binary classification. Unlike prevailing l1 regularized approaches that approximate a sparsifying penalty, we formulate the problem by imposing an explicit cardinality constraint on the kernel weights and add an l2 penalty for robustness. We solve the resulting non-convex minimax problem via an alternating best response algorithm with two subpr...
ID: 2511.21890v1 stat.ML, cs.LG
Авторы:

Aleksei G. Sorokin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Most scientific domains elicit the development of efficient algorithms and accessible scientific software. This thesis unifies our developments in three broad domains: Quasi-Monte Carlo (QMC) methods for efficient high-dimensional integration, Gaussian process (GP) regression for high-dimensional interpolation with built-in uncertainty quantification, and scientific machine learning (sciML) for modeling partial differential equations (PDEs) with mesh-free solvers. For QMC, we built new algorithm...
ID: 2511.21915v1 stat.ML, cs.LG, math.PR
Авторы:

Soumojit Das, Nairanjana Dasgupta, Prashanta Dutta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Modern artificial intelligence systems make critical decisions yet often fail silently when uncertain. We develop a geometric framework for post-hoc calibration of neural network probability outputs, treating probability vectors as points on the $(c-1)$-dimensional probability simplex equipped with the Fisher--Rao metric. Our approach yields Additive Log-Ratio (ALR) calibration maps that reduce exactly to Platt scaling for binary problems (Proposition~1) while extending naturally to multi-class ...
ID: 2511.20960v1 stat.ML, cs.LG, math.ST, stat.ME
Авторы:

Lechen Feng, Haoran Li, Lucky Li, Xingqiu Zhao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This paper investigates the partial linear model by Least Absolute Deviation (LAD) regression. We parameterize the nonparametric term using Deep Neural Networks (DNNs) and formulate a penalized LAD problem for estimation. Specifically, our model exhibits the following challenges. First, the regularization term can be nonconvex and nonsmooth, necessitating the introduction of infinite dimensional variational analysis and nonsmooth analysis into the asymptotic normality discussion. Second, our net...
ID: 2511.21115v1 stat.ML, cs.LG
Авторы:

Jasraj Singh, Shelvia Wongso, Jeremie Houssineau, Badr-Eddine Chérief-Abdellatif

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Variational inference (VI) is a cornerstone of modern Bayesian learning, enabling approximate inference in complex models that would otherwise be intractable. However, its formulation depends on expectations and divergences defined through high-dimensional integrals, often rendering analytical treatment impossible and necessitating heavy reliance on approximate learning and inference techniques. Possibility theory, an imprecise probability framework, allows to directly model epistemic uncertaint...
ID: 2511.21223v1 stat.ML, cs.LG
Авторы:

Alexandra Carpentier, Christophe Giraud, Nicolas Verzelen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
A fundamental theoretical question in network analysis is to determine under which conditions community recovery is possible in polynomial time in the Stochastic Block Model (SBM). When the number $K$ of communities remains smaller than $\sqrt{n}$ --where $n$ denotes the number of nodes--, non-trivial community recovery is possible in polynomial time above, and only above, the Kesten--Stigum (KS) threshold, originally postulated using arguments from statistical physics. When $K \geq \sqrt{n}$,...
ID: 2511.21526v1 stat.ML, cs.LG, math.PR, math.ST
Показано 31 - 40 из 564 записей