📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 Parametrising the Inhomogeneity Inducing Capacity of a Training Set, and its Impact on Supervised Learning

2025-10-23

Авторы:

Gargi Roy, Dalia Chakrabarty

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce parametrisation of that property of the available training dataset, that necessitates an inhomogeneous correlation structure for the function that is learnt as a model of the relationship between the pair of variables, observations of which comprise the considered training data. We refer to a parametrisation of this property of a given training set, as its ``inhomogeneity parameter''. It is easy to compute this parameter for small-to-large datasets, and we demonstrate ...

ID: 2510.18332v1 stat.ML, cs.LG, 62H20, 60G10, 68T05, 68T27, 60J20

arXiv PDF

📄 Interval Prediction of Annual Average Daily Traffic on Local Roads via Quantile Random Forest with High-Dimensional Spatial Data

2025-10-23

Авторы:

Ying Yao, Daniel J. Graham

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Accurate annual average daily traffic (AADT) data are vital for transport planning and infrastructure management. However, automatic traffic detectors across national road networks often provide incomplete coverage, leading to underrepresentation of minor roads. While recent machine learning advances have improved AADT estimation at unmeasured locations, most models produce only point predictions and overlook estimation uncertainty. This study addresses that gap by introducing an interval predic...

ID: 2510.18548v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 A Frequentist Statistical Introduction to Variational Inference, Autoencoders, and Diffusion Models

2025-10-23

Авторы:

Yen-Chi Chen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While Variational Inference (VI) is central to modern generative models like Variational Autoencoders (VAEs) and Denoising Diffusion Models (DDMs), its pedagogical treatment is split across disciplines. In statistics, VI is typically framed as a Bayesian method for posterior approximation. In machine learning, however, VAEs and DDMs are developed from a Frequentist viewpoint, where VI is used to approximate a maximum likelihood estimator. This creates a barrier for statisticians, as the principl...

ID: 2510.18777v1 stat.ML, cs.LG, stat.CO, stat.ME

arXiv PDF

📄 Infinite Neural Operators: Gaussian processes on functions

2025-10-22

Авторы:

Daniel Augusto de Souza, Yuchen Zhu, Harry Jake Cunningham, Yuri Saporito, Diego Mesquita, Marc Peter Deisenroth

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

A variety of infinitely wide neural architectures (e.g., dense NNs, CNNs, and transformers) induce Gaussian process (GP) priors over their outputs. These relationships provide both an accurate characterization of the prior predictive distribution and enable the use of GP machinery to improve the uncertainty quantification of deep neural networks. In this work, we extend this connection to neural operators (NOs), a class of models designed to learn mappings between function spaces. Specifically, ...

ID: 2510.16675v1 stat.ML, cs.LG

arXiv PDF

📄 Local regression on path spaces with signature metrics

2025-10-22

Авторы:

Christian Bayer, Davit Gogolashvili, Luca Pelizzari

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study nonparametric regression and classification for path-valued data. We introduce a functional Nadaraya-Watson estimator that combines the signature transform from rough path theory with local kernel regression. The signature transform provides a principled way to encode sequential data through iterated integrals, enabling direct comparison of paths in a natural metric space. Our approach leverages signature-induced distances within the classical kernel regression framework, achieving comp...

ID: 2510.16728v1 stat.ML, cs.LG, math.PR, stat.ME, 60L10, 60L20, 62G05, 62G08

arXiv PDF

📄 Kernel-Based Nonparametric Tests For Shape Constraints

2025-10-22

Авторы:

Rohan Sen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We develop a reproducing kernel Hilbert space (RKHS) framework for nonparametric mean-variance optimization and inference on shape constraints of the optimal rule. We derive statistical properties of the sample estimator and provide rigorous theoretical guarantees, such as asymptotic consistency, a functional central limit theorem, and a finite-sample deviation bound that matches the Monte Carlo rate up to regularization. Building on these findings, we introduce a joint Wald-type statistic to te...

ID: 2510.16745v2 stat.ML, cs.LG, math.ST, stat.ME, stat.TH, 62G10, 62G20, 62P05, 46E22

arXiv PDF

📄 Prediction-Augmented Trees for Reliable Statistical Inference

2025-10-22

Авторы:

Vikram Kher, Argyris Oikonomou, Manolis Zampetakis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The remarkable success of machine learning (ML) in predictive tasks has led scientists to incorporate ML predictions as a core component of the scientific discovery pipeline. This was exemplified by the landmark achievement of AlphaFold (Jumper et al. (2021)). In this paper, we study how ML predictions can be safely used in statistical analysis of data towards scientific discovery. In particular, we follow the framework introduced by Angelopoulos et al. (2023). In this framework, we assume acces...

ID: 2510.16937v1 stat.ML, cs.LG, stat.ME, G.3

arXiv PDF

📄 Adaptive Sample Sharing for Linear Regression

2025-10-22

Авторы:

Hamza Cherkaoui, Hélène Halconruy, Yohan Petetin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In many business settings, task-specific labeled data are scarce or costly to obtain, which limits supervised learning on a specific task. To address this challenge, we study sample sharing in the case of ridge regression: leveraging an auxiliary data set while explicitly protecting against negative transfer. We introduce a principled, data-driven rule that decides how many samples from an auxiliary dataset to add to the target training set. The rule is based on an estimate of the transfer gain ...

ID: 2510.16986v1 stat.ML, cs.LG, stat.OT

arXiv PDF

📄 Mode Collapse of Mean-Field Variational Inference

2025-10-22

Авторы:

Shunan Sheng, Bohan Wu, Alberto González-Sanz

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Mean-field variational inference (MFVI) is a widely used method for approximating high-dimensional probability distributions by product measures. It has been empirically observed that MFVI optimizers often suffer from mode collapse. Specifically, when the target measure $\pi$ is a mixture $\pi = w P_0 + (1 - w) P_1$, the MFVI optimizer tends to place most of its mass near a single component of the mixture. This work provides the first theoretical explanation of mode collapse in MFVI. We introduc...

ID: 2510.17063v1 stat.ML, cs.LG

arXiv PDF

📄 DFNN: A Deep Fréchet Neural Network Framework for Learning Metric-Space-Valued Responses

2025-10-22

Авторы:

Kyum Kim, Yaqing Chen, Paromita Dubey

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Regression with non-Euclidean responses -- e.g., probability distributions, networks, symmetric positive-definite matrices, and compositions -- has become increasingly important in modern applications. In this paper, we propose deep Fr\'echet neural networks (DFNNs), an end-to-end deep learning framework for predicting non-Euclidean responses -- which are considered as random objects in a metric space -- from Euclidean predictors. Our method leverages the representation-learning power of deep ne...

ID: 2510.17072v1 stat.ML, cs.LG, stat.ME

arXiv PDF

Показано 211 - 220 из 564 записей