📊 Статистика дайджестов
Всего дайджестов: 34123 Добавлено сегодня: 0
Последнее обновление: сегодня
Авторы:
Elizabeth Collins-Woodfin, Inbar Seroussi
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We develop a framework for analyzing the training and learning rate dynamics
on a variety of high- dimensional optimization problems trained using one-pass
stochastic gradient descent (SGD) with data generated from multiple anisotropic
classes. We give exact expressions for a large class of functions of the
limiting dynamics, including the risk and the overlap with the true signal, in
terms of a deterministic solution to a system of ODEs. We extend the existing
theory of high-dimensional SGD dyn...
Авторы:
Julio Enrique Castrillon-Candas, Hanfeng Gu, Caleb Meredith, Yulin Li, Xiaojing Tang, Pontus Olofsson, Mark Kon
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In this paper we develop a deforestation detection pipeline that incorporates
optical and Synthetic Aperture Radar (SAR) data. A crucial component of the
pipeline is the construction of anomaly maps of the optical data, which is done
using the residual space of a discrete Karhunen-Lo\`{e}ve (KL) expansion.
Anomalies are quantified using a concentration bound on the distribution of the
residual components for the nominal state of the forest. This bound does not
require prior knowledge on the dist...
📄 High-Dimensional BWDM: A Robust Nonparametric Clustering Validation Index for Large-Scale Data
2025-10-19Авторы:
Mohammed Baragilly, Hend Gabr
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Determining the appropriate number of clusters in unsupervised learning is a
central problem in statistics and data science. Traditional validity indices
such as Calinski-Harabasz, Silhouette, and Davies-Bouldin-depend on
centroid-based distances and therefore degrade in high-dimensional or
contaminated data. This paper proposes a new robust, nonparametric clustering
validation framework, the High-Dimensional Between-Within Distance Median
(HD-BWDM), which extends the recently introduced BWDM cr...
Авторы:
Benjamín Castro, Camilo Ramírez, Sebastián Espinosa, Jorge F. Silva, Marcos E. Orchard, Heraldo Rozas
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In Machine Learning (ML), a regression algorithm aims to minimize a loss
function based on data. An assessment method in this context seeks to quantify
the discrepancy between the optimal response for an input-output system and the
estimate produced by a learned predictive model (the student). Evaluating the
quality of a learned regressor remains challenging without access to the true
data-generating mechanism, as no data-driven assessment method can ensure the
achievability of global optimality...
Авторы:
Runlin Zhou, Letian Li, Zemin Zheng
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study personalized federated learning for multivariate responses where
client models are heterogeneous yet share variable-level structure. Existing
entry-wise penalties ignore cross-response dependence, while matrix-wise fusion
over-couples clients. We propose a Sparse Row-wise Fusion (SROF) regularizer
that clusters row vectors across clients and induces within-row sparsity, and
we develop RowFed, a communication-efficient federated algorithm that embeds
SROF into a linearized ADMM framework...
Авторы:
Zhikun Zhang, Guanyu Pan, Xiangjun Wang, Yong Xu, Guangtao Zhang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Inverse problems involving partial differential equations (PDEs) with
discontinuous coefficients are fundamental challenges in modeling complex
spatiotemporal systems with heterogeneous structures and uncertain dynamics.
Traditional numerical and machine learning approaches often face limitations in
addressing these problems due to high dimensionality, inherent nonlinearity,
and discontinuous parameter spaces. In this work, we propose a novel
computational framework that synergistically integrat...
Авторы:
Pierre Glaser, David Widmann, Fredrik Lindsten, Arthur Gretton
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We introduce the Kernel Calibration Conditional Stein Discrepancy test (KCCSD
test), a non-parametric, kernel-based test for assessing the calibration of
probabilistic models with well-defined scores. In contrast to previous methods,
our test avoids the need for possibly expensive expectation approximations
while providing control over its type-I error. We achieve these improvements by
using a new family of kernels for score-based probabilities that can be
estimated without probability density s...
Авторы:
Gavin Kerrigan, Christian A. Naesseth, Tom Rainforth
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We introduce a novel geometric framework for optimal experimental design
(OED). Traditional OED approaches, such as those based on mutual information,
rely explicitly on probability densities, leading to restrictive invariance
properties. To address these limitations, we propose the mutual transport
dependence (MTD), a measure of statistical dependence grounded in optimal
transport theory which provides a geometric objective for optimizing designs.
Unlike conventional approaches, the MTD can be ...
Авторы:
Pierre Glaser, Kevin Han Huang, Arthur Gretton
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We perform a non-asymptotic analysis of the contrastive divergence (CD)
algorithm, a training method for unnormalized models. While prior work has
established that (for exponential family distributions) the CD iterates
asymptotically converge at an $O(n^{-1 / 3})$ rate to the true parameter of the
data distribution, we show, under some regularity assumptions, that CD can
achieve the parametric rate $O(n^{-1 / 2})$. Our analysis provides results for
various data batching schemes, including the fu...
Авторы:
Santiago Mazuelas, Veronica Alvarez
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Boosting methods often achieve excellent classification accuracy, but can
experience notable performance degradation in the presence of label noise.
Existing robust methods for boosting provide theoretical robustness guarantees
for certain types of label noise, and can exhibit only moderate performance
degradation. However, previous theoretical results do not account for realistic
types of noise and finite training sizes, and existing robust methods can
provide unsatisfactory accuracies, even wi...
Показано 241 -
250
из 565 записей