📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Common Structure Discovery in Collections of Bipartite Networks: Application to Pollination Systems

2025-12-04

Авторы:

Louis Lacoste, Pierre Barbillon, Sophie Donnet

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Bipartite networks are widely used to encode the ecological interactions. Being able to compare the organization of bipartite networks is a first step toward a better understanding of how environmental factors shape community structure and resilience. Yet current methods for structure detection in bipartite networks overlook shared patterns across collections of networks. We introduce the \emph{colBiSBM}, a family of probabilistic models for collections of bipartite networks that extends the cla...

ID: 2512.01716v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Clustering Approaches for Mixed-Type Data: A Comparative Study

2025-11-27

Авторы:

Badih Ghattas, Alvaro Sanchez San-Benito

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Clustering is widely used in unsupervised learning to find homogeneous groups of observations within a dataset. However, clustering mixed-type data remains a challenge, as few existing approaches are suited for this task. This study presents the state-of-the-art of these approaches and compares them using various simulation models. The compared methods include the distance-based approaches k-prototypes, PDQ, and convex k-means, and the probabilistic methods KAy-means for MIxed LArge data (KAMILA...

ID: 2511.19755v1 stat.ML, cs.LG, stat.AP, stat.ME

arXiv PDF

📄 Structured Matching via Cost-Regularized Unbalanced Optimal Transport

2025-11-26

Авторы:

Emanuele Pardini, Katerina Papagiannouli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Unbalanced optimal transport (UOT) provides a flexible way to match or compare nonnegative finite Radon measures. However, UOT requires a predefined ground transport cost, which may misrepresent the data's underlying geometry. Choosing such a cost is particularly challenging when datasets live in heterogeneous spaces, often motivating practitioners to adopt Gromov-Wasserstein formulations. To address this challenge, we introduce cost-regularized unbalanced optimal transport (CR-UOT), a framework...

ID: 2511.19075v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Gini Score under Ties and Case Weights

2025-11-21

Авторы:

Alexej Brauer, Mario V. Wüthrich

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The Gini score is a popular tool in statistical modeling and machine learning for model validation and model selection. It is a purely rank based score that allows one to assess risk rankings. The Gini score for statistical modeling has mainly been used in a binary context, in which it has many equivalent reformulations such as the receiver operating characteristic (ROC) or the area under the curve (AUC). In the actuarial literature, this rank based score for binary responses has been extended t...

ID: 2511.15446v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Uncertainty-Calibrated Prediction of Randomly-Timed Biomarker Trajectories with Conformal Bands

2025-11-19

Авторы:

Vasiliki Tassopoulou, Charis Stamouli, Haochang Shou, George J. Pappas, Christos Davatzikos

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Despite recent progress in predicting biomarker trajectories from real clinical data, uncertainty in the predictions poses high-stakes risks (e.g., misdiagnosis) that limit their clinical deployment. To enable safe and reliable use of such predictions in healthcare, we introduce a conformal method for uncertainty-calibrated prediction of biomarker trajectories resulting from randomly-timed clinical visits of patients. Our approach extends conformal prediction to the setting of randomly-timed tra...

ID: 2511.13911v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Masked Mineral Modeling: Continent-Scale Mineral Prospecting via Geospatial Infilling

2025-11-15

Авторы:

Sujay Nair, Evan Coleman, Sherrie Wang, Elsa Olivetti

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Minerals play a critical role in the advanced energy technologies necessary for decarbonization, but characterizing mineral deposits hidden underground remains costly and challenging. Inspired by recent progress in generative modeling, we develop a learning method which infers the locations of minerals by masking and infilling geospatial maps of resource availability. We demonstrate this technique using mineral data for the conterminous United States, and train performant models, with the best a...

ID: 2511.09722v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Signature Kernel Scoring Rule as Spatio-Temporal Diagnostic for Probabilistic Forecasting

2025-10-25

Авторы:

Archer Dodson, Ritabrata Dutta

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern weather forecasting has increasingly transitioned from numerical weather prediction (NWP) to data-driven machine learning forecasting techniques. While these new models produce probabilistic forecasts to quantify uncertainty, their training and evaluation may remain hindered by conventional scoring rules, primarily MSE, which ignore the highly correlated data structures present in weather and atmospheric systems. This work introduces the signature kernel scoring rule, grounded in rough pa...

ID: 2510.19110v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Interval Prediction of Annual Average Daily Traffic on Local Roads via Quantile Random Forest with High-Dimensional Spatial Data

2025-10-23

Авторы:

Ying Yao, Daniel J. Graham

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Accurate annual average daily traffic (AADT) data are vital for transport planning and infrastructure management. However, automatic traffic detectors across national road networks often provide incomplete coverage, leading to underrepresentation of minor roads. While recent machine learning advances have improved AADT estimation at unmeasured locations, most models produce only point predictions and overlook estimation uncertainty. This study addresses that gap by introducing an interval predic...

ID: 2510.18548v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 Beyond PCA: Manifold Dimension Estimation via Local Graph Structure

2025-10-21

Авторы:

Zelong Bi, Pierre Lafaye de Micheaux

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Local principal component analysis (Local PCA) has proven to be an effective tool for estimating the intrinsic dimension of a manifold. More recently, curvature-adjusted PCA (CA-PCA) has improved upon this approach by explicitly accounting for the curvature of the underlying manifold, rather than assuming local flatness. Building on these insights, we propose a general framework for manifold dimension estimation that captures the manifold's local graph structure by integrating PCA with regressio...

ID: 2510.15141v1 stat.ML, cs.LG, stat.AP

arXiv PDF

📄 A Honest Cross-Validation Estimator for Prediction Performance

2025-10-11

Авторы:

Tianyu Pan, Vincent Z. Yu, Viswanath Devanarayan, Lu Tian

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Cross-validation is a standard tool for obtaining a honest assessment of the performance of a prediction model. The commonly used version repeatedly splits data, trains the prediction model on the training set, evaluates the model performance on the test set, and averages the model performance across different data splits. A well-known criticism is that such cross-validation procedure does not directly estimate the performance of the particular model recommended for future use. In this paper, we...

ID: 2510.07649v1 stat.ML, cs.LG, stat.AP, stat.ME

arXiv PDF

Показано 1 - 10 из 17 записей