📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Statistical Inference under Adaptive Sampling with LinUCB

2025-12-02

Авторы:

Wei Fan, Kevin Tan, Yuting Wei

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adaptively collected data has become ubiquitous within modern practice. However, even seemingly benign adaptive sampling schemes can introduce severe biases, rendering traditional statistical inference tools inapplicable. This can be mitigated by a property called stability, which states that if the rate at which an algorithm takes actions converges to a deterministic limit, one can expect that certain parameters are asymptotically normal. Building on a recent line of work for the multi-armed ba...

ID: 2512.00222v1 math.ST, cs.LG, stat.ME, stat.ML

arXiv PDF

📄 A Trainable Centrality Framework for Modern Data

2025-12-02

Авторы:

Minh Duc Vu, Mingshuo Liu, Doudou Zhou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Measuring how central or typical a data point is underpins robust estimation, ranking, and outlier detection, but classical depth notions become expensive and unstable in high dimensions and are hard to extend beyond Euclidean data. We introduce Fused Unified centrality Score Estimation (FUSE), a neural centrality framework that operates on top of arbitrary representations. FUSE combines a global head, trained from pairwise distance-based comparisons to learn an anchor-free centrality score, wit...

ID: 2511.22959v1 cs.LG, stat.ME, stat.ML

arXiv PDF

📄 Hierarchical Linkage Clustering Beyond Binary Trees and Ultrametrics

2025-11-26

Авторы:

Maximilien Dreveton, Matthias Grossglauser, Daichi Kuroda, Patrick Thiran

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Hierarchical clustering seeks to uncover nested structures in data by constructing a tree of clusters, where deeper levels reveal finer-grained relationships. Traditional methods, including linkage approaches, face three major limitations: (i) they always return a hierarchy, even if none exists, (ii) they are restricted to binary trees, even if the true hierarchy is non-binary, and (iii) they are highly sensitive to the choice of linkage function. In this paper, we address these issues by introd...

ID: 2511.18056v1 cs.LG, stat.ME, stat.ML

arXiv PDF

📄 RFX: High-Performance Random Forests with GPU Acceleration and QLORA Compression

2025-11-26

Авторы:

Chris Kuchar

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

RFX (Random Forests X), where X stands for compression or quantization, presents a production-ready implementation of Breiman and Cutler's Random Forest classification methodology in Python. RFX v1.0 provides complete classification: out-of-bag error estimation, overall and local importance measures, proximity matrices with QLORA compression, case-wise analysis, and interactive visualization (rfviz)--all with CPU and GPU acceleration. Regression, unsupervised learning, CLIQUE importance, and RF-...

ID: 2511.19493v1 cs.LG, stat.ME, stat.ML

arXiv PDF

📄 Statistical Properties of Rectified Flow

2025-11-07

Авторы:

Gonzalo Mena, Arun Kumar Kuchibhotla, Larry Wasserman

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Rectified flow (Liu et al., 2022; Liu, 2022; Wu et al., 2023) is a method for defining a transport map between two distributions, and enjoys popularity in machine learning, although theoretical results supporting the validity of these methods are scant. The rectified flow can be regarded as an approximation to optimal transport, but in contrast to other transport methods that require optimization over a function space, computing the rectified flow only requires standard statistical tools such as...

ID: 2511.03193v2 math.ST, cs.LG, stat.ME, stat.ML, stat.TH

arXiv PDF

📄 Bayesian model selection and misspecification testing in imaging inverse problems only from noisy and partial measurements

2025-11-04

Авторы:

Tom Sprunck, Marcelo Pereyra, Tobias Liaudat

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern imaging techniques heavily rely on Bayesian statistical models to address difficult image reconstruction and restoration tasks. This paper addresses the objective evaluation of such models in settings where ground truth is unavailable, with a focus on model selection and misspecification diagnosis. Existing unsupervised model evaluation methods are often unsuitable for computational imaging due to their high computational cost and incompatibility with modern image priors defined implicitl...

ID: 2510.27663v1 eess.IV, cs.LG, stat.ME, stat.ML

arXiv PDF

📄 Topic Analysis with Side Information: A Neural-Augmented LDA Approach

2025-11-01

Авторы:

Biyi Fang, Kripa Rajshekhar, Truong Vo, Diego Klabjan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Traditional topic models such as Latent Dirichlet Allocation (LDA) have been widely used to uncover latent structures in text corpora, but they often struggle to integrate auxiliary information such as metadata, user attributes, or document labels. These limitations restrict their expressiveness, personalization, and interpretability. To address this, we propose nnLDA, a neural-augmented probabilistic topic model that dynamically incorporates side information through a neural prior mechanism. nn...

ID: 2510.24918v1 cs.LG, stat.ME, stat.ML

arXiv PDF

📄 Embedding Trust: Semantic Isotropy Predicts Nonfactuality in Long-Form Text Generation

2025-10-29

Авторы:

Dhrupad Bhardwaj, Julia Kempe, Tim G. J. Rudner

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

To deploy large language models (LLMs) in high-stakes application domains that require substantively accurate responses to open-ended prompts, we need reliable, computationally inexpensive methods that assess the trustworthiness of long-form responses generated by LLMs. However, existing approaches often rely on claim-by-claim fact-checking, which is computationally expensive and brittle in long-form responses to open-ended prompts. In this work, we introduce semantic isotropy -- the degree of u...

ID: 2510.21891v1 cs.CL, cs.AI, cs.LG, stat.ME, stat.ML

arXiv PDF

📄 SHAP-Based Supervised Clustering for Sample Classification and the Generalized Waterfall Plot

2025-10-14

Авторы:

Justin Lin, Julia Fukuyama

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this growing age of data and technology, large black-box models are becoming the norm due to their ability to handle vast amounts of data and learn incredibly complex input-output relationships. The deficiency of these methods, however, is their inability to explain the prediction process, making them untrustworthy and their use precarious in high-stakes situations. SHapley Additive exPlanations (SHAP) analysis is an explainable AI method growing in popularity for its ability to explain model...

ID: 2510.08737v1 cs.LG, stat.ME, stat.ML

arXiv PDF

📄 Robust Spatiotemporally Contiguous Anomaly Detection Using Tensor Decomposition

2025-10-04

Авторы:

Rachita Mondal, Mert Indibi, Tapabrata Maiti, Selin Aviyente

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Anomaly detection in spatiotemporal data is a challenging problem encountered in a variety of applications, including video surveillance, medical imaging data, and urban traffic monitoring. Existing anomaly detection methods focus mainly on point anomalies and cannot deal with temporal and spatial dependencies that arise in spatio-temporal data. Tensor-based anomaly detection methods have been proposed to address this problem. Although existing methods can capture dependencies across different m...

ID: 2510.00460v1 cs.LG, stat.ME, stat.ML

arXiv PDF

Показано 1 - 10 из 13 записей