📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 An Approach to Variable Clustering: K-means in Transposed Data and its Relationship with Principal Component Analysis

2025-12-02

Авторы:

Victor Saquicela, Kenneth Palacio-Baus, Mario Chifla

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Principal Component Analysis (PCA) and K-means constitute fundamental techniques in multivariate analysis. Although they are frequently applied independently or sequentially to cluster observations, the relationship between them, especially when K-means is used to cluster variables rather than observations, has been scarcely explored. This study seeks to address this gap by proposing an innovative method that analyzes the relationship between clusters of variables obtained by applying K-means on...

ID: 2512.00979v1 stat.ML, cs.AI, cs.LG

arXiv PDF

📄 Discriminative classification with generative features: bridging Naive Bayes and logistic regression

2025-12-02

Авторы:

Zachary Terner, Alexander Petersen, Yuedong Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce Smart Bayes, a new classification framework that bridges generative and discriminative modeling by integrating likelihood-ratio-based generative features into a logistic-regression-style discriminative classifier. From the generative perspective, Smart Bayes relaxes the fixed unit weights of Naive Bayes by allowing data-driven coefficients on density-ratio features. From a discriminative perspective, it constructs transformed inputs as marginal log-density ratios that explicitly qua...

ID: 2512.01097v1 stat.ML, cs.AI, cs.LG, stat.CO, stat.ME

arXiv PDF

📄 FAST: Topology-Aware Frequency-Domain Distribution Matching for Coreset Selection

2025-11-26

Авторы:

Jin Cui, Boran Zhao, Jiajun Xu, Jiaqi Guo, Shuo Guan, Pengju Ren

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Coreset selection compresses large datasets into compact, representative subsets, reducing the energy and computational burden of training deep neural networks. Existing methods are either: (i) DNN-based, which are tied to model-specific parameters and introduce architectural bias; or (ii) DNN-free, which rely on heuristics lacking theoretical guarantees. Neither approach explicitly constrains distributional equivalence, largely because continuous distribution matching is considered inapplicable...

ID: 2511.19476v1 stat.ML, cs.AI, cs.LG

arXiv PDF

📄 Implicit Bias of the JKO Scheme

2025-11-20

Авторы:

Peter Halmos, Boris Hanin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Wasserstein gradient flow provides a general framework for minimizing an energy functional $J$ over the space of probability measures on a Riemannian manifold $(M,g)$. Its canonical time-discretization, the Jordan-Kinderlehrer-Otto (JKO) scheme, produces for any step size $η>0$ a sequence of probability distributions $ρ_k^η$ that approximate to first order in $η$ Wasserstein gradient flow on $J$. But the JKO scheme also has many other remarkable properties not shared by other first order integra...

ID: 2511.14827v1 stat.ML, cs.AI, cs.LG, math.AP

arXiv PDF

📄 A general framework for adaptive nonparametric dimensionality reduction

2025-11-15

Авторы:

Antonio Di Noia, Federico Ravenda, Antonietta Mira

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Dimensionality reduction is a fundamental task in modern data science. Several projection methods specifically tailored to take into account the non-linearity of the data via local embeddings have been proposed. Such methods are often based on local neighbourhood structures and require tuning the number of neighbours that define this local structure, and the dimensionality of the lower-dimensional space onto which the data are projected. Such choices critically influence the quality of the resul...

ID: 2511.09486v1 stat.ML, cs.AI, cs.LG, stat.ME

arXiv PDF

📄 Self-adaptive weighting and sampling for physics-informed neural networks

2025-11-11

Авторы:

Wenqian Chen, Amanda Howard, Panos Stinis

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Physics-informed deep learning has emerged as a promising framework for solving partial differential equations (PDEs). Nevertheless, training these models on complex problems remains challenging, often leading to limited accuracy and efficiency. In this work, we introduce a hybrid adaptive sampling and weighting method to enhance the performance of physics-informed neural networks (PINNs). The adaptive sampling component identifies training points in regions where the solution exhibits rapid var...

ID: 2511.05452v1 stat.ML, cs.AI, cs.LG, physics.comp-ph

arXiv PDF

📄 Data-driven Projection Generation for Efficiently Solving Heterogeneous Quadratic Programming Problems

2025-11-01

Авторы:

Tomoharu Iwata, Futoshi Futami

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We propose a data-driven framework for efficiently solving quadratic programming (QP) problems by reducing the number of variables in high-dimensional QPs using instance-specific projection. A graph neural network-based model is designed to generate projections tailored to each QP instance, enabling us to produce high-quality solutions even for previously unseen problems. The model is trained on heterogeneous QPs to minimize the expected objective value evaluated on the projected solutions. This...

ID: 2510.26061v1 stat.ML, cs.AI, cs.LG, math.OC

arXiv PDF

📄 Using latent representations to link disjoint longitudinal data for mixed-effects regression

2025-10-31

Авторы:

Clemens Schächter, Maren Hackenberg, Michelle Pfaffenlehner, Félix B. Tambe-Ndonfack, Thorsten Schmidt, Astrid Pechmann, Janbernd Kirschner, Jan Hasenauser, Harald Binder

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Many rare diseases offer limited established treatment options, leading patients to switch therapies when new medications emerge. To analyze the impact of such treatment switches within the low sample size limitations of rare disease trials, it is important to use all available data sources. This, however, is complicated when usage of measurement instruments change during the observation period, for example when instruments are adapted to specific age ranges. The resulting disjoint longitudinal ...

ID: 2510.25531v1 stat.ML, cs.AI, cs.LG, 68T07, G.3; I.2.6; J.3

arXiv PDF

📄 E-Scores for (In)Correctness Assessment of Generative Model Outputs

2025-10-31

Авторы:

Guneet S. Dhillon, Javier González, Teodora Pandeva, Alicia Curth

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

While generative models, especially large language models (LLMs), are ubiquitous in today's world, principled mechanisms to assess their (in)correctness are limited. Using the conformal prediction framework, previous works construct sets of LLM responses where the probability of including an incorrect response, or error, is capped at a desired user-defined tolerance level. However, since these methods are based on p-values, they are susceptible to p-hacking, i.e., choosing the tolerance level po...

ID: 2510.25770v1 stat.ML, cs.AI, cs.LG

arXiv PDF

📄 Frequentist Validity of Epistemic Uncertainty Estimators

2025-10-29

Авторы:

Anchit Jain, Stephen Bates

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Decomposing prediction uncertainty into its aleatoric (irreducible) and epistemic (reducible) components is critical for the development and deployment of machine learning systems. A popular, principled measure for epistemic uncertainty is the mutual information between the response variable and model parameters. However, evaluating this measure requires access to the posterior distribution of the model parameters, which is challenging to compute. In view of this, we introduce a frequentist meas...

ID: 2510.22063v1 stat.ML, cs.AI, cs.LG, math.ST, stat.TH

arXiv PDF

Показано 1 - 10 из 35 записей