📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Provable Benefit of Sign Descent: A Minimal Model Under Heavy-Tailed Class Imbalance

2025-12-02

Авторы:

Robin Yadav, Shuo Xie, Tianhao Wang, Zhiyuan Li

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Adaptive optimization methods (such as Adam) play a major role in LLM pretraining, significantly outperforming Gradient Descent (GD). Recent studies have proposed new smoothness assumptions on the loss function to explain the advantages of adaptive algorithms with structured preconditioners, e.g., coordinate-wise or layer-wise, and steepest descent methods w.r.t. non-euclidean norms, e.g., $\ell_\infty$ norm or spectral norm, over GD. However, it remains unclear how these smoothness assumptions ...

ID: 2512.00763v1 cs.LG, cs.AI

arXiv PDF

📄 Preventing Model Collapse via Contraction-Conditioned Neural Filters

2025-12-02

Авторы:

Zongjian Han, Yiran Liang, Ruiwen Wang, Yiwei Luo, Yilin Huang, Xiaotong Song, Dongqing Wei

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper presents a neural network filter method based on contraction operators to address model collapse in recursive training of generative models. Unlike \cite{xu2024probabilistic}, which requires superlinear sample growth ($O(t^{1+s})$), our approach completely eliminates the dependence on increasing sample sizes within an unbiased estimation framework by designing a neural filter that learns to satisfy contraction conditions. We develop specialized neural network architectures and loss fu...

ID: 2512.00757v1 cs.LG, cs.AI

arXiv PDF

📄 Limitations of Using Identical Distributions for Training and Testing When Learning Boolean Functions

2025-12-02

Авторы:

Jordi Pérez-Guijarro

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

When the distributions of the training and test data do not coincide, the problem of understanding generalization becomes considerably more complex, prompting a variety of questions. In this work, we focus on a fundamental one: Is it always optimal for the training distribution to be identical to the test distribution? Surprisingly, assuming the existence of one-way functions, we find that the answer is no. That is, matching distributions is not always the best scenario, which contrasts with the...

ID: 2512.00791v1 cs.LG, cs.AI

arXiv PDF

📄 Causal Invariance and Counterfactual Learning Driven Cooperative Game for Multi-Label Classification

2025-12-02

Авторы:

Yijia Fan, Jusheng Zhang, Kaitong Cai, Jing Yang, Keze Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Multi-label classification (MLC) remains vulnerable to label imbalance, spurious correlations, and distribution shifts, challenges that are particularly detrimental to rare label prediction. To address these limitations, we introduce the Causal Cooperative Game (CCG) framework, which conceptualizes MLC as a cooperative multi-player interaction. CCG unifies explicit causal discovery via Neural Structural Equation Models with a counterfactual curiosity reward to drive robust feature learning. Furt...

ID: 2512.00812v1 cs.LG, cs.AI

arXiv PDF

📄 Topological Federated Clustering via Gravitational Potential Fields under Local Differential Privacy

2025-12-02

Авторы:

Yunbo Long, Jiaquan Zhang, Xi Chen, Alexandra Brintrup

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Clustering non-independent and identically distributed (non-IID) data under local differential privacy (LDP) in federated settings presents a critical challenge: preserving privacy while maintaining accuracy without iterative communication. Existing one-shot methods rely on unstable pairwise centroid distances or neighborhood rankings, degrading severely under strong LDP noise and data heterogeneity. We present Gravitational Federated Clustering (GFC), a novel approach to privacy-preserving fede...

ID: 2512.00849v1 cs.LG, cs.AI

arXiv PDF

📄 HBLLM: Wavelet-Enhanced High-Fidelity 1-Bit Quantization for LLMs

2025-12-02

Авторы:

Ningning Chen, Weicai Ye, Ying Jiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce HBLLM, a wavelet-enhanced high-fidelity $1$-bit post-training quantization method for Large Language Models (LLMs). By leveraging Haar wavelet transforms to enhance expressive capacity through frequency decomposition, HBLLM significantly improves quantization fidelity while maintaining minimal overhead. This approach features two innovative structure-aware grouping strategies: (1) frequency-aware multi-parameter intra-row grouping and (2) $\ell_2$-norm-based saliency-driven column s...

ID: 2512.00862v1 cs.LG, cs.AI

arXiv PDF

📄 Light-Weight Benchmarks Reveal the Hidden Hardware Cost of Zero-Shot Tabular Foundation Models

2025-12-02

Авторы:

Aayam Bansal, Ishaan Gangwani

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Zero-shot foundation models (FMs) promise training-free prediction on tabular data, yet their hardware footprint remains poorly characterized. We present a fully reproducible benchmark that reports test accuracy together with wall-clock latency, peak CPU RAM, and peak GPU VRAM on four public datasets: Adult-Income, Higgs-100k, Wine-Quality, and California-Housing. Two open FMs (TabPFN-1.0 and TabICL-base) are compared against tuned XGBoost, LightGBM, and Random Forest baselines on a single NVIDI...

ID: 2512.00888v1 cs.LG, cs.AI

arXiv PDF

📄 Beyond High-Entropy Exploration: Correctness-Aware Low-Entropy Segment-Based Advantage Shaping for Reasoning LLMs

2025-12-02

Авторы:

Xinzhu Chen, Xuesheng Li, Zhongxiang Sun, Weijie Yu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reinforcement Learning with Verifiable Rewards (RLVR) has become a central approach for improving the reasoning ability of large language models. Recent work studies RLVR through token entropy, arguing that high-entropy tokens drive exploration and should receive stronger updates. However, they overlook the fact that most of a reasoning trajectory consists of low-entropy segments that encode stable and reusable structural patterns. Through qualitative and quantitative analyses, we find that the ...

ID: 2512.00908v1 cs.LG, cs.AI

arXiv PDF

📄 Multi-Modal AI for Remote Patient Monitoring in Cancer Care

2025-12-02

Авторы:

Yansong Liu, Ronnie Stafford, Pramit Khetrapal, Huriye Kocadag, Graça Carvalho, Patricia de Winter, Maryam Imran, Amelia Snook, Adamos Hadjivasiliou, D. Vijay Anand, Weining Lin, John Kelly, Yukun Zhou, Ivana Drobnjak

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

For patients undergoing systemic cancer therapy, the time between clinic visits is full of uncertainties and risks of unmonitored side effects. To bridge this gap in care, we developed and prospectively trialed a multi-modal AI framework for remote patient monitoring (RPM). This system integrates multi-modal data from the HALO-X platform, such as demographics, wearable sensors, daily surveys, and clinical events. Our observational trial is one of the largest of its kind and has collected over 2....

ID: 2512.00949v1 cs.LG, cs.AI

arXiv PDF

📄 Operator-Theoretic Framework for Gradient-Free Federated Learning

2025-12-02

Авторы:

Mohit Kumar, Mathias Brucker, Alexander Valentinitsch, Adnan Husakovic, Ali Abbas, Manuela Geiß, Bernhard A. Moser

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Federated learning must address heterogeneity, strict communication and computation limits, and privacy while ensuring performance. We propose an operator-theoretic framework that maps the $L^2$-optimal solution into a reproducing kernel Hilbert space (RKHS) via a forward operator, approximates it using available data, and maps back with the inverse operator, yielding a gradient-free scheme. Finite-sample bounds are derived using concentration inequalities over operator norms, and the framework ...

ID: 2512.01025v1 cs.LG, cs.AI

arXiv PDF

Показано 161 - 170 из 2901 записей