📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 Category learning in deep neural networks: Information content and geometry of internal representations

2025-10-25

Авторы:

Laurent Bonnasse-Gahot, Jean-Pierre Nadal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In animals, category learning enhances discrimination between stimuli close to the category boundary. This phenomenon, called categorical perception, was also empirically observed in artificial neural networks trained on classification tasks. In previous modeling works based on neuroscience data, we show that this expansion/compression is a necessary outcome of efficient learning. Here we extend our theoretical framework to artificial networks. We show that minimizing the Bayes cost (mean of the...

ID: 2510.19021v1 cs.LG, cs.IT, math.IT, q-bio.NC

arXiv PDF

📄 Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency

2025-10-25

Авторы:

Renzhao Liang, Sizhe Xu, Chenggang Xie, Jingru Chen, Feiyang Ren, Shu Yang, Takahiro Yabe

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Time series forecasting plays a pivotal role in critical domains such as energy management and financial markets. Although deep learning-based approaches (e.g., MLP, RNN, Transformer) have achieved remarkable progress, the prevailing "long-sequence information gain hypothesis" exhibits inherent limitations. Through systematic experimentation, this study reveals a counterintuitive phenomenon: appropriately truncating historical data can paradoxically enhance prediction accuracy, indicating that e...

ID: 2510.19980v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning

2025-10-25

Авторы:

Reuben Dorent, Polina Golland, William Wells III

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Mutual Information (MI) is a fundamental measure of statistical dependence widely used in representation learning. While direct optimization of MI via its definition as a Kullback-Leibler divergence (KLD) is often intractable, many recent methods have instead maximized alternative dependence measures, most notably, the Jensen-Shannon divergence (JSD) between joint and product of marginal distributions via discriminative losses. However, the connection between these surrogate objectives and MI re...

ID: 2510.20644v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 The Effect of Label Noise on the Information Content of Neural Representations

2025-10-12

Авторы:

Ali Hussaini Umar, Franky Kevin Nando Tezoh, Jean Barbier, Santiago Acevedo, Alessandro Laio

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In supervised classification tasks, models are trained to predict a label for each data point. In real-world datasets, these labels are often noisy due to annotation errors. While the impact of label noise on the performance of deep learning models has been widely studied, its effects on the networks' hidden representations remain poorly understood. We address this gap by systematically comparing hidden representations using the Information Imbalance, a computationally efficient proxy of conditi...

ID: 2510.06401v1 cs.LG, cs.IT, cs.NE, math.IT, stat.ML

arXiv PDF

📄 Black-box Detection of LLM-generated Text Using Generalized Jensen-Shannon Divergence

2025-10-11

Авторы:

Shuangyi Chen, Ashish Khisti

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study black-box detection of machine-generated text under practical constraints: the scoring model (proxy LM) may mismatch the unknown source model, and per-input contrastive generation is costly. We propose SurpMark, a reference-based detector that summarizes a passage by the dynamics of its token surprisals. SurpMark quantizes surprisals into interpretable states, estimates a state-transition matrix for the test text, and scores it via a generalized Jensen-Shannon (GJS) gap between the test...

ID: 2510.07500v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 Efficient Generalization via Multimodal Co-Training under Data Scarcity and Distribution Shift

2025-10-11

Авторы:

Tianyu Bell Pan, Damon L. Woodard

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper explores a multimodal co-training framework designed to enhance model generalization in situations where labeled data is limited and distribution shifts occur. We thoroughly examine the theoretical foundations of this framework, deriving conditions under which the use of unlabeled data and the promotion of agreement between classifiers for different modalities lead to significant improvements in generalization. We also present a convergence analysis that confirms the effectiveness of ...

ID: 2510.07509v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 Some theoretical improvements on the tightness of PAC-Bayes risk certificates for neural networks

2025-10-11

Авторы:

Diego García-Pérez, Emilio Parrado-Hernández, John Shawe-Taylor

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This paper presents four theoretical contributions that improve the usability of risk certificates for neural networks based on PAC-Bayes bounds. First, two bounds on the KL divergence between Bernoulli distributions enable the derivation of the tightest explicit bounds on the true risk of classifiers across different ranges of empirical risk. The paper next focuses on the formalization of an efficient methodology based on implicit differentiation that enables the introduction of the optimizatio...

ID: 2510.07935v1 cs.LG, cs.IT, math.IT, stat.ML

arXiv PDF

📄 Spectral Scaling Laws in Language Models: How Effectively Do Feed-Forward Networks Use Their Latent Space?

2025-10-04

Авторы:

Nandan Kumar Jha, Brandon Reagen

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As large language models (LLMs) scale, the question is not only how large they become, but how much of their capacity is effectively utilized. Existing scaling laws relate model size to loss, yet overlook how components exploit their latent space. We study feed-forward networks (FFNs) and recast width selection as a spectral utilization problem. Using a lightweight diagnostic suite -- Hard Rank (participation ratio), Soft Rank (Shannon rank), Spectral Concentration, and the composite Spectral Ut...

ID: 2510.00537v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 A Unified Probabilistic Framework for Dictionary Learning with Parsimonious Activation

2025-10-02

Авторы:

Zihui Zhao, Yuanbo Tang, Jieyu Ren, Xiaoping Zhang, Yang Li

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Dictionary learning is traditionally formulated as an $L_1$-regularized signal reconstruction problem. While recent developments have incorporated discriminative, hierarchical, or generative structures, most approaches rely on encouraging representation sparsity over individual samples that overlook how atoms are shared across samples, resulting in redundant and sub-optimal dictionaries. We introduce a parsimony promoting regularizer based on the row-wise $L_\infty$ norm of the coefficient matri...

ID: 2509.25690v1 cs.LG, cs.IT, math.IT

arXiv PDF

📄 Beyond Point Estimates: Likelihood-Based Full-Posterior Wireless Localization

2025-10-02

Авторы:

Haozhe Lei, Hao Guo, Tommy Svensson, Sundeep Rangan

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Modern wireless systems require not only position estimates, but also quantified uncertainty to support planning, control, and radio resource management. We formulate localization as posterior inference of an unknown transmitter location from receiver measurements. We propose Monte Carlo Candidate-Likelihood Estimation (MC-CLE), which trains a neural scoring network using Monte Carlo sampling to compare true and candidate transmitter locations. We show that in line-of-sight simulations with a mu...

ID: 2509.25719v1 cs.LG, cs.IT, cs.SY, eess.SY, math.IT

arXiv PDF

Показано 21 - 30 из 58 записей