📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 The Adaptivity Barrier in Batched Nonparametric Bandits: Sharp Characterization of the Price of Unknown Margin

2025-11-07

Авторы:

Rong Jiang, Cong Ma

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study batched nonparametric contextual bandits under a margin condition when the margin parameter $\alpha$ is unknown. To capture the statistical price of this ignorance, we introduce the regret inflation criterion, defined as the ratio between the regret of an adaptive algorithm and that of an oracle knowing $\alpha$. We show that the optimal regret inflation grows polynomial with the horizon $T$, with exponent precisely given by the value of a convex optimization problem involving the dimen...

ID: 2511.03708v1 math.ST, cs.LG, stat.ML, stat.TH

arXiv PDF

📄 Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge

2025-11-07

Авторы:

Drago Plecko, Patrik Okanovic, Torsten Hoefler, Elias Bareinboim

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Artificial intelligence (AI) systems hold great promise for advancing various scientific disciplines, and are increasingly used in real-world applications. Despite their remarkable progress, further capabilities are expected in order to achieve more general types of intelligence. A critical distinction in this context is between factual knowledge, which can be evaluated against true or false answers (e.g., "what is the capital of England?"), and probabilistic knowledge, reflecting probabilistic ...

ID: 2511.03070v1 cs.AI, cs.LG, stat.ML

arXiv PDF

📄 Estimation of Toeplitz Covariance Matrices using Overparameterized Gradient Descent

2025-11-06

Авторы:

Daniel Busbib, Ami Wiesel

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We consider covariance estimation under Toeplitz structure. Numerous sophisticated optimization methods have been developed to maximize the Gaussian log-likelihood under Toeplitz constraints. In contrast, recent advances in deep learning demonstrate the surprising power of simple gradient descent (GD) applied to overparameterized models. Motivated by this trend, we revisit Toeplitz covariance estimation through the lens of overparameterized GD. We model the $P\times P$ covariance as a sum of $K$...

ID: 2511.01605v1 cs.LG, stat.ML

arXiv PDF

📄 Cross-Treatment Effect Estimation for Multi-Category, Multi-Valued Causal Inference via Dynamic Neural Masking

2025-11-06

Авторы:

Xiaopeng Ke, Yihan Yu, Ruyue Zhang, Zhishuo Zhou, Fangzhou Shi, Chang Men, Zhengdan Zhu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Counterfactual causal inference faces significant challenges when extended to multi-category, multi-valued treatments, where complex cross-effects between heterogeneous interventions are difficult to model. Existing methodologies remain constrained to binary or single-type treatments and suffer from restrictive assumptions, limited scalability, and inadequate evaluation frameworks for complex intervention scenarios. We present XTNet, a novel network architecture for multi-category, multi-value...

ID: 2511.01641v1 cs.LG, stat.ML

arXiv PDF

📄 Bridging Lifelong and Multi-Task Representation Learning via Algorithm and Complexity Measure

2025-11-06

Авторы:

Zhi Wang, Chicheng Zhang, Ramya Korlakai Vinayak

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In lifelong learning, a learner faces a sequence of tasks with shared structure and aims to identify and leverage it to accelerate learning. We study the setting where such structure is captured by a common representation of data. Unlike multi-task learning or learning-to-learn, where tasks are available upfront to learn the representation, lifelong learning requires the learner to make use of its existing knowledge while continually gathering partial information in an online fashion. In this pa...

ID: 2511.01847v1 cs.LG, stat.ML

arXiv PDF

📄 Learning CNF formulas from uniform random solutions in the local lemma regime

2025-11-06

Авторы:

Weiming Feng, Xiongxin Yang, Yixiao Yu, Yiyao Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study the problem of learning a $n$-variables $k$-CNF formula $\Phi$ from its i.i.d. uniform random solutions, which is equivalent to learning a Boolean Markov random field (MRF) with $k$-wise hard constraints. Revisiting Valiant's algorithm (Commun. ACM'84), we show that it can exactly learn (1) $k$-CNFs with bounded clause intersection size under Lov\'asz local lemma type conditions, from $O(\log n)$ samples; and (2) random $k$-CNFs near the satisfiability threshold, from $\widetilde{O}(n^{...

ID: 2511.02487v1 cs.DS, cs.LG, stat.ML

arXiv PDF

📄 Quantitative Bounds for Length Generalization in Transformers

2025-11-04

Авторы:

Zachary Izzo, Eshaan Nichani, Jason D. Lee

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We study the problem of length generalization (LG) in transformers: the ability of a model trained on shorter sequences to maintain performance when evaluated on much longer, previously unseen inputs. Prior work by Huang et al. (2025) established that transformers eventually achieve length generalization once the training sequence length exceeds some finite threshold, but left open the question of how large it must be. In this work, we provide the first quantitative bounds on the required traini...

ID: 2510.27015v1 cs.LG, stat.ML

arXiv PDF

📄 Causal Masking on Spatial Data: An Information-Theoretic Case for Learning Spatial Datasets with Unimodal Language Models

2025-11-04

Авторы:

Jared Junkin, Samuel Nathanson

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Language models are traditionally designed around causal masking. In domains with spatial or relational structure, causal masking is often viewed as inappropriate, and sequential linearizations are instead used. Yet the question of whether it is viable to accept the information loss introduced by causal masking on nonsequential data has received little direct study, in part because few domains offer both spatial and sequential representations of the same dataset. In this work, we investigate thi...

ID: 2510.27009v1 cs.AI, cs.LG, stat.ML

arXiv PDF

📄 LLMs as In-Context Meta-Learners for Model and Hyperparameter Selection

2025-11-01

Авторы:

Youssef Attia El Hili, Albert Thomas, Malik Tiomoko, Abdelhakim Benechehab, Corentin Léger, Corinne Ancourt, Balázs Kégl

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Model and hyperparameter selection are critical but challenging in machine learning, typically requiring expert intuition or expensive automated search. We investigate whether large language models (LLMs) can act as in-context meta-learners for this task. By converting each dataset into interpretable metadata, we prompt an LLM to recommend both model families and hyperparameters. We study two prompting strategies: (1) a zero-shot mode relying solely on pretrained knowledge, and (2) a meta-inform...

ID: 2510.26510v1 cs.LG, stat.ML

arXiv PDF

📄 On Measuring Localization of Shortcuts in Deep Networks

2025-11-01

Авторы:

Nikita Tsoy, Nikola Konstantinov

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Shortcuts, spurious rules that perform well during training but fail to generalize, present a major challenge to the reliability of deep networks (Geirhos et al., 2020). However, the impact of shortcuts on feature representations remains understudied, obstructing the design of principled shortcut-mitigation methods. To overcome this limitation, we investigate the layer-wise localization of shortcuts in deep models. Our novel experiment design quantifies the layer-wise contribution to accuracy de...

ID: 2510.26560v1 cs.LG, stat.ML

arXiv PDF

Показано 121 - 130 из 385 записей