📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Rong Jiang, Cong Ma
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study batched nonparametric contextual bandits under a margin condition
when the margin parameter $\alpha$ is unknown. To capture the statistical price
of this ignorance, we introduce the regret inflation criterion, defined as the
ratio between the regret of an adaptive algorithm and that of an oracle knowing
$\alpha$. We show that the optimal regret inflation grows polynomial with the
horizon $T$, with exponent precisely given by the value of a convex
optimization problem involving the dimen...
📄 Epidemiology of Large Language Models: A Benchmark for Observational Distribution Knowledge
2025-11-07Авторы:
Drago Plecko, Patrik Okanovic, Torsten Hoefler, Elias Bareinboim
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Artificial intelligence (AI) systems hold great promise for advancing various
scientific disciplines, and are increasingly used in real-world applications.
Despite their remarkable progress, further capabilities are expected in order
to achieve more general types of intelligence. A critical distinction in this
context is between factual knowledge, which can be evaluated against true or
false answers (e.g., "what is the capital of England?"), and probabilistic
knowledge, reflecting probabilistic ...
Авторы:
Daniel Busbib, Ami Wiesel
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We consider covariance estimation under Toeplitz structure. Numerous
sophisticated optimization methods have been developed to maximize the Gaussian
log-likelihood under Toeplitz constraints. In contrast, recent advances in deep
learning demonstrate the surprising power of simple gradient descent (GD)
applied to overparameterized models. Motivated by this trend, we revisit
Toeplitz covariance estimation through the lens of overparameterized GD. We
model the $P\times P$ covariance as a sum of $K$...
Авторы:
Xiaopeng Ke, Yihan Yu, Ruyue Zhang, Zhishuo Zhou, Fangzhou Shi, Chang Men, Zhengdan Zhu
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Counterfactual causal inference faces significant challenges when extended to
multi-category, multi-valued treatments, where complex cross-effects between
heterogeneous interventions are difficult to model. Existing methodologies
remain constrained to binary or single-type treatments and suffer from
restrictive assumptions, limited scalability, and inadequate evaluation
frameworks for complex intervention scenarios.
We present XTNet, a novel network architecture for multi-category,
multi-value...
📄 Bridging Lifelong and Multi-Task Representation Learning via Algorithm and Complexity Measure
2025-11-06Авторы:
Zhi Wang, Chicheng Zhang, Ramya Korlakai Vinayak
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In lifelong learning, a learner faces a sequence of tasks with shared
structure and aims to identify and leverage it to accelerate learning. We study
the setting where such structure is captured by a common representation of
data. Unlike multi-task learning or learning-to-learn, where tasks are
available upfront to learn the representation, lifelong learning requires the
learner to make use of its existing knowledge while continually gathering
partial information in an online fashion. In this pa...
Авторы:
Weiming Feng, Xiongxin Yang, Yixiao Yu, Yiyao Zhang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study the problem of learning a $n$-variables $k$-CNF formula $\Phi$ from
its i.i.d. uniform random solutions, which is equivalent to learning a Boolean
Markov random field (MRF) with $k$-wise hard constraints. Revisiting Valiant's
algorithm (Commun. ACM'84), we show that it can exactly learn (1) $k$-CNFs with
bounded clause intersection size under Lov\'asz local lemma type conditions,
from $O(\log n)$ samples; and (2) random $k$-CNFs near the satisfiability
threshold, from $\widetilde{O}(n^{...
Авторы:
Zachary Izzo, Eshaan Nichani, Jason D. Lee
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study the problem of length generalization (LG) in transformers: the
ability of a model trained on shorter sequences to maintain performance when
evaluated on much longer, previously unseen inputs. Prior work by Huang et al.
(2025) established that transformers eventually achieve length generalization
once the training sequence length exceeds some finite threshold, but left open
the question of how large it must be. In this work, we provide the first
quantitative bounds on the required traini...
Авторы:
Jared Junkin, Samuel Nathanson
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Language models are traditionally designed around causal masking. In domains
with spatial or relational structure, causal masking is often viewed as
inappropriate, and sequential linearizations are instead used. Yet the question
of whether it is viable to accept the information loss introduced by causal
masking on nonsequential data has received little direct study, in part because
few domains offer both spatial and sequential representations of the same
dataset. In this work, we investigate thi...
Авторы:
Youssef Attia El Hili, Albert Thomas, Malik Tiomoko, Abdelhakim Benechehab, Corentin Léger, Corinne Ancourt, Balázs Kégl
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Model and hyperparameter selection are critical but challenging in machine
learning, typically requiring expert intuition or expensive automated search.
We investigate whether large language models (LLMs) can act as in-context
meta-learners for this task. By converting each dataset into interpretable
metadata, we prompt an LLM to recommend both model families and
hyperparameters. We study two prompting strategies: (1) a zero-shot mode
relying solely on pretrained knowledge, and (2) a meta-inform...
Авторы:
Nikita Tsoy, Nikola Konstantinov
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Shortcuts, spurious rules that perform well during training but fail to
generalize, present a major challenge to the reliability of deep networks
(Geirhos et al., 2020). However, the impact of shortcuts on feature
representations remains understudied, obstructing the design of principled
shortcut-mitigation methods. To overcome this limitation, we investigate the
layer-wise localization of shortcuts in deep models. Our novel experiment
design quantifies the layer-wise contribution to accuracy de...
Показано 121 -
130
из 385 записей