📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Shayan Kiyani, Hamed Hassani, George Pappas, Aaron Roth
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Calibration has emerged as a foundational goal in ``trustworthy machine
learning'', in part because of its strong decision theoretic semantics.
Independent of the underlying distribution, and independent of the decision
maker's utility function, calibration promises that amongst all policies
mapping predictions to actions, the uniformly best policy is the one that
``trusts the predictions'' and acts as if they were correct. But this is true
only of \emph{fully calibrated} forecasts, which are tr...
📄 Finding the Sweet Spot: Trading Quality, Cost, and Speed During Inference-Time LLM Reflection
2025-10-25Авторы:
Jack Butler, Nikita Kozodoi, Zainab Afolabi, Brian Tyacke, Gaiar Baimuratov
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
As Large Language Models (LLMs) continue to evolve, practitioners face
increasing options for enhancing inference-time performance without model
retraining, including budget tuning and multi-step techniques like
self-reflection. While these methods improve output quality, they create
complex trade-offs among accuracy, cost, and latency that remain poorly
understood across different domains. This paper systematically compares
self-reflection and budget tuning across mathematical reasoning and tra...
Авторы:
Dechen Zhang, Junwei Su, Difan Zou
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The use of low-bit quantization has emerged as an indispensable technique for
enabling the efficient training of large-scale models. Despite its widespread
empirical success, a rigorous theoretical understanding of its impact on
learning performance remains notably absent, even in the simplest linear
regression setting. We present the first systematic theoretical study of this
fundamental question, analyzing finite-step stochastic gradient descent (SGD)
for high-dimensional linear regression und...
Авторы:
Yi-Shan Chu, Yueh-Cheng Kuo
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We revisit the Universal Approximation Theorem(UAT) through the lens of the
tropical geometry of neural networks and introduce a constructive,
geometry-aware initialization for sigmoidal multi-layer perceptrons (MLPs).
Tropical geometry shows that Rectified Linear Unit (ReLU) networks admit
decision functions with a combinatorial structure often described as a tropical
rational, namely a difference of tropical polynomials. Focusing on planar
binary classification, we design purely sigmoidal MLPs...
Авторы:
Dechen Zhang, Zhenmei Shi, Yi Zhang, Yingyu Liang, Difan Zou
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Kernel ridge regression (KRR) is a foundational tool in machine learning,
with recent work emphasizing its connections to neural networks. However,
existing theory primarily addresses the i.i.d. setting, while real-world data
often exhibits structured dependencies - particularly in applications like
denoising score learning where multiple noisy observations derive from shared
underlying signals. We present the first systematic study of KRR generalization
for non-i.i.d. data with signal-noise cau...
Авторы:
Gabriele Visentin, Patrick Cheridito
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
In this paper, we show that interventionally robust optimization problems in
causal models are continuous under the $G$-causal Wasserstein distance, but may
be discontinuous under the standard Wasserstein distance. This highlights the
importance of using generative models that respect the causal structure when
augmenting data for such tasks. To this end, we propose a new normalizing flow
architecture that satisfies a universal approximation property for causal
structural models and can be effici...
Авторы:
Mátyás Schubert, Tom Claassen, Sara Magliacane
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Causal discovery methods can identify valid adjustment sets for causal effect
estimation for a pair of target variables, even when the underlying causal
graph is unknown. Global causal discovery methods focus on learning the whole
causal graph and therefore enable the recovery of optimal adjustment sets,
i.e., sets with the lowest asymptotic variance, but they quickly become
computationally prohibitive as the number of variables grows. Local causal
discovery methods offer a more scalable alterna...
Авторы:
Ningkang Peng, Yuzhe Mao, Yuhao Zhang, Linjin Qian, Qianfeng Yu, Yanhui Gu, Yi Chen, Li Kong
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Out-of-Distribution (OOD) detection is a cornerstone for the safe deployment
of AI systems in the open world. However, existing methods treat OOD detection
as a binary classification problem, a cognitive flattening that fails to
distinguish between semantically close (Near-OOD) and distant (Far-OOD) unknown
risks. This limitation poses a significant safety bottleneck in applications
requiring fine-grained risk stratification. To address this, we propose a
paradigm shift from a conventional proba...
Авторы:
Kyla Chasalow, Skyler Wu, Susan Murphy
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Missing data in online reinforcement learning (RL) poses challenges compared
to missing data in standard tabular data or in offline policy learning. The
need to impute and act at each time step means that imputation cannot be put
off until enough data exist to produce stable imputation models. It also means
future data collection and learning depend on previous imputations. This paper
proposes fully online imputation ensembles. We find that maintaining multiple
imputation pathways may help balan...
Авторы:
Adrián Pérez-Herrero, Paulo Félix, Jesús Presedo, Carl Henrik Ek
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We present a method that models the evolution of an unbounded number of time
series clusters by switching among an unknown number of regimes with linear
dynamics. We develop a Bayesian non-parametric approach using a hierarchical
Dirichlet process as a prior on the parameters of a Switching Linear Dynamical
System and a Gaussian process prior to model the statistical variations in
amplitude and temporal alignment within each cluster. By modeling the evolution
of time series patterns, the method ...
Показано 11 -
20
из 35 записей