📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Anchit Jain, Stephen Bates
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Decomposing prediction uncertainty into its aleatoric (irreducible) and
epistemic (reducible) components is critical for the development and deployment
of machine learning systems. A popular, principled measure for epistemic
uncertainty is the mutual information between the response variable and model
parameters. However, evaluating this measure requires access to the posterior
distribution of the model parameters, which is challenging to compute. In view
of this, we introduce a frequentist meas...
Авторы:
Yuma Ichikawa, Shuhei Kashiwamura, Ayaka Sakata
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Quantized neural network training optimizes a discrete, non-differentiable
objective. The straight-through estimator (STE) enables backpropagation through
surrogate gradients and is widely used. While previous studies have primarily
focused on the properties of surrogate gradients and their convergence, the
influence of quantization hyperparameters, such as bit width and quantization
range, on learning dynamics remains largely unexplored. We theoretically show
that in the high-dimensional limit,...
Авторы:
Hao Zeng, Jianguo Huang, Bingyi Jing, Hongxin Wei, Bo An
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large reasoning models (LRMs) have achieved remarkable progress in complex
problem-solving tasks. Despite this success, LRMs typically suffer from high
computational costs during deployment, highlighting a need for efficient
inference. A popular direction of efficiency improvement is to switch the LRM
between thinking and nonthinking modes dynamically. However, such approaches
often introduce additional reasoning errors and lack statistical guarantees for
the performance loss, which are critical...