📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Adrian Kosowski, Przemysław Uznański, Jan Chorowski, Zuzanna Stamirowska, Michał Bartoszkiewicz
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The relationship between computing systems and the brain has served as
motivation for pioneering theoreticians since John von Neumann and Alan Turing.
Uniform, scale-free biological networks, such as the brain, have powerful
properties, including generalizing over time, which is the main barrier for
Machine Learning on the path to Universal Reasoning Models.
We introduce `Dragon Hatchling' (BDH), a new Large Language Model
architecture based on a scale-free biologically inspired network of \$n...
Авторы:
Gholamali Aminian, Andrew Elliott, Tiger Li, Timothy Cheuk Hin Wong, Victor Claude Dehon, Lukasz Szpruch, Carsten Maple, Christopher Read, Martin Brown, Gesine Reinert, Mo Mamouei
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Detecting payment fraud in real-world banking streams requires models that
can exploit both the order of events and the irregular time gaps between them.
We introduce FraudTransformer, a sequence model that augments a vanilla
GPT-style architecture with (i) a dedicated time encoder that embeds either
absolute timestamps or inter-event values, and (ii) a learned positional
encoder that preserves relative order. Experiments on a large industrial
dataset -- tens of millions of transactions and auxi...
Авторы:
Chris Kolb, Laetitia Frost, Bernd Bischl, David Rügamer
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Structured sparsity regularization offers a principled way to compact neural
networks, but its non-differentiability breaks compatibility with conventional
stochastic gradient descent and requires either specialized optimizers or
additional post-hoc pruning without formal guarantees. In this work, we propose
$D$-Gating, a fully differentiable structured overparameterization that splits
each group of weights into a primary weight vector and multiple scalar gating
factors. We prove that any local ...
Авторы:
Chenruo Liu, Yijun Dong, Qi Lei
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We initiate a unified theoretical and algorithmic study of a key problem in
weak-to-strong (W2S) generalization: when fine-tuning a strong pre-trained
student with pseudolabels from a weaker teacher on a downstream task with
spurious correlations, does W2S happen, and how to improve it upon failures? We
consider two sources of spurious correlations caused by group imbalance: (i) a
weak teacher fine-tuned on group-imbalanced labeled data with a minority group
of fraction $\eta_\ell$, and (ii) a g...
Авторы:
Julia Wenkmann, Damien Garreau
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
One of the most pressing challenges in artificial intelligence is to make
models more transparent to their users. Recently, explainable artificial
intelligence has come up with numerous method to tackle this challenge. A
promising avenue is to use concept-based explanations, that is, high-level
concepts instead of plain feature importance score. Among this class of
methods, Concept Activation vectors (CAVs), Kim et al. (2018) stands out as one
of the main protagonists. One interesting aspect of ...
📄 Demographic-Agnostic Fairness without Harm
2025-10-01Авторы:
Zhongteng Cai, Mohammad Mahdi Khalili, Xueru Zhang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
As machine learning (ML) algorithms are increasingly used in social domains
to make predictions about humans, there is a growing concern that these
algorithms may exhibit biases against certain social groups. Numerous notions
of fairness have been proposed in the literature to measure the unfairness of
ML. Among them, one class that receives the most attention is
\textit{parity-based}, i.e., achieving fairness by equalizing treatment or
outcomes for different social groups. However, achieving pa...
Авторы:
Bo Hu, José C. Príncipe
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Pairwise distance-based costs are crucial for self-supervised and contrastive
feature learning. Mixture Density Networks (MDNs) are a widely used approach
for generative models and density approximation, using neural networks to
produce multiple centers that define a Gaussian mixture. By combining MDNs with
contrastive costs, this paper proposes data density approximation using four
types of kernelized matrix costs: the scalar cost, the vector-matrix cost, the
matrix-matrix cost (the trace of Sc...
📄 A signal separation view of classification
2025-10-01Авторы:
H. N. Mhaskar, Ryan O'Dowd
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The problem of classification in machine learning has often been approached
in terms of function approximation. In this paper, we propose an alternative
approach for classification in arbitrary compact metric spaces which, in
theory, yields both the number of classes, and a perfect classification using a
minimal number of queried labels. Our approach uses localized trigonometric
polynomial kernels initially developed for the point source signal separation
problem in signal processing. Rather tha...
Авторы:
Dipan Maity
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Orthogonal gradient updates have emerged as a promising direction in
optimization for machine learning. However, traditional approaches such as
SVD/QR decomposition incur prohibitive computational costs of O(n^3) and
underperform compared to well-tuned SGD with momentum, since momentum is
applied only after strict orthogonalization. Recent advances, such as Muon,
improve efficiency by applying momentum before orthogonalization and producing
semi-orthogonal matrices via Newton-Schulz iterations, ...
Авторы:
Maedeh Zarvandi, Michael Timothy, Theresa Wasserer, Debarghya Ghoshdastidar
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Kernel methods provide a theoretically grounded framework for non-linear and
non-parametric learning, with strong analytic foundations and statistical
guarantees. Yet, their scalability has long been limited by prohibitive time
and memory costs. While progress has been made in scaling kernel regression, no
framework exists for scalable kernel-based representation learning, restricting
their use in the era of foundation models where representations are learned
from massive unlabeled data. We intr...
Показано 261 -
270
из 385 записей