📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Grigory Kovalev, Mikhail Tikhomirov
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This work investigates distillation methods for large language models (LLMs)
with the goal of developing compact models that preserve high performance.
Several existing approaches are reviewed, with a discussion of their respective
strengths and limitations. An improved method based on the ShortGPT approach
has been developed, building upon the idea of incorporating iterative
evaluation of layer importance. At each step, importance is assessed by
measuring performance degradation when individual...
Авторы:
Arthur Satouf, Yuxuan Zong, Habiboulaye Amadou-Boubacar, Pablo Piantanida, Benjamin Piwowarski
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Generative Retrieval (GR) differs from the traditional index-then-retrieve
pipeline by storing relevance in model parameters and directly generating
document identifiers. However, GR often struggles to generalize and is costly
to scale. We introduce QUESTER (QUEry SpecificaTion gEnerative Retrieval),
which reframes GR as query specification generation - in this work, a simple
keyword query handled by BM25 - using a (small) LLM. The policy is trained
using reinforcement learning techniques (GRPO)...
Авторы:
Constanza Fierro, Fabien Roger
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Providing high-quality feedback to Large Language Models (LLMs) on a diverse
training distribution can be difficult and expensive, and providing feedback
only on a narrow distribution can result in unintended generalizations. To
better leverage narrow training data, we propose contrastive weight steering, a
simple post-training method that edits the model parameters using weight
arithmetic. We isolate a behavior direction in weight-space by subtracting the
weight deltas from two small fine-tunes...
Авторы:
Manh Nguyen, Sunil Gupta, Dai Do, Hung Le
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Hallucination mitigation remains a persistent challenge for large language
models (LLMs), even as model scales grow. Existing approaches often rely on
external knowledge sources, such as structured databases or knowledge graphs,
accessed through prompting or retrieval. However, prompt-based grounding is
fragile and domain-sensitive, while symbolic knowledge integration incurs heavy
retrieval and formatting costs. Motivated by knowledge graphs, we introduce
Graph-Retrieved Adaptive Decoding (GRAD...
Авторы:
Liran Cohen, Yaniv Nemcovesky, Avi Mendelson
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Machine unlearning aims to remove the influence of specific training data
from a model without requiring full retraining. This capability is crucial for
ensuring privacy, safety, and regulatory compliance. Therefore, verifying
whether a model has truly forgotten target data is essential for maintaining
reliability and trustworthiness. However, existing evaluation methods often
assess forgetting at the level of individual inputs. This approach may overlook
residual influence present in semantical...
Авторы:
Narjes Nourzad, Hanqing Yang, Shiyu Chen, Carlee Joe-Wong
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Cooperative multi-agent planning requires agents to make joint decisions with
partial information and limited communication. Coordination at the trajectory
level often fails, as small deviations in timing or movement cascade into
conflicts. Symbolic planning mitigates this challenge by raising the level of
abstraction and providing a minimal vocabulary of actions that enable
synchronization and collective progress. We present DR. WELL, a decentralized
neurosymbolic framework for cooperative mult...
📄 Automatic Machine Translation Detection Using a Surrogate Multilingual Translation Model
2025-11-07Авторы:
Cristian García-Romero, Miquel Esplà-Gomis, Felipe Sánchez-Martínez
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Modern machine translation (MT) systems depend on large parallel corpora,
often collected from the Internet. However, recent evidence indicates that (i)
a substantial portion of these texts are machine-generated translations, and
(ii) an overreliance on such synthetic content in training data can
significantly degrade translation quality. As a result, filtering out non-human
translations is becoming an essential pre-processing step in building
high-quality MT systems. In this work, we propose a ...
📄 Data-Efficient Adaptation and a Novel Evaluation Method for Aspect-based Sentiment Analysis
2025-11-07Авторы:
Yan Cathy Hua, Paul Denny, Jörg Wicker, Katerina Taškova
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Aspect-based Sentiment Analysis (ABSA) is a fine-grained opinion mining
approach that identifies and classifies opinions associated with specific
entities (aspects) or their categories within a sentence. Despite its rapid
growth and broad potential, ABSA research and resources remain concentrated in
commercial domains, leaving analytical needs unmet in high-demand yet
low-resource areas such as education and healthcare. Domain adaptation
challenges and most existing methods' reliance on resource...
Авторы:
Michel Wong, Ali Alshehri, Sophia Kao, Haotian He
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Text Normalization (TN) is a key preprocessing step in Text-to-Speech (TTS)
systems, converting written forms into their canonical spoken equivalents.
Traditional TN systems can exhibit high accuracy, but involve substantial
engineering effort, are difficult to scale, and pose challenges to language
coverage, particularly in low-resource settings. We propose PolyNorm, a
prompt-based approach to TN using Large Language Models (LLMs), aiming to
reduce the reliance on manually crafted rules and ena...
Авторы:
Kazi Reyazul Hasan, Mubasshira Musarrat, A. B. M. Alim Al Islam, Muhammad Abdullah Adnan
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large language models work well for technical problem solving in English but
perform poorly when the same questions are asked in Bangla. A simple solution
would be to translate Bangla questions into English first and then use these
models. However, existing Bangla-English translation systems struggle with
technical terms. They often mistranslate specialized vocabulary, which changes
the meaning of the problem and leads to wrong answers. We present BanglaSTEM, a
dataset of 5,000 carefully selecte...
Показано 101 -
110
из 573 записей