📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 0
Последнее обновление: сегодня
Авторы:
David Pohl, Marco Cognetta, Junyoung Lee, Naoaki Okazaki
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Modern language models operate on subword-tokenized text in order to make a
trade-off between model size, inference speed, and vocabulary coverage. A side
effect of this is that, during inference, models are evaluated by measuring the
probability of only the specific tokenization produced as the output, despite
there being many possible ways to represent the same text with a subword
vocabulary. Recent studies have argued instead for evaluating LLMs by
marginalization - the probability mass of al...
Авторы:
Prashant Kodali, Vaishnavi Shivkumar, Swarang Joshi, Monojit Choudhary, Ponnurangam Kumaraguru, Manish Shrivastava
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
We study model merging as a practical alternative to conventional adaptation
strategies for code-mixed NLP. Starting from a multilingual base model, we: (i)
perform continued pre-training (CPT) on unlabeled code-mixed text to obtain an
adapted checkpoint, (ii) merge checkpoint with the base model, and (iii)
fine-tune (FT) on the downstream task data. We evaluate our approach for
sentence classification (sentiment and hate speech) task in English-Hindi
(En-Hi) and English-Spanish (En-Es) using XL...
Авторы:
Joseph McInerney
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This paper introduces Qomhr\'a, a bilingual Irish-English large language
model (LLM), developed under low-resource constraints presenting a complete
pipeline spanning bilingual continued pre-training, instruction tuning, and
alignment from human preferences. Newly accessible Irish corpora and English
text are mixed and curated to improve Irish performance while preserving
English ability. 6 closed-weight LLMs are judged for their Irish text
generation by a native speaker, a learner and other LLM...
📄 Temporal Referential Consistency: Do LLMs Favor Sequences Over Absolute Time References?
2025-10-21Авторы:
Ashutosh Bajpai, Tanmoy Chakraborty
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The increasing acceptance of large language models (LLMs) as an alternative
to knowledge sources marks a significant paradigm shift across various domains,
including time-sensitive fields such as law, healthcare, and finance. To
fulfill this expanded role, LLMs must not only be factually accurate but also
demonstrate consistency across temporal dimensions, necessitating robust
temporal reasoning capabilities. Despite this critical requirement, efforts to
ensure temporal consistency in LLMs remai...
Авторы:
Wael Rashwan, Hossam M. Zawbaa, Sourav Dutta, Haytham Assem
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Detecting out-of-scope (OOS) user utterances remains a key challenge in
task-oriented dialogue systems and, more broadly, in open-set intent
recognition. Existing approaches often depend on strong distributional
assumptions or auxiliary calibration modules. We present DROID (Dual
Representation for Out-of-Scope Intent Detection), a compact end-to-end
framework that combines two complementary encoders -- the Universal Sentence
Encoder (USE) for broad semantic generalization and a domain-adapted
T...
Авторы:
Nzubechukwu C. Ohalete, Kevin B. Gittner, Lauren M. Matheny
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large Language Models (LLMs) are highly sensitive to prompt design, and
making optimized prompting techniques is crucial for generating consistent,
high-quality outputs. In this study, we introduce COSTAR-A, a novel prompt
engineering framework that enhances the existing COSTAR method, which stands
for Context, Objective, Style, Tone, Audience, and Response, by adding the
'Answer' component at the end. We demonstrate that while the original COSTAR
framework improves prompt clarity and aligns out...
📄 Augmenting Dialog with Think-Aloud Utterances for Modeling Individual Personality Traits by LLM
2025-10-14Авторы:
Seiya Ishikura, Hiroaki Yamada, Tatsuya Hiraoka, Hiroaki Yamada, Takenobu Tokunaga
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
This study proposes augmenting dialog data with think-aloud utterances (TAUs)
for modeling individual personalities in text chat by LLM. TAU is a
verbalization of a speaker's thought before articulating the utterance. We
expect "persona LLMs" trained with TAU-augmented data can mimic the speaker's
personality trait better. We tested whether the trained persona LLMs obtain the
human personality with respect to Big Five, a framework characterizing human
personality traits from five aspects. The re...
Авторы:
Ruitong Liu, Yan Wen, Te Sun, Yunjia Wu, Pingyang Huang, Zihang Yu, Siyuan Li
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Fusing Knowledge Graphs with Large Language Models is crucial for
knowledge-intensive tasks like knowledge graph completion. The prevailing
paradigm, prefix-tuning, simply concatenates knowledge embeddings with text
inputs. However, this shallow fusion overlooks the rich relational semantics
within KGs and imposes a significant implicit reasoning burden on the LLM to
correlate the prefix with the text. To address these, we propose
Semantic-condition Tuning (SCT), a new knowledge injection paradi...
📄 Comprehensiveness Metrics for Automatic Evaluation of Factual Recall in Text Generation
2025-10-11Авторы:
Adam Dejl, James Barry, Alessandra Pascale, Javier Carnerero Cano
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Despite demonstrating remarkable performance across a wide range of tasks,
large language models (LLMs) have also been found to frequently produce outputs
that are incomplete or selectively omit key information. In sensitive domains,
such omissions can result in significant harm comparable to that posed by
factual inaccuracies, including hallucinations. In this study, we address the
challenge of evaluating the comprehensiveness of LLM-generated texts, focusing
on the detection of missing informa...
Авторы:
Zihao Li, Shaoxiong Ji, Jörg Tiedemann
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Test-time scaling (TTS) has enhanced the performance of Reasoning Models
(RMs) on various tasks such as math and coding, yet its efficacy in machine
translation (MT) remains underexplored. This paper investigates whether
increased inference-time computation improves translation quality. We evaluate
12 RMs across a diverse suite of MT benchmarks spanning multiple domains,
examining three scenarios: direct translation, forced-reasoning extrapolation,
and post-editing. Our findings show that for ge...
Показано 11 -
20
из 63 записей