📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 ToolExpander: Extending the Frontiers of Tool-Using Reinforcement Learning to Weak LLMs

2025-10-11

Авторы:

Fu Chen, Peng Wang, Xiyin Li, Wen Li, Shichi Lei, Dongdong Xiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Training Large Language Models (LLMs) with Group Relative Policy Optimization (GRPO) encounters a significant challenge: models often fail to produce accurate responses, particularly in small-scale architectures. This limitation not only diminishes performance improvements and undermines the potential of GRPO but also frequently leads to mid-training collapse, adversely affecting stability and final efficacy. To address these issues, we propose ToolExpander, a novel framework that advances tool-...

ID: 2510.07737v1 cs.CL, cs.LG

arXiv PDF

📄 Instance Relation Learning Network with Label Knowledge Propagation for Few-shot Multi-label Intent Detection

2025-10-11

Авторы:

Shiman Zhao, Shangyuan Li, Wei Chen, Tengjiao Wang, Jiahui Yao, Jiabin Zheng, Kam Fai Wong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Few-shot Multi-label Intent Detection (MID) is crucial for dialogue systems, aiming to detect multiple intents of utterances in low-resource dialogue domains. Previous studies focus on a two-stage pipeline. They first learn representations of utterances with multiple labels and then use a threshold-based strategy to identify multi-label results. However, these methods rely on representation classification and ignore instance relations, leading to error propagation. To solve the above issues, we ...

ID: 2510.07776v1 cs.CL, cs.LG

arXiv PDF

📄 Climate Knowledge in Large Language Models

2025-10-11

Авторы:

Ivan Kuznetsov, Jacopo Grassi, Dmitrii Pantiukhin, Boris Shapkin, Thomas Jung, Nikolay Koldunov

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are increasingly deployed for climate-related applications, where understanding internal climatological knowledge is crucial for reliability and misinformation risk assessment. Despite growing adoption, the capacity of LLMs to recall climate normals from parametric knowledge remains largely uncharacterized. We investigate the capacity of contemporary LLMs to recall climate normals without external retrieval, focusing on a prototypical query: mean July 2-m air tempera...

ID: 2510.08043v1 cs.CL, cs.LG, physics.ao-ph

arXiv PDF

📄 Investigating Counterclaims in Causality Extraction from Text

2025-10-11

Авторы:

Tim Hagen, Niklas Deckers, Felix Wolter, Harrisen Scells, Martin Potthast

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Research on causality extraction from text has so far almost entirely neglected counterclaims. Existing causality extraction datasets focus solely on "procausal" claims, i.e., statements that support a relationship. "Concausal" claims, i.e., statements that refute a relationship, are entirely ignored or even accidentally annotated as procausal. We address this shortcoming by developing a new dataset that integrates concausality. Based on an extensive literature review, we first show that concaus...

ID: 2510.08224v1 cs.CL, cs.LG

arXiv PDF

📄 On the Relationship Between the Choice of Representation and In-Context Learning

2025-10-11

Авторы:

Ioana Marinescu, Kyunghyun Cho, Eric Karl Oermann

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In-context learning (ICL) is the ability of a large language model (LLM) to learn a new task from a few demonstrations presented as part of the context. Past studies have attributed a large portion of the success of ICL to the way these in-context demonstrations are represented, particularly to how labels are represented in classification tasks. On the other hand, observations of the learning capacity of ICL (i.e., the extent to which more in-context demonstrations can lead to higher performance...

ID: 2510.08372v1 cs.CL, cs.LG

arXiv PDF

📄 TTOM: Test-Time Optimization and Memorization for Compositional Video Generation

2025-10-11

Авторы:

Leigang Qu, Ziyang Wang, Na Zheng, Wenjie Wang, Liqiang Nie, Tat-Seng Chua

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Video Foundation Models (VFMs) exhibit remarkable visual generation performance, but struggle in compositional scenarios (e.g., motion, numeracy, and spatial relation). In this work, we introduce Test-Time Optimization and Memorization (TTOM), a training-free framework that aligns VFM outputs with spatiotemporal layouts during inference for better text-image alignment. Rather than direct intervention to latents or attention per-sample in existing work, we integrate and optimize new parameters gu...

ID: 2510.07940v1 cs.CV, cs.AI, cs.CL, cs.LG, cs.MM

arXiv PDF

📄 VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

2025-10-11

Авторы:

Dhruv Jain, Harshit Shukla, Gautam Rajeev, Ashish Kulkarni, Chandra Khatri, Shubham Agarwal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large-scale Speech Language Models (SpeechLMs) have enabled voice assistants capable of understanding natural spoken queries and performing complex tasks. However, existing speech benchmarks primarily focus on isolated capabilities such as transcription, or question-answering, and do not systematically evaluate agentic scenarios encompassing multilingual and cultural understanding, as well as adversarial robustness. To address this, we introduce VoiceAgentBench, a comprehensive benchmark designe...

ID: 2510.07978v1 cs.AI, cs.CL, cs.LG

arXiv PDF

📄 Beyond Pass@k: Breadth-Depth Metrics for Reasoning Boundaries

2025-10-11

Авторы:

Marius Dragoi, Ioana Pintilie, Florin Gogianu, Florin Brad

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful paradigm to improve Large Language Models on reasoning tasks such as coding, math or logic. To assess the reasoning boundary (the fraction of problems a model can solve) researchers often report Pass@k at large sampling budgets. Recent results reveal a crossover phenomenon: while RLVR models outperform the base model at small k values, the base model usually outperforms them when sampling a very large number of compl...

ID: 2510.08325v1 cs.AI, cs.CL, cs.LG, I.2.6; I.2.7

arXiv PDF

📄 Looking to Learn: Token-wise Dynamic Gating for Low-Resource Vision-Language Modelling

2025-10-11

Авторы:

Bianca-Mihaela Ganescu, Suchir Salhan, Andrew Caines, Paula Buttery

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Training vision-language models on cognitively-plausible amounts of data requires rethinking how models integrate multimodal information. Within the constraints of the Vision track for the BabyLM Challenge 2025, we propose a lightweight decoder-based architecture with (1) token-wise dynamic gating for adaptive fusion of linguistic and visual cues, (2) feature modulation and channel attention to maximise the utility of limited visual information and (3) auxiliary contrastive objectives for visual...

ID: 2510.08470v1 cs.AI, cs.CL, cs.LG

arXiv PDF

📄 AutoMLGen: Navigating Fine-Grained Optimization for Coding Agents

2025-10-11

Авторы:

Shangheng Du, Xiangchao Yan, Dengyang Jiang, Jiakang Yuan, Yusong Hu, Xin Li, Liang He, Bo Zhang, Lei Bai

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) have shown impressive performance in general programming tasks. However, in Machine Learning Engineering (MLE) scenarios such as AutoML and Kaggle competitions, achieving high performance depends heavily on expert intervention and repeated adjustments rather than simply generating correct code. When applied directly to these tasks, LLMs often lack fine-grained domain priors, and existing MLE approaches that use linear or tree-structured searches limit knowledge trans...

ID: 2510.08511v1 cs.AI, cs.CL, cs.LG

arXiv PDF

Показано 241 - 250 из 575 записей