📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Agentic Entropy-Balanced Policy Optimization

2025-10-18

Авторы:

Guanting Dong, Licheng Bao, Zhongyuan Wang, Kangzhi Zhao, Xiaoxi Li, Jiajie Jin, Jinghan Yang, Hangyu Mao, Fuzheng Zhang, Kun Gai, Guorui Zhou, Yutao Zhu, Ji-Rong Wen, Zhicheng Dou

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recently, Agentic Reinforcement Learning (Agentic RL) has made significant progress in incentivizing the multi-turn, long-horizon tool-use capabilities of web agents. While mainstream agentic RL algorithms autonomously explore high-uncertainty tool-call steps under the guidance of entropy, excessive reliance on entropy signals can impose further constraints, leading to the training collapse. In this paper, we delve into the challenges caused by entropy and propose the Agentic Entropy-Balanced Po...

ID: 2510.14545v1 cs.LG, cs.AI, cs.CL, cs.IR

arXiv PDF

📄 Reasoning with Sampling: Your Base Model is Smarter Than You Think

2025-10-18

Авторы:

Aayush Karan, Yilun Du

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Frontier reasoning models have exhibited incredible capabilities across a wide array of disciplines, driven by posttraining large language models (LLMs) with reinforcement learning (RL). However, despite the widespread success of this paradigm, much of the literature has been devoted to disentangling truly novel behaviors that emerge during RL but are not present in the base models. In our work, we approach this question from a different angle, instead asking whether comparable reasoning capabil...

ID: 2510.14901v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Circuit Insights: Towards Interpretability Beyond Activations

2025-10-18

Авторы:

Elena Golimblevskaia, Aakriti Jain, Bruno Puri, Ammar Ibrahim, Wojciech Samek, Sebastian Lapuschkin

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The fields of explainable AI and mechanistic interpretability aim to uncover the internal structure of neural networks, with circuit discovery as a central tool for understanding model computations. Existing approaches, however, rely on manual inspection and remain limited to toy tasks. Automated interpretability offers scalability by analyzing isolated features and their activations, but it often misses interactions between features and depends strongly on external LLMs and dataset quality. Tra...

ID: 2510.14936v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Max It or Miss It: Benchmarking LLM On Solving Extremal Problems

2025-10-17

Авторы:

Binxin Gao, Jingjun Han

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Test-time scaling has enabled Large Language Models (LLMs) with remarkable reasoning capabilities, particularly in mathematical domains, through intermediate chain-of-thought (CoT) reasoning before generating final answers. However, the specific sources and mechanisms underlying these reasoning capabilities remain insufficiently understood. Optimization reasoning, i.e. finding extrema under constraints, represents a fundamental abstraction that underpins critical applications in planning, contro...

ID: 2510.12997v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 On the Reasoning Abilities of Masked Diffusion Language Models

2025-10-17

Авторы:

Anej Svete, Ashish Sabharwal

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Masked diffusion models (MDMs) for text offer a compelling alternative to traditional autoregressive language models. Parallel generation makes them efficient, but their computational capabilities and the limitations inherent to their parallelism remain largely unexplored. To this end, we characterize what types of reasoning problems MDMs can provably solve and how efficiently. We do this by connecting MDMs to the well-understood reasoning frameworks of chain of thought (CoT) and padded looped t...

ID: 2510.13117v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 K-Merge: Online Continual Merging of Adapters for On-device Large Language Models

2025-10-17

Авторы:

Donald Shenaj, Ondrej Bohdal, Taha Ceritli, Mete Ozay, Pietro Zanuttigh, Umberto Michieli

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

On-device deployment of Large Language Models (LLMs) frequently leverages Low-Rank Adapters (LoRAs) to support diverse downstream tasks under tight resource constraints. To address the limited storage capacity of mobile devices, recent works have explored model merging techniques to fuse multiple LoRAs into a single one. In practice, however, LoRAs are often delivered incrementally, as users request support for new tasks (e.g., novel problem types or languages). This scenario introduces a new ch...

ID: 2510.13537v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Demystifying Hybrid Thinking: Can LLMs Truly Switch Between Think and No-Think?

2025-10-16

Авторы:

Shouren Wang, Wang Yang, Xianxuan Long, Qifan Wang, Vipin Chaudhary, Xiaotian Han

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Hybrid thinking enables LLMs to switch between reasoning and direct answering, offering a balance between efficiency and reasoning capability. Yet our experiments reveal that current hybrid thinking LLMs only achieve partial mode separation: reasoning behaviors often leak into the no-think mode. To understand and mitigate this, we analyze the factors influencing controllability and identify four that matter most: (1) larger data scale, (2) using think and no-think answers from different question...

ID: 2510.12680v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 Group-Adaptive Adversarial Learning for Robust Fake News Detection Against Malicious Comments

2025-10-15

Авторы:

Zhao Tong, Chunlin Gong, Yimeng Gu, Haichao Shi, Qiang Liu, Shu Wu, Xiao-Yu Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The spread of fake news online distorts public judgment and erodes trust in social media platforms. Although recent fake news detection (FND) models perform well in standard settings, they remain vulnerable to adversarial comments-authored by real users or by large language models (LLMs)-that subtly shift model decisions. In view of this, we first present a comprehensive evaluation of comment attacks to existing fake news detectors and then introduce a group-adaptive adversarial training strateg...

ID: 2510.09712v1 cs.LG, cs.AI, cs.CL

arXiv PDF

📄 It's 2025 -- Narrative Learning is the new baseline to beat for explainable machine learning

2025-10-15

Авторы:

Gregory D. Baker

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this paper, we introduce Narrative Learning, a methodology where models are defined entirely in natural language and iteratively refine their classification criteria using explanatory prompts rather than traditional numerical optimisation. We report on experiments to evaluate the accuracy and potential of this approach using 3 synthetic and 3 natural datasets and compare them against 7 baseline explainable machine learning models. We demonstrate that on 5 out of 6 of these datasets, Narrative...

ID: 2510.09723v1 cs.LG, cs.AI, cs.CL, 68T05 (Primary), 68T50, I.2.6; I.2.7; I.5.4

arXiv PDF

📄 Machine learning methods fail to provide cohesive atheoretical construction of personality traits from semantic embeddings

2025-10-15

Авторы:

Ayoub Bouguettaya, Elizabeth M. Stuart

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The lexical hypothesis posits that personality traits are encoded in language and is foundational to models like the Big Five. We created a bottom-up personality model from a classic adjective list using machine learning and compared its descriptive utility against the Big Five by analyzing one million Reddit comments. The Big Five, particularly Agreeableness, Conscientiousness, and Neuroticism, provided a far more powerful and interpretable description of these online communities. In contrast, ...

ID: 2510.09739v1 cs.LG, cs.AI, cs.CL, cs.HC

arXiv PDF

Показано 101 - 110 из 278 записей