📊 Статистика дайджестов

Всего дайджестов: 34123 Добавлено сегодня: 101

Последнее обновление: сегодня

📄 SSR: Socratic Self-Refine for Large Language Model Reasoning

2025-11-15

Авторы:

Haizhou Shi, Ye Liu, Bo Pang, Zeyu Leo Liu, Hao Wang, Silvio Savarese, Caiming Xiong, Yingbo Zhou, Semih Yavuz

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, yet existing test-time frameworks often rely on coarse self-verification and self-correction, limiting their effectiveness on complex tasks. In this paper, we propose Socratic Self-Refine (SSR), a novel framework for fine-grained evaluation and precise refinement of LLM reasoning. Our proposed SSR decomposes model responses into verifiable (sub-question, sub-answer) pairs, enabling step-level confidence estimation thr...

ID: 2511.10621v1 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 Instella: Fully Open Language Models with Stellar Performance

2025-11-15

Авторы:

Jiang Liu, Jialian Wu, Xiaodong Yu, Yusheng Su, Prakamya Mishra, Gowtham Ramesh, Sudhanshu Ranjan, Chaitanya Manem, Ximeng Sun, Ze Wang, Pratik Prabhanjan Brahma, Zicheng Liu, Emad Barsoum

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) have demonstrated remarkable performance across a wide range of tasks, yet the majority of high-performing models remain closed-source or partially open, limiting transparency and reproducibility. In this work, we introduce Instella, a family of fully open three billion parameter language models trained entirely on openly available data and codebase. Powered by AMD Instinct MI300X GPUs, Instella is developed through large-scale pre-training, general-purpose instructi...

ID: 2511.10628v1 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 Black-Box On-Policy Distillation of Large Language Models

2025-11-15

Авторы:

Tianzhu Ye, Li Dong, Zewen Chi, Xun Wu, Shaohan Huang, Furu Wei

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Black-box distillation creates student large language models (LLMs) by learning from a proprietary teacher model's text outputs alone, without access to its internal logits or parameters. In this work, we introduce Generative Adversarial Distillation (GAD), which enables on-policy and black-box distillation. GAD frames the student LLM as a generator and trains a discriminator to distinguish its responses from the teacher LLM's, creating a minimax game. The discriminator acts as an on-policy rewa...

ID: 2511.10643v1 cs.CL, cs.AI

arXiv PDF

📄 Minimal and Mechanistic Conditions for Behavioral Self-Awareness in LLMs

2025-11-11

Авторы:

Matthew Bozoukov, Matthew Nguyen, Shubkarman Singh, Bart Bussmann, Patrick Leask

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recent studies have revealed that LLMs can exhibit behavioral self-awareness: the ability to accurately describe or predict their own learned behaviors without explicit supervision. This capability raises safety concerns as it may, for example, allow models to better conceal their true abilities during evaluation. We attempt to characterize the minimal conditions under which such self-awareness emerges, and the mechanistic processes through which it manifests. Through controlled finetuning exper...

ID: 2511.04875v1 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 First is Not Really Better Than Last: Evaluating Layer Choice and Aggregation Strategies in Language Model Data Influence Estimation

2025-11-11

Авторы:

Dmytro Vitel, Anshuman Chhabra

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Identifying how training samples influence/impact Large Language Model (LLM) decision-making is essential for effectively interpreting model decisions and auditing large-scale datasets. Current training sample influence estimation methods (also known as influence functions) undertake this goal by utilizing information flow through the model via its first-order and higher-order gradient terms. However, owing to the large model sizes of today consisting of billions of parameters, these influence c...

ID: 2511.04715v1 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 Learning to reason about rare diseases through retrieval-augmented agents

2025-11-11

Авторы:

Ha Young Kim, Jun Li, Ana Beatriz Solana, Carolin M. Pirkl, Benedikt Wiestler, Julia A. Schnabel, Cosmin I. Bercea

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Rare diseases represent the long tail of medical imaging, where AI models often fail due to the scarcity of representative training data. In clinical workflows, radiologists frequently consult case reports and literature when confronted with unfamiliar findings. Following this line of reasoning, we introduce RADAR, Retrieval Augmented Diagnostic Reasoning Agents, an agentic system for rare disease detection in brain MRI. Our approach uses AI agents with access to external medical knowledge by em...

ID: 2511.04720v1 cs.CL, cs.AI

arXiv PDF

📄 Minimal and Mechanistic Conditions for Behavioral Self-Awareness in LLMs

2025-11-11

Авторы:

Matthew Bozoukov, Matthew Nguyen, Shubkarman Singh, Bart Bussmann, Patrick Leask

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

ID: 2511.04875v2 cs.CL, cs.AI, cs.LG

arXiv PDF

📄 BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models

2025-11-11

Авторы:

Chandra Vamsi Krishna Alla, Harish Naidu Gaddam, Manohar Kommi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) face significant computational and memory constraints when processing long contexts, despite growing demand for applications requiring reasoning over extensive documents, multi-session dialogues, and book length texts. While recent advances have extended context windows to 100K-1M tokens, such approaches incur prohibitive costs for resource constrained deployments. We propose BudgetMem, a novel memory augmented architecture that learns what to remember rather than re...

ID: 2511.04919v1 cs.CL, cs.AI, I.2.7; I.2.6; H.3.3

arXiv PDF

📄 Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

2025-11-11

Авторы:

Zihao Yi, Qingxuan Jiang, Ruotian Ma, Xingyu Chen, Qu Yang, Mengru Wang, Fanghua Ye, Ying Shen, Zhaopeng Tu, Xiaolong Li, Linus

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) are increasingly tasked with creative generation, including the simulation of fictional characters. However, their ability to portray non-prosocial, antagonistic personas remains largely unexamined. We hypothesize that the safety alignment of modern LLMs creates a fundamental conflict with the task of authentically role-playing morally ambiguous or villainous characters. To investigate this, we introduce the Moral RolePlay benchmark, a new dataset featuring a four-le...

ID: 2511.04962v1 cs.CL, cs.AI

arXiv PDF

📄 Pluralistic Behavior Suite: Stress-Testing Multi-Turn Adherence to Custom Behavioral Policies

2025-11-11

Авторы:

Prasoon Varshney, Makesh Narsimhan Sreedhar, Liwei Jiang, Traian Rebedea, Christopher Parisien

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) are typically aligned to a universal set of safety and usage principles intended for broad public acceptability. Yet, real-world applications of LLMs often take place within organizational ecosystems shaped by distinctive corporate policies, regulatory requirements, use cases, brand guidelines, and ethical commitments. This reality highlights the need for rigorous and comprehensive evaluation of LLMs with pluralistic alignment goals, an alignment paradigm that emphas...

ID: 2511.05018v1 cs.CL, cs.AI, cs.LG

arXiv PDF

Показано 331 - 340 из 2054 записей