📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Majority of the Bests: Improving Best-of-N via Bootstrapping

2025-11-26

Авторы:

Amin Rakhsha, Kanika Madan, Tianyu Zhang, Amir-massoud Farahmand, Amir Khasahmadi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Sampling multiple outputs from a Large Language Model (LLM) and selecting the most frequent (Self-consistency) or highest-scoring (Best-of-N) candidate is a popular approach to achieve higher accuracy in tasks with discrete final answers. Best-of-N (BoN) selects the output with the highest reward, and with perfect rewards, it often achieves near-perfect accuracy. With imperfect rewards from reward models, however, BoN fails to reliably find the correct answer and its performance degrades drastic...

ID: 2511.18630v1 cs.LG, cs.AI, cs.CL, stat.ME, stat.ML

arXiv PDF