📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
Авторы:
Jiexi Xu
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
The performance of Large Language Models (LLMs) depends heavily on the chosen
prompting strategy, yet static approaches such as Zero-Shot, Few-Shot, or
Chain-of-Thought (CoT) impose a rigid efficiency-accuracy trade-off. Highly
accurate strategies like Self-Consistency (SC) incur substantial computational
waste on simple tasks, while lightweight methods often fail on complex inputs.
This paper introduces the Prompt Policy Network (PPN), a lightweight
reinforcement learning framework that formali...