📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 0
Последнее обновление: сегодня
📄 When to Reason: Semantic Router for vLLM
2025-10-14Авторы:
Chen Wang, Xunzhuo Liu, Yuhan Liu, Yue Zhu, Xiangxi Mo, Junchen Jiang, Huamin Chen
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large Language Models (LLMs) demonstrate substantial accuracy gains when
augmented with reasoning modes such as chain-of-thought and inference-time
scaling. However, reasoning also incurs significant costs in inference latency
and token usage, with environmental and financial impacts, which are
unnecessary for many simple prompts. We present a semantic router that
classifies queries based on their reasoning requirements and selectively
applies reasoning only when beneficial. Our approach achieve...