📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 0
Последнее обновление: сегодня
Авторы:
Hailay Kidu Teklehaymanot, Wolfgang Nejdl
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Tokenization disparities pose a significant barrier to achieving equitable
access to artificial intelligence across linguistically diverse populations.
This study conducts a large-scale cross-linguistic evaluation of tokenization
efficiency in over 200 languages to systematically quantify computational
inequities in large language models (LLMs). Using a standardized experimental
framework, we applied consistent preprocessing and normalization protocols,
followed by uniform tokenization through t...