📊 Статистика дайджестов
Всего дайджестов: 34022 Добавлено сегодня: 82
Последнее обновление: сегодня
📄 The Lossy Horizon: Error-Bounded Predictive Coding for Lossy Text Compression (Episode I)
2025-10-29Авторы:
Nnamdi Aghanya, Jun Li, Kewei Wang
Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']
Annotation:
Large Language Models (LLMs) can achieve near-optimal lossless compression by
acting as powerful probability models. We investigate their use in the lossy
domain, where reconstruction fidelity is traded for higher compression ratios.
This paper introduces Error-Bounded Predictive Coding (EPC), a lossy text codec
that leverages a Masked Language Model (MLM) as a decompressor. Instead of
storing a subset of original tokens, EPC allows the model to predict masked
content and stores minimal, rank-ba...