FLToP CTC: Frame-Level Token Pruning via Relative Threshold for Efficient and Memory-Saving Decoding on Diverse Platforms

2510.09085v1 cs.LG, cs.SD, eess.AS 2025-10-14
Авторы:

Atul Shree, Harshith Jupuru

Abstract

CTC-based ASR systems face computational and memory bottlenecks in resource-limited environments. Traditional CTC decoders, requiring up to 90% of processing time in systems (e.g., wav2vec2-large on L4 GPUs), face inefficiencies due to exhaustive token-level operations. This paper introduces Frame Level Token Pruning for Connectionist Temporal Classification (FLToP CTC), a novel decoding algorithm that employs frame-level token pruning guided by a relative threshold probability. By dynamically eliminating low-probability tokens per frame, FLToP CTC reduces compute and memory demands while maintaining negligible WER degradation. On LibriSpeech, FLToP CTC achieves a 10.5x runtime speedup and 2.78x memory reduction versus standard CTC decoders. Its simplicity enables seamless integration into CTC decoders across platforms (CPUs, GPUs, etc.). FLToP CTC addresses CTC bottlenecks, offering scalability for resource-limited environments and realtime applications, enhancing speech recognition accessibility and efficiency.

Ссылки и действия

Связанные статьи

CAK: Emergent Audio Effects from Minimal Deep Learning

## Контекст Исследование сосредоточено на исследовании возможностей небольших нейронных сетей для создания эффектов в ау...

2025-08-09

Perch 2.0: The Bittern Lesson for Bioacoustics

Perch 2.0 — это подходящая для работы модель для биоакустики, развитая на основе ее предшественника, Perch. Оригинальная...

2025-08-09