LASER: An LLM-based ASR Scoring and Evaluation Rubric
2510.07437v1
cs.CL, cs.AI, cs.LG, eess.AS
2025-10-11
Авторы:
Amruta Parulekar, Preethi Jyothi
Abstract
Standard ASR evaluation metrics like Word Error Rate (WER) tend to unfairly
penalize morphological and syntactic nuances that do not significantly alter
sentence semantics. We introduce an LLM-based scoring rubric LASER that
leverages state-of-the-art LLMs' in-context learning abilities to learn from
prompts with detailed examples. Hindi LASER scores using Gemini 2.5 Pro
achieved a very high correlation score of 94% with human annotations. Hindi
examples in the prompt were also effective in analyzing errors in other Indian
languages such as Marathi, Kannada and Malayalam. We also demonstrate how a
smaller LLM like Llama 3 can be finetuned on word-pair examples derived from
reference and ASR predictions to predict what kind of penalty should be applied
with close to 89% accuracy.