Transformer-Based Low-Resource Language Translation: A Study on Standard Bengali to Sylheti

2510.18898v1 cs.CL, cs.CY 2025-10-24

Авторы:

Mangsura Kabir Oni, Tabia Tanzin Prama

Abstract

Machine Translation (MT) has advanced from rule-based and statistical methods to neural approaches based on the Transformer architecture. While these methods have achieved impressive results for high-resource languages, low-resource varieties such as Sylheti remain underexplored. In this work, we investigate Bengali-to-Sylheti translation by fine-tuning multilingual Transformer models and comparing them with zero-shot large language models (LLMs). Experimental results demonstrate that fine-tuned models significantly outperform LLMs, with mBART-50 achieving the highest translation adequacy and MarianMT showing the strongest character-level fidelity. These findings highlight the importance of task-specific adaptation for underrepresented languages and contribute to ongoing efforts toward inclusive language technologies.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Transformer-Based Low-Resource Language Translation: A Study on Standard Bengali to Sylheti

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Identifying attributions of causality in political text

Sycophancy Claims about Language Models: The Missing Human-in-the-Loop

CAIRNS: Balancing Readability and Scientific Accuracy in Climate Adaptation Ques...

Gender Bias in Emotion Recognition by Large Language Models

Analysing Personal Attacks in U.S. Presidential Debates

Навигация