Efficient Toxicity Detection in Gaming Chats: A Comparative Study of Embeddings, Fine-Tuned Transformers and LLMs

2510.17924v1 cs.CL, cs.AI, I.2.7 2025-10-23

Авторы:

Yehor Tereshchenko, Mika Hämäläinen

Abstract

This paper presents a comprehensive comparative analysis of Natural Language Processing (NLP) methods for automated toxicity detection in online gaming chats. Traditional machine learning models with embeddings, large language models (LLMs) with zero-shot and few-shot prompting, fine-tuned transformer models, and retrieval-augmented generation (RAG) approaches are evaluated. The evaluation framework assesses three critical dimensions: classification accuracy, processing speed, and computational costs. A hybrid moderation system architecture is proposed that optimizes human moderator workload through automated detection and incorporates continuous learning mechanisms. The experimental results demonstrate significant performance variations across methods, with fine-tuned DistilBERT achieving optimal accuracy-cost trade-offs. The findings provide empirical evidence for deploying cost-effective, efficient content moderation systems in dynamic online gaming environments.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Efficient Toxicity Detection in Gaming Chats: A Comparative Study of Embeddings, Fine-Tuned Transformers and LLMs

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Direct Semantic Communication Between Large Language Models via Vector Translati...

Detecting Data Contamination in LLMs via In-Context Learning

LASTIST: LArge-Scale Target-Independent STance dataset

PerCoR: Evaluating Commonsense Reasoning in Persian via Multiple-Choice Sentence...

A Use-Case Specific Dataset for Measuring Dimensions of Responsible Performance ...

Навигация