Extended LSTM: Adaptive Feature Gating for Toxic Comment Classification
2510.17018v1
cs.CL, cs.LG
2025-10-22
Авторы:
Noor Islam S. Mohammad
Abstract
Toxic comment detection remains a challenging task, where transformer-based
models (e.g., BERT) incur high computational costs and degrade on minority
toxicity classes, while classical ensembles lack semantic adaptability. We
propose xLSTM, a parameter-efficient and theoretically grounded framework that
unifies cosine-similarity gating, adaptive feature prioritization, and
principled class rebalancing. A learnable reference vector {v} in {R}^d
modulates contextual embeddings via cosine similarity, amplifying toxic cues
and attenuating benign signals to yield stronger gradients under severe class
imbalance. xLSTM integrates multi-source embeddings (GloVe, FastText, BERT CLS)
through a projection layer, a character-level BiLSTM for morphological cues,
embedding-space SMOTE for minority augmentation, and adaptive focal loss with
dynamic class weighting. On the Jigsaw Toxic Comment benchmark, xLSTM attains
96.0% accuracy and 0.88 macro-F1, outperforming BERT by 33% on threat and 28%
on identity_hate categories, with 15 times fewer parameters and 50ms inference
latency. Cosine gating contributes a +4.8% F1 gain in ablations. The results
establish a new efficiency adaptability frontier, demonstrating that
lightweight, theoretically informed architectures can surpass large pretrained
models on imbalanced, domain-specific NLP tasks.
Ссылки и действия
Дополнительные ресурсы: