ExpressNet-MoE: A Hybrid Deep Neural Network for Emotion Recognition
2510.13493v1
cs.CV, cs.LG, I.2.10; I.5.2; H.4.2
2025-10-17
Авторы:
Deeptimaan Banerjee, Prateek Gothwal, Ashis Kumer Biswas
Abstract
In many domains, including online education, healthcare, security, and
human-computer interaction, facial emotion recognition (FER) is essential.
Real-world FER is still difficult despite its significance because of some
factors such as variable head positions, occlusions, illumination shifts, and
demographic diversity. Engagement detection, which is essential for
applications like virtual learning and customer services, is frequently
challenging due to FER limitations by many current models. In this article, we
propose ExpressNet-MoE, a novel hybrid deep learning model that blends both
Convolution Neural Networks (CNNs) and Mixture of Experts (MoE) framework, to
overcome the difficulties. Our model dynamically chooses the most pertinent
expert networks, thus it aids in the generalization and providing flexibility
to model across a wide variety of datasets. Our model improves on the accuracy
of emotion recognition by utilizing multi-scale feature extraction to collect
both global and local facial features. ExpressNet-MoE includes numerous
CNN-based feature extractors, a MoE module for adaptive feature selection, and
finally a residual network backbone for deep feature learning. To demonstrate
efficacy of our proposed model we evaluated on several datasets, and compared
with current state-of-the-art methods. Our model achieves accuracies of 74.77%
on AffectNet (v7), 72.55% on AffectNet (v8), 84.29% on RAF-DB, and 64.66% on
FER-2013. The results show how adaptive our model is and how it may be used to
develop end-to-end emotion recognition systems in practical settings.
Reproducible codes and results are made publicly accessible at
https://github.com/DeeptimaanB/ExpressNet-MoE.