Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification

2510.12850v1 cs.CY, cs.AI 2025-10-17

Авторы:

Mahamodul Hasan Mahadi, Md. Nasif Safwan, Souhardo Rahman, Shahnaj Parvin, Aminun Nahar, Kamruddin Nur

Abstract

Developing AI systems capable of nuanced ethical reasoning is critical as they increasingly influence human decisions, yet existing models often rely on superficial correlations rather than principled moral understanding. This paper introduces Ethic-BERT, a BERT-based model for ethical content classification across four domains: Commonsense, Justice, Virtue, and Deontology. Leveraging the ETHICS dataset, our approach integrates robust preprocessing to address vocabulary sparsity and contextual ambiguities, alongside advanced fine-tuning strategies like full model unfreezing, gradient accumulation, and adaptive learning rate scheduling. To evaluate robustness, we employ an adversarially filtered "Hard Test" split, isolating complex ethical dilemmas. Experimental results demonstrate Ethic-BERT's superiority over baseline models, achieving 82.32% average accuracy on the standard test, with notable improvements in Justice and Virtue. In addition, the proposed Ethic-BERT attains 15.28% average accuracy improvement in the HardTest. These findings contribute to performance improvement and reliable decision-making using bias-aware preprocessing and proposed enhanced AI model.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Ethic-BERT: An Enhanced Deep Learning Model for Ethical and Non-Ethical Content Classification

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Humanity in the Age of AI: Reassessing 2025's Existential-Risk Narratives

When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Fro...

Artificial Intelligence / Human Intelligence: Who Controls Whom?

First, do NOHARM: towards clinically safe large language models

AI-Driven Document Redaction in UK Public Authorities: Implementation Gaps, Regu...

Навигация