Group-Adaptive Adversarial Learning for Robust Fake News Detection Against Malicious Comments
2510.09712v1
cs.LG, cs.AI, cs.CL
2025-10-15
Авторы:
Zhao Tong, Chunlin Gong, Yimeng Gu, Haichao Shi, Qiang Liu, Shu Wu, Xiao-Yu Zhang
Abstract
The spread of fake news online distorts public judgment and erodes trust in
social media platforms. Although recent fake news detection (FND) models
perform well in standard settings, they remain vulnerable to adversarial
comments-authored by real users or by large language models (LLMs)-that subtly
shift model decisions. In view of this, we first present a comprehensive
evaluation of comment attacks to existing fake news detectors and then
introduce a group-adaptive adversarial training strategy to improve the
robustness of FND models. To be specific, our approach comprises three steps:
(1) dividing adversarial comments into three psychologically grounded
categories: perceptual, cognitive, and societal; (2) generating diverse,
category-specific attacks via LLMs to enhance adversarial training; and (3)
applying a Dirichlet-based adaptive sampling mechanism (InfoDirichlet Adjusting
Mechanism) that dynamically adjusts the learning focus across different comment
categories during training. Experiments on benchmark datasets show that our
method maintains strong detection accuracy while substantially increasing
robustness to a wide range of adversarial comment perturbations.
Ссылки и действия
Дополнительные ресурсы: