Enhancing Security in Deep Reinforcement Learning: A Comprehensive Survey on Adversarial Attacks and Defenses
2510.20314v1
cs.CR, cs.AI, cs.LG
2025-10-25
Авторы:
Wu Yichao, Wang Yirui, Ding Panpan, Wang Hailong, Zhu Bingqian, Liu Chun
Abstract
With the wide application of deep reinforcement learning (DRL) techniques in
complex fields such as autonomous driving, intelligent manufacturing, and smart
healthcare, how to improve its security and robustness in dynamic and
changeable environments has become a core issue in current research. Especially
in the face of adversarial attacks, DRL may suffer serious performance
degradation or even make potentially dangerous decisions, so it is crucial to
ensure their stability in security-sensitive scenarios. In this paper, we first
introduce the basic framework of DRL and analyze the main security challenges
faced in complex and changing environments. In addition, this paper proposes an
adversarial attack classification framework based on perturbation type and
attack target and reviews the mainstream adversarial attack methods against DRL
in detail, including various attack methods such as perturbation state space,
action space, reward function and model space. To effectively counter the
attacks, this paper systematically summarizes various current robustness
training strategies, including adversarial training, competitive training,
robust learning, adversarial detection, defense distillation and other related
defense techniques, we also discuss the advantages and shortcomings of these
methods in improving the robustness of DRL. Finally, this paper looks into the
future research direction of DRL in adversarial environments, emphasizing the
research needs in terms of improving generalization, reducing computational
complexity, and enhancing scalability and explainability, aiming to provide
valuable references and directions for researchers.
Ссылки и действия
Дополнительные ресурсы: