Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training

2510.13361v1 cs.LG, cs.AI, cs.CR 2025-10-17

Авторы:

Yisen Wang, Yichuan Mo, Hongjun Wang, Junyi Li, Zhouchen Lin

Abstract

Despite the rapid progress of neural networks, they remain highly vulnerable to adversarial examples, for which adversarial training (AT) is currently the most effective defense. While AT has been extensively studied, its practical applications expose two major limitations: natural accuracy tends to degrade significantly compared with standard training, and robustness does not transfer well across attacks crafted under different norm constraints. Unlike prior works that attempt to address only one issue within a single network, we propose to partition the overall generalization goal into multiple sub-tasks, each assigned to a dedicated base learner. By specializing in its designated objective, each base learner quickly becomes an expert in its field. In the later stages of training, we interpolate their parameters to form a knowledgeable global learner, while periodically redistributing the global parameters back to the base learners to prevent their optimization trajectories from drifting too far from the shared target. We term this framework Generalist and introduce three variants tailored to different application scenarios. Both theoretical analysis and extensive experiments demonstrate that Generalist achieves lower generalization error and significantly alleviates the trade-off problems compared with baseline methods. Our results suggest that Generalist provides a promising step toward developing fully robust classifiers in the future.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Generalist++: A Meta-learning Framework for Mitigating Trade-off in Adversarial Training

Авторы:

Abstract

Ссылки и действия

Связанные статьи

MarkTune: Improving the Quality-Detectability Trade-off in Open-Weight LLM Water...

A Safety and Security Framework for Real-World Agentic Systems

Teleportation-Based Defenses for Privacy in Approximate Machine Unlearning

BrowseSafe: Understanding and Preventing Prompt Injection Within AI Browser Agen...

Privacy Auditing of Multi-domain Graph Pre-trained Model under Membership Infere...

Навигация