📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 0

Последнее обновление: сегодня

📄 Machine Unlearning Meets Adversarial Robustness via Constrained Interventions on LLMs

2025-10-08

Авторы:

Fatmazohra Rezkellah, Ramzi Dakhmouche

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

With the increasing adoption of Large Language Models (LLMs), more customization is needed to ensure privacy-preserving and safe generation. We address this objective from two critical aspects: unlearning of sensitive information and robustness to jail-breaking attacks. We investigate various constrained optimization formulations that address both aspects in a \emph{unified manner}, by finding the smallest possible interventions on LLM weights that either make a given vocabulary set unreachable ...

ID: 2510.03567v1 cs.LG, cs.CL, cs.CR, cs.CY, math.OC

arXiv PDF