📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Superpixel Attack: Enhancing Black-box Adversarial Attack with Image-driven Division Areas

2025-12-04

Авторы:

Issa Oe, Keiichiro Yamamura, Hiroki Ishikura, Ryo Hamahira, Katsuki Fujisawa

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Deep learning models are used in safety-critical tasks such as automated driving and face recognition. However, small perturbations in the model input can significantly change the predictions. Adversarial attacks are used to identify small perturbations that can lead to misclassifications. More powerful black-box adversarial attacks are required to develop more effective defenses. A promising approach to black-box adversarial attacks is to repeat the process of extracting a specific image area a...

ID: 2512.02062v1 cs.CR, cs.AI, cs.CV

arXiv PDF

📄 GEO-Detective: Unveiling Location Privacy Risks in Images with LLM Agents

2025-12-02

Авторы:

Xinyu Zhang, Yixin Wu, Boyang Zhang, Chenhao Lin, Chao Shen, Michael Backes, Yang Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Images shared on social media often expose geographic cues. While early geolocation methods required expert effort and lacked generalization, the rise of Large Vision Language Models (LVLMs) now enables accurate geolocation even for ordinary users. However, existing approaches are not optimized for this task. To explore the full potential and associated privacy risks, we present Geo-Detective, an agent that mimics human reasoning and tool use for image geolocation inference. It follows a procedu...

ID: 2511.22441v1 cs.CR, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 SPQR: A Standardized Benchmark for Modern Safety Alignment Methods in Text-to-Image Diffusion Models

2025-11-26

Авторы:

Mohammed Talha Alam, Nada Saadi, Fahad Shamshad, Nils Lukas, Karthik Nandakumar, Fahkri Karray, Samuele Poppi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Text-to-image diffusion models can emit copyrighted, unsafe, or private content. Safety alignment aims to suppress specific concepts, yet evaluations seldom test whether safety persists under benign downstream fine-tuning routinely applied after deployment (e.g., LoRA personalization, style/domain adapters). We study the stability of current safety methods under benign fine-tuning and observe frequent breakdowns. As true safety alignment must withstand even benign post-deployment adaptations, we...

ID: 2511.19558v1 cs.CR, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 BackWeak: Backdooring Knowledge Distillation Simply with Weak Triggers and Fine-tuning

2025-11-18

Авторы:

Shanmin Wang, Dongdong Zhao

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Knowledge Distillation (KD) is essential for compressing large models, yet relying on pre-trained "teacher" models downloaded from third-party repositories introduces serious security risks -- most notably backdoor attacks. Existing KD backdoor methods are typically complex and computationally intensive: they employ surrogate student models and simulated distillation to guarantee transferability, and they construct triggers in a way similar to universal adversarial perturbations (UAPs), which be...

ID: 2511.12046v1 cs.CR, cs.AI, cs.CV, cs.LG

arXiv PDF

📄 AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models

2025-11-18

Авторы:

Jiayu Li, Yunhan Zhao, Xiang Zheng, Zonghuan Xu, Yige Li, Xingjun Ma, Yu-Gang Jiang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Vision-Language-Action (VLA) models enable robots to interpret natural-language instructions and perform diverse tasks, yet their integration of perception, language, and control introduces new safety vulnerabilities. Despite growing interest in attacking such models, the effectiveness of existing techniques remains unclear due to the absence of a unified evaluation framework. One major issue is that differences in action tokenizers across VLA architectures hinder reproducibility and fair compar...

ID: 2511.12149v1 cs.CR, cs.AI, cs.CV

arXiv PDF

📄 T2I-RiskyPrompt: A Benchmark for Safety Evaluation, Attack, and Defense on Text-to-Image Model

2025-10-29

Авторы:

Chenyu Zhang, Tairen Zhang, Lanjun Wang, Ruidong Chen, Wenhui Li, Anan Liu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Using risky text prompts, such as pornography and violent prompts, to test the safety of text-to-image (T2I) models is a critical task. However, existing risky prompt datasets are limited in three key areas: 1) limited risky categories, 2) coarse-grained annotation, and 3) low effectiveness. To address these limitations, we introduce T2I-RiskyPrompt, a comprehensive benchmark designed for evaluating safety-related tasks in T2I models. Specifically, we first develop a hierarchical risk taxonomy, ...

ID: 2510.22300v1 cs.CR, cs.AI, cs.CV

arXiv PDF

📄 An Experimental Study of Trojan Vulnerabilities in UAV Autonomous Landing

2025-10-28

Авторы:

Reza Ahmari, Ahmad Mohammadi, Vahid Hemmati, Mohammed Mynuddin, Mahmoud Nabil Mahmoud, Parham Kebria, Abdollah Homaifar, Mehrdad Saif

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

This study investigates the vulnerabilities of autonomous navigation and landing systems in Urban Air Mobility (UAM) vehicles. Specifically, it focuses on Trojan attacks that target deep learning models, such as Convolutional Neural Networks (CNNs). Trojan attacks work by embedding covert triggers within a model's training data. These triggers cause specific failures under certain conditions, while the model continues to perform normally in other situations. We assessed the vulnerability of Urba...

ID: 2510.20932v1 cs.CR, cs.AI, cs.CV, cs.RO

arXiv PDF

📄 ISO/IEC-Compliant Match-on-Card Face Verification with Short Binary Templates

2025-10-22

Авторы:

Abdelilah Ganmati, Karim Afdel, Lahcen Koutti

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We present a practical match-on-card design for face verification in which compact 64/128-bit templates are produced off-card by PCA-ITQ and compared on-card via constant-time Hamming distance. We specify ISO/IEC 7816-4 and 14443-4 command APDUs with fixed-length payloads and decision-only status words (no score leakage), together with a minimal per-identity EEPROM map. Using real binary codes from a CelebA working set (55 identities, 412 images), we (i) derive operating thresholds from ROC/DET,...

ID: 2510.16078v1 cs.CR, cs.AI, cs.CV, 68T10, 68T45, 94A60, I.4.8; I.5.4; I.2.10

arXiv PDF

📄 SafeGuider: Robust and Practical Content Safety Control for Text-to-Image Models

2025-10-09

Авторы:

Peigui Qi, Kunsheng Tang, Wenbo Zhou, Weiming Zhang, Nenghai Yu, Tianwei Zhang, Qing Guo, Jie Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Text-to-image models have shown remarkable capabilities in generating high-quality images from natural language descriptions. However, these models are highly vulnerable to adversarial prompts, which can bypass safety measures and produce harmful content. Despite various defensive strategies, achieving robustness against attacks while maintaining practical utility in real-world applications remains a significant challenge. To address this issue, we first conduct an empirical study of the text en...

ID: 2510.05173v2 cs.CR, cs.AI, cs.CV, I.2

arXiv PDF

📄 EditTrack: Detecting and Attributing AI-assisted Image Editing

2025-10-04

Авторы:

Zhengyuan Jiang, Yuyang Zhang, Moyang Guo, Neil Zhenqiang Gong

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this work, we formulate and study the problem of image-editing detection and attribution: given a base image and a suspicious image, detection seeks to determine whether the suspicious image was derived from the base image using an AI editing model, while attribution further identifies the specific editing model responsible. Existing methods for detecting and attributing AI-generated images are insufficient for this problem, as they focus on determining whether an image was AI-generated/edite...

ID: 2510.01173v1 cs.CR, cs.AI, cs.CV, cs.LG

arXiv PDF

Показано 1 - 10 из 18 записей