📊 Статистика дайджестов

Всего дайджестов: 34022 Добавлено сегодня: 82

Последнее обновление: сегодня

📄 Death by a Thousand Prompts: Open Model Vulnerability Analysis

2025-11-07

Авторы:

Amy Chang, Nicholas Conley, Harish Santhanalakshmi Ganesan, Adam Swanda

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Open-weight models provide researchers and developers with accessible foundations for diverse downstream applications. We tested the safety and security postures of eight open-weight large language models (LLMs) to identify vulnerabilities that may impact subsequent fine-tuning and deployment. Using automated adversarial testing, we measured each model's resilience against single-turn and multi-turn prompt injection and jailbreak attacks. Our findings reveal pervasive vulnerabilities across all ...

ID: 2511.03247v1 cs.CR, cs.LG

arXiv PDF

📄 Black-Box Differentially Private Nonparametric Confidence Intervals Under Minimal Assumptions

2025-11-06

Авторы:

Tomer Shoham, Moshe Shenfeld, Noa Velner-Harris, Katrina Ligett

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

We introduce a simple, general framework that takes any differentially private estimator of any arbitrary quantity as a black box, and from it constructs a differentially private nonparametric confidence interval of that quantity. Our approach repeatedly subsamples the data, applies the private estimator to each subsample, and then post-processes the resulting empirical CDF to a confidence interval. Our analysis uses the randomness from the subsampling to achieve privacy amplification. Under mil...

ID: 2511.01303v1 cs.CR, cs.LG

arXiv PDF

📄 Panther: A Cost-Effective Privacy-Preserving Framework for GNN Training and Inference Services in Cloud Environments

2025-11-06

Авторы:

Congcong Chen, Xinyu Liu, Kaifeng Huang, Lifei Wei, Yang Shi

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Graph Neural Networks (GNNs) have marked significant impact in traffic state prediction, social recommendation, knowledge-aware question answering and so on. As more and more users move towards cloud computing, it has become a critical issue to unleash the power of GNNs while protecting the privacy in cloud environments. Specifically, the training data and inference data for GNNs need to be protected from being stolen by external adversaries. Meanwhile, the financial cost of cloud computing is a...

ID: 2511.01654v1 cs.CR, cs.LG

arXiv PDF

📄 PrivGNN: High-Performance Secure Inference for Cryptographic Graph Neural Networks

2025-11-06

Авторы:

Fuyi Wang, Zekai Chen, Mingyuan Fan, Jianying Zhou, Lei Pan, Leo Yu Zhang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Graph neural networks (GNNs) are powerful tools for analyzing and learning from graph-structured (GS) data, facilitating a wide range of services. Deploying such services in privacy-critical cloud environments necessitates the development of secure inference (SI) protocols that safeguard sensitive GS data. However, existing SI solutions largely focus on convolutional models for image and text data, leaving the challenge of securing GNNs and GS data relatively underexplored. In this work, we desi...

ID: 2511.02185v1 cs.CR, cs.LG

arXiv PDF

📄 An Automated Framework for Strategy Discovery, Retrieval, and Evolution in LLM Jailbreak Attacks

2025-11-06

Авторы:

Xu Liu, Yan Chen, Kan Ling, Yichi Zhu, Hengrun Zhang, Guisheng Fan, Huiqun Yu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

The widespread deployment of Large Language Models (LLMs) as public-facing web services and APIs has made their security a core concern for the web ecosystem. Jailbreak attacks, as one of the significant threats to LLMs, have recently attracted extensive research. In this paper, we reveal a jailbreak strategy which can effectively evade current defense strategies. It can extract valuable information from failed or partially successful attack attempts and contains self-evolution from attack inter...

ID: 2511.02356v1 cs.CR, cs.LG

arXiv PDF

📄 Verifying LLM Inference to Prevent Model Weight Exfiltration

2025-11-06

Авторы:

Roy Rinberg, Adam Karvonen, Alex Hoover, Daniel Reuter, Keri Warr

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

As large AI models become increasingly valuable assets, the risk of model weight exfiltration from inference servers grows accordingly. An attacker controlling an inference server may exfiltrate model weights by hiding them within ordinary model outputs, a strategy known as steganography. This work investigates how to verify model responses to defend against such attacks and, more broadly, to detect anomalous or buggy behavior during inference. We formalize model exfiltration as a security game,...

ID: 2511.02620v1 cs.CR, cs.LG

arXiv PDF

📄 AutoAdv: Automated Adversarial Prompting for Multi-Turn Jailbreaking of Large Language Models

2025-11-06

Авторы:

Aashray Reddy, Andrew Zagula, Nicholas Saban

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large Language Models (LLMs) remain vulnerable to jailbreaking attacks where adversarial prompts elicit harmful outputs, yet most evaluations focus on single-turn interactions while real-world attacks unfold through adaptive multi-turn conversations. We present AutoAdv, a training-free framework for automated multi-turn jailbreaking that achieves up to 95% attack success rate on Llama-3.1-8B within six turns a 24 percent improvement over single turn baselines. AutoAdv uniquely combines three ada...

ID: 2511.02376v1 cs.CL, cs.AI, cs.CR, cs.LG

arXiv PDF

📄 On Selecting Few-Shot Examples for LLM-based Code Vulnerability Detection

2025-11-04

Авторы:

Md Abdul Hannan, Ronghao Ni, Chi Zhang, Limin Jia, Ravi Mangal, Corina S. Pasareanu

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Large language models (LLMs) have demonstrated impressive capabilities for many coding tasks, including summarization, translation, completion, and code generation. However, detecting code vulnerabilities remains a challenging task for LLMs. An effective way to improve LLM performance is in-context learning (ICL) - providing few-shot examples similar to the query, along with correct answers, can improve an LLM's ability to generate correct solutions. However, choosing the few-shot examples appro...

ID: 2510.27675v1 cs.SE, cs.CR, cs.LG

arXiv PDF

📄 Attention Augmented GNN RNN-Attention Models for Advanced Cybersecurity Intrusion Detection

2025-11-01

Авторы:

Jayant Biradar, Smit Shah, Tanmay Naik

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

In this paper, we propose a novel hybrid deep learning architecture that synergistically combines Graph Neural Networks (GNNs), Recurrent Neural Networks (RNNs), and multi-head attention mechanisms to significantly enhance cybersecurity intrusion detection capabilities. By leveraging the comprehensive UNSW-NB15 dataset containing diverse network traffic patterns, our approach effectively captures both spatial dependencies through graph structural relationships and temporal dynamics through seque...

ID: 2510.25802v1 cs.CR, cs.LG

arXiv PDF

📄 ALMGuard: Safety Shortcuts and Where to Find Them as Guardrails for Audio-Language Models

2025-11-01

Авторы:

Weifei Jin, Yuxin Cao, Junjie Su, Minhui Xue, Jie Hao, Ke Xu, Jin Song Dong, Derui Wang

Саммари на русском не найдено
Доступные поля: ['id', 'arxiv_id', 'title', 'authors', 'abstract', 'summary_ru', 'categories', 'published_date', 'created_at']

Annotation:

Recent advances in Audio-Language Models (ALMs) have significantly improved multimodal understanding capabilities. However, the introduction of the audio modality also brings new and unique vulnerability vectors. Previous studies have proposed jailbreak attacks that specifically target ALMs, revealing that defenses directly transferred from traditional audio adversarial attacks or text-based Large Language Model (LLM) jailbreaks are largely ineffective against these ALM-specific threats. To addr...

ID: 2510.26096v1 cs.SD, cs.CR, cs.LG

arXiv PDF

Показано 41 - 50 из 168 записей