CompressionAttack: Exploiting Prompt Compression as a New Attack Surface in LLM-Powered Agents
2510.22963v1
cs.CR, cs.AI
2025-10-29
Авторы:
Zesen Liu, Zhixiang Zhang, Yuchong Xie, Dongdong She
Abstract
LLM-powered agents often use prompt compression to reduce inference costs,
but this introduces a new security risk. Compression modules, which are
optimized for efficiency rather than safety, can be manipulated by adversarial
inputs, causing semantic drift and altering LLM behavior. This work identifies
prompt compression as a novel attack surface and presents CompressionAttack,
the first framework to exploit it. CompressionAttack includes two strategies:
HardCom, which uses discrete adversarial edits for hard compression, and
SoftCom, which performs latent-space perturbations for soft compression.
Experiments on multiple LLMs show up to 80% attack success and 98% preference
flips, while remaining highly stealthy and transferable. Case studies in VSCode
Cline and Ollama confirm real-world impact, and current defenses prove
ineffective, highlighting the need for stronger protections.
Ссылки и действия
Дополнительные ресурсы: