QueryIPI: Query-agnostic Indirect Prompt Injection on Coding Agents
2510.23675v1
cs.CR, cs.AI
2025-10-30
Авторы:
Yuchong Xie, Zesen Liu, Mingyu Luo, Zhixiang Zhang, Kaikai Zhang, Zongjie Li, Ping Chen, Shuai Wang, Dongdong She
Abstract
Modern coding agents integrated into IDEs combine powerful tools and
system-level actions, exposing a high-stakes attack surface. Existing Indirect
Prompt Injection (IPI) studies focus mainly on query-specific behaviors,
leading to unstable attacks with lower success rates. We identify a more
severe, query-agnostic threat that remains effective across diverse user
inputs. This challenge can be overcome by exploiting a common vulnerability:
leakage of the agent's internal prompt, which turns the attack into a
constrained white-box optimization problem. We present QueryIPI, the first
query-agnostic IPI method for coding agents. QueryIPI refines malicious tool
descriptions through an iterative, prompt-based process informed by the leaked
internal prompt. Experiments on five simulated agents show that QueryIPI
achieves up to 87 percent success, outperforming baselines, and the generated
malicious descriptions also transfer to real-world systems, highlighting a
practical security risk to modern LLM-based coding agents.
Ссылки и действия
Дополнительные ресурсы: