LLMAtKGE: Large Language Models as Explainable Attackers against Knowledge Graph Embeddings
2510.11584v1
cs.CL, cs.CR
2025-10-15
Авторы:
Ting Li, Yang Yang, Yipeng Yu, Liang Yao, Guoqing Chao, Ruifeng Xu
Abstract
Adversarial attacks on knowledge graph embeddings (KGE) aim to disrupt the
model's ability of link prediction by removing or inserting triples. A recent
black-box method has attempted to incorporate textual and structural
information to enhance attack performance. However, it is unable to generate
human-readable explanations, and exhibits poor generalizability. In the past
few years, large language models (LLMs) have demonstrated powerful capabilities
in text comprehension, generation, and reasoning. In this paper, we propose
LLMAtKGE, a novel LLM-based framework that selects attack targets and generates
human-readable explanations. To provide the LLM with sufficient factual context
under limited input constraints, we design a structured prompting scheme that
explicitly formulates the attack as multiple-choice questions while
incorporating KG factual evidence. To address the context-window limitation and
hesitation issues, we introduce semantics-based and centrality-based filters,
which compress the candidate set while preserving high recall of
attack-relevant information. Furthermore, to efficiently integrate both
semantic and structural information into the filter, we precompute high-order
adjacency and fine-tune the LLM with a triple classification task to enhance
filtering performance. Experiments on two widely used knowledge graph datasets
demonstrate that our attack outperforms the strongest black-box baselines and
provides explanations via reasoning, and showing competitive performance
compared with white-box methods. Comprehensive ablation and case studies
further validate its capability to generate explanations.
Ссылки и действия
Дополнительные ресурсы: