Inside CORE-KG: Evaluating Structured Prompting and Coreference Resolution for Knowledge Graphs
2510.26512v1
cs.CL, cs.AI, cs.IR, cs.LG
2025-11-01
Авторы:
Dipak Meher, Carlotta Domeniconi
Abstract
Human smuggling networks are increasingly adaptive and difficult to analyze.
Legal case documents offer critical insights but are often unstructured,
lexically dense, and filled with ambiguous or shifting references, which pose
significant challenges for automated knowledge graph (KG) construction. While
recent LLM-based approaches improve over static templates, they still generate
noisy, fragmented graphs with duplicate nodes due to the absence of guided
extraction and coreference resolution. The recently proposed CORE-KG framework
addresses these limitations by integrating a type-aware coreference module and
domain-guided structured prompts, significantly reducing node duplication and
legal noise. In this work, we present a systematic ablation study of CORE-KG to
quantify the individual contributions of its two key components. Our results
show that removing coreference resolution results in a 28.32% increase in node
duplication and a 4.32% increase in noisy nodes, while removing structured
prompts leads to a 4.34% increase in node duplication and a 73.33% increase in
noisy nodes. These findings offer empirical insights for designing robust
LLM-based pipelines for extracting structured representations from complex
legal texts.