Knowledge Graph Augmented Large Language Models for Disease Prediction

2512.01210v2 cs.AI 2025-12-04

Авторы:

Ruiyu Wang, Tuan Vinh, Ran Xu, Yuyin Zhou, Jiaying Lu, Carl Yang, Francisco Pasquel

Abstract

Electronic health records (EHRs) support powerful clinical prediction models, but existing methods typically provide coarse, post hoc explanations that offer limited value for patient-level decision making. We introduce a knowledge graph (KG)-guided chain-of-thought (CoT) framework that generates clinically grounded and temporally consistent reasoning for visit-level disease prediction in MIMIC-III. ICD-9 codes are mapped to PrimeKG, from which disease-relevant nodes and multi-hop reasoning paths are extracted and used as scaffolds for CoT generation; only explanations whose conclusions match observed outcomes are retained. Lightweight LLaMA-3.1-Instruct-8B and Gemma-7B models are then fine-tuned on this supervision corpus. Across ten PrimeKG-mapped diseases and limited training cohorts (400 and 1000 cases), KG-guided models outperform strong classical baselines, achieving AUROC values of 0.66 to 0.70 and macro-AUPR values of 0.40 to 0.47. The models also transfer zero-shot to the CRADLE cohort, improving accuracy from approximately 0.40 to 0.51 up to 0.72 to 0.77. A blinded clinician evaluation shows consistent preference for KG-guided CoT explanations in clarity, relevance, and clinical correctness.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Knowledge Graph Augmented Large Language Models for Disease Prediction

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Executable Governance for AI: Translating Policies into Rules Using LLMs

Solving LLM Repetition Problem in Production: A Comprehensive Study of Multiple ...

BiTAgent: A Task-Aware Modular Framework for Bidirectional Coupling between Mult...

SlideGen: Collaborative Multimodal Agents for Scientific Slide Generation

GTM: Simulating the World of Tools for AI Agents

Навигация