Discourse-Aware Scientific Paper Recommendation via QA-Style Summarization and Multi-Level Contrastive Learning
2511.03330v1
cs.IR, cs.AI
2025-11-07
Авторы:
Shenghua Wang, Zhen Yin
Abstract
The rapid growth of open-access (OA) publications has intensified the
challenge of identifying relevant scientific papers. Due to privacy constraints
and limited access to user interaction data, recent efforts have shifted toward
content-based recommendation, which relies solely on textual information.
However, existing models typically treat papers as unstructured text,
neglecting their discourse organization and thereby limiting semantic
completeness and interpretability. To address these limitations, we propose
OMRC-MR, a hierarchical framework that integrates QA-style OMRC (Objective,
Method, Result, Conclusion) summarization, multi-level contrastive learning,
and structure-aware re-ranking for scholarly recommendation. The QA-style
summarization module converts raw papers into structured and
discourse-consistent representations, while multi-level contrastive objectives
align semantic representations across metadata, section, and document levels.
The final re-ranking stage further refines retrieval precision through
contextual similarity calibration. Experiments on DBLP, S2ORC, and the newly
constructed Sci-OMRC dataset demonstrate that OMRC-MR consistently surpasses
state-of-the-art baselines, achieving up to 7.2% and 3.8% improvements in
Precision@10 and Recall@10, respectively. Additional evaluations confirm that
QA-style summarization produces more coherent and factually complete
representations. Overall, OMRC-MR provides a unified and interpretable
content-based paradigm for scientific paper recommendation, advancing
trustworthy and privacy-aware scholarly information retrieval.
Ссылки и действия
Дополнительные ресурсы: