Evaluating LLM Story Generation through Large-scale Network Analysis of Social Structures
2510.18932v1
cs.CL, cs.LG
2025-10-24
Авторы:
Hiroshi Nonaka, K. E. Perry
Abstract
Evaluating the creative capabilities of large language models (LLMs) in
complex tasks often requires human assessments that are difficult to scale. We
introduce a novel, scalable methodology for evaluating LLM story generation by
analyzing underlying social structures in narratives as signed character
networks. To demonstrate its effectiveness, we conduct a large-scale
comparative analysis using networks from over 1,200 stories, generated by four
leading LLMs (GPT-4o, GPT-4o mini, Gemini 1.5 Pro, and Gemini 1.5 Flash) and a
human-written corpus. Our findings, based on network properties like density,
clustering, and signed edge weights, show that LLM-generated stories
consistently exhibit a strong bias toward tightly-knit, positive relationships,
which aligns with findings from prior research using human assessment. Our
proposed approach provides a valuable tool for evaluating limitations and
tendencies in the creative storytelling of current and future LLMs.
Ссылки и действия
Дополнительные ресурсы: