LLM-as-a-Prophet: Understanding Predictive Intelligence with Prophet Arena
2510.17638v1
cs.AI, cs.CL, cs.LG
2025-10-22
Авторы:
Qingchuan Yang, Simon Mahns, Sida Li, Anri Gu, Jibang Wu, Haifeng Xu
Abstract
Forecasting is not only a fundamental intellectual pursuit but also is of
significant importance to societal systems such as finance and economics. With
the rapid advances of large language models (LLMs) trained on Internet-scale
data, it raises the promise of employing LLMs to forecast real-world future
events, an emerging paradigm we call "LLM-as-a-Prophet". This paper
systematically investigates such predictive intelligence of LLMs. To this end,
we build Prophet Arena, a general evaluation benchmark that continuously
collects live forecasting tasks and decomposes each task into distinct pipeline
stages, in order to support our controlled and large-scale experimentation. Our
comprehensive evaluation reveals that many LLMs already exhibit impressive
forecasting capabilities, reflected in, e.g., their small calibration errors,
consistent prediction confidence and promising market returns. However, we also
uncover key bottlenecks towards achieving superior predictive intelligence via
LLM-as-a-Prophet, such as LLMs' inaccurate event recalls, misunderstanding of
data sources and slower information aggregation compared to markets when
resolution nears.
Ссылки и действия
Дополнительные ресурсы: