Ax-Prover: A Deep Reasoning Agentic Framework for Theorem Proving in Mathematics and Quantum Physics
2510.12787v1
cs.AI, cs.MA
2025-10-16
Авторы:
Marco Del Tredici, Jacob McCarran, Benjamin Breen, Javier Aspuru Mijares, Weichen Winston Yin, Jacob M. Taylor, Frank Koppens, Dirk Englund
Abstract
We present Ax-Prover, a multi-agent system for automated theorem proving in
Lean that can solve problems across diverse scientific domains and operate
either autonomously or collaboratively with human experts. To achieve this,
Ax-Prover approaches scientific problem solving through formal proof
generation, a process that demands both creative reasoning and strict syntactic
rigor. Ax-Prover meets this challenge by equipping Large Language Models
(LLMs), which provide knowledge and reasoning, with Lean tools via the Model
Context Protocol (MCP), which ensure formal correctness. To evaluate its
performance as an autonomous prover, we benchmark our approach against frontier
LLMs and specialized prover models on two public math benchmarks and on two
Lean benchmarks we introduce in the fields of abstract algebra and quantum
theory. On public datasets, Ax-Prover is competitive with state-of-the-art
provers, while it largely outperform them on the new benchmarks. This shows
that, unlike specialized systems that struggle to generalize, our tool-based
agentic theorem prover approach offers a generalizable methodology for formal
verification across diverse scientific domains. Furthermore, we demonstrate
Ax-Prover's assistant capabilities in a practical use case, showing how it
enabled an expert mathematician to formalize the proof of a complex
cryptography theorem.
Ссылки и действия
Дополнительные ресурсы: