Beyond Postconditions: Can Large Language Models infer Formal Contracts for Automatic Software Verification?
2510.12702v1
cs.SE, cs.AI, cs.PL
2025-10-16
Авторы:
Cedric Richter, Heike Wehrheim
Abstract
Automatic software verifiers have become increasingly effective at the task
of checking software against (formal) specifications. Yet, their adoption in
practice has been hampered by the lack of such specifications in real world
code. Large Language Models (LLMs) have shown promise in inferring formal
postconditions from natural language hints embedded in code such as function
names, comments or documentation. Using the generated postconditions as
specifications in a subsequent verification, however, often leads verifiers to
suggest invalid inputs, hinting at potential issues that ultimately turn out to
be false alarms.
To address this, we revisit the problem of specification inference from
natural language in the context of automatic software verification. In the
process, we introduce NL2Contract, the task of employing LLMs to translate
informal natural language into formal functional contracts, consisting of
postconditions as well as preconditions. We introduce metrics to validate and
compare different NL2Contract approaches, using soundness, bug discriminative
power of the generated contracts and their usability in the context of
automatic software verification as key metrics. We evaluate NL2Contract with
different LLMs and compare it to the task of postcondition generation
nl2postcond. Our evaluation shows that (1) LLMs are generally effective at
generating functional contracts sound for all possible inputs, (2) the
generated contracts are sufficiently expressive for discriminating buggy from
correct behavior, and (3) verifiers supplied with LLM inferred functional
contracts produce fewer false alarms than when provided with postconditions
alone. Further investigations show that LLM inferred preconditions generally
align well with developers intentions which allows us to use automatic software
verifiers to catch real-world bugs.
Ссылки и действия
Дополнительные ресурсы: