Pseudo-MDPs: A Novel Framework for Efficiently Optimizing Last Revealer Seed Manipulations in Blockchains
2510.07080v1
cs.CR, cs.LG
2025-10-10
Авторы:
Maxime Reynouard
Abstract
This study tackles the computational challenges of solving Markov Decision
Processes (MDPs) for a restricted class of problems. It is motivated by the
Last Revealer Attack (LRA), which undermines fairness in some Proof-of-Stake
(PoS) blockchains such as Ethereum (\$400B market capitalization). We introduce
pseudo-MDPs (pMDPs) a framework that naturally models such problems and propose
two distinct problem reductions to standard MDPs. One problem reduction
provides a novel, counter-intuitive perspective, and combining the two problem
reductions enables significant improvements in dynamic programming algorithms
such as value iteration. In the case of the LRA which size is parameterized by
$\kappa$ (in Ethereum's case $\kappa$= 325), we reduce the computational
complexity from $O(2^\kappa \kappa^{2^{\kappa+2}})$ to $O(\kappa^4)$ (per
iteration). This solution also provide the usual benefits from Dynamic
Programming solutions: exponentially fast convergence toward the optimal
solution is guaranteed. The dual perspective also simplifies policy extraction,
making the approach well-suited for resource-constrained agents who can operate
with very limited memory and computation once the problem has been solved.
Furthermore, we generalize those results to a broader class of MDPs, enhancing
their applicability. The framework is validated through two case studies: a
fictional card game and the LRA on the Ethereum random seed consensus protocol.
These applications demonstrate the framework's ability to solve large-scale
problems effectively while offering actionable insights into optimal
strategies. This work advances the study of MDPs and contributes to
understanding security vulnerabilities in blockchain systems.
Ссылки и действия
Дополнительные ресурсы: