A Recovery Guarantee for Sparse Neural Networks
2509.20323v1
cs.LG, math.OC, stat.ML
2025-09-26
Авторы:
Sara Fridovich-Keil, Mert Pilanci
Резюме на русском
## Контекст
Modern machine learning relies heavily on neural networks, which are known for their expressive power but also for their high computational and memory demands. This poses significant challenges for deploying these models in resource-constrained environments, such as mobile devices and embedded systems. Sparse neural networks, which reduce the number of nonzero weights, offer a promising solution to these challenges. However, achieving sparse recovery—accurately recovering the sparse weight configuration of a neural network—remains a significant theoretical and practical problem. Existing approaches, such as iterative magnitude pruning, often struggle with efficiency and accuracy. This study addresses these limitations by providing the first theoretical guarantees for sparse recovery in ReLU neural networks, focusing on two-layer, scalar-output networks.
## Метод
The proposed methodology centers on analyzing structural properties of sparse neural networks and developing an efficient recovery algorithm. Specifically, the study focuses on two-layer ReLU neural networks with scalar outputs. It introduces an iterative hard thresholding (IHT) algorithm, which systematically prunes small weights while updating remaining ones to optimize network performance. The algorithm operates with memory requirements that scale linearly with the number of nonzero weights, making it highly efficient. Structural assumptions, such as sparsity patterns and activation properties, are analyzed to ensure recovery guarantees. These theoretical insights are then validated through practical experiments on diverse tasks, including planted network recovery, MNIST classification, and implicit neural representation learning.
## Результаты
Theoretical analysis demonstrates that the IHT algorithm can exactly recover sparse weight configurations of two-layer ReLU networks under specific structural conditions. Empirical experiments validate these findings. For instance, on planted MLP recovery tasks, the algorithm achieves perfect recovery with high probability while significantly reducing memory usage compared to baseline methods. In MNIST classification, sparse networks recovered by the IHT algorithm demonstrate competitive accuracy with a fraction of the parameters. Additionally, the method shows promise in implicit neural representations, where it outperforms iterative magnitude pruning in certain scenarios. These results highlight the robustness and efficiency of the proposed approach.
## Значимость
The study provides a theoretical foundation for sparse recovery in ReLU neural networks, addressing a critical gap in the literature. Its practical implications are substantial: the proposed method offers a memory-efficient alternative to traditional pruning techniques, enabling the deployment of sparse neural networks on devices with limited computational resources. Potential applications include edge computing, mobile AI, and real-time processing. Furthermore, the findings contribute to the broader understanding of sparse optimization in neural networks, paving the way for advancements in model compression, interpretability, and energy efficiency.
## Выводы
This work establishes the first recovery guarantees for sparse neural networks, showcasing the effectiveness of the IHT algorithm in recovering sparse weight configurations of two-layer ReLU networks. Experimental results demonstrate competitive performance compared to state-of-the-art methods, with significant memory savings. Future research will focus on extending these results to deeper networks, exploring the role of initialization in recovery guarantees, and developing adaptive pruning strategies for more complex architectures. These directions hold promise for advancing the scalability and efficiency of neural network deployment.
Abstract
We prove the first guarantees of sparse recovery for ReLU neural networks,
where the sparse network weights constitute the signal to be recovered.
Specifically, we study structural properties of the sparse network weights for
two-layer, scalar-output networks under which a simple iterative hard
thresholding algorithm recovers these weights exactly, using memory that grows
linearly in the number of nonzero weights. We validate this theoretical result
with simple experiments on recovery of sparse planted MLPs, MNIST
classification, and implicit neural representations. Experimentally, we find
performance that is competitive with, and often exceeds, a high-performing but
memory-inefficient baseline based on iterative magnitude pruning.