Optimally Deep Networks -- Adapting Model Depth to Datasets for Superior Efficiency
2510.10764v2
cs.LG, cs.AI, cs.CV
2025-10-15
Авторы:
Shaharyar Ahmed Khan Tareen, Filza Khan Tareen
Abstract
Deep neural networks (DNNs) have provided brilliant performance across
various tasks. However, this success often comes at the cost of unnecessarily
large model sizes, high computational demands, and substantial memory
footprints. Typically, powerful architectures are trained at full depths but
not all datasets or tasks require such high model capacity. Training very deep
architectures on relatively low-complexity datasets frequently leads to wasted
computation, unnecessary energy consumption, and excessive memory usage, which
in turn makes deployment of models on resource-constrained devices impractical.
To address this problem, we introduce Optimally Deep Networks (ODNs), which
provide a balance between model depth and task complexity. Specifically, we
propose a NAS like training strategy called progressive depth expansion, which
begins by training deep networks at shallower depths and incrementally
increases their depth as the earlier blocks converge, continuing this process
until the target accuracy is reached. ODNs use only the optimal depth for the
given datasets, removing redundant layers. This cuts down future training and
inference costs, lowers the memory footprint, enhances computational
efficiency, and facilitates deployment on edge devices. Empirical results show
that the optimal depths of ResNet-18 and ResNet-34 for MNIST and SVHN, achieve
up to 98.64 % and 96.44 % reduction in memory footprint, while maintaining a
competitive accuracy of 99.31 % and 96.08 %, respectively.
Ссылки и действия
Дополнительные ресурсы: