Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data

2510.21078v1 cs.LG, math.OC 2025-10-28

Авторы:

Hancheng Min, Zhihui Zhu, René Vidal

Abstract

Among many mysteries behind the success of deep networks lies the exceptional discriminative power of their learned representations as manifested by the intriguing Neural Collapse (NC) phenomenon, where simple feature structures emerge at the last layer of a trained neural network. Prior works on the theoretical understandings of NC have focused on analyzing the optimization landscape of matrix-factorization-like problems by considering the last-layer features as unconstrained free optimization variables and showing that their global minima exhibit NC. In this paper, we show that gradient flow on a two-layer ReLU network for classifying orthogonally separable data provably exhibits NC, thereby advancing prior results in two ways: First, we relax the assumption of unconstrained features, showing the effect of data structure and nonlinear activations on NC characterizations. Second, we reveal the role of the implicit bias of the training dynamics in facilitating the emergence of NC.

Ссылки и действия

Читать на arXiv Скачать PDF

Дополнительные ресурсы:

Neural Collapse under Gradient Flow on Shallow ReLU Networks for Orthogonally Separable Data

Авторы:

Abstract

Ссылки и действия

Связанные статьи

Convergence for Discrete Parameter Updates

The Geometry of Intelligence: Deterministic Functional Topology as a Foundation ...

Beyond Scaffold: A Unified Spatio-Temporal Gradient Tracking Method

Risk-Sensitive Q-Learning in Continuous Time with Application to Dynamic Portfol...

ARM-Explainer -- Explaining and improving graph neural network predictions for t...

Навигация