RAE: A Neural Network Dimensionality Reduction Method for Nearest Neighbors Preservation in Vector Search
2509.25839v1
cs.IR, cs.AI, cs.DB
2025-10-02
Авторы:
Han Zhang, Dongfang Zhao
Abstract
While high-dimensional embedding vectors are being increasingly employed in
various tasks like Retrieval-Augmented Generation and Recommendation Systems,
popular dimensionality reduction (DR) methods such as PCA and UMAP have rarely
been adopted for accelerating the retrieval process due to their inability of
preserving the nearest neighbor (NN) relationship among vectors. Empowered by
neural networks' optimization capability and the bounding effect of Rayleigh
quotient, we propose a Regularized Auto-Encoder (RAE) for k-NN preserving
dimensionality reduction. RAE constrains the network parameter variation
through regularization terms, adjusting singular values to control embedding
magnitude changes during reduction, thus preserving k-NN relationships. We
provide a rigorous mathematical analysis demonstrating that regularization
establishes an upper bound on the norm distortion rate of transformed vectors,
thereby offering provable guarantees for k-NN preservation. With modest
training overhead, RAE achieves superior k-NN recall compared to existing DR
approaches while maintaining fast retrieval efficiency.
Ссылки и действия
Дополнительные ресурсы: