A New Framework for Convex Clustering in Kernel Spaces: Finite Sample Bounds, Consistency and Performance Insights
2511.05159v1
stat.ML, cs.LG
2025-11-11
Авторы:
Shubhayan Pan, Saptarshi Chakraborty, Debolina Paul, Kushal Bose, Swagatam Das
Abstract
Convex clustering is a well-regarded clustering method, resembling the
similar centroid-based approach of Lloyd's $k$-means, without requiring a
predefined cluster count. It starts with each data point as its centroid and
iteratively merges them. Despite its advantages, this method can fail when
dealing with data exhibiting linearly non-separable or non-convex structures.
To mitigate the limitations, we propose a kernelized extension of the convex
clustering method. This approach projects the data points into a Reproducing
Kernel Hilbert Space (RKHS) using a feature map, enabling convex clustering in
this transformed space. This kernelization not only allows for better handling
of complex data distributions but also produces an embedding in a
finite-dimensional vector space. We provide a comprehensive theoretical
underpinnings for our kernelized approach, proving algorithmic convergence and
establishing finite sample bounds for our estimates. The effectiveness of our
method is demonstrated through extensive experiments on both synthetic and
real-world datasets, showing superior performance compared to state-of-the-art
clustering techniques. This work marks a significant advancement in the field,
offering an effective solution for clustering in non-linear and non-convex data
scenarios.
Ссылки и действия
Дополнительные ресурсы: