Journal article Open Access
Fodor Lidija; Jakovetić Dušan; Boberić Krstićev Danijela; Škrbić Srđan
Convex clustering has received recently an increased interest as a valuable method for unsupervised learning. Unlike conventional clustering methods such as k-means, its formulation corresponds to solving a convex optimization problem and hence, alleviates initialization and local minima problems. However, while several algorithms have been proposed to solve convex clustering formulations, including those based on the alternating direction method of multipliers (ADMM), there is currently a limited body of work on developing scalable parallel and distributed algorithms and solvers for convex clustering. In this paper, we develop a parallel, ADMM-based method, for a modified convex clustering sum-of-norms (SON) formulation for master–worker architectures, where the data to be clustered are partitioned across a number of worker nodes, and we provide its efficient, open-source implementation (available on Parallel ADMM-based convex clustering. https://github.com/lidijaf/Parallel-ADMM-based-convex-clustering. Accessed on 10 June 2022) for high-performance computing (HPC) cluster environments. Extensive numerical evaluations on real and synthetic data sets demonstrate a high degree of scalability and efficiency of the method, when compared with existing alternative solvers for convex clustering.
|Data volume||39.4 MB|