Fodor Lidija
Jakovetić Dušan
Boberić Krstićev Danijela
Škrbić Srđan
2022-11-08
<p>Convex clustering has received recently an increased interest as a valuable method for unsupervised learning. Unlike conventional clustering methods such as k-means, its formulation corresponds to solving a convex optimization problem and hence, alleviates initialization and local minima problems. However, while several algorithms have been proposed to solve convex clustering formulations, including those based on the alternating direction method of multipliers (ADMM), there is currently a limited body of work on developing scalable <em>parallel and distributed</em> algorithms and solvers for convex clustering. In this paper, we develop a parallel, ADMM-based method, for a modified convex clustering sum-of-norms (SON) formulation for master–worker architectures, where the data to be clustered are partitioned across a number of worker nodes, and we provide its efficient, open-source implementation (available on Parallel ADMM-based convex clustering. <a href="https://github.com/lidijaf/Parallel-ADMM-based-convex-clustering">https://github.com/lidijaf/Parallel-ADMM-based-convex-clustering</a>. Accessed on 10 June 2022) for high-performance computing (HPC) cluster environments. Extensive numerical evaluations on real and synthetic data sets demonstrate a high degree of scalability and efficiency of the method, when compared with existing alternative solvers for convex clustering.</p>
LF developed the implementation of the algorithm and performed the empirical evaluations.
DJ contributed with the theoretical advances and design of algorithm.
DBK and SS contributed to improving the quality of experimentation and design.
All authors participated in the main research flow development and in writing and revising the manuscript.
All authors read and approved the final manuscript.
The code for parallel ADMM-based convex clustering can be found in the following GitHub repository: https://github.com/lidijaf/Parallel-ADMM-based-convex-clustering.
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.
https://doi.org/10.1186/s13634-022-00942-8
oai:zenodo.org:7379988
eng
Zenodo
https://zenodo.org/communities/cyreneproject-eu
https://zenodo.org/communities/eu
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
EURASIP Journal on Advances in Signal Processing, 108, (2022-11-08)
Distributed optimization
ADMM
High-performance computing
Performance evaluation
A parallel ADMM-based convex clustering method
info:eu-repo/semantics/article