Published March 9, 2020 | Version v1
Conference paper Open

Parallel Differentially Private K-Means Implementation Using COMPSs Framework

  • 1. University of Novi Sad, Faculty of Sciences

Description

K-means is one of the most important clustering algorithms, but it does introduce a risk of privacy disclosure in the clustering process. One approach to solving this problem is by applying differential privacy to K-means clustering algorithm to effectively prevent privacy disclosure. Increasing amounts of information generated in big data processing scenarios make clustering a challenging task. In order to deal with the problem, various approaches to the parallelization of clustering algorithms have been attempted. This paper presents an implementation of a differentially private k-means clustering algorithm that uses -differential privacy, based on the COMPSs framework for parallel computing. The experimental results show that the proposed implementation scales well and can be used to efficiently process large datasets using high-performance computing equipment.

Files

2020ICIST2020_final.pdf

Files (248.4 kB)

Name Size Download all
md5:07df8da42e67a7c1b5d6dce5a47217e6
248.4 kB Preview Download

Additional details

Funding

I-BiDaaS – Industrial-Driven Big Data as a Self-Service Solution 780787
European Commission