Published November 26, 2019 | Version v1
Conference paper Open

Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi

Description

Sentiment analysis, also known as opinion mining, plays a big role in both private and public sector Business Intelligence (BI); it attempts to improve public and customer experience. Nevertheless, de-identified sentiment scores from public social media posts can compromise individual privacy due to their vulnerability to record linkage attacks. Established privacy-preserving methods like k-anonymity, l-diversity and t-closeness are offline models exclusively designed for data at rest. Recently, a number of online anonymization algorithms (CASTLE, SKY, SWAF) have been proposed to complement the functional requirements of streaming applications, but without open-source implementation. In this paper, we present a reusable Apache NiFi dataflow that buffers tweets from multiple edge devices and performs anonymized sentiment analysis in real-time, using randomization. The solution can be easily adapted to suit different scenarios, enabling researchers to deploy custom anonymization algorithms.

Notes

Abhinay Pandya, Panos Kostakos, Hassan Mahmood, Marta Cortes, Ekaterina Gilman, Mourad Oussalah and Susanna Pirttikangas, "Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi", in Proc. European Intelligence and Security Informatics Conference (EISIC 2019), Oulu, Finland, November 2019. (DOI: https://doi.org/10.1109/EISIC49498.2019.9108851)

Files

Privacy preserving sentiment analysis on multiple edge data streams with Apache NiFi.pdf

Additional details

Funding

European Commission
CUTLER - Coastal Urban developmenT through the LEnses of Resiliency 770469