Published July 6, 2020 | Version v1
Conference paper Open

A Prototype Deep Learning Paraphrase Identification Service for Discovering Information Cascades in Social Networks

Description

Identifying the provenance of information posted on social media and how this information may have changed over time can be very helpful in assessing its trustworthiness. Here, we introduce a novel mechanism for discovering "post-based" information cascades, including the earliest relevant post and how its information has evolved over subsequent posts. Our prototype leverages multiple innovations in the combination of dynamic data sub-sampling and multiple natural language processing and analysis techniques, benefiting from deep learning architectures. We evaluate its performance on EMTD, a dataset that we have generated from our private experimental instance of the decentralised social network Mastodon, as well as the benchmark Microsoft Research Paraphrase Corpus, reporting no errors in sub-sampling based on clustering, and an average accuracy of 92% and F1 score of 93% for paraphrase identification.

Files

A Prototype Deep Learning Paraphrase Identification Service for Discovering Information Cascades in Social Networks.pdf

Additional details

Funding

European Commission
EUNOMIA - User-oriented, secure, trustful & decentralised social media 825171