Detection and Visualization of Misleading Content on Twitter
The problems of online misinformation and fake news have gained increasing prominence in an age where user-generated content and social media platforms are key forces in the shaping and diffusion of news stories. Unreliable information and misleading content are often posted and widely disseminated through popular social media platforms such as Twitter and Facebook. As a result, journalists and editors are in need of new tools that can help them speed up the verication process for content that is sourced from social media. Motivated by this need, in this paper we present a system that supports the automatic classication of multimedia
Twitter posts into credible or misleading. The system leverages credibility-oriented features extracted from the tweet and the user who published it, and trains a two-step classication model based on a novel semisupervised learning scheme. The latter uses the agreement between two independent pre-trained models on new posts as guiding signals for retraining the classication model.We analyze a large labeled dataset of tweets that shared debunked fake and conrmed real images and videos, and show that integrating the newly proposed features, and making use of bagging in the initial classiers and of the semi-supervised learning scheme, signicantly improves classication accuracy. Moreover, we present a web-based application for visualizing and communicating the classication results to end users.