Published September 15, 2020 | Version v1
Dataset Open

Check Mate: Prioritizing User Generated Multi-Media Content for Fact-Checking

Description

Given volume of content and misinformation on social media, there is a need for systems that can support fact checkers by prioritizing content that needs to be fact checked. Prior research on prioritizing content for fact-checking has focused on news media articles, predominantly in English language. But there is an increasing amount of misinformation in user-generated content. Furthermore, misinformation is generated through information across modalities. In this paper we present a novel dataset that can be used to prioritize check-worthy posts from multi-media content in Hindi. It is unique in its 1) focus on user generated content, 2) multi-modality and 3) Hindi as the primary language of content. In addition, we also provide metadata for each post such as number of shares and likes of the post on ShareChat, a popular Indian social media platform, that allows for correlative analysis around virality and misinformation. 

Notes

Additional data/information can be requested by emailing admin@tattle.co.in

Files

CheckMate_UGC_Hindi 2.zip

Files (2.6 GB)

Name Size Download all
md5:7016d5fedef8d186f69d31a65062be3a
2.6 GB Preview Download