Dataset Open Access

Claim Detection and Matching for Indian Languages

Ashkan Kazemi; Kiran Garimella; Devin Gaffney; Scott A. Hale


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.4890950", 
  "language": "hin", 
  "title": "Claim Detection and Matching for Indian Languages", 
  "issued": {
    "date-parts": [
      [
        2021, 
        6, 
        1
      ]
    ]
  }, 
  "abstract": "<p>Two datasets are included in this repository: claim matching and claim detection datasets.&nbsp;The collections contain&nbsp;data in 5 languages: Bengali, English, Hindi, Malayalam and Tamil.</p>\n\n<p>The &quot;claim detection&quot;&nbsp;dataset contains textual claims from social media and fact-checking websites annotated for the&nbsp; &quot;fact-check worthiness&quot; of the claims in each message. Data points have one of the three labels of &quot;Yes&quot; (text contains one or more check-worthy claims), &quot;No&quot; and &quot;Probably&quot;.&nbsp;</p>\n\n<p>The &quot;claim matching&quot; dataset is a curated collection of pairs of textual claims from social media and fact-checking websites for the purpose of automatic and multilingual claim matching.&nbsp;Pairs of data have one of the four labels of &quot;Very Similar&quot;, &quot;Somewhat Similar&quot;, &quot;Somewhat Dissimilar&quot; and &quot;Very Dissimilar&quot;.</p>\n\n<p>All personally identifiable information (PII) including phone numbers, email addresses,&nbsp;license plate numbers and addresses have been replaced with general tags (e.g. &lt;PHONE#&gt;, &lt;ADDRESS&gt;, etc)&nbsp;to protect user anonymity. A detailed explanation on the curation and annotation process is provided in our ACL 2021 paper:&nbsp;<br>\n<a href=\"https://arxiv.org/abs/2106.00853\">Kazemi, A.; Garimella, K.; Gaffney, D.; and Hale, S. A. 2021. Claim Matching Beyond English to Scale Global Fact-Checking. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics, ACL 2021.</a></p>", 
  "author": [
    {
      "family": "Ashkan Kazemi"
    }, 
    {
      "family": "Kiran Garimella"
    }, 
    {
      "family": "Devin Gaffney"
    }, 
    {
      "family": "Scott A. Hale"
    }
  ], 
  "id": "4890950", 
  "event-place": "Online", 
  "version": "1.0", 
  "type": "dataset", 
  "event": "ACL-IJCNLP 2021"
}
318
214
views
downloads
All versions This version
Views 318318
Downloads 214214
Data volume 688.4 MB688.4 MB
Unique views 290290
Unique downloads 178178

Share

Cite as