Dataset Open Access
Fatma Arslan;
Naeemul Hassan;
Chengkai Li;
Mark Tremayne
The ClaimBuster dataset consists of statements extracted from all U.S. general election presidential debates (1960-2016) along with human-annotated check-worthiness labels. It contains 23,533 sentences where each sentence is categorized into one of the three categories: non-factual statement, unimportant factual statement, and check-worthy factual statement.
Name | Size | |
---|---|---|
all_sentences.csv
md5:686ac1dd5123d9ca0d229ee9760d4962 |
5.3 MB | Download |
crowdsourced.csv
md5:af9649bc3cc93edbc804893720a50bde |
3.9 MB | Download |
groundtruth.csv
md5:1577f6d45bf33eabe9cf760f0fb66da3 |
167.7 kB | Download |
All versions | This version | |
---|---|---|
Views | 3,373 | 1,525 |
Downloads | 2,777 | 2,571 |
Data volume | 10.0 GB | 9.0 GB |
Unique views | 2,898 | 1,385 |
Unique downloads | 1,599 | 1,453 |