Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.
Published April 16, 2021 | Version 1.0
Dataset Open

Finnish Rumor Detection Dataset and Models

  • 1. University of Helsinki

Description

Data and models for Finnish rumor detection. The entire dataset is in rumor_dataset.json. Look into dataset_splits.zip for data splits used in the paper (*_test.txt and *_train.txt). The .pt files are the OpenNMT models described in the paper. bert-models.zip has the BERT based models trained with the bert.py (FinBERT) and bert_multi.py (Multilingual BERT) scripts.

Cite:

Hämäläinen, M., Alnajjar, K., Partanen, N., & Rueter, J. (2021) Never guess what I heard... Rumor Detection in Finnish News: a Dataset and a Baseline. In the Proceedings of the Third Workshop on NLP for Internet Freedom (NLP4IF): Censorship, Disinformation, and Propaganda

Files

bert-models.zip

Files (3.7 GB)

Name Size Download all
md5:4c974f627386514f524a6c770504a7f6
3.6 GB Preview Download
md5:dc5cf370702666301cffbcf1516d6880
4.7 kB Download
md5:7d32007341c5ece9893ab5cb1307dece
4.7 kB Download
md5:6247aaa0c5b81b382eb050a955ae58e4
151.0 kB Preview Download
md5:364add4b9f575b8171b61befdfe939fd
63.2 MB Download
md5:5781082f6133bfbb2d9095725dad3a7a
63.2 MB Download
md5:70746d4ab9c8b8d031fbcf6789892606
425.3 kB Preview Download