Published October 22, 2020 | Version v1
Dataset Open

Sentiment Quantification Datasets

  • 1. ISTI-CNR

Description

These files are contain the tokenized reviews that are used for quantication experiments on text.

IMDB is derived from the IMDB dataset from Maas et al., 2011 (https://ai.stanford.edu/~amaas/data/sentiment/).
The version of the IMDB content in this dataset has minimal processing with respect to the original dataset, yet, it is provided to unsure reproducibility of experiments.

HP and Kindle dataset are Amazon reviews collected by the authors. The reviews are respectively about the books in the Harry Potter series, and about the Kindle e-book reader.

Files

hp_test.txt

Files (99.2 MB)

Name Size Download all
md5:1308ca699724a04fad1aef1a016e4f03
15.7 MB Preview Download
md5:c0afaf100cc53125bef1cea102991480
5.2 MB Preview Download
md5:8b083174d5af1ef0871a43a7ce1430f5
30.8 MB Preview Download
md5:e1db754cf4c0b831c1b4dbd4ba9f96af
32.1 MB Preview Download
md5:973f217f94202ade02ada763bdab2ecd
12.3 MB Preview Download
md5:fff0a2e54c533bf12d1d04e41230f515
3.1 MB Preview Download

Additional details

Related works

Is supplement to
Conference paper: 10.1145/3269206.3269287 (DOI)
Journal article: 10.1145/2700406 (DOI)
Software: https://github.com/HLT-ISTI/QuaNet (URL)

Funding

SoBigData – SoBigData Research Infrastructure 654024
European Commission