Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published July 9, 2020 | Version v1.0.0
Dataset Open

Bootstrapped Lexicon of English Polarity Shifters

  • 1. Spoken Language Systems, Saarland University
  • 2. Institute for German Language, Mannheim

Description

We provide a bootstrapped lexicon of English polarity shifters and their shifting direction. We cover verbs, nouns and adjectives. Our lexicon provides 2521 shifters among a vocabulary of 9145 words, taken from WordNet v3.1 (Miller et al., 1990).

We also provide a dataset of 2631 verb phrases that are annotated for shifting polarities. The phrases are taken from the Amazon Product Review Data corpus (Jindal & Liu, 2008).

Data

1. Polarity Shifter Lexicon

A list of 9145 words, annotated for whether they are polarity shifters. Contains 2631 shifters and 6514 non-shifters.

  • File: shifters.txt
  • The lexicon is a comma-separated value (CSV) table
  • Each line follows the format POS,LEMMA,SHIFTER_LABEL,SOURCE.
    • POS: The part of speech of the word (verbnounadj)
    • LEMMA: The lemma representation of the word in question. Multiword expressions are separated by an underscore (WORD_WORD).
    • SHIFTER_LABEL: Whether the word is a polarity shifter (SHIFTER) or a non-shifter (NONSHIFTER)
    • SOURCE: Whether the word was part of the gold standard (GOLD_STANDARD) or was bootstrapped (BOOTSTRAPPED). All labels, both from gold standard and bootstrap output, were verified by a human annotator.

2. Sentiment Verb Phrases

A set of verb phrases, annotated for the polarity of the verb phrase and the polarity of a polar noun that it contains. Can be used to evaluate whether a polarity classifier correctly recognizes polarity shifting. The file starts with 400 phrases containing shifter verbs, followed by 2231 phrases containing non-shifter verbs.

  • File: sentiment_phrases.txt
  • Every item consists of:
    • The sentence from which the VP and the polar noun were extracted.
    • The VP, polar noun and the verb heading the VP.
    • Constituency parse for the VP.
    • Gold labels for VP and polar noun by a human annotator.
    • Predicted labels for VP and polar noun by RNTN tagger (Socher et al., 2013) and LEX_gold approach.
    • Items are separated by a line of asterisks (*)

Attribution
This dataset was created as part of the following publication:

Schulder, Marc and Wiegand, Michael and Ruppenhofer, Josef (2020). "Automatic Generation of Lexica for Sentiment Polarity Shifters". In: Natural Language Engineeringdoi:10.1017/S135132492000039X

If you use the data in your research or work, please cite the publication.

Notes

This work was partially supported by the German Research Foundation (DFG) under grants RU 1873/2-1 and WI4204/2-1.

Files

README.md

Files (1.6 MB)

Name Size Download all
md5:4a17ffc27c9f3b240fbf4fe17783c89c
18.6 kB Download
md5:a25c1f6632608519137ae3916cd2640a
4.5 kB Preview Download
md5:e5524ce70e76b7c33a9e0580a2c7023b
1.2 MB Preview Download
md5:95d54ac073c7c2d6c5ef021c4aaefe13
351.5 kB Preview Download

Additional details

Related works