Published November 27, 2017 | Version 1.0.0
Dataset Open

Word Embedding of Amazon Product Review Corpus

  • 1. Spoken Language Systems, Saarland University


A word embedding of the Amazon Product Review Corpus (Jindal and Liu, 2008).

Created using Word2Vec in CBOW mode, 500 dimensions and window size 5.

Words have been lemmatised and particle verbs have been merged into a single token (e.g. calm_down).



This dataset was created as part of the following publication:

Marc Schulder, Michael Wiegand, Josef Ruppenhofer and Benjamin Roth (2017). "Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features". Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP). Taipei, Taiwan, November 27 - December 3, 2017. DOI: 10.5281/zenodo.3365609.

If you use the data in your research or work, please cite the publication.



Files (2.9 GB)

Additional details

Related works

Is supplement to
Conference paper: 10.5281/zenodo.3365609 (DOI)


  • Jindal, Nitin and Bing Liu (2008). "Opinion Spam and Analysis." In: Proceedings of the International Conference on Web Search and Data Mining (WSDM). Palo Alto, California, USA: Association for Com- puting Machinery, pp. 219–230. isbn: 978-1-59593-927-2. doi: 10. 1145/1341531.1341560