Dataset Open Access

Word Embedding of Amazon Product Review Corpus

Marc Schulder; Wiegand, Michael

A word embedding of the Amazon Product Review Corpus (Jindal and Liu, 2008).

Created using Word2Vec in CBOW mode, 500 dimensions and window size 5.

Words have been lemmatised and particle verbs have been merged into a single token (e.g. calm_down).

 

Attribution

This dataset was created as part of the following publication:

Marc Schulder, Michael Wiegand, Josef Ruppenhofer and Benjamin Roth (2017). "Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features". Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP). Taipei, Taiwan, November 27 - December 3, 2017. DOI: 10.5281/zenodo.3365609.

If you use the data in your research or work, please cite the publication.

Files (2.9 GB)
Name Size
amazon_product_review_corpus.particle_verbs.cbow.w5.d500.txt
md5:0473c85b76f8057535944fc52911c470
2.9 GB Download
amazon_product_review_corpus.particle_verbs.cbow.w5.d500.voc
md5:228b01ddffe135922c762b1bc3a72501
8.2 MB Download
  • Jindal, Nitin and Bing Liu (2008). "Opinion Spam and Analysis." In: Proceedings of the International Conference on Web Search and Data Mining (WSDM). Palo Alto, California, USA: Association for Com- puting Machinery, pp. 219–230. isbn: 978-1-59593-927-2. doi: 10. 1145/1341531.1341560

170
60
views
downloads
All versions This version
Views 170170
Downloads 6060
Data volume 117.0 GB117.0 GB
Unique views 146146
Unique downloads 3535

Share

Cite as