Dataset Open Access

Word Embedding of Amazon Product Review Corpus

Marc Schulder; Wiegand, Michael


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/55a96a3f-7229-4d99-9a70-b7ee6d0c5195/amazon_product_review_corpus.particle_verbs.cbow.w5.d500.txt"
      }, 
      "checksum": "md5:0473c85b76f8057535944fc52911c470", 
      "bucket": "55a96a3f-7229-4d99-9a70-b7ee6d0c5195", 
      "key": "amazon_product_review_corpus.particle_verbs.cbow.w5.d500.txt", 
      "type": "txt", 
      "size": 2920592352
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/55a96a3f-7229-4d99-9a70-b7ee6d0c5195/amazon_product_review_corpus.particle_verbs.cbow.w5.d500.voc"
      }, 
      "checksum": "md5:228b01ddffe135922c762b1bc3a72501", 
      "bucket": "55a96a3f-7229-4d99-9a70-b7ee6d0c5195", 
      "key": "amazon_product_review_corpus.particle_verbs.cbow.w5.d500.voc", 
      "type": "voc", 
      "size": 8200324
    }
  ], 
  "owners": [
    74115
  ], 
  "doi": "10.5281/zenodo.3370051", 
  "stats": {
    "version_unique_downloads": 42.0, 
    "unique_views": 164.0, 
    "views": 190.0, 
    "version_views": 190.0, 
    "unique_downloads": 42.0, 
    "version_unique_views": 164.0, 
    "volume": 131607062968.0, 
    "version_downloads": 67.0, 
    "downloads": 67.0, 
    "version_volume": 131607062968.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.3370051", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.3370050", 
    "bucket": "https://zenodo.org/api/files/55a96a3f-7229-4d99-9a70-b7ee6d0c5195", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.3370050.svg", 
    "html": "https://zenodo.org/record/3370051", 
    "latest_html": "https://zenodo.org/record/3370051", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.3370051.svg", 
    "latest": "https://zenodo.org/api/records/3370051"
  }, 
  "conceptdoi": "10.5281/zenodo.3370050", 
  "created": "2019-08-16T21:42:38.044559+00:00", 
  "updated": "2020-07-09T12:52:57.594303+00:00", 
  "conceptrecid": "3370050", 
  "revision": 14, 
  "id": 3370051, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.3370051", 
    "description": "<p>A word embedding of the <a href=\"https://www.cs.uic.edu/~liub/FBS/sentiment-analysis.html#datasets\">Amazon Product Review Corpus</a> (<a href=\"https://www.doi.org/10.1145/1341531.1341560\">Jindal and Liu, 2008</a>).</p>\n\n<p>Created using <a href=\"https://code.google.com/archive/p/word2vec/\">Word2Vec</a> in CBOW mode, 500 dimensions and window size 5.</p>\n\n<p>Words have been lemmatised and particle verbs have been merged into a single token (e.g. <code>calm_down</code>).</p>\n\n<ul>\n</ul>\n\n<p>&nbsp;</p>\n\n<p><strong>Attribution</strong></p>\n\n<p>This dataset was created as part of the following publication:</p>\n\n<p>Marc Schulder,&nbsp;Michael Wiegand,&nbsp;Josef Ruppenhofer&nbsp;and&nbsp;Benjamin Roth&nbsp;(2017).&nbsp;<strong>&quot;Towards Bootstrapping a Polarity Shifter Lexicon using Linguistic Features&quot;</strong>. Proceedings of the 8th International Joint Conference on Natural Language Processing (IJCNLP). Taipei, Taiwan, November 27 - December 3, 2017.&nbsp;<a href=\"https://doi.org/10.5281/zenodo.3365609\">DOI: 10.5281/zenodo.3365609</a>.</p>\n\n<p>If you use the data in your research or work, please cite the publication.</p>", 
    "language": "eng", 
    "title": "Word Embedding of Amazon Product Review Corpus", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "relations": {
      "version": [
        {
          "count": 1, 
          "index": 0, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "3370050"
          }, 
          "is_last": true, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "3370051"
          }
        }
      ]
    }, 
    "communities": [
      {
        "id": "natural-language-processing"
      }
    ], 
    "version": "1.0.0", 
    "references": [
      "Jindal, Nitin and Bing Liu (2008). \"Opinion Spam and Analysis.\" In: Proceedings of the International Conference on Web Search and Data Mining (WSDM). Palo Alto, California, USA: Association for Com- puting Machinery, pp. 219\u2013230. isbn: 978-1-59593-927-2. doi: 10. 1145/1341531.1341560"
    ], 
    "keywords": [
      "Word Embedding", 
      "Product Reviews"
    ], 
    "publication_date": "2017-11-27", 
    "creators": [
      {
        "orcid": "0000-0002-4183-8489", 
        "affiliation": "Spoken Language Systems, Saarland University", 
        "name": "Marc Schulder"
      }, 
      {
        "affiliation": "Spoken Language Systems, Saarland University", 
        "name": "Wiegand, Michael"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3365609", 
        "relation": "isSupplementTo", 
        "resource_type": "publication-conferencepaper"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3370050", 
        "relation": "isVersionOf"
      }
    ]
  }
}
190
67
views
downloads
All versions This version
Views 190190
Downloads 6767
Data volume 131.6 GB131.6 GB
Unique views 164164
Unique downloads 4242

Share

Cite as