Dataset Open Access

Sentiment analysis in Galaxy with IMDB movie review dataset

Kaivan Kamali


JSON-LD (schema.org) Export

{
  "inLanguage": {
    "alternateName": "eng", 
    "@type": "Language", 
    "name": "English"
  }, 
  "description": "<p>IMDB movie review sentiment classification dataset (Andrew L. Maas, Raymond E. Daly, Peter T. Pham, Dan Huang, Andrew Y. Ng, and Christopher Potts. (2011).&nbsp;Learning Word Vectors for Sentiment Analysis.&nbsp;The 49th Annual Meeting of the Association for Computational Linguistics (ACL 2011)). For more information&nbsp;please refer to:&nbsp;https://ai.stanford.edu/~amaas/data/sentiment/<br>\n<br>\nThe IMDB dataset was modified as follows to prepare it for use in a Galaxy Training Tutorial (https://training.galaxyproject.org/):<br>\n<br>\nThe top 50 words are excluded (mostly stop words). Included&nbsp;the next 10,000 top words. Reviews are limited to&nbsp;500 words max (Longer reviews trimmed and shorter reviews are padded). 25,000 reviews are used for training and testing each. Files are&nbsp;in tsv (tab separated value) format to be consumed by Galaxy (www.usegalaxy.org).&nbsp;</p>", 
  "license": "https://creativecommons.org/licenses/by/4.0/legalcode", 
  "creator": [
    {
      "affiliation": "Penn State University", 
      "@type": "Person", 
      "name": "Kaivan Kamali"
    }
  ], 
  "url": "https://zenodo.org/record/4477881", 
  "datePublished": "2021-01-28", 
  "version": "1.0", 
  "keywords": [
    "IMDB", 
    "Sentiment Analysis", 
    "Movie reviews"
  ], 
  "@context": "https://schema.org/", 
  "distribution": [
    {
      "contentUrl": "https://zenodo.org/api/files/47b8692d-2671-4082-ac8f-746a0783cfa1/X_test.tsv", 
      "encodingFormat": "tsv", 
      "@type": "DataDownload"
    }, 
    {
      "contentUrl": "https://zenodo.org/api/files/47b8692d-2671-4082-ac8f-746a0783cfa1/X_train.tsv", 
      "encodingFormat": "tsv", 
      "@type": "DataDownload"
    }, 
    {
      "contentUrl": "https://zenodo.org/api/files/47b8692d-2671-4082-ac8f-746a0783cfa1/y_test.tsv", 
      "encodingFormat": "tsv", 
      "@type": "DataDownload"
    }, 
    {
      "contentUrl": "https://zenodo.org/api/files/47b8692d-2671-4082-ac8f-746a0783cfa1/y_train.tsv", 
      "encodingFormat": "tsv", 
      "@type": "DataDownload"
    }
  ], 
  "identifier": "https://doi.org/10.5281/zenodo.4477881", 
  "@id": "https://doi.org/10.5281/zenodo.4477881", 
  "@type": "Dataset", 
  "name": "Sentiment analysis in Galaxy with IMDB movie review dataset"
}
146
166
views
downloads
All versions This version
Views 146146
Downloads 166166
Data volume 11.3 GB11.3 GB
Unique views 121121
Unique downloads 6363

Share

Cite as