Planned intervention: On Wednesday June 26th 05:30 UTC Zenodo will be unavailable for 10-20 minutes to perform a storage cluster upgrade.
Published April 1, 2014 | Version v1
Dataset Open

ArguAna TripAdvisor

  • 1. Paderborn University
  • 2. Bauhaus-Universität Weimar

Description

An English corpus for studying local sentiment flows and aspect-based sentiment analysis. It contains 2100 hotel reviews balanced with respect to the reviews’ sentiment scores. All reviews are segmented into subsentence-level statements that have then been manually classified as a fact, a positive, or a negative opinion. Also, all hotel aspects mentioned in the reviews have been annotated as such:

  • arguana-tripadvisor-annotated-plus-software-v1.zip
  • arguana-tripadvisor-annotated-v2.zip

In addition, we provide nearly 200k further hotel reviews without manual annotations:

  • v1 upon request
  • arguana-tripadvisor-unannotated-v2.zip

The corpus is free-to-use for scientific purposes, not for commercial applications. In version 2, the annotated XMI files have been changed according to a new underlying type system that is more easily extendable. Notice that some adaptations of the software of version 1 are necessary to make it work with version 2.

In case you publish any results related to the ArguAna TripAdvisor corpus, please cite our CICLing 2014 paper.

Files

arguana-tripadvisor-annotated-plus-software-v1.zip

Files (279.1 MB)

Name Size Download all
md5:ef11039ebbd5088784cdf2d37bc0b65f
12.1 MB Preview Download
md5:a450bbdbbf888fb171783b62aa81e332
9.6 MB Preview Download
md5:85e7c4f4142fc6bfdec1ad671cd78cdb
257.4 MB Preview Download