ArguAna TripAdvisor
- 1. Paderborn University
- 2. Bauhaus-Universität Weimar
Description
An English corpus for studying local sentiment flows and aspect-based sentiment analysis. It contains 2100 hotel reviews balanced with respect to the reviews’ sentiment scores. All reviews are segmented into subsentence-level statements that have then been manually classified as a fact, a positive, or a negative opinion. Also, all hotel aspects mentioned in the reviews have been annotated as such:
- arguana-tripadvisor-annotated-plus-software-v1.zip
- arguana-tripadvisor-annotated-v2.zip
In addition, we provide nearly 200k further hotel reviews without manual annotations:
- v1 upon request
- arguana-tripadvisor-unannotated-v2.zip
The corpus is free-to-use for scientific purposes, not for commercial applications. In version 2, the annotated XMI files have been changed according to a new underlying type system that is more easily extendable. Notice that some adaptations of the software of version 1 are necessary to make it work with version 2.
In case you publish any results related to the ArguAna TripAdvisor corpus, please cite our CICLing 2014 paper.