Published August 17, 2021 | Version 1.0
Dataset Restricted

SEN - Sentiment analysis of Entities in News headlines

  • 1. Polish-Japanese Academy of Information Technology
  • 2. Polish-Japanese Academy of Information Technology / Institute of Computer Science Polish Academy of Sciences

Description

If you wish to use this data please cite:

Katarzyna Baraniak, Marcin Sydow,
A dataset for Sentiment analysis of Entities in News headlines (SEN),
Procedia Computer Science,
Volume 192,
2021,
Pages 3627-3636,
ISSN 1877-0509,
https://doi.org/10.1016/j.procs.2021.09.136.
(https://www.sciencedirect.com/science/article/pii/S1877050921018755)

bibtex: users.pja.edu.pl/~msyd/bibtex/sydow-baraniak-SENdataset-kes21.bib
 

SEN is a novel publicly available human-labelled dataset for training and testing machine learning algorithms for the problem of entity level sentiment analysis of political news headlines.

On-line news portals play a very important role in the information society. Fair media should present reliable and objective information. In practice there is an observable positive or negative bias concerning named entities (e.g. politicians) mentioned in the on-line news headlines.
Our dataset consists of 3819 human-labelled political news headlines coming from several major on-line media outlets in English and Polish.

Each record contains a news headline, a named entity mentioned in the headline and a human annotated label (one of “positive”, “neutral”, “negative” ). Our SEN dataset package consists of 2 parts: SEN-en (English headlines that split into SEN-en-R and SEN-en-AMT), and SEN-pl (Polish headlines). Each headline-entity pair was annotated via team of volunteer researchers (the whole SEN-pl dataset and a subset of 1271 English records: the SEN-en-R subset, “R” for “researchers”) or via the Amazon Mechanical Turk service (a subset of 1360 English records: the SEN-en-AMT subset).

During analysis of annotation outlying annotations and removed . Separate version of dataset without outliers is marked by "noutliers" in data file name.

Details of the process of preparing the dataset and presenting its analysis are presented in the paper.
 

In case of any questions, please contact one of the authors. Email adresses are in the paper.

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

Data can be used only for reaserch purposes. 

After data request you should receive an email with confirmation, if not please resend request.

If you wish to use this data please cite:

bibtex: users.pja.edu.pl/~msyd/bibtex/sydow-baraniak-SENdataset-kes21.bib
article: https://www.sciencedirect.com/science/article/pii/S1877050921018755
 

You are currently not logged in. Do you have an account? Log in here

Additional details

Related works

Is published in
Conference paper: 10.1016/j.procs.2021.09.136 (DOI)