MBIC – A Media Bias Annotation Dataset Including Annotator Characteristics

Spinde, Timo and Rudnitckaia, Lada and Sinha, Kanishka, and Hamborg, Felix and and Gipp, Bela and Donnay, Karsten

doi:10.5281/zenodo.4474336

Published January 27, 2021 | Version v1

Conference paper Open

MBIC – A Media Bias Annotation Dataset Including Annotator Characteristics

Spinde, Timo and Rudnitckaia, Lada and Sinha, Kanishka, and Hamborg, Felix and and Gipp, Bela and Donnay, Karsten

Many people consider news articles to be a reliable source of information on current events. However, due to the range of factors influencing news agencies, such coverage may not always be impartial. Media bias, or slanted news coverage, can have a substantial impact on public perception of events, and, accordingly, can potentially alter the beliefs and views of the public. The main data gap in current research on media bias detection is a robust, representative, and diverse dataset containing annotations of biased words and sentences. In particular, existing datasets do not control for the individual background of annotators, which may affect their assessment and, thus, represents critical information for contextualizing their annotations. In this poster, we present a matrix-based methodology to crowdsource such data using a self-developed annotation platform. We also present MBIC (Media Bias Including Characteristics) - the first sample of 1,700 statements representing various media bias instances. The statements were reviewed by ten annotators each and contain labels for media bias identification both on the word and sentence level. MBIC is the first available dataset about media bias reporting detailed information on annotator characteristics and their individual background. The current dataset already significantly extends existing data in this domain providing unique and more reliable insights into the perception of bias. In future, we will further extend it both with respect to the number of articles and annotators per article.

Notes (English)

The files where slightly modified on 2024-01-31 due to exposure of sensitive information.

Files

annotators.csv

Files (4.8 MB)

Name	Size	Download all
annotations.xlsx md5:7ee6b60316c64bc6dd7ff715bf69054b	2.6 MB	Download
annotators.csv md5:3f983a0fc940d4be71cf0753d66f65b6	195.1 kB	Preview Download
labeled_dataset.xlsx md5:ff3b030047c2adbabd3458d513c0e661	2.0 MB	Download

	All versions	This version
Views	2,366	1,665
Downloads	2,786	1,994
Data volume	3.6 GB	2.6 GB

MBIC – A Media Bias Annotation Dataset Including Annotator Characteristics

Authors/Creators

Description

Notes (English)

Files

annotators.csv

Files (4.8 MB)