There is a newer version of this record available.

Dataset Restricted Access

SemEval 2019 Task 4 - Hyperpartisan News Detection

Johannes Kiesel; Martin Potthast; Maria Mestre; Rishabh Shukla; Benno Stein; David Corney; Emmanuel Vincent; Payam Adineh

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Johannes Kiesel</dc:creator>
  <dc:creator>Martin Potthast</dc:creator>
  <dc:creator>Maria Mestre</dc:creator>
  <dc:creator>Rishabh Shukla</dc:creator>
  <dc:creator>Benno Stein</dc:creator>
  <dc:creator>David Corney</dc:creator>
  <dc:creator>Emmanuel Vincent</dc:creator>
  <dc:creator>Payam Adineh</dc:creator>
  <dc:description>Trial dataset for the SemEval 2019 Task 4: Hyperpartisan News Detection.

The dataset contains 200.000 articles: 100.000 hyperpartisan and 100.000 least biased. All articles are labeled by the overall bias of the publisher as provided by BuzzFeed journalists or

The trial data is not fully cleaned. Due to some encoding error, some characters are replaced by question marks. Quote tags are mostly missing. Some text is duplicated. Also, some articles may be contained several times when they are published by several publishers. These errors will be fixed for the final data.</dc:description>
  <dc:subject>Hyperpartisan news</dc:subject>
  <dc:subject>SemEval 2019</dc:subject>
  <dc:subject>SemEval 2019 Task 4</dc:subject>
  <dc:subject>Biased news</dc:subject>
  <dc:subject>News bias</dc:subject>
  <dc:title>SemEval 2019 Task 4 - Hyperpartisan News Detection</dc:title>
All versions This version
Views 17,641137
Downloads 8,6040
Data volume 2.5 TB0 Bytes
Unique views 14,763132
Unique downloads 2,3220


Cite as