4105765
doi
10.5281/zenodo.4105765
oai:zenodo.org:4105765
user-webis
El Baff, Roxanne
German Aerospace Centre (DLR)
Al-Khatib, Khalid
Bauhaus Universität, Weimar
Kiesel, Johannes
Bauhaus Universität, Weimar
Stein, Benno
Bauhaus Universität, Weimar
Potthast, Martin
Leipzig University
Webis EditorialSum Corpus 2020
Syed, Shahbaz
Leipzig University
info:eu-repo/semantics/openAccess
Creative Commons Attribution 4.0 International
https://creativecommons.org/licenses/by/4.0/legalcode
editorial summarization
argumentation summarization
extractive summarization
<p>The Webis EditorialSum Corpus consists of 1330 manually curated extractive summaries for 266 news editorials spanning three diverse portals: Al-Jazeera, Guardian and Fox News. Each editorial has 5 summaries, each labeled for overall quality and fine grained properties such as thesis-relevance, persuasiveness, reasonableness, self-containedness.</p>
<p>The files are organized as follows:</p>
<p><br>
<em>corpus.csv</em> - <strong>Contains all the editorials and their acquired summaries</strong></p>
<p><br>
Note: (X = [1,5] for five summaries)</p>
<p>- article_id : Article ID in the corpus<br>
- title : Title of the editorial<br>
- article_text : Plain text of the editorial<br>
- summary_{X}_text : Plain text of the corresponding summary<br>
- thesis_{X}_text : Plain text of the thesis from the corresponding summary<br>
- lead : top 15% of the editorial's segments<br>
- body : segments between lead and conclusion sections<br>
- conclusion : bottom 15% of the editorial's segments<br>
- article_segments: Collection of paragraphs, each further divided into collection of segments containing:<br>
{ "number": segment order in the editorial,<br>
"text" : segment text,<br>
"label": ADU type<br>
}<br>
- summary_{X}_segments: Collection of summary segments containing:<br>
{ "number": segment order in the editorial,<br>
"text" : segment text,<br>
"adu_label": ADU type from the editorial,<br>
"summary_label": can be 'thesis' or 'justification'<br>
}</p>
<p><br>
<em>quality-groups.csv</em> - <strong>Contains the IDs for high(and low)-quality summaries for each quality dimension per editorial</strong><br>
<br>
For example: article_id 2 has four high_quality summaries (summary_1, summary_2, summary_3, summary_4) and one low_quality summary (summary_5) in terms of overall quality.<br>
The summary texts can be obtained from corpus.csv respectively.</p>
<p> </p>
<p> </p>
<p> </p>
Zenodo
2020-10-19
info:eu-repo/semantics/other
4105764
user-webis
1603110417.201047
94974
md5:117cb5a5712a3772b0da9ab254a331d7
https://zenodo.org/records/4105765/files/quality-groups.csv
10733231
md5:b3053455c6c58580570c9e30390f7d62
https://zenodo.org/records/4105765/files/corpus.csv
public
10.5281/zenodo.4105764
isVersionOf
doi