10.5281/zenodo.2651010
https://zenodo.org/records/2651010
oai:zenodo.org:2651010
Thanasis Vergoulis
Thanasis Vergoulis
0000-0003-0555-4128
IMSI - "Athena" Research and Innovation Center
Ilias Kanellos
Ilias Kanellos
0000-0003-2146-3795
IMSI - "Athena" Research and Innovation Center
Anargiros Tzaferos
Anargiros Tzaferos
University of the Peloponnese
Serafeim Chatzopoulos
Serafeim Chatzopoulos
IMSI - "Athena" Research and Innovation Center
Theodore Dalamagas
Theodore Dalamagas
IMSI - "Athena" Research and Innovation Center
Spiros Skiadopoulos
Spiros Skiadopoulos
University of the Peloponnese
Domain expert readability dataset
Zenodo
2019
readability
DBLP
2019-04-25
eng
10.5281/zenodo.2651009
1.0
Creative Commons Attribution 4.0 International
Judgments gathered from 10 experts through a web-based survey on the readability of publication abstracts. The abstracts used were a subset of the AMiner's DBLP citation nework v10 dataset (https://aminer.org/citation) in the discipline of data and knowledge management. In particular, abstracts containing the following keywords were used: "database", "machine learning", "information retrieval", "data management", "cloud computing", "data mining", "algorithms", "classification", "query processing", "networks", "indexing", "distributed systems".
After reading the abstract, each expert had to answer the following questions on a 5 point scale.
Q1: Please rate how well-written the abstract is.
Q2: Does the abstract contain linguistic errors?
Q3: Please rate how clear the contribution of the paper is (based on the abstract).
For each question, the interpretation of the extreme scale values (i.e., 1 and 5) were provided. In particular, 1 = “very poorly written” / “so many ling. errors that make abstract incomprehensible” / “not clear at all” (Q1/Q2/Q3) and 5 = “excellently written” / “no errors” / “completely clear” (Q1/Q2/Q3).
The pairwise correlations (Kendall’s τ) of expert judgments on questions Q1-Q3 are presented in this table.
The contained dataset is a tsv file that includes the following fields:
user_id: expert identifier
paper_id: AMiner's identifier from DBLP citation nework v10 dataset
rating_1: answer for Q1
rating_2: answer for Q2
rating_3: answer fro Q3
Please cite:
Thanasis Vergoulis, Ilias Kanellos, Anargiros Tzerefos, Serafeim Chatzopoulos, Theodore Dalamagas, Spiros Skiadopoulos. A study on the readability of scientific publications. 23rd International Conference on Theory and Practice of Digital Libraries. Oslo, Norway 2019 (to appear)
We acknowledge support of this work by the project "Moving from Big Data Management to Data Science" (MIS 5002437/3) which is implemented under the Action "Reinforcement of the Research and Innovation Infrastructure", funded by the Operational Programme "Competitiveness, Entrepreneurship and Innovation" (NSRF 2014-2020) and co-financed by Greece and the European Union (European Regional Development Fund).