Dataset Open Access

SemEval-2021 Task 11: NLPContributionGraph - Structuring Scholarly NLP Contributions for a Research Knowledge Graph

D'Souza, Jennifer; Auer, Soeren; Pedersen, Ted

NLPContributionGraph was introduced as Task 11 at SemEval 2021 for the first time. The task is defined on a dataset of Natural Language Processing (NLP) scholarly articles with their contributions structured to be integrable within Knowledge Graph infrastructures such as the Open Research Knowledge Graph. The structured contribution annotations are provided as (1) Contribution sentences : a set of sentences about the contribution in the article; (2) Scientific terms and relations: a set of scientific terms and relational cue phrases extracted from the contribution sentences; and (3) Triples: semantic statements that pair scientific terms with a relation, modeled toward subject-predicate-object RDF statements for KG building. The Triples are organized under three (mandatory) or more of twelve total information units (viz., ResearchProblem, Approach, Model, Code, Dataset, ExperimentalSetup, Hyperparameters, Baselines, Results, Tasks, Experiments, and AblationAnalysis).

The Shared Task

As a complete submission for the Shared Task, given NLP scholarly articles in plaintext format, systems had to automatically extract the following information: contribution sentences; scientific term and predicate phrases from the sentences; and (subject,predicate,object) triple statements toward KG building organized under three or more of twelve total information units. The shared task has an open evaluation never-ending official online evaluation at Codalab.

Files (415.0 MB)
Name Size
test-set.zip
md5:570925498db9f0ce8e5b0a12cd8b8aae
225.8 MB Download
training-set.zip
md5:60956613134ad8a006b3ebf76ffebdbf
159.1 MB Download
trial-set.zip
md5:96b9d13dedf32a188fed274f1b2ee14f
30.0 MB Download
82
3
views
downloads
Views 82
Downloads 3
Data volume 415.0 MB
Unique views 67
Unique downloads 1

Share

Cite as