Published October 9, 2023
| Version 1
Dataset
Open
Evaluation dataset for Relation Extraction of relationships between Organisms and Natural-Products
- 1. dECMT, Cancer Research UK Manchester Institute; The University of Manchester
- 2. Idiap Research Institute
- 3. The University of Manchester; Idiap Research Institute
Description
A curated evaluation dataset for end-to-end Relation Extraction of relationships between organisms and natural-products.
Details about the manual annotation:
- For Chemicals:
- The chemical labels are annotated as they appear in the abstract.
- In abstracts, singular chemicals and classes of chemicals produced by a specific organism were distinguished.
- The "type" attribute {“chemical”, “class”} is used to indicate the nature of the mentioned name.
- A "class" attribute for chemical entities has also been included if class information is present in the abstract.
- A Wikidata and PubChem identifiers were assigned to chemicals and classes when available.
- For Organisms:
- The organism labels are annotated as they appear in the abstract.
- If in an abstract, the genus name was mention first, e.g. "Plakinastrella sp." and then the specie name e.g "Plakinastrella clathrata" is precise, then only the specie name is used.
- A Wikidata identifier was assigned to all organisms.
- In some abstracts, only the genus name is mentioned.
- For Relations:
- Only the relations explicitly mentioned in the abstract are reported in the output labels.
- Relations are reported in their order of appearance in the abstract.
Files
curated_test_set.json
Files
(671.0 kB)
Name | Size | Download all |
---|---|---|
md5:04469028908079cbabc4dcc68456511f
|
671.0 kB | Preview Download |