There is a newer version of the record available.

Published April 17, 2022 | Version 2
Dataset Restricted

Comparison of biomedical relationship extraction methods and models for knowledge graph creation (Gene-Disease relationships)

  • 1. Bayer A.G.

Description

This is the dataset used for classifying Gene-Disease relationship types from sentences. The dataset consists of 3 files:

  • manually_annotated_set.xlsx - set of 2000 manualy annotated sentences with entities
  • Unbalanced_dataset.xlsx - set of 12000 sentences, out of which 2000 are from the first set, manually annotated, and the rest have been added using rule based method by adding sentences where extraction had confidence 1.
  • Balanced_dataset_SUB_PRED.xlsx - balanced dataset generated by taking 2000 manually annotated sentences, but then adding sentences from the rule-based method with confidence 1 in such a way that each relationship class had at least 1400 sentences (for biomarkers, we could obtain 1243 sentences with confidence 1 from a processed portion of the data we had at the time of building the dataset).
  • Please cite: Milosevic, Nikola, and Wolfgang Thielemann. "Comparison of biomedical relationship extraction methods and models for knowledge graph creation." Journal of Web Semantics (2022): 100756. https://doi.org/10.1016/j.websem.2022.100756
  • Article preprint available at: https://arxiv.org/abs/2201.01647 

 

Files

Restricted

The record is publicly accessible, but files are restricted to users with access.

Request access

If you would like to request access to these files, please fill out the form below.

You need to satisfy these conditions in order for this request to be accepted:

  • User need to specify their use case and contact the publisher of the data

You are currently not logged in. Do you have an account? Log in here

Additional details

Related works

Is supplement to
Journal article: 10.1016/j.websem.2022.100756 (DOI)

References

  • Milosevic, Nikola, and Wolfgang Thielemann. "Comparison of biomedical relationship extraction methods and models for knowledge graph creation." arXiv preprint arXiv:2201.01647 (2022).
  • Milosevic, Nikola, and Wolfgang Thielemann. "Comparison of biomedical relationship extraction methods and models for knowledge graph creation", Journal of Web Semantics (2022).