Published April 13, 2021 | Version 1
Dataset Open

UIMA ConceptMapper Dictionaries for the Annotation of the 2021 BioASQ Corpus with Drug Names and Terms from Epilepsy Ontologies

  • 1. ZB MED - Information Centre for Life Sciences

Description

The dictionary files for the annotation of free text with terms from epilepsy ontologies by the UIMA ConceptMapper are taken from NCBO BioPortal, namely from the ontologies EpSO, ESSO, EPILONT, EPISEM and FENICS:

The dictionary for the identification of drug names is derived from the DrugBank vocabulary available online at https://go.drugbank.com/releases/latest#open-data.

Further descriptions of making use of the UIMA-based text mining workflow can be found in the following publications:

  1. Bernd Müller, Alexandra Hagelstein: Beyond Metadata: Enriching life science publications in Livivo with semantic entities from the linked data cloud. SEMANTiCS (Posters, Demos, SuCCESS) 2016
  2. Bernd Müller, Alexandra Hagelstein, Thomas Gübitz: Life Science Ontologies in Literature Retrieval: A Comparison of Linked Data Sets for Use in Semantic Search on a Heterogeneous Corpus. EKAW (Satellite Events) 2016: 158-161
  3. Bernd Müller, Christoph Poley, Jana Pössel, Alexandra Hagelstein, Thomas Gübitz: LIVIVO - the Vertical Search Engine for Life Sciences. Datenbank-Spektrum 17(1): 29-34 (2017)
  4. Bernd Müller, Dietrich Rebholz-Schuhmann: Selected Approaches Ranking Contextual Term for the BioASQ Multi-label Classification (Task6a and 7a). PKDD/ECML Workshops (2) 2019: 569-580

The dictionary files are in particular:

  • Dict_DrugNames.xml - constructed from the DrugBank vocabulary
  • Dict_EpSO.xml - constructed from the EpSO ontology
  • Dict_ESSO.xml - constructed from the ESSO ontology
  • Dict_EPILONT.xml - constructed from the EPILONT ontology
  • Dict_EPISEM.xml - constructed from the EPISEM ontology
  • Dict_FENICS.xml - constructed from the FENICS ontology

The dictionaries were used with the UIMA ConceptMapper for the annotation of the 2021 BioASQ corpus resulting in the BioASQ Sub-Corpus for the Pharmacology of Epilepsy (BioPepsy).

Please cite this data as:

Müller, Bernd. UIMA ConceptMapper Dictionaries for the Annotation of the 2021 BioASQ Corpus with Drug Names and Terms from Epilepsy Ontologies. ZENODO, 10.5281/zenodo.4683353

Files

Dict_DrugNames.xml

Files (8.7 MB)

Name Size Download all
md5:972e9d4c4473f88b598a611095fce449
5.7 MB Preview Download
md5:b8f814fd14ac79da5fd0015022523238
61.1 kB Preview Download
md5:b3bc788eb8e7d36830aa81ae161cf44b
675.0 kB Preview Download
md5:e0d0e70ab6c755fa88b2623f69afcc92
302.5 kB Preview Download
md5:d4d88c3ecb5a880f9cdbea76eaf20ff3
1.8 MB Preview Download
md5:ce911212516dc45beeab3d190078bcfd
67.1 kB Preview Download

Additional details

Related works

Compiles
Dataset: 10.5281/zenodo.4680826 (DOI)
Is compiled by
Software: 10.5281/zenodo.4680086 (DOI)
Is part of
Software: 10.5281/zenodo.4682869 (DOI)
Software: https://cran.r-project.org/package=epos (URL)