Evaluation of Named Entity Recognition Systems to Improve Ontology Concept Annotation for Biomedical Knowledge Graphs
Authors/Creators
- 1. University of Pittsburgh
- 2. Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory
- 3. University of Colorado Anschutz Medical Campus
- 4. Columbia University
Description
Named Entity Recognition (NER) systems are commonly used in the construction of large biomedical knowledge graphs (KGs) from free text or non-standardized data. Their main role is to map biomedical entities to standardized identifiers in ontologies and databases. While NER is only one of the steps in KG construction, NER systems can greatly accelerate KG construction. However, errors introduced by the NER systems can systematically affect downstream applications of the KG. In this study, we used two NER systems – BioPortal Annotator and the OntoRunNER OGER++ wrapper - to map biomedical entities in two KGs to 13 biomedical ontologies and subsequently evaluated the mappings. Results from both systems contained errors such that using the mappings in a KG without curation could lead to inaccurate inferences. We are currently evaluating the effects of the NER systems on downstream KG applications using graph analysis, embedding similarity, and data source cross-validation.