Published November 12, 2023 | Version v1
Conference proceeding Open

An End-to-End Approach for Asserted Named Entity Recognition and Relationship Extraction in Biomedical Text

  • 1. Computational Bioscience Program, University of Colorado, Anschutz Medical Campus, Aurora, CO, 80045, USA

Description

Abstract

In this study, we focus on subtask 2 of the BioRED track for extracting and analyzing biomedical entities and their relationships from biomedical literature. We developed an end-to-end framework that uses a series of Large Language Models (LLMS), such as the Flair model, to identify various biomedical entities and pass them to BioBERT for relation extraction. To augment the system's performance, we incorporated coreferencing resolution along with the use of resources like CRAFT and Pubtator to enrich our training data for Named Entity Recognition (NER). Moreover, we applied similarity measures for the linguistic contexts of named entities to match their mentions over longer distances. We used positional data to assess the odds that the relations we found might be novel. Finally, we used PheKnowlator, a graphical knowledge base, to get insights into the contextual environment of the entities and weigh the likelihood of them participating in particular relations. Although our work is preliminary, these techniques show promise for finding relations between entities in biomedical papers, even when the relations are subtle, when they span longer distances, or when they are implied rather than stated directly.

 

This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.

Files

An End-to-End Approach for Asserted Named Entity Recognition and Relationship Extraction in Biomedical Text.pdf

Additional details

Related works

Is published in
Conference proceeding: 10.5281/zenodo.10103190 (DOI)