DUTIR-BioNLP@BC8 Track 3: Genetic Phenotype Extraction and Normalization with Biomedical Pre-trained Language Models

Qi, Jiewei; Luo, Ling; Yang, Zhihao; Lin, Hongfei

doi:10.5281/zenodo.10104756

Published November 12, 2023 | Version v1

Conference proceeding Open

DUTIR-BioNLP@BC8 Track 3: Genetic Phenotype Extraction and Normalization with Biomedical Pre-trained Language Models

1. School of Computer Science and Technology, Dalian University of Technology, Dalian 116024, China

Abstract

It is important to automatically extract and normalize key medical findings from the observation results written during the physical examination of teratology. The BioCreative VIII Track 3 endeavors to facilitate the advancement and assessment of systems designed to automatically extract and normalize the phenotype entities from electronic health records (EHRs). This paper describes our method used to create our submissions to the track. Our pipelined method for the phenotype concept extraction partitions the process into two subtasks: Named Entity Recognition and Named Entity Normalization. The cutting-edge biomedical pre-trained language models are used for both subtasks. Then the ensemble method is further used to improve the final performance. The official results on the test set show that our best submission achieves the F1-scores of 0.7632 on Subtask 3a and 0.7112 on Subtask 3b.

This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.

Files

bc8_phenotypes_dutir.pdf

Files (539.3 kB)

Name	Size	Download all
bc8_phenotypes_dutir.pdf md5:7a741901ea78f98d6fef47153adc8482	539.3 kB	Preview Download

Additional details

Is published in: Conference proceeding: 10.5281/zenodo.10103190 (DOI)

147

Views

Downloads

Show more details

	All versions	This version
Views	147	147
Downloads	74	74
Data volume	48.0 MB	48.0 MB

More info on how stats are collected....

DOI

Resource type

Conference proceeding

Publisher

Zenodo

Imprint

Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models. New Orleans, USA.

Conference

AMIA 2023 Annual Symposium , New Orleans, USA, November 2023

Languages

English

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 10, 2023
Modified: July 10, 2024

DUTIR-BioNLP@BC8 Track 3: Genetic Phenotype Extraction and Normalization with Biomedical Pre-trained Language Models

Creators

Description

Abstract

Files

bc8_phenotypes_dutir.pdf

Files (539.3 kB)

Additional details

Related works