BioCreative VIII – Task 3: Genetic Phenotype Normalization from Dysmorphology Physical Examinations
Creators
- 1. Cedars-Sinai Medical Center West Hollywood, CA
- 2. Children's Hospital of Philadelphia, Philadelphia, PA, USA
- 3. University of Pennsylvania, Philadelphia, PA, USA
- 4. Cedars-Sinai Medical Center West Hollywood, CA, USA
Description
Abstract
The BioCreative VIII Task 3 focuses on normalizing terms mentioned in dysmorphology physical examinations to the Human Phenotype Ontology (HPO) to enable computational analysis geared towards finding correlations between patients with rare genetic diseases, delineate undescribed genetic conditions, or further our understanding of existing ones, among other applications. We made available 3,136 deidentified and manually annotated observations extracted from dysmorphology physical examinations of 1,652 pediatric patients. Task 3 consists of detecting all HPO terms mentioned in an observation and returning the HPO IDs associated with the terms detected. This task is challenging due to discontinuous, overlapping, and descriptive mentions of HPO terms, making strict matching approaches inefficient. The large size and incompleteness of the HPO ontology also prevents the annotation of an exhaustive training set to train conventional multi-class classifiers. A total of 20 teams registered, and 5 teams submitted their predictions. We summarize the corpus, the competing systems, and their results. Using a pre-trained large language model, the top system achieved a .82 F1 score, a score close to human performance, which confirms the recent advance in natural language processing recently commented on the media. The post-evaluation period of the challenge is still open for submission at https://codalab.lisn.upsaclay.fr/competitions/11351.
This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.
Files
bc8_phenotypes_overview.pdf
Files
(254.3 kB)
Name | Size | Download all |
---|---|---|
md5:797be85e2743308a950277d4298e0bf4
|
254.3 kB | Preview Download |
Additional details
Related works
- Is published in
- Conference proceeding: 10.5281/zenodo.10103190 (DOI)