ICB-UMA at BioCreative VIII @ AMIA 2023 Task 2 SYMPTEMIST (Symptom TExt Mining Shared Task)

Gallego, Fernando; Veredas, Francisco J.

doi:10.5281/zenodo.10104058

Published November 12, 2023 | Version v1

Conference proceeding Open

ICB-UMA at BioCreative VIII @ AMIA 2023 Task 2 SYMPTEMIST (Symptom TExt Mining Shared Task)

1. Dept. of Computer Languages and Sciences & Research Institute of Multilingual Language Technologies, Universidad de Málaga, Málaga, Spain

Abstract

These working notes summarize the contribution of the ICB research group from the University of Malaga to the BioCreative VIII Workshop @AMIA 2023, from our participation in Task 2 - SympTEMIST. Engaged in both subtasks, our approaches tackled symptom, sign, and clinical finding entities recognition (subtask 1 - SymptomNER) and their normalization to the corresponding SNOMED CT concepts (subtask 2 - SymptomNorm). For subtask 1, we analyzed the performance of some BERT-based models tailored for the nuances of Spanish clinical data. These models, specifically fine-tuned on the SymptomNER corpus, showed remarkable precision (0.804), recall (0.699), and F1-score (0.748) for the test set. For SymtomNorm subtask, we incorporated recent strategies using bi-encoder and cross-encoder models, especially SapBERT models enhanced with FAISS methods for similarity search. Finally, the model's predictions were further refined by leveraging a gazetteer with more than 150,000 concepts. Our strategy achieved 0.58 accuracy for the test set.

This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.

Files

bc8_symptemist_icbuma.pdf

Files (114.1 kB)

Name	Size	Download all
bc8_symptemist_icbuma.pdf md5:c5914b88675088bad2a67caac7ad8ed0	114.1 kB	Preview Download

Additional details

Is published in: Conference proceeding: 10.5281/zenodo.10103190 (DOI)

174

Views

139

Downloads

Show more details

	All versions	This version
Views	174	174
Downloads	139	139
Data volume	18.1 MB	18.1 MB

More info on how stats are collected....

DOI

Resource type

Conference proceeding

Publisher

Zenodo

Imprint

Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models. New Orleans, USA.

Conference

AMIA 2023 Annual Symposium , New Orleans, USA, November 2023

Languages

English

Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: November 10, 2023
Modified: July 10, 2024

ICB-UMA at BioCreative VIII @ AMIA 2023 Task 2 SYMPTEMIST (Symptom TExt Mining Shared Task)

Creators

Description

Abstract

Files

bc8_symptemist_icbuma.pdf

Files (114.1 kB)

Additional details

Related works