Published November 12, 2023 | Version v1
Conference proceeding Open

FRE @ BC8 SympTEMIST track: Named Entity Recognition

  • 1. AI & Computing Research Group, Fujitsu Research of Europe Ltd., Spain

Description

Abstract

This paper describes our submission on the SympTEMIST Named Entity Recognition (NER) shared subtask at BioCreative 2023. We submitted two systems based on a RoBERTa architecture LLM trained on Spanish-language clinical data available at HuggingFace model repository. The techniques that we used for both systems are Conditional Random Fields (CRF) and Byte-Pair Encoding dropout (BPE dropout). In the second system we also included Sub-subword feature based embeddings (SSW). Our systems obtained strict F1-score 0.727 and 0.728 with and without SSW, respectively.

 

This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.

Files

bc8_symptemist_fre.pdf

Files (348.0 kB)

Name Size Download all
md5:c935a557c6e5c087907e521bbe624487
348.0 kB Preview Download

Additional details

Related works

Is published in
Conference proceeding: 10.5281/zenodo.10103190 (DOI)