FRE @ BC8 SympTEMIST track: Named Entity Recognition
Creators
- 1. AI & Computing Research Group, Fujitsu Research of Europe Ltd., Spain
Description
Abstract
This paper describes our submission on the SympTEMIST Named Entity Recognition (NER) shared subtask at BioCreative 2023. We submitted two systems based on a RoBERTa architecture LLM trained on Spanish-language clinical data available at HuggingFace model repository. The techniques that we used for both systems are Conditional Random Fields (CRF) and Byte-Pair Encoding dropout (BPE dropout). In the second system we also included Sub-subword feature based embeddings (SSW). Our systems obtained strict F1-score 0.727 and 0.728 with and without SSW, respectively.
This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.
Files
bc8_symptemist_fre.pdf
Files
(348.0 kB)
Name | Size | Download all |
---|---|---|
md5:c935a557c6e5c087907e521bbe624487
|
348.0 kB | Preview Download |
Additional details
Related works
- Is published in
- Conference proceeding: 10.5281/zenodo.10103190 (DOI)