HPI-DHC @ BC8 SympTEMIST Track: Detection and Normalization of Symptom Mentions with SpanMarker and xMEN
- 1. Hasso Plattner Institute for Digital Engineering, University of Potsdam, Germany
Description
Abstract
Signs and symptoms of patients are frequently reported in clinical text documents. Therefore, accurate automated extraction of symptom information is essential for their integration into downstream clinical applications. In this work, we describe our contribution to the BioCreative VIII SympTEMIST shared task, a benchmark for the detection and normalization of symptom mentions in Spanish-language clinical case reports. Our systems for subtasks 1 and 2 are built upon two state-of-the-art, open-source information extraction tools: (1) SpanMarker for named entity recognition with document-level context and (2) xMEN for normalizing symptom mentions to their corresponding SNOMED CT code. For subtask 1, our best submitted run achieves an F1 score of 0.7363, which exceeds the median across all submissions by more than 3pp. Our experiments underline the positive impact of including document-level context for named entity taggers. For subtask 2, our best system for entity normalization obtains an accuracy of 0.6070, an improvement of more than 8pp over the median.
This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.
Files
bc8_symptemist_hpi.pdf
Files
(240.9 kB)
Name | Size | Download all |
---|---|---|
md5:a1e9a796adc8594e5a260cda3777072f
|
240.9 kB | Preview Download |
Additional details
Related works
- Is published in
- Conference proceeding: 10.5281/zenodo.10103190 (DOI)