Published November 12, 2023 | Version v1
Conference proceeding Open

HPI-DHC @ BC8 SympTEMIST Track: Detection and Normalization of Symptom Mentions with SpanMarker and xMEN

  • 1. Hasso Plattner Institute for Digital Engineering, University of Potsdam, Germany

Description

Abstract

Signs and symptoms of patients are frequently reported in clinical text documents. Therefore, accurate automated extraction of symptom information is essential for their integration into downstream clinical applications. In this work, we describe our contribution to the BioCreative VIII SympTEMIST shared task, a benchmark for the detection and normalization of symptom mentions in Spanish-language clinical case reports. Our systems for subtasks 1 and 2 are built upon two state-of-the-art, open-source information extraction tools: (1) SpanMarker for named entity recognition with document-level context and (2) xMEN for normalizing symptom mentions to their corresponding SNOMED CT code. For subtask 1, our best submitted run achieves an F1 score of 0.7363, which exceeds the median across all submissions by more than 3pp. Our experiments underline the positive impact of including document-level context for named entity taggers. For subtask 2, our best system for entity normalization obtains an accuracy of 0.6070, an improvement of more than 8pp over the median.

 

This article is part of the Proceedings of the BioCreative VIII Challenge and Workshop: Curation and Evaluation in the era of Generative Models.

Files

bc8_symptemist_hpi.pdf

Files (240.9 kB)

Name Size Download all
md5:a1e9a796adc8594e5a260cda3777072f
240.9 kB Preview Download

Additional details

Related works

Is published in
Conference proceeding: 10.5281/zenodo.10103190 (DOI)