Spa-neg: an approach for negation detection in clinical text written in Spanish
Description
Electronic health records contain valuable information written in narrative form. A relevant challenge in clinical narrative text is that concepts commonly appear negated. Several proposals have been developed to detect negation in clinical text written in Spanish. Much of these proposals have adapted the Negex algorithm to Spanish, but obtained results indicated lower performance than Negex implementations in other languages. Moreover, in most of these proposals, the validation process could be improved using a shared test corpus focused on negation in clinical text. This paper proposes Spa-neg, an approach to improve negation detection in clinical text written in Spanish. Spa-neg combines three elements: i) an exploratory data analysis of how negation is written in the clinical text, ii) use of regular expressions best adapted to the way in which negation is expressed in Spanish, iii) tests, and validation using a shared annotated corpus focused on negation. Obtained results suggest that the combination of these elements improves the process of negation detection. The tests performed shown 92% F-Score using IULA Spanish, an annotated corpus for negation
Files
IWBBIO.pdf
Files
(447.1 kB)
Name | Size | Download all |
---|---|---|
md5:d229170d0df0ebfd76269f47cc13ee95
|
447.1 kB | Preview Download |