Medical Concept Normalization in Social Media Posts with Recurrent Neural Networks
Authors/Creators
- 1. Kazan Federal University
- 2. St. Petersburg Department of the Steklov Mathematical Institute
- 3. Moscow Institute of Physics and Technology
Description
Text mining of scientific libraries and social media has already proven itself as a reliable tool for
drug repurposing and hypothesis generation. The task of mapping a disease mention to a concept
in a controlled vocabulary, typically to the standard thesaurus in the Unified Medical Language
System (UMLS), is known as medical concept normalization. This task is challenging due to the
differences in medical terminology between health care professionals and social media texts coming
from the lay public. To bridge this gap, we use sequence learning with recurrent neural networks
and semantic representation of one- or multi-word expressions: we develop end-to-end architectures
directly tailored to the task, including bidirectional Long Short-Term Memory and Gated Recurrent
Units with an attention mechanism and additional semantic similarity features based on UMLS.
Our evaluation over a standard benchmark shows that recurrent neural networks improve results
over an effective baseline for classification based on convolutional neural networks. A qualitative
examination of mentions discovered in a dataset of user reviews collected from popular online health
information platforms as well as quantitative evaluation both show improvements in the semantic
representation of health-related expressions in social media.
Files
medical concept normalization.zip
Files
(6.3 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:0d12cc18e5138df0fa100bfeb0070d3c
|
6.3 MB | Preview Download |