Dataset Open Access

MeSDiCon subset for CodiEsp: MESH terms in MeSDiCon mapped to ICD10 CM and ICD10 PCS

Miranda-Escalada, Antonio; Krallinger, Martin

The MeSDiCon consists of a list or gazetteer of candidate names of diseases and symptoms mentioned in Spanish clinical texts. Thus MeSDiCon serves as a lexical resource or dictionary for automatic detection of disease/symptom mentions, as well as indexing or classification of medical texts with such concept types. Terms in MeSDiCon were mapped to MESH terminology.

In this subset, we have mapped MESH codes to ICD10-CM and ICD10-PCS through UMLS Metathesaurus. Then, this resource contains diseases and symptoms terms from Spanish clinical texts mapped to MESH and ICD10.


Please cite if you use this dataset:

Antonio Miranda-Escalada, Aitor Gonzalez-Agirre, Jordi Armengol-Estapé and Martin Krallinger. Overview of automatic clinical coding: annotations, guidelines, and solutions for non-English clinical cases at CodiEsp track of CLEF eHealth 2020. In CLEF (Working Notes). 2020

  title={Overview of automatic clinical coding: annotations, guidelines, and solutions for non-english clinical cases at codiesp track of CLEF eHealth 2020},
  author={Miranda-Escalada, Antonio and Gonzalez-Agirre, Aitor and Armengol-Estap{\'e}, Jordi and Krallinger, Martin},
  booktitle={Working Notes of Conference and Labs of the Evaluation (CLEF) Forum. CEUR Workshop Proceedings},


File structure

TSV. Data is separated by tabs (\t). Every row of the file has the following fields:

terminology    identifier    translatedTerm    termCount    documentCount    ICD10CM-code    ICD10PCS-code

In case one MESH term is mapped to more than one ICD10 code, they are separated by commas.

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).
Files (3.1 MB)
Name Size
3.1 MB Download
All versions This version
Views 224224
Downloads 4646
Data volume 141.0 MB141.0 MB
Unique views 198198
Unique downloads 4545


Cite as