Report Open Access

Cantemist guidelines: neoplasms morphology annotation and mapping to CIEO-3

Farré, Eulàlia; González, Gloria; Mas, Toni; Miranda-Escalada, Antonio; Krallinger, Martin

The Cantemist corpus was manually annotated by clinical experts following the Cantemist guidelines.  These guidelines contain rules for annotating morphology neoplasms in Spanish oncology clinical cases; as well as for mapping these annotations to CIEO-3 (Spanish version of ICD-O-3).

Guidelines were created de novo by clinical experts in three phases:

  •  First, a zero version of guidelines after the clinical experts reviewed neoplasm morphology annotations in SPACCC corpus see Codiesp guidelines(https://zenodo.org/record/3730567).
  •  Second, a stable version of guidelines was reached while annotating sample sets of Cantemist corpus iteratively until quality control was satisfactory.
  •  Third, guidelines are iteratively refined as manual annotation continues.

 

Please cite if you use this resource:

Miranda-Escalada, A., Farré, E., & Krallinger, M. (2020). Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings.

@inproceedings{miranda2020named,
  title={Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results},
  author={Miranda-Escalada, A and Farr{\'e}, E and Krallinger, M},
  booktitle={Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings},
  year={2020}
}

 

Resources:

  • Web
  • Citation: Miranda-Escalada, A., Farré, E., & Krallinger, M. (2020). Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings.
  • Gold Standard corpus
  • Silver Standard corpus
  • YouTube presentations
  • Participant codes

 

For more information, visit https://temu.bsc.es/cantemist/?p=4362 or email us at encargo-pln-life@bsc.es

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).
Files (877.8 kB)
Name Size
GUÍAS MORFOLOGÍA NEOPLÁSICA JUNE 2020.pdf
md5:26a6632f5f86de1e1302ead75c204b19
877.8 kB Download
598
160
views
downloads
All versions This version
Views 598118
Downloads 16094
Data volume 132.1 MB82.5 MB
Unique views 522105
Unique downloads 15084

Share

Cite as