10.5281/zenodo.4121183
https://zenodo.org/records/4121183
oai:zenodo.org:4121183
Farré, Eulàlia
Eulàlia
Farré
Barcelona Supercomputing Center
González, Gloria
Gloria
González
Bitac
Mas, Toni
Toni
Mas
Bitac
Miranda-Escalada, Antonio
Antonio
Miranda-Escalada
0000-0002-5654-001X
Barcelona Supercomputing Center
Krallinger, Martin
Martin
Krallinger
0000-0002-2646-8782
Barcelona Supercomputing Center
Cantemist guidelines: neoplasms morphology annotation and mapping to CIEO-3
Zenodo
2020
NLP
guidelines
annotatation
clinical
neoplasm morphology
cieo
oncology
NER
normalization
ICD-O
2020-06-05
spa
10.5281/zenodo.3878178
https://zenodo.org/communities/medicalnlp
1.3
Creative Commons Attribution 4.0 International
The Cantemist corpus was manually annotated by clinical experts following the Cantemist guidelines. These guidelines contain rules for annotating morphology neoplasms in Spanish oncology clinical cases; as well as for mapping these annotations to CIEO-3 (Spanish version of ICD-O-3).
Guidelines were created de novo by clinical experts in three phases:
First, a zero version of guidelines after the clinical experts reviewed neoplasm morphology annotations in SPACCC corpus see Codiesp guidelines(https://zenodo.org/record/3730567).
Second, a stable version of guidelines was reached while annotating sample sets of Cantemist corpus iteratively until quality control was satisfactory.
Third, guidelines are iteratively refined as manual annotation continues.
Please cite if you use this resource:
Miranda-Escalada, A., Farré, E., & Krallinger, M. (2020). Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings.
@inproceedings{miranda2020named,
title={Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results},
author={Miranda-Escalada, A and Farr{\'e}, E and Krallinger, M},
booktitle={Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings},
year={2020}
}
Resources:
Web
Citation: Miranda-Escalada, A., Farré, E., & Krallinger, M. (2020). Named entity recognition, concept normalization and clinical coding: Overview of the cantemist track for cancer text mining in spanish, corpus, guidelines, methods and results. In Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2020), CEUR Workshop Proceedings.
Gold Standard corpus
Silver Standard corpus
YouTube presentations
Participant codes
For more information, visit https://temu.bsc.es/cantemist/?p=4362 or email us at encargo-pln-life@bsc.es
Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).