There is a newer version of the record available.

Published March 8, 2023 | Version 1.3
Dataset Open

MEDDOPLACE Corpus: Gold Standard annotations for Medical Documents Place-related Content Extraction

  • 1. Barcelona Supercomputing Center
  • 2. Dublin University


MEDDOPLACE stands for MEDical DOcument PLAce-related Content Extraction. It is a shared task and set of resources focused on the detection, normalization (entity linking/toponym resolution) and classification of different kinds of places, as well as related types of information such as clinical departments, nationalities or patient movements, in medical documents in Spanish.

This repository includes the Training Set of the corpus in three different formats (brat, .json, .tsv), as well as relevant gazetteers and the test set files. For more information, please check the attached README file.

** Update June 8th 2023: The test set data for sub-tasks 2 and 3 (i.e. gold standard entity annotations) are now available!

MEDDOPLACE was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of IberLEF 2023. For more information on the corpus, annotation scheme and task in general, please visit:

Related Links:

- MEDDOPLACE website:

- Annotation Guidelines (Spanish):

- Annotation Guidelines (English):


This work is licensed under a Creative Commons Attribution 4.0 International License.


If you have any questions or suggestions, please contact us at:

- Salvador Lima-López (<salvador [dot] limalopez [at] gmail [dot] com>)
- Martin Krallinger (<krallinger [dot] martin [at] gmail [dot] com>)


Files (5.8 MB)

Name Size Download all
5.8 MB Preview Download