There is a newer version of the record available.

Published March 8, 2023 | Version 1.3
Dataset Open

MEDDOPLACE Corpus: Gold Standard annotations for Medical Documents Place-related Content Extraction

  • 1. Barcelona Supercomputing Center
  • 2. Dublin University

Description

MEDDOPLACE stands for MEDical DOcument PLAce-related Content Extraction. It is a shared task and set of resources focused on the detection, normalization (entity linking/toponym resolution) and classification of different kinds of places, as well as related types of information such as clinical departments, nationalities or patient movements, in medical documents in Spanish.

This repository includes the Training Set of the corpus in three different formats (brat, .json, .tsv), as well as relevant gazetteers and the test set files. For more information, please check the attached README file.

** Update June 8th 2023: The test set data for sub-tasks 2 and 3 (i.e. gold standard entity annotations) are now available!

MEDDOPLACE was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of IberLEF 2023. For more information on the corpus, annotation scheme and task in general, please visit: https://temu.bsc.es/meddoplace.

Related Links:

- MEDDOPLACE website: https://temu.bsc.es/meddoplace

- Annotation Guidelines (Spanish): https://doi.org/10.5281/zenodo.7775234

- Annotation Guidelines (English): https://doi.org/10.5281/zenodo.7928145

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Contact

If you have any questions or suggestions, please contact us at:

- Salvador Lima-López (<salvador [dot] limalopez [at] gmail [dot] com>)
- Martin Krallinger (<krallinger [dot] martin [at] gmail [dot] com>)

Files

meddoplace_train+test+gazz_230608.zip

Files (5.8 MB)

Name Size Download all
md5:4d97ceb3d8bc462e618a741b0cb28373
5.8 MB Preview Download