MEDDOPLACE Corpus: Gold Standard annotations for Medical Documents Place-related Content Extraction

Salvador Lima López; Eulàlia Farré-Maduell; Vicent Briva-Iglesias; Luis Gasco; Martin Krallinger

doi:10.5281/zenodo.8017179

Published March 8, 2023 | Version 1.3

Dataset Open

MEDDOPLACE Corpus: Gold Standard annotations for Medical Documents Place-related Content Extraction

1. Barcelona Supercomputing Center
2. Dublin University

MEDDOPLACE stands for MEDical DOcument PLAce-related Content Extraction. It is a shared task and set of resources focused on the detection, normalization (entity linking/toponym resolution) and classification of different kinds of places, as well as related types of information such as clinical departments, nationalities or patient movements, in medical documents in Spanish.

This repository includes the Training Set of the corpus in three different formats (brat, .json, .tsv), as well as relevant gazetteers and the test set files. For more information, please check the attached README file.

** Update June 8th 2023: The test set data for sub-tasks 2 and 3 (i.e. gold standard entity annotations) are now available!

MEDDOPLACE was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of IberLEF 2023. For more information on the corpus, annotation scheme and task in general, please visit: https://temu.bsc.es/meddoplace.

Related Links:

- MEDDOPLACE website: https://temu.bsc.es/meddoplace

- Annotation Guidelines (Spanish): https://doi.org/10.5281/zenodo.7775234

- Annotation Guidelines (English): https://doi.org/10.5281/zenodo.7928145

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Contact

If you have any questions or suggestions, please contact us at:

- Salvador Lima-López (<salvador [dot] limalopez [at] gmail [dot] com>)
- Martin Krallinger (<krallinger [dot] martin [at] gmail [dot] com>)

Files

meddoplace_train+test+gazz_230608.zip

Files (5.8 MB)

Name	Size	Download all
meddoplace_train+test+gazz_230608.zip md5:4d97ceb3d8bc462e618a741b0cb28373	5.8 MB	Preview Download

	All versions	This version
Views	1,593	181
Downloads	225	37
Data volume	3.1 GB	215.0 MB

MEDDOPLACE Corpus: Gold Standard annotations for Medical Documents Place-related Content Extraction

Creators

Description

Files

meddoplace_train+test+gazz_230608.zip

Files (5.8 MB)