There is a newer version of the record available.

Published April 11, 2023 | Version v3
Dataset Open

MedProcNER/ProcTEMIST Corpus: Gold Standard annotations for Clinical Procedures Information Extraction

Description

MedProcNER stands for MEDical PROCedure Named Entity Recognition. It is a shared task and set of resources focused on the detection, normalization and indexing of clinical procedures in medical documents in Spanish. MedProcNER is complementary to the DisTEMIST corpus (https://temu.bsc.es/distemist) as they both use the same document collection, which is why it's also called ProcTEMIST.

This repository includes the Train Set of the task, which includes a total of 750 documents. The unannotated test text files are also included so that predictions can be created for them. Finally, we include a gazetteer of possible SNOMED CT codes for the normalization and indexing tasks. For more information, please check the attached README file.

** UPDATE MAY 2nd 2023: Second part of the train set, test set texts and gazetteer now available!

** UPDATE MAY 12th 2023: We've uploaded a new version of the gazetteer that removes some ambiguous codes wrongfully added from older SNOMED versions

MedProcNER was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of BioASQ @ CLEF 2023. For more information on the corpus, annotation scheme and task in general, please visit: https://temu.bsc.es/medprocner.

Related Links:

- MedProcNER website: https://temu.bsc.es/medprocner

- MedProcNER Guidelines: https://doi.org/10.5281/zenodo.7817666

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Contact

If you have any questions or suggestions, please contact us at:

- Salvador Lima-López (<salvador [dot] limalopez [at] gmail [dot] com>)
- Martin Krallinger (<krallinger [dot] martin [at] gmail [dot] com>)

Files

medprocner_gs_train+test+gazz_230512.zip

Files (6.6 MB)

Name Size Download all
md5:430d33d0f874b11c8b02ac41d90898ca
6.6 MB Preview Download