Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published April 25, 2024 | Version v1
Other Open

DrugTEMIST Guidelines: Annotation of Medication in Medical Documents

  • 1. Barcelona Supercomputing Center

Description

DrugTEMIST stands for Drug Text Mining Shared Task. It is a set of resources focused on the detection of medication mentions in clinical text in Spanish and other languages. DrugTEMIST is complementary to the DisTEMIST, ProcTEMIST and SympTEMIST corpora, as they all use the same document collection. 

This repository includes the Annotation Guidelines, a 17 pages-long document that describes how to annotate medications in medical documents. The guidelines are only available in Spanish.

DrugTEMIST was released as part of the MultiCardioNER Shared Task. MultiCardioNER was developed by the Barcelona Supercomputing Center's NLP for Biomedical Information Analysis and used as part of BioASQ 2024. For more information on the corpus, annotation scheme and task in general, please visit: https://temu.bsc.es/multicardioner. This task is promoted by Spanish and European projects such as DataTools4Heart, AI4HF, BARITONE and AI4ProfHealth.

Resources

License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Contact

If you have any questions or suggestions, please contact us at:

- Salvador Lima-López (<salvador [dot] limalopez [at] gmail [dot] com>)
- Martin Krallinger (<krallinger [dot] martin [at] gmail [dot] com>)

Additional resources and corpora

If you are interested in MultiCardioNER, you might want to check out these corpora and resources:

  • DisTEMIST (Corpus of disease mentions and normalization to SNOMED CT)
  • MedProcNER (Corpus of clinical procedure mentions and normalization to SNOMED CT)
  • SympTEMIST (Corpus of clinical findings and normalization to SNOMED CT)
  • PharmaCoNER (Corpus of medications, drugs, chemical substances, genes, proteins and vaccine mentions and normalization)
  • MEDDOPROF (Corpus of mentions of professions, occupations and working status and normalization)
  • MEDDOPLACE (Corpus of mentions of place-related entity mentions, including departments, nationalities or patient movements etc.. and normalization)
  • MEDDOCAN (Corpus of mentions of Personal Health Identifiers (PHI))
  • CANTEMIST (Corpus of cancer tumor morphology mentions and normalization)
  • CodiESP (Corpus of clinical case reportes with assigned clinical codes from ICD10, Spanish version)
  • LivingNER (Corpus of mentions of species, including human/family members, pathogens, food, etc.. and normalization to NCBI Taxonomy)
  • SPACCC-POS (Corpus of clinical case reports in Spanish annotated with POS-tags)
  • SPACCC-TOKEN (Corpus of clinical case reports in Spanish annotated with token-tags (word mention boundaries))
  • SPACCC-SPLIT (Corpus of clinical case reports in Spanish annotated with sentence boundary-tags)
  • MESINESP-2 (Corpus of manually indexed records with DeCS /MeSH terms comprising scientific literature abstracts, clinical trials, and patent abstracts)

Files

Guías DrugTEMIST v1.pdf

Files (1.4 MB)

Name Size Download all
md5:0d481461637ffa3855dd2c86e98bf169
1.4 MB Preview Download