ClinSpEn-CC (Clinical Cases) Test + Background Set
Description
This repository contains the test and background data for the ClinSpEn-Clinical Cases sub-track. ClinSpEn is part of the Biomedical WMT 2022 shared task, having the aim to promote the development and evaluation of machine translation systems adapted to the medical domain with three highly relevant sub-tracks: clinical cases, medical controlled vocabularies/ontologies, and clinical terms and entities extracted from medical content.
The data is made up of a TSV file with three columns: document number, line number and English line. The direction of this sub-track is EN>ES. The clinical cases themselves include COVID-19 case reports as well as diverse content extracted from PubMed.
Related Links:
- Data website with more information: https://temu.bsc.es/clinspen/
- WMT website (includes schedule, registration, ...): https://www.statmt.org/wmt22/
- CodaLab: https://codalab.lisn.upsaclay.fr/competitions/6696
ClinSpEn SAMPLE SETS:
- ClinSpEn-CC Sample Set (Clinical Cases): https://doi.org/10.5281/zenodo.6497350
- ClinSpEn-CT Sample Set (Clinical Terms): https://doi.org/10.5281/zenodo.6497372
- ClinSpEn-OC Sample Set (Ontology Concepts): https://doi.org/10.5281/zenodo.6497388
ClinSpEn TEST SETS:
- ClinSpEn-CC Test Set (Clinical Cases): https://doi.org/10.5281/zenodo.6948634
- ClinSpEn-CT Test Set (Clinical Terms): https://doi.org/10.5281/zenodo.6948669
- ClinSpEn-OC Test Set (Ontology Concepts): https://doi.org/10.5281/zenodo.6948679
Files
Files
(12.9 MB)
Name | Size | Download all |
---|---|---|
md5:ad56ba4ad2fefd084095727062897d0f
|
12.9 MB | Download |