SemEval-2021 Task 10: Source-Free Domain Adaptation for Semantic Processing
- 1. University of Arizona
- 2. George Mason University
- 3. Boston Children's Hospital and Harvard Medical School
Description
Data sharing restrictions are common in NLP datasets. For example, Twitter policies do not allow sharing of tweet text, though tweet IDs may be shared. The situation is even more common in clinical NLP, where patient health information must be protected, and annotations over health text, when released at all, often require the signing of complex data use agreements. The SemEval-2021 Task 10 framework asks participants to develop semantic annotation systems in the face of data sharing constraints. A participant's goal is to develop an accurate system for a target domain when annotations exist for a related domain but cannot be distributed. Instead of annotated training data, participants are given a model trained on the annotations. Then, given unlabeled target domain data, they are asked to make predictions.
Website: https://machine-learning-for-medical-language.github.io/source-free-domain-adaptation/
CodaLab site: https://competitions.codalab.org/competitions/26152
Github repository: https://github.com/Machine-Learning-for-Medical-Language/source-free-domain-adaptation
Files
baselines.zip
Files
(205.6 kB)
Name | Size | Download all |
---|---|---|
md5:8dc156d135d41cf0905f9568d7791858
|
12.2 kB | Preview Download |
md5:b4bd608b9a18a2bc607bc03c0b7ef926
|
109.0 kB | Preview Download |
md5:56aab514f9450d8dcf6e5f1e845df789
|
2.3 kB | Preview Download |
md5:171f3ab4b1545d1333753c3a3b71b551
|
41.3 kB | Preview Download |
md5:b0c9194e0fba78134e29b9ba5d14de1a
|
40.8 kB | Preview Download |
Additional details
Funding
- Temporal relation discovery for clinical text 5R01LM010090-07
- National Institutes of Health
- Automated domain adaptation for clinical natural language processing 1R01LM012918-01
- National Institutes of Health