Scaling ContProto Performance to Large Multilingual Language Models for Cross-Lingual NER
Description
Natural language tasks like Named Entity Recognition (NER) in the clinical domain on non-English texts can be very time-consuming and expensive due to the lack of annotated data. Cross-lingual transfer (CLT) is a way to circumvent this issue thanks to the ability of multilingual large language models to be fine-tuned on a specific task in one language and to provide high accuracy for the same task in another language. However, other methods leveraging translation models can be used to perform NER without annotated data in the target language, by either translating the training set or test set.
Research goal: Can the performance gains from ContProto be scaled to large multilingual language models like XLM-RoBERTa or mBERT when fine-tuned for cross-lingual NER?
Autonomous synthesis report generated by Assignee Research. Tribunal consensus score: 7.5/10.
Notes
Files
paper.pdf
Files
(80.8 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:69a7d3f026a8d0c4ef2b711bb51c8e43
|
80.8 kB | Preview Download |