Published June 28, 2021
| Version minor corrections on lemmas and tokenization
Dataset
Open
Enriched CONLLU Ancora for ML training
Description
This is an enriched version for Machine Learning purposes of the CONLLU adaptation of AnCora corpus .
This version of the corpus was developed by BSC TeMU as part of the AINA project, and has been used to do multi-task learning for the Catalan language Spacy 3.4 models.
Versió enriquida de l'adaptació del corpus AnCora al format CONLLU orientada a l'aprenentatge automàtic.
Aquesta versió del corpus ha estat desenvolupada per BSC TeMU com a part del projecte Aina, i s'ha fet servir per a l'entrenament multitasca dels models Spacy 3.0 per al català.
Notes
Files
ANCORA_ca_2022.zip
Files
(11.6 MB)
Name | Size | Download all |
---|---|---|
md5:ea78258d289faf0f6a940c06291b3b50
|
5.7 MB | Preview Download |
md5:70d9820fff00313f853d5e4d5fbf87f7
|
5.9 MB | Preview Download |