Published February 9, 2021 | Version 1.0
Dataset Open

Catalan Word Embeddings in FastText

Description

These Catalan word embeddings in FastText have been generated from the largest corpus ever made in Catalan till the date. The corpus has more than 10Gb of curated high quality text.

If this material is useful, please cite it.

 

Copyright (c) 2021 Text Mining Unit  - Barcelona Supercomputing Center

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL) and the Generalitat de Catalunya, Departament de Polítiques Digitals i Administració Pública.

Files

catalan-word-embeddings-fasttext.zip

Files (20.0 GB)

Name Size Download all
md5:1ca49371e37725fb57ec743c89cbd5b5
20.0 GB Preview Download