Dataset Open Access
Felipe Soares;
Marta Villegas;
Aitor Gonzalez-Agirre;
Jordi Armengol-Estapé;
Siamak Barzegar;
Martin Krallinger
[Plan TL/medicine/word embeddings] Word embeddings generated from Spanish corpora that include: (a) the full-text in Spanish available in SciELO.org (until December/2018), (b) all articles from the following Wikipedia categories: Pharmacology, Pharmacy, Medicine and Biology (during December/2018) and (c) the concatenation of the previous two corpora.
We used fastText to train the word embeddings.
For more information, we refer to the corresponding article: https://www.aclweb.org/anthology/W19-1916/
Name | Size | |
---|---|---|
biomedical_embeddings_for_spanish_v2.0.zip
md5:69aef90159d92c1b181cc0ea97ca8dde |
41.1 GB | Download |
Soares F, Villegas M, Gonzalez-Agirre A, Krallinger M, Armengol-Estapé J. Medical Word Embeddings for Spanish: Development and Evaluation. InProceedings of the 2nd Clinical Natural Language Processing Workshop 2019 Jun (pp. 124-133).
All versions | This version | |
---|---|---|
Views | 3,275 | 1,221 |
Downloads | 17,284 | 16,725 |
Data volume | 694.2 TB | 686.7 TB |
Unique views | 2,640 | 1,048 |
Unique downloads | 5,089 | 4,681 |