Dataset Open Access
Felipe Soares;
Marta Villegas;
Aitor Gonzalez-Agirre;
Jordi Armengol-Estapé;
Martin Krallinger
This version throws an error while loading the file, There is a newer version of this record available.
[Plan TL/medicine/word embeddings] Word embeddings generated from Spanish corpora that include: (a) the full-text in Spanish available in Scielo.org (until December/2018), (b) all articles from the following Wikipedia categories: Pharmacology, Pharmacy, Medicine and Biology (during December/2018) and (c) the concatenation of the previous two corpora.
To generate the word embedding two different approaches were used: Word2Vec and fastText.
For more information, we refer to the corresponding article: https://www.aclweb.org/anthology/W19-1916/
Name | Size | |
---|---|---|
Embeddings_2019-01-01.zip
md5:e7a3dce00bcc156e150d45ae85e02be9 |
8.6 GB | Download |
Soares F, Villegas M, Gonzalez-Agirre A, Krallinger M, Armengol-Estapé J. Medical Word Embeddings for Spanish: Development and Evaluation. InProceedings of the 2nd Clinical Natural Language Processing Workshop 2019 Jun (pp. 124-133).
All versions | This version | |
---|---|---|
Views | 1,907 | 977 |
Downloads | 16,904 | 279 |
Data volume | 681.8 TB | 2.4 TB |
Unique views | 1,552 | 868 |
Unique downloads | 4,845 | 207 |