FastText Spanish Medical Embeddings

Felipe Soares; Marta Villegas; Aitor Gonzalez-Agirre; Jordi Armengol-Estapé; Siamak Barzegar; Martin Krallinger

doi:10.5281/zenodo.3744326

Published April 15, 2020 | Version 2020-04-15

Dataset Open

FastText Spanish Medical Embeddings

1. BSC

[Plan TL/medicine/word embeddings] Word embeddings generated from Spanish corpora that include: (a) the full-text in Spanish available in SciELO.org (until December/2018), (b) all articles from the following Wikipedia categories: Pharmacology, Pharmacy, Medicine and Biology (during December/2018) and (c) the concatenation of the previous two corpora.

We used fastText to train the word embeddings.

For more information, we refer to the corresponding article: https://www.aclweb.org/anthology/W19-1916/

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL) and the ICTUSnet project (https://ictusnet-sudoe.eu/en/)

Files

biomedical_embeddings_for_spanish_v2.0.zip

Files (41.1 GB)

Name	Size
biomedical_embeddings_for_spanish_v2.0.zip md5:69aef90159d92c1b181cc0ea97ca8dde	41.1 GB	Preview Download

Additional details

Is supplement to: https://github.com/PlanTL-SANIDAD/Biomedical-Word-Embeddings-for-Spanish (URL)

Soares F, Villegas M, Gonzalez-Agirre A, Krallinger M, Armengol-Estapé J. Medical Word Embeddings for Spanish: Development and Evaluation. InProceedings of the 2nd Clinical Natural Language Processing Workshop 2019 Jun (pp. 124-133).

Natural language processing: http://id.loc.gov/authorities/subjects/sh88002425

	All versions	This version
Views	5,624	2,255
Downloads	5,487	4,825
Data volume	703.9 TB	692.9 TB

biomedical_embeddings_for_spanish_v2.0.zip

Files (41.1 GB)

Related works

References

Subjects

FastText Spanish Medical Embeddings

Authors/Creators

Description

Notes

Files

biomedical_embeddings_for_spanish_v2.0.zip

Files (41.1 GB)

Additional details

Related works

References

Subjects