Published February 16, 2021 | Version 3.0
Dataset Open

Spanish Biomedical Word Embeddings in FastText

Description

Spanish Biomedical Word Embeddings in FastText

These word embeddings have been generated from the largest corpus ever made from Spanish Biomedicine resources till the date.
The corpus has more than 6Gb of curated high quality text.

For previous version (v2) see: https://zenodo.org/record/3744326#.YCu3fGj0mUk

Citation

@misc{temu2021spanish,
      title={Spanish Biomedical and Clinical Language Embeddings}, 
      author={Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Casimiro Pio Carrino and Ona De Gibert and Aitor Gonzalez-Agirre and Marta Villegas},
      year={2021},
      eprint={2102.12843},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

 

Copyright (c) 2021 Text Mining Unit - Barcelona Supercomputing Center

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

cased.zip

Files (34.2 GB)

Name Size Download all
md5:35ee1214fd7dec98554a16f7ad6e0bae
18.0 GB Preview Download
md5:2ab724713fdaf49e4523c4503bfd068d
18.7 kB Preview Download
md5:fbe36370f103502a514a3efe4f7eac31
675 Bytes Preview Download
md5:6690aa9fd67bad053382380a8365e422
16.2 GB Preview Download

Additional details