Spanish Biomedical Word Embeddings in FastText

Gutiérrez-Fandiño, Asier; Armengol-Estapé, Jordi; Carrino, Casimiro Pio; De Gibert, Ona; Gonzalez-Agirre, Aitor; Villegas, Marta

doi:10.5281/zenodo.4543236

Published February 16, 2021 | Version 3.0

Dataset Open

Spanish Biomedical Word Embeddings in FastText

1. Barcelona Supercomputing Center

Spanish Biomedical Word Embeddings in FastText

These word embeddings have been generated from the largest corpus ever made from Spanish Biomedicine resources till the date.
The corpus has more than 6Gb of curated high quality text.

For previous version (v2) see: https://zenodo.org/record/3744326#.YCu3fGj0mUk

Citation

@misc{temu2021spanish,
      title={Spanish Biomedical and Clinical Language Embeddings}, 
      author={Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Casimiro Pio Carrino and Ona De Gibert and Aitor Gonzalez-Agirre and Marta Villegas},
      year={2021},
      eprint={2102.12843},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

cased.zip

Files (34.2 GB)

Name	Size
cased.zip md5:35ee1214fd7dec98554a16f7ad6e0bae	18.0 GB	Preview Download
LICENSE.txt md5:2ab724713fdaf49e4523c4503bfd068d	18.7 kB	Preview Download
README.md md5:fbe36370f103502a514a3efe4f7eac31	675 Bytes	Preview Download
uncased.zip md5:6690aa9fd67bad053382380a8365e422	16.2 GB	Preview Download

Additional details

Is supplement to: https://github.com/PlanTL-SANIDAD/Biomedical-Word-Embeddings-for-Spanish (URL)

	All versions	This version
Views	862	862
Downloads	382	382
Data volume	4.3 TB	4.3 TB

Spanish Biomedical Word Embeddings in FastText

Authors/Creators

Description

Notes

Files

cased.zip

Files (34.2 GB)

Additional details

Related works