Spanish Biomedical Sub-word Embeddings in FastText

doi:10.5281/zenodo.4557459

Published February 23, 2021 | Version v1.0

Dataset Open

Spanish Biomedical Sub-word Embeddings in FastText

1. Barcelona Supercomputing Center

Spanish Biomedical Sub-word Embeddings in FastText

These embeddings have been generated from the largest corpus ever made from Spanish Biomedical resources till the date.

Citation

@misc{temu2021spanish,
      title={Spanish Biomedical and Clinical Language Embeddings}, 
      author={Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Casimiro Pio Carrino and Ona De Gibert and Aitor Gonzalez-Agirre and Marta Villegas},
      year={2021},
      eprint={2102.12843},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan TL).

Files

cased.zip

Files (12.9 GB)

Name	Size	Download all
cased.zip md5:5b1365aa7a5f42aba05f5e8c73a61811	6.8 GB	Preview Download
LICENSE.txt md5:2ab724713fdaf49e4523c4503bfd068d	18.7 kB	Preview Download
README.md md5:fa42858ee36cfb53a7d3e06e8d161091	1.0 kB	Preview Download
uncased.zip md5:e636b4d7c0298b192b0cc892a98aed47	6.1 GB	Preview Download

Additional details

Is supplement to: https://github.com/PlanTL-SANIDAD/Biomedical-Word-Embeddings-for-Spanish (URL)

	All versions	This version
Views	763	610
Downloads	66	56
Data volume	324.6 GB	292.8 GB

Spanish Biomedical Sub-word Embeddings in FastText

Creators

Description

Notes

Files

cased.zip

Files (12.9 GB)

Additional details

Related works