Published June 28, 2021 | Version 1.0
Dataset Open

Spanish Legal Domain Word & Sub-Word Embeddings

  • 1. Barcelona Supercomputing Center

Description

Spanish Legal Word and Sub-word Embeddings in FastText

These embeddings have been generated from the largest corpus (9GB) ever made from Spanish Legal resources till the date.

More legal domain resources: https://github.com/PlanTL-GOB-ES/lm-legal-es

Citation

@misc{gutierrezfandino2021legal,
      title={Spanish Legalese Language Model and Corpora}, 
      author={Asier Gutiérrez-Fandiño and Jordi Armengol-Estapé and Aitor Gonzalez-Agirre and Marta Villegas},
      year={2021},
      eprint={2110.12201},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Copyright

Copyright (c) 2021 Secretaría de Estado de Digitalización e Inteligencia Artificial

Notes

Funded by the Plan de Impulso de las Tecnologías del Lenguaje (Plan-TL).

Files

LICENSE.txt

Files (39.0 GB)

Name Size Download all
md5:2ab724713fdaf49e4523c4503bfd068d
18.7 kB Preview Download
md5:e5b1e889c7aeb313ec269cb5cdc92b9f
1.0 kB Preview Download
md5:55448c3d326918293088a28a05a60d03
5.4 GB Download
md5:f41bc24f64ae6a4ad5c2b9c79b3a4df3
1.6 GB Download
md5:ceac247ca045e3cee174c2919ad93cb5
5.4 GB Download
md5:2dfbb095d2edc947c7de4c49a663d515
5.4 GB Download
md5:5a433d80382c59823c88a6a4b65ddd0c
5.4 GB Download
md5:92b5324e21bc7c24778c38cda4b19e78
5.4 GB Download
md5:0914454cdd06f56df03e900ba8b7085f
5.4 GB Download
md5:2b182c5b9eadde007ad308851c1978d0
5.2 GB Download