Published July 1, 2020
| Version v1
Other
Open
Chilean Waiting List Corpus Embeddings
Creators
- 1. Center for Mathematical Modeling, University of Chile
Contributors
Supervisors:
- 1. Centro de Modelamiento Matemático, Universidad de Chile
- 2. Centro de Informática Médica y Telemedicina, Universidad de Chile
Description
The Chilean Waiting List Corpus Embeddings is a Word2Vec word embedding trained over 11 million unstructured free text diagnostics obtained from the Chilean Waiting List through Transparency Law. The corpus used to train this embedding was composed of 56 million word-tokens, where the vocabulary length was 252 thousand different words.
The original Mikolov's implementation of the Word2Vec algorithm was used to compute the embeddings with the default hyperparameters, except for the vector size which was changed to 300.
Files
Files
(163.3 MB)
Name | Size | Download all |
---|---|---|
md5:f262e5e09313cc04f2115c759909662f
|
163.3 MB | Download |