Chilean Waiting List Corpus Embeddings

Villena, Fabián; Dunstan, Jocelyn

doi:10.5281/zenodo.3924799

Published July 1, 2020 | Version v1

Other Open

Chilean Waiting List Corpus Embeddings

1. Center for Mathematical Modeling, University of Chile

Contributors

Supervisors:

1. Centro de Modelamiento Matemático, Universidad de Chile
2. Centro de Informática Médica y Telemedicina, Universidad de Chile

The Chilean Waiting List Corpus Embeddings is a Word2Vec word embedding trained over 11 million unstructured free text diagnostics obtained from the Chilean Waiting List through Transparency Law. The corpus used to train this embedding was composed of 56 million word-tokens, where the vocabulary length was 252 thousand different words.

The original Mikolov's implementation of the Word2Vec algorithm was used to compute the embeddings with the default hyperparameters, except for the vector size which was changed to 300.

Files

Files (163.3 MB)

Name	Size	Download all
cwlce.vec md5:f262e5e09313cc04f2115c759909662f	163.3 MB	Download

Views

840

Downloads

Show more details

	All versions	This version
Views	1,117	1,116
Downloads	840	840
Data volume	166.2 GB	166.2 GB

More info on how stats are collected....

DOI

Resource type

Other

Publisher

Zenodo

Thesis

Universidad de Chile

Languages

Spanish

License: Creative Commons Attribution 4.0 International

The Creative Commons Attribution license allows re-distribution and re-use of a licensed work on the condition that the creator is appropriately credited. Read more

Technical metadata

Created: July 1, 2020
Modified: April 24, 2025

Chilean Waiting List Corpus Embeddings

Creators

Contributors

Supervisors:

Description

Files

Files (163.3 MB)