Dataset Open Access
Newman-Griffis, Denis;
Fosler-Lussier, Eric
Up-to-date information and pre-trained embeddings can be found here.
In order to support NLP research efforts related to COVID-19, we have developed resources for training vector-valued embeddings of COVID-19 related medical concepts, primarily using the CORD-19 dataset.
This resource includes only concept embeddings. Download of the full sets of concept, term, and word embeddings from here requires a valid UMLS Terminology Services account, in order to validate licensed access to SNOMED-CT.
Name | Size | |
---|---|---|
CORD19_202003020_concepts.txt
md5:e9bea0f6edf54b3f8bd28131074b48e1 |
145.9 MB | Download |
CORD19_20200327_concepts.txt
md5:7b1c6c76e97a2bfef2f13f6cc3937054 |
146.2 MB | Download |
CORD19_20200403_concepts.txt
md5:ea48900ecbd7691e3cfb3d16fcb5f2ee |
148.9 MB | Download |
CORD19_20200410_concepts.txt
md5:9bf8635f03ec82b9c100e1d27ccbddd9 |
156.7 MB | Download |
README
md5:4e819183e06ee12e5278ed8112d25f80 |
499 Bytes | Download |
All versions | This version | |
---|---|---|
Views | 4,893 | 1,533 |
Downloads | 358 | 150 |
Data volume | 57.1 GB | 8.1 GB |
Unique views | 4,547 | 1,469 |
Unique downloads | 279 | 109 |