Dataset Open Access

CORD-19 Diachronic Concept Embeddings

Newman-Griffis, Denis; Sivaraman, Venkatesh; Fosler-Lussier, Eric; Perer, Adam; Hochheiser, Harry

Medical concept embeddings trained on monthly releases of the CORD-19 dataset, for analysis of diachronic change.

This resource includes:

  1. Concept embeddings, trained using the JET toolkit on the set of articles added to CORD-19 in each month of March 2020 - October 2020.  Concepts are SNOMED CT codes.
  2. Nearest neighborhood-based analysis results of the concept embeddings, stored as a SQLite database.

Database content excludes terms and definitions for SNOMED CT codes, which are licensed content owned by SNOMED International. Information on accessing SNOMED licensed content may be found here.

Files (1.3 GB)
Name Size
CORD-19_analysis__2020-03__2020-10.db
md5:5bc7c50f6bfd89c1c33d34d394556c92
340.3 MB Download
CORD-19_monthly_embeddings.zip
md5:be0737c6c04a9e51fc04e94f384dc1f9
998.5 MB Download
5,643
499
views
downloads
All versions This version
Views 5,6432,469
Downloads 499129
Data volume 83.1 GB61.7 GB
Unique views 5,1692,309
Unique downloads 33699

Share

Cite as