Dataset Open Access

CORD-19 Diachronic Concept Embeddings

Newman-Griffis, Denis; Sivaraman, Venkatesh; Fosler-Lussier, Eric; Perer, Adam; Hochheiser, Harry

Medical concept embeddings trained on monthly releases of the CORD-19 dataset, for analysis of diachronic change.

This resource includes:

  1. Concept embeddings, trained using the JET toolkit on the set of articles added to CORD-19 in each month of March 2020 - October 2020.  Concepts are SNOMED CT codes.
  2. Nearest neighborhood-based analysis results of the concept embeddings, stored as a SQLite database.

Database content excludes terms and definitions for SNOMED CT codes, which are licensed content owned by SNOMED International. Information on accessing SNOMED licensed content may be found here.

Files (1.3 GB)
Name Size
CORD-19_analysis__2020-03__2020-10.db
md5:5bc7c50f6bfd89c1c33d34d394556c92
340.3 MB Download
CORD-19_monthly_embeddings.zip
md5:be0737c6c04a9e51fc04e94f384dc1f9
998.5 MB Download
5,834
509
views
downloads
All versions This version
Views 5,8342,626
Downloads 509132
Data volume 85.5 GB63.3 GB
Unique views 5,3252,452
Unique downloads 342102

Share

Cite as