There is a newer version of the record available.

Published April 6, 2021 | Version 0.1.0
Dataset Open

A living catalogue of artificial intelligence datasets and benchmarks for medical decision making

  • 1. Medical University of Vienna


We provide a comprehensive curated catalogue of artificial intelligence datasets and benchmarks for medical decision making. At the time of first release (April 2021), the dataset contains more than 400 biomedical and clinical datasets of which 252 are publicly available or available upon request.

The dataset was combiled based on a systematic literature review covering both biomedical and computer science literature and grey literature data sources. All datasets were manually systematized and annotated for meta-information, such as:

  • Availability and licensing information
  • Type of source data
  • Links to source publications, main references or dataset repositories

Benchmark dataset were additionally annotated for the following information:

  • Associated task
  • Performance metrics commonly used for evaluation
  • Clinical relevance
  • The availability of data splits

In addition to the versioned TSV file on Zenodo, the dataset can also be explored live via this Google Spreadsheet. The dataset is intended as a living, extendable resource. Edit suggestions and additions are encouraged and can be submitted via the comment function of the Google sheet.


File descriptions

annotated-datasets.tsv -- contains the annotated datasets

arXiv-literature-export.tsv -- contains the original literature record export from arXiv

pubmed-literature-export.tsv -- contains the original literature record export from PubMed


Files (546.6 kB)

Name Size Download all
164.7 kB Download
294.9 kB Download
86.9 kB Download

Additional details


European Commission
U-PGx – Ubiquitous Pharmacogenomics (U-PGx): Making actionable pharmacogenomic data and effective treatment optimization accessible to every European citizen 668353