Published October 1, 2020 | Version 1.0.0
Dataset Open

Machine learning datasets for epigenomic landscapes in epidermal differentiation

  • 1. Stanford School of Medicine

Description

Datasets for training classification and regression models on sequence and epigenomic features. The data used here is generated from an integrative analysis of the Genomics of Gene Regulation dataset (https://www.encodeproject.org/awards/U01HG007919). For classification models, the peak files used to label genomic regions as positives or negatives can be found in `ggr.label_files.tar.gz`. For regression models, the bigwig files used for target signals can be found at the ENCODE portal. he processed dataset stored in hdf5 format files along with processing details are in the file `nn.ggr.hdf5_files.tar.gz`.

Files

Files (12.1 GB)

Name Size Download all
md5:9ddc89b23413791a39fd7a62bc0fa635
118.2 MB Download
md5:aeca90ba8bf09e47044d2b6bd9424326
12.0 GB Download

Additional details