Published October 1, 2020
| Version 1.0.0
Dataset
Open
Machine learning datasets for epigenomic landscapes in epidermal differentiation
Description
Datasets for training classification and regression models on sequence and epigenomic features. The data used here is generated from an integrative analysis of the Genomics of Gene Regulation dataset (https://www.encodeproject.org/awards/U01HG007919). For classification models, the peak files used to label genomic regions as positives or negatives can be found in `ggr.label_files.tar.gz`. For regression models, the bigwig files used for target signals can be found at the ENCODE portal. he processed dataset stored in hdf5 format files along with processing details are in the file `nn.ggr.hdf5_files.tar.gz`.
Files
Files
(12.1 GB)
Name | Size | Download all |
---|---|---|
md5:9ddc89b23413791a39fd7a62bc0fa635
|
118.2 MB | Download |
md5:aeca90ba8bf09e47044d2b6bd9424326
|
12.0 GB | Download |
Additional details
Related works
- Is compiled by
- Software: https://github.com/kundajelab/tronn (URL)
- Software: https://bitbucket.org/vervacity/ggr-project (URL)
- Is supplemented by
- Dataset: https://www.encodeproject.org/awards/U01HG007919/ (URL)
- Dataset: https://personal.broadinstitute.org/meuleman/reg2map/HoneyBadger2_release/ (URL)