Published June 6, 2023 | Version v1
Dataset Open

Enhancer design manuscript data and models

  • 1. Research Institute of Molecular Pathology (IMP)

Description

This repository holds the trained DNA accessibility and enhancer activity models, and the data used to train and evaluate the model.

 

Files for DNA accessibility models:

  • Accessibility_models_training_data.tar.gz
    • <fold>_sequences_<set>.fa
      • FASTA files with DNA sequences of genomic regions from train/val/test sets.
    • <fold>_sequences_activity_<set>.txt
      • Files with accessibility scores for sequences from train/val/test sets.
  • Accessibility_model_files.tar.gz
    • model.h5 and model.json files for each Results_<fold>_<tissue>_DeepSTARR2_<rep>
  • Accessibility_models_test_set_predictions.rds
    • RDS object with predictions for accessibility

 

Files for enhancer activity models:

  • EnhancerActivity_model_files.tar.gz
    • <fold>_sequences_<set>.fa
      • FASTA files with DNA sequences of genomic regions from train/val/test sets.
    • <fold>_sequences_activity_<set>.txt
      • Files with enhancer activity labels for sequences from train/val/test sets.
  • EnhancerActivity_model_files.tar.gz
    • model.h5, model.json and Model_evaluation.pdf files for each Results_<fold>_<tissue>_<rep>
  • EnhancerActivity_models_clean_evaluation_data.rds
    • RDS object with sequences used for final evaluation of the models
  • EnhancerActivity_models_results_per_tissue_test_set.rds
    • RDS object with predictions and results of the enhancer activity models

Files

Files (3.0 GB)

Name Size Download all
md5:d51a1e88ed86201d2e1a5e8fc08f3514
129.5 MB Download
md5:fcd59f932dc3960c587793c255f4744c
257.8 MB Download
md5:fe406913b1a1bcece46a5a2cc2c96b5a
2.3 GB Download
md5:a02bdda4bafe78f472d714b727fe9207
109.7 MB Download
md5:e483a8b3aadbb7c9e7d4eb224515ecc0
523.3 kB Download
md5:36f2c102c646a56df4868325c05ad091
1.7 MB Download
md5:bf54b5d8597d337a4ac75b35c1ecd711
168.4 MB Download