Designing synthetic regulatory elements using DNA-Diffusion, a generative AI framework
Authors/Creators
Description
File: supplementary_tables.zip
Manuscript Supplementary tables a-g
Supplementary Data 1
File: JASPAR_motifs.txt.zip
Description: JASPAR MOODS Motif Matches by Sequence ID and Cell Type in Training, Validation, Test, and Generated Sets (with Positions and Scores)
Supplementary Data 2
File: Co_occurence_motifs_JASPAR.zip
Description: MOODS Motifs Co-occurrence for Training, Test, Validation and Generated sequences separated by cell-type (GM12878, HepG2, K562)
Supplementary Data 3
File: Enformer_and_Chrombpnet_predictions_enhancer_promoters_different_loci.zip
Description: Tables containing Enformer and ChromBPNet (DNase, CAGE, H3K4me3) predictions for enhancer and genes promoter in different genomic loci.
Supplementary Data 4
File: EXTRASEQ_counts.tab.zip
Description: EXTRA-Seq barcode raw counts for mRNA and DNA for five different replicates
Supplementary Data 5
File: DNA-Diffusion.pt
DNA-Diffusion PyTorch weights (see: https://github.com/pinellolab/DNA-Diffusion )
Supplementary Data 6
File: mpra_model_best-epoch_07_val_MalinoisMPRA_mean_SpearmanR_0_73185.ckpt
Weights of the predictor MPRA model trained with the Malinois MPRA data.
Supplementary Data 7
File: DeepMEL_training.zip
Weights and code to train and sample deepMEL trained in the DHS index data for GM12878, K562, and GM12878
Supplementary Data 8
File: wgan_training.zip
Weights and code to train the different WGAN in different cell types. DeepMEL was trained in the DHS index data. One model per cell: GM12878, K562, and GM12878
Supplementary Data 9
File: CODA_training.zip
Weights and code to train and sample CODA trained in the DHS index data for GM12878, K562, and GM12878
Supplementary Data 10
File: EXTRASEQ_AND_STARRSEQ_R_mpra_voom_scripts.zip
Scripts to process EXTRA-Seq and STARR-Seq data from DNA and mRNA counts used to compute the mRNA/DNA log fold change using the different experiment replicates.
Supplementary Data 11
File: DNA-Diffusion-0.0.2.zip
DNA-Diffusion library source code
Supplementary Data 12
MPRA_predictions.txt
MPRA predictions for the different cell types
Files
Co_occurence_motifs_JASPAR.zip
Files
(7.0 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:0a26b747e73e4717662a887168d2c574
|
2.8 GB | Download |
|
md5:6fc0e2019d889e82dcecb153b401c604
|
105.5 MB | Preview Download |
|
md5:0013ccde5e121d73a341e6cd5f303f61
|
956.3 MB | Preview Download |
|
md5:7c9b8947a9d7ffb22d8719e9817ff6a4
|
38.1 MB | Preview Download |
|
md5:b6e54edfc308e6834c7f4e2dd63bce9e
|
48.8 MB | Preview Download |
|
md5:6724df51f2007b6170dcdc11a3fb9714
|
1.5 GB | Download |
|
md5:77db5157ea111f07446107e5c84de17c
|
426.2 MB | Preview Download |
|
md5:7138ed833870202b7af0cb50bb66214a
|
5.5 kB | Preview Download |
|
md5:9530e11640d92f90fd2739ce80dbc66a
|
11.1 MB | Preview Download |
|
md5:20617d74f3debb68b40837c0f01502c3
|
482.6 MB | Preview Download |
|
md5:b700ce077ca24608456403a472cfd0a5
|
142.8 MB | Preview Download |
|
md5:9b11aa5dc72a58e97d1853a64f6e3067
|
59.9 kB | Download |
|
md5:608184500c533d0295a3ccc48726a405
|
411.0 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/pinellolab/DNA-Diffusion
- Programming language
- Python , Jupyter Notebook
- Development Status
- Active