Gene Regulatory Network inference in long lived C.elegans reveals modular properties that are predictive of novel ageing genes - Database of Physical gene-gene Interactions in young adult C.elegans.
Creators
- 1. Babraham Institute
- 2. U. Cambridge
- 3. Imperial College
- 4. Universidad Santo Tomas
- 5. ICREA
- 6. Universitat Rovira i Virgili
Description
This repository contains Supplementary Information for manuscript Suriyalaksh et al Gene Regulatory Network inference in long lived C.elegans reveals modular properties that are predictive of novel ageing genes corresponding to the curation of physical gene-gene interactions for young adult C elegans worms
We manually curated 239,001 regulatory interactions from 289 young adult wild-type (WT) C.elegans datasets, consisting of 126 genes and 495 unique transcription factors (see TableS1_datasets_for_prior.csv for references).
This repository contains 3 different files:
TableS1_datasets_for_prior.csv - contains datasets used as sources for physical gene-gene or TF-gene interactions
TableS2_physical_priors.xlsx - contains three tabs:
ChIPATAC - contains physical TF-gene interactions from 115 L4 or young-adult ChIP-seq datasets from modERN (Kudron et al., 2018) + ChIP-seq datasets (GSE28350, GSE81521) from (Hochbaum et. al, 2011, Li et. al, 2016).
eY1HATAC- contains 3,501 TF-gene interactions from eY1H assay by Fuxman Bass et al. (2016).
motifATAC - contains 202 unique TF DNA recognition motifs using “direct evidence” option from CiS-BP motif database (Weirauch et al., 2014), obtained through RTFBSDB R package (Wang et al., 2016) - see TableS1
TableS3_WT_functional_priors.csv - contains functional knockdown data that we use as gold standard to validate inferred networks in Suriyalaksh et al. (see TableS1_datasets_for_prior.csv for sources)
---
Description of methodology to obtain regulatory interactions in TableS2:
Regulatory sequences for each gene were acquired from ENSEMBL (Aken et al., 2017), obtained using biomaRt R package (accessed on 31st Oct 2017). This study used WBcel235/ce11 version of the C. elegans genome, and WormBase WS260 genome annotations.
For motifs, TFs whose motifs overlapped with an open ATAC-seq region by at least one base pair were kept. For ChIP-seq, TF binding sites that overlapped with an open ATAC-seq region by at least one base pair were kept using bedtools intersect and bedtools merge commands.
An interaction from a TF to a gene was inferred by aligning transcription start sites (TSS) using bedtools window commands with 1000 bp window size to the TF-binding locations from ChIP-seq and motifs.
For eY1H data, an interaction is included if the TSS site of the target gene overlaps with an open ATAC-seq region by at least one base pair.
For gene-gene interactions, of the 298 studies compiled in WormExp v1.0 database (Yang et al, 2016, updated 27/07/16), 98 studies were included in the database spanning 126 different genes (see Table S1 in this repository).