There is a newer version of the record available.

Published April 14, 2020 | Version 0.1
Journal article Open

EpiRegio: Analysis and retrieval of regulatory elements linked to genes

  • 1. Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany
  • 2. Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany;
  • 3. Genome Institute of Singapore, 60 Biopolis Street, Genome, 02-01 Singapore 138672; Cluster of Excellence, Multimodal Computing and Interaction, Saarland Informatics Campus, 66123 Saarbrücken, Germany,
  • 4. Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany

Description

The data set contains all regulatory elements (REMs) and the additional information used to create the EpiRegio  webserver (https://epiregio.de). 

The data set consists of 10 tables (CSV-files):

  • GenomeAnnotation: contains information about genomeVersion, annotationVersion and databaseName (GenomeAnnotation_1.csv.gz)
  • GeneAnnotation: Information of the genes (chr, start, end, geneID, geneSymbol, alternativeGeneID, isTF, strand and annotationVersion) (GeneAnnotation_1.csv.gz)
  • GeneExpression of Blueprint and Roadmap: Per consortium one table containing information about geneID, sampleID, expressionLog2TPM and species (GeneExpression_Blueprint_1.csv.gz and GeneExpressionRoadmap_1.csv.gz)
  • CellTypeInfo: Information of the used cell and tissue types  (cellTypeID, cellTypeName and cellOntologyTerm) (CellTypeInfo.csv.gz)
  • sampleInfo of Roadmap and Blueprint: Per consortium one table containing  information about sampleID, originalSampleID, cellTypeID, origin and dataType (sampleInfo_Blueprint_1.csv.gz and sampleInfo_Roadmap_1.csv.gz)
  • REMAnnotation: contains all predicted REMs using STITCHIT (chr, start, end, geneID, REMID, regressionCoefficient, pValue, normModelScore, meanDNase1Signal, sdDNase1Signal, consortium and version) (REMAnnotationModelScore_1.csv.gz)
  • REMActivity:  This table contains per REM the DNase-signal  and the standardised DNase-signal per cell or tissue type (REMID, sampleID, dnase1Log2, standDnase1Log2 and version) (REMActivity_1.csv.gz)
  • clusterREMs: contains all CREMs (REMID, CREMID, chr, start, end, REMsPerCREM and version) (clusterREMs_1.csv.gz)

With these tables the underlying database of EpiRegio can easily be reconstructed. The source code for the current version of the EpiRegio webserver version is available at 10.5281/zenodo.3751189. 

 

Notes

This work has been supported by the DZHK (German Centre for Cardiovascular Research, 81Z0200101) and the Cardio-Pulmonary Institute (CPI) [EXC 2026], and the DFG SFB/TRR 267 Noncoding RNAs in the cardiovascular system.

Files

Files (6.4 GB)

Name Size Download all
md5:893482b8e2ec02346408f9d0ede2aa42
1.0 kB Download
md5:945f87f088b5e0dd8566a7fd23e5c9c3
7.4 MB Download
md5:b64ca12690dc22bdbe8b20a97a610f4d
1.0 MB Download
md5:2b43a6632029c7b289f260d6950b2874
21.0 MB Download
md5:77753f17fcb15518046ac296dd84cf07
43.1 MB Download
md5:d7c27a9631bf719e3a1152539c887bbc
97 Bytes Download
md5:b674a3d32658650bee61480c9c5e44db
6.4 GB Download
md5:f5be3ba36d69cf79abb438c4305ce310
557 Bytes Download
md5:f1412276ae4f5dc343b0f8930366053b
1.1 kB Download

Additional details

Related works

Compiles
Software documentation: 10.5281/zenodo.3751189 (DOI)
Is derived from
Journal article: 10.5281/zenodo.3665990 (DOI)