10.5281/zenodo.3758494
https://zenodo.org/records/3758494
oai:zenodo.org:3758494
Baumgarten Nina
Baumgarten Nina
0000-0002-5423-8634
Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany
Hecker Dennis
Hecker Dennis
0000-0003-0272-243X
Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany;
Karunanithi Sivarajan
Karunanithi Sivarajan
0000-0001-6128-5351
Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany
Schmidt Florian
Schmidt Florian
0000-0001-9222-6207
Genome Institute of Singapore, 60 Biopolis Street, Genome, 02-01 Singapore 138672; Cluster of Excellence, Multimodal Computing and Interaction, Saarland Informatics Campus, 66123 Saarbrücken, Germany,
List Markus
List Markus
0000-0002-0941-4168
Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
Schulz Marcel H.
Schulz Marcel H.
0000-0002-1252-3656
Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany;
EpiRegio: Analysis and retrieval of regulatory elements linked to genes
Zenodo
2020
web server
Epigenomics
enhancer
gene regulation
enhancer target association
2020-04-14
eng
10.5281/zenodo.3665990
10.5281/zenodo.3751189
10.5281/zenodo.3750929
1
Creative Commons Attribution 4.0 International
The data set contains all regulatory elements (REMs) and the additional information used to create the EpiRegio webserver (https://epiregio.de).
The data set consists of 10 tables (CSV-files):
GenomeAnnotation: contains information about genomeVersion, annotationVersion and databaseName (GenomeAnnotation_1.csv.gz)
GeneAnnotation: Information of the genes (chr, start, end, geneID, geneSymbol, alternativeGeneID, isTF, strand and annotationVersion) (GeneAnnotation_1.csv.gz)
GeneExpression of Blueprint and Roadmap: Per consortium one table containing information about geneID, sampleID, expressionLog2TPM and species (GeneExpression_Blueprint_1.csv.gz and GeneExpressionRoadmap_1.csv.gz)
CellTypeInfo: Information of the used cell and tissue types (cellTypeID, cellTypeName and cellOntologyTerm) (CellTypeInfo.csv.gz)
sampleInfo of Roadmap and Blueprint: Per consortium one table containing information about sampleID, originalSampleID, cellTypeID, origin and dataType (sampleInfo_Blueprint_1.csv.gz and sampleInfo_Roadmap_1.csv.gz)
REMAnnotation: contains all predicted REMs using STITCHIT (chr, start, end, geneID, REMID, regressionCoefficient, pValue, normModelScore, meanDNase1Signal, sdDNase1Signal, consortium and version) (REMAnnotationModelScore_1.csv.gz)
REMActivity: This table contains per REM the DNase-signal and the standardised DNase-signal per cell or tissue type (REMID, sampleID, dnase1Log2, standDnase1Log2 and version) (REMActivity_1.csv.gz)
clusterREMs: contains all CREMs (REMID, CREMID, chr, start, end, REMsPerCREM and version) (clusterREMs_1.csv.gz)
With these tables the underlying database of EpiRegio can easily be reconstructed. The source code for the current version of the EpiRegio webserver version is available at 10.5281/zenodo.3751189. EpiRegio uses the STITCHIT algorithm, which is currently under revision. The preprint is available at http://dx.doi.org/10.1101/585125.
This work has been supported by the DZHK (German Centre for Cardiovascular Research, 81Z0200101) and the Cardio-Pulmonary Institute (CPI) [EXC 2026], and the DFG SFB/TRR 267 Noncoding RNAs in the cardiovascular system.