Published April 14, 2020
| Version 1
Journal article
Open
EpiRegio: Analysis and retrieval of regulatory elements linked to genes
Creators
- 1. Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany
- 2. Institute for Cardiovascular Regeneration, Goethe University Hospital, 60590 Frankfurt am Main, Germany;
- 3. Genome Institute of Singapore, 60 Biopolis Street, Genome, 02-01 Singapore 138672; Cluster of Excellence, Multimodal Computing and Interaction, Saarland Informatics Campus, 66123 Saarbrücken, Germany,
- 4. Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Maximus-von-Imhof-Forum 3, 85354 Freising, Germany
Description
The data set contains all regulatory elements (REMs) and the additional information used to create the EpiRegio webserver (https://epiregio.de).
The data set consists of 10 tables (CSV-files):
- GenomeAnnotation: contains information about genomeVersion, annotationVersion and databaseName (GenomeAnnotation_1.csv.gz)
- GeneAnnotation: Information of the genes (chr, start, end, geneID, geneSymbol, alternativeGeneID, isTF, strand and annotationVersion) (GeneAnnotation_1.csv.gz)
- GeneExpression of Blueprint and Roadmap: Per consortium one table containing information about geneID, sampleID, expressionLog2TPM and species (GeneExpression_Blueprint_1.csv.gz and GeneExpressionRoadmap_1.csv.gz)
- CellTypeInfo: Information of the used cell and tissue types (cellTypeID, cellTypeName and cellOntologyTerm) (CellTypeInfo.csv.gz)
- sampleInfo of Roadmap and Blueprint: Per consortium one table containing information about sampleID, originalSampleID, cellTypeID, origin and dataType (sampleInfo_Blueprint_1.csv.gz and sampleInfo_Roadmap_1.csv.gz)
- REMAnnotation: contains all predicted REMs using STITCHIT (chr, start, end, geneID, REMID, regressionCoefficient, pValue, normModelScore, meanDNase1Signal, sdDNase1Signal, consortium and version) (REMAnnotationModelScore_1.csv.gz)
- REMActivity: This table contains per REM the DNase-signal and the standardised DNase-signal per cell or tissue type (REMID, sampleID, dnase1Log2, standDnase1Log2 and version) (REMActivity_1.csv.gz)
- clusterREMs: contains all CREMs (REMID, CREMID, chr, start, end, REMsPerCREM and version) (clusterREMs_1.csv.gz)
With these tables the underlying database of EpiRegio can easily be reconstructed. The source code for the current version of the EpiRegio webserver version is available at 10.5281/zenodo.3751189. EpiRegio uses the STITCHIT algorithm, which is currently under revision. The preprint is available at http://dx.doi.org/10.1101/585125.
Notes
Files
Files
(7.1 GB)
Name | Size | Download all |
---|---|---|
md5:893482b8e2ec02346408f9d0ede2aa42
|
1.0 kB | Download |
md5:945f87f088b5e0dd8566a7fd23e5c9c3
|
7.4 MB | Download |
md5:4fb2808ccf9b8f31bd8c29c7cfbf0a7f
|
1.0 MB | Download |
md5:2b43a6632029c7b289f260d6950b2874
|
21.0 MB | Download |
md5:77753f17fcb15518046ac296dd84cf07
|
43.1 MB | Download |
md5:83316853916b1df35f35d931d2d0c53f
|
97 Bytes | Download |
md5:d4668f53c78a657d26379035f96d9ef7
|
6.9 GB | Download |
md5:058bcffb802579e8dbc7296b504d351f
|
130.3 MB | Download |
md5:f5be3ba36d69cf79abb438c4305ce310
|
557 Bytes | Download |
md5:f1412276ae4f5dc343b0f8930366053b
|
1.1 kB | Download |
Additional details
Related works
- Compiles
- Software documentation: 10.5281/zenodo.3751189 (DOI)
- Is derived from
- Journal article: 10.5281/zenodo.3665990 (DOI)