00000nam##2200000uu#4500 4299922 doi 10.5281/zenodo.4299922 oai:zenodo.org:4299922 user-eucanshare user-eucan user-eu Spiess, Andrej Medical Center Hamburg-Eppendorf Engels, Anna Lena Medical Center Hamburg-Eppendorf euCanSHare. Deliverable 4.4 - Bioinformatics Toolbox Zeller, Tanja Medical Center Hamburg-Eppendorf info:eu-repo/semantics/openAccess Creative Commons Attribution 4.0 International https://creativecommons.org/licenses/by/4.0/legalcode cc-by-4.0 spdx Meaning and purpose of this deliverable is to demonstrate the applicability of a bioinformatical tool (part of a larger toolbox) that can either analyse external data through an upload mechanism or offer the automatic analysis of internal server-housed data. For this initial case, we selected the analysis of RNA sequencing (RNAseq) data, the de facto standard of today’s gene expression measurement, as it is widely applied in the scientific community. We have programmed a tool that (as it currently stands) can analyse differential gene expression between two groups, based on a provided “raw count” RNAseq matrix and three additional files containing gene annotation data, group definitions and covariates. All data is automatically matched and a subsequent extensive analyses of the data is conducted, including visualizations of expression levels, variance structure analysis by decomposition (PCA), variance contribution analysis, hierarchical clustering of top differential transcripts, profile plots, and diagnostic plots (MA plot, Volcano plot). During analysis, the obtained data to generate these exported plots is also automatically exported and named accordingly. The differential gene expression is calculated by covariate-adjusted linear models with multiple testing-corrected p-values. Finally, a large result matrix is generated, with the original count matrix augmented with annotations, gene names and the complete statistical data and sorted ascendingly by the corrected p-value, so that the most differential transcripts reside on the top of the data. In future, it is envisaged that the user selects RNAseq data deposited alongside clinical variables and defines the desired grouping of the samples, which then is sufficient to create a complete analysis output as described above. This deliverable has been produced in the context of the euCanSHare (An EU-Canada joint infrastructure for next-generation multi-Study Heart research) Research and Innovation Action, funded by the European Union's Horizon 2020 programme (grant agreement No 825903), the Canadian Institutes of Health Research (CIHR) and the Fonds de recherche du Québec – Santé under the framework of Canada‐EU Commission Flagship Collaboration for Human data storage, integration and sharing. eng Zenodo 2020-11-30 user-eucanshare user-eucan user-eu info:eu-repo/semantics/report 825903 An EU-Canada joint infrastructure for next-generation multi-Study Heart research 20201202122713.0 473711 md5:497b9ed6591b953b7b0dd17a2de410de https://zenodo.org/records/4299922/files/D4.4_euCanSHare_UKE_30112020.pdf open 10.5281/zenodo.4299921 isVersionOf doi