Training material for small RNA-seq data analysis (Galaxy Training Network tutorial)
- 1. Johns Hopkins University
The data provided here are part of a Galaxy Training Network tutorial that analyzes small RNA-seq (sRNA-seq) data from a study published by Harrington et al. (DOI:10.1186/s12864-017-3692-8) to detect differential abundance of various classes of endogenous short interfering RNAs (esiRNAs). The goal of this study was to investigate "connections between differential retroTn and hp-derived esiRNA processing and cellular location, and to investigate the potential link between mRNA 3’ end cleavage and esiRNA biogenesis." To this end, sRNA-seq libraries were constructed from triplicate Drosophila tissue culture samples under conditions of either control RNAi or RNAi knockdown of a factor involved in mRNA 3’ end processing, Symplekin. This dataset (GEO Accession: GSE82128) consists of single-end, size-selected, non-rRNA-depleted sRNA-seq libraries. Because of the long processing time for the large original files, we have downsampled the original raw data files to include only reads that align to a subset of interesting transcript features including: (1) transposable elements, (2) Drosophila piRNA clusters, (3) Symplekin, and (4) genes encoding mass spectrometry-defined protein binding partners of Symplekin from Additional File 2 in the indicated paper by Harrington et al. More details on features 1 and 2 can be found here: https://github.com/bowhan/piPipes/blob/master/common/dm3/genomic_features (piRNA_Cluster, Trn). All features are from the Drosophila genome Apr. 2006 (BDGP R5/dm3) release.
- 10.1186/s12864-017-3692-8 (DOI)
- Harrington, AW et al. Drosophila melanogaster retrotransposon and inverted repeat-derived endogenous siRNAs are differentially processed in distinct cellular locations. 18, 304 (2017).
- Afgan, E et al. The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2016 update. 44, W3–W10 (2016).