Gene-level read counts from bulk RNA-seq data for 38 follicular lymphoma diagnostic biopsies
Description
Conventional (bulk) RNA-sequencing was performed on unfractionated cell suspension or snap frozen whole tissue material. Total RNA was isolated with TRIzol reagent followed by purification over PureLink RNA Mini Kit columns (Invitrogen). RNA-seq was performed using a polyA-enriched strand-specific library construction protocol (doi: 10.1016/j.ccell.2016.02.009) and paired-end 75bp sequencing on an Illumina HiSeq 2500 instrument.
Raw reads were aligned to the reference human genome assembly GRCh37 (hg19) using STAR (v2.5.2.a). To improve spliced alignment, STAR was provided with exon junction coordinates from the reference annotations (Gencode v19). We applied a modified version of a bioinformatics workflow for normalization of raw read counts and differential gene expression analysis (doi: 10.12688/f1000research.9005.3). Gene-level read counts were quantified using HTSEQ-count (v0.11.0; intersection-strict, reverse mode) (doi: 10.1093/bioinformatics/btu638). Genes showing low read counts (i.e., genes not showing counts per million (cpm) > 1.0 in at least 10% of samples) were removed from further analysis. Raw counts from expressed genes were then TMM-normalized and scaled to counts per million (CPM) using the edgeR (v3.22.2) package (doi: 10.1093/bioinformatics/btp616).
Sample IDs correspond to those referenced in Wang X et al, Nature Communications (2022).
Files
Files
(9.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:3be22fe045a8711d23d44daf7dd5c6f4
|
9.9 MB | Download |