Somatic and Germline Variant Calling Workflows for Canopy2
- 1. Department of Biostatistics, The University of North Carolina at Chapel Hill
- 2. Department of Biostatistics and Department of Genetics, The University of North Carolina at Chapel Hill
Description
Overview
Zipped folders contain shell and R scripts associated with the somatic (_Somatic) and germline (_Germline) variant calling workflows outlined in “Canopy2: tumor phylogeny inference using bulk DNA and single-cell RNA sequencing,” by Weideman et al. We refer the user to the manuscript methods and supplement for graphical illustrations and associated protocols corresponding to these workflows.
Workflows are demonstrated on two individuals with breast cancer from Chung et al., 2017 and three individuals with glioblastoma from Lee et al., 2017. More information regarding the samples, excepting GBM2 (Lee et al., 2017) which was not utilized for main analysis due to size, can be found in the main text and supplement.
The script titled “master_script_authorname.sh” in each folder can be utilized to interactively run all steps of the workflow on your cluster. While these master scripts and associated files are written as generically as possible, the user will still have to modify the paths, file names, and sample names to coincide with their specific project. We also suggest updating software versions to coincide with the most recent versioning in order to avoid potential conflicts.
Data Availability
The FASTQ files were too large to upload here, so we refer the readers to the following locations for the sequenced data.
Glioblastoma data (Lee et al., 2017): the RNA-seq (single-cell and bulk) and bulk DNA WES data were downloaded from the European Genome-phenome Archive (EGA) under accession code EGAS00001001880.
Breast cancer data (Chung et al., 2017): the single-cell and bulk RNA-seq data were downloaded from the NCBI Gene Expression Omnibus (GEO) database under accession code GSE75688. The bulk DNA WES data was downloaded from the NCBI Sequence Read Archive (SRA) under accession code SRP067248.
Files
Chung_NatureCommunications_2017_Germline.zip
Additional details
References
- Chung W, Eum HH, Lee HO, Lee KM, Lee HB, Kim KT, et al. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nature Communications. 2017;8(1). doi:10.1038/ncomms15081
- Lee JK, Wang J, Sa JK, Ladewig E, Lee HO, Lee IH, et al. Spatiotemporal genomic architecture informs precision oncology in glioblastoma. Nature Genetics. 2017;49(4):594–599. doi:10.1038/ng.3806