Published May 14, 2023 | Version 1.0.0
Workflow Open

Somatic and Germline Variant Calling Workflows for Canopy2

  • 1. Department of Biostatistics, The University of North Carolina at Chapel Hill
  • 2. Department of Biostatistics and Department of Genetics, The University of North Carolina at Chapel Hill

Description

Overview

Zipped folders contain shell and R scripts associated with the somatic (_Somatic) and germline (_Germline) variant calling workflows outlined in “Canopy2: tumor phylogeny inference using bulk DNA and single-cell RNA sequencing,” by Weideman et al. We refer the user to the manuscript methods and supplement for graphical illustrations and associated protocols corresponding to these workflows.

Workflows are demonstrated on two individuals with breast cancer from Chung et al., 2017 and three individuals with glioblastoma from Lee et al., 2017. More information regarding the samples, excepting GBM2 (Lee et al., 2017) which was not utilized for main analysis due to size, can be found in the main text and supplement.

The script titled “master_script_authorname.sh” in each folder can be utilized to interactively run all steps of the workflow on your cluster. While these master scripts and associated files are written as generically as possible, the user will still have to modify the paths, file names, and sample names to coincide with their specific project. We also suggest updating software versions to coincide with the most recent versioning in order to avoid potential conflicts.

Data Availability

The FASTQ files were too large to upload here, so we refer the readers to the following locations for the sequenced data.

Glioblastoma data (Lee et al., 2017): the RNA-seq (single-cell and bulk) and bulk DNA WES data were downloaded from the European Genome-phenome Archive (EGA) under accession code EGAS00001001880.

Breast cancer data (Chung et al., 2017): the single-cell and bulk RNA-seq data were downloaded from the NCBI Gene Expression Omnibus (GEO) database under accession code GSE75688. The bulk DNA WES data was downloaded from the NCBI Sequence Read Archive (SRA) under accession code SRP067248.

Files

Chung_NatureCommunications_2017_Germline.zip

Files (77.6 kB)

Name Size Download all
md5:124e510eb8c56122a17f2ccf776ca03d
21.1 kB Preview Download
md5:b3239a93e23d20aef6e312e72127cc03
18.3 kB Preview Download
md5:908926d795cc5938ec42f3adb02f88e5
20.6 kB Preview Download
md5:4ba89f418889601c91b27410d941387a
17.7 kB Preview Download

Additional details

References

  • Chung W, Eum HH, Lee HO, Lee KM, Lee HB, Kim KT, et al. Single-cell RNA-seq enables comprehensive tumour and immune cell profiling in primary breast cancer. Nature Communications. 2017;8(1). doi:10.1038/ncomms15081
  • Lee JK, Wang J, Sa JK, Ladewig E, Lee HO, Lee IH, et al. Spatiotemporal genomic architecture informs precision oncology in glioblastoma. Nature Genetics. 2017;49(4):594–599. doi:10.1038/ng.3806