Filtered and annotated SNV and indel variants in the PC3 and LNCaP human prostate cancer cell lines
Description
150bp paired-end reads (insert size 350bp) were obtained using the Illumina HiSeqX sequencer. Samtools v1.3.1 mpileup and bcftools were used to interrogate indexed BAM files, from whole-genome reads aligned to human reference genome GRCh38 build 82, and generate a VCF (Variant Call Format) file of single nucleotide variants (SNVs) and short indel variants. Variants private, or unique to a particular cell line, or shared by both were next identified. Variants (likely to be common germline variants) present in HapMap, 1000 genomes phase 3 (2,504 human genomes), and the National Heart Lung and Blood Institute’s Exome Sequencing Project (ESP) (bundled variant data file available at https://goo.gl/mEogvD) were excluded. Variant files (VCF) were filtered using SnpSift with the following parameters: 'QUAL \textgreater= 200 \&\& DP \textgreater= 30', where QUAL denotes minimum variance confidence and DP total depth threshold. Filtered variants were annotated using SnpEff v4.3g. Please see https://github.com/sciseim/PCaWGS for associated scripts.