Published December 23, 2016 | Version v1
Software Open

Joint statistical inference of clonal populations from single cell and bulk tumour sequencing data

Description

A statistical framework leveraging data obtained from both single cell and bulk sequencing strategies. The ddClone (Salehi et al.) approach is predicated on the notion that single cell sequencing data will inform and improve clustering of allele fractions derived from bulk sequencing data in a joint statistical model.
ddClone combines a Bayesian non-parametric prior informed by single cell data with a likelihood model based on bulk sequencing data to infer clonal population architecture. Intuitively, the prior encourages genomic loci with co-occurring mutations in single cells to cluster together. Using a cell-locus binary matrix from single cell sequencing, ddClone computes a distance matrix between mutations using the Jaccard distance with exponential decay. This matrix is then used as a prior for inference over mutation clusters and their prevalences from deeply sequenced bulk data in a distance-dependent Chinese restaurant process (Frazier and Blei 2012) framework. The output of the model is the most probable set of mutational clusters present and the prevalence of each mutation in the population. The code is based on the ddCRP model, as introduced and implemented in (Frazier and Blei 2012).

Files

Files (65.9 kB)

Name Size Download all
md5:4fd4260805e33cab850fa4a982616255
21.8 kB Download
md5:b9e681b0cfa543d828c489d53d32adb4
2.5 kB Download
md5:b93ddd76f8e5439f704e7b72c3d74b7b
7.9 kB Download
md5:55650bd7bbef73f9659d8092fbb1c627
3.9 kB Download
md5:82ce76c80833c012a89271e3a4ef92ef
3.8 kB Download
md5:79237ef0cf0e307b5a6b1f3d692ca3ea
243 Bytes Download
md5:9d6e7964c12a97a0fccf689db270fb3e
7.3 kB Download
md5:7cb52b9540a936e4ee3b39e78b0021c1
746 Bytes Download
md5:25b4f04d7a06377347419f88463e9ef4
15.7 kB Download
md5:7c115728abad6608c948a1d4a10808f0
2.0 kB Download