clonealign assigns single cells (measured with RNA-seq) to their clones of origin, where the clones have been inferred from ultra-shallow scDNA-seq and collated into copy number profiles.

clonealign(gene_expression_data, copy_number_data, max_iter = 100,
  rel_tol = 1e-06, gene_filter_threshold = 0, learning_rate = 0.1,
  x = NULL, fix_alpha = FALSE, fix_s = NULL, dtype = "float64",
  saturate = TRUE, saturation_threshold = 6, K = NULL, B = 20,
  verbose = TRUE)

Arguments

gene_expression_data

A matrix of gene counts or a SingleCellExperiment. This should contain raw counts. See details.

copy_number_data

A matrix or data frame of copy number calls for each clone. See details.

max_iter

Maximum number of Variational Bayes iterations to perform

rel_tol

Relative tolerance (change in ELBO per iteration in percent) below which the inference is considered converged

gene_filter_threshold

Genes with total counts below or equal to this threshold will be filtered out (removes genes with no counts by default)

learning_rate

The learning rate to be passed to the Adam optimizer

x

An optional vector of covariates, e.g. corresponding to batch or patient. Can be a vector of a single covariate or a sample by covariate matrix. Note this should not contain an intercept.

fix_alpha

Should the underlying priors for clone frequencies be fixed? Default TRUE (values are inferred from the data)

fix_s

Should the size factors be fixed? If NULL they are jointly inferred from the data, otherwise a vector corresponding to the number of cells should be specified.

dtype

The dtype for tensorflow useage, either "float32" or "float64"

saturate

Should the CNV-expression relationship saturate above copy number = saturation_threshold? Default TRUE

saturation_threshold

If saturate is true, copy numbers above this will be reduced to the threshold

K

The dimensionality of the expression latent space. If left NULL, K is set to 1 if fewer than 100 genes are present and 6 otherwise.

B

Number of basis functions for spline fitting

verbose

Should warnings and EM convergence information be printed? Default TRUE

Value

An object of class clonealign_fit. The maximum likelihood estimates of the clone assignment paramters are in the clone slot. Maximum likelihood estimates of all model parameters are in the ml_params slot.

Details

Input format

gene_expression_data must either be a SingleCellExperiment or SummarizedExperiment with a counts assay representing raw gene expression counts, or a cell by gene matrix of raw counts.

copy_number_data must either be a matrix, data.frame or DataFrame with a row for each gene in gene_expression_data and a column for each of the clones. If colnames(copy_number_data) is not NULL then these names will be used for each of the clones in the final output.

Recommended parameter settings

As with any probabilistic model there are many parameters to set. Through comprehensive simulations regarding the robustness of the model to mis-specification (ie what's the minimum proportion of genes for which the CNV-expression relationship can be true and our inferences still valid) we have come up with the following guidelines for parameter settings, reflected in the default values:

  • Number of ADAM iterations - if set to 1 we essentially perform gradient descent on the marginal log-likelihood which empircally appears to have the best performance

  • Dispersions should be clone-specific with weak shrinkage (sigma = 1 appears best)

  • The generating probabilities should be fixed to be a priori equal (this corresponds to setting alpha = TRUE)

  • The cell size factors are best fixed in advanced by multiplying the total counts of whatever genes are passed to clonealign by the edgeR (TMM) normalization factors

Controlling Variational inference

Inference is performed using reparametrization-gradient variational inference. Convergence is monitored via changes to the evidence lower bound (ELBO) - this is controlled using the rel_tol parameter. When the difference between the new and old ELBOs normalized by the absolute value of the old falls below rel_tol, the algorithm is considered converged. The maximum number of iterations to acheive this is set using the max_iter parameter.

In each step, maximization is performed using Adam, with learning rate given by learning_rate.

Examples

library(SingleCellExperiment) data(example_sce) copy_number_data <- rowData(example_sce)[,c("A", "B", "C")] cal <- clonealign(example_sce, copy_number_data)
#> Removing 0 genes with low counts
#> Creating Tensorflow graph...
#>
#> running VB [>---------------------------------] 2% | change in elbo 0.014%
#>
#> running VB [>--------------------------------] 3% | change in elbo 0.0083%
#>
#> running VB [>--------------------------------] 4% | change in elbo 0.0065%
#>
#> running VB [=>-------------------------------] 5% | change in elbo 0.0046%
#>
#> running VB [=>-------------------------------] 6% | change in elbo 0.0044%
#>
#> running VB [=>-------------------------------] 7% | change in elbo 0.0044%
#>
#> running VB [==>------------------------------] 8% | change in elbo 0.0013%
#>
#> running VB [==>------------------------------] 9% | change in elbo 0.0037%
#>
#> running VB [==>------------------------------] 10% | change in elbo 0.0021%
#>
#> running VB [===>-----------------------------] 11% | change in elbo 0.0026%
#>
#> running VB [===>----------------------------] 12% | change in elbo 0.00086%
#>
#> running VB [===>-----------------------------] 13% | change in elbo 0.0022%
#>
#> running VB [====>----------------------------] 14% | change in elbo 0.0011%
#>
#> running VB [====>---------------------------] 15% | change in elbo 0.00082%
#>
#> running VB [====>----------------------------] 16% | change in elbo 0.0015%
#>
#> running VB [=====>---------------------------] 17% | change in elbo 0.0014%
#>
#> running VB [=====>--------------------------] 18% | change in elbo 0.00060%
#>
#> running VB [=====>---------------------------] 19% | change in elbo 0.0012%
#>
#> running VB [======>--------------------------] 20% | change in elbo 0.0014%
#>
#> running VB [======>-------------------------] 21% | change in elbo 0.00023%
#>
#> running VB [======>--------------------------] 22% | change in elbo 0.0011%
#>
#> running VB [=======>-------------------------] 23% | change in elbo 0.0016%
#>
#> running VB [======>------------------------] 24% | change in elbo -0.00052%
#>
#> running VB [======>-----------------------] 25% | change in elbo -0.000031%
#>
#> running VB [=======>------------------------] 26% | change in elbo 0.00030%
#>
#> running VB [========>------------------------] 27% | change in elbo 0.0014%
#>
#> running VB [========>-----------------------] 28% | change in elbo 0.00013%
#>
#> running VB [========>----------------------] 29% | change in elbo -0.00021%
#>
#> running VB [=========>----------------------] 30% | change in elbo 0.00071%
#>
#> running VB [=========>----------------------] 31% | change in elbo 0.00053%
#>
#> running VB [=========>-----------------------] 32% | change in elbo 0.0011%
#>
#> running VB [=========>----------------------] 33% | change in elbo 0.00091%
#>
#> running VB [=========>---------------------] 34% | change in elbo -0.00062%
#>
#> running VB [==========>---------------------] 35% | change in elbo 0.00074%
#>
#> running VB [==========>--------------------] 36% | change in elbo -0.00027%
#>
#> running VB [===========>--------------------] 37% | change in elbo 0.00023%
#>
#> running VB [===========>--------------------] 38% | change in elbo 0.00077%
#>
#> running VB [===========>-------------------] 39% | change in elbo -0.00098%
#>
#> running VB [============>--------------------] 40% | change in elbo 0.0014%
#>
#> running VB [============>-------------------] 41% | change in elbo 0.00046%
#>
#> running VB [============>-------------------] 42% | change in elbo 0.00022%
#>
#> running VB [=============>------------------] 43% | change in elbo -0.0010%
#>
#> running VB [=============>-------------------] 44% | change in elbo 0.0017%
#>
#> running VB [=============>-----------------] 45% | change in elbo -0.00011%
#>
#> running VB [=============>----------------] 46% | change in elbo -0.000069%
#>
#> running VB [==============>-----------------] 47% | change in elbo 0.00048%
#>
#> running VB [=============>---------------] 48% | change in elbo -0.0000031%
#>
#> running VB [===============>----------------] 49% | change in elbo 0.00068%
#>
#> running VB [==============>----------------] 50% | change in elbo -0.00018%
#>
#> running VB [===============>---------------] 50% | change in elbo -0.00018%
#>
#> running VB [===============>---------------] 51% | change in elbo 0.000037%
#>
#> running VB [================>---------------] 52% | change in elbo 0.00092%
#>
#> running VB [================>--------------] 53% | change in elbo -0.00083%
#>
#> running VB [================>---------------] 54% | change in elbo 0.00018%
#>
#> running VB [=================>--------------] 55% | change in elbo 0.00038%
#>
#> running VB [================>--------------] 56% | change in elbo -0.00040%
#>
#> running VB [=================>-------------] 57% | change in elbo 0.000091%
#>
#> running VB [==================>-------------] 58% | change in elbo 0.00030%
#>
#> running VB [==================>-------------] 59% | change in elbo 0.00021%
#>
#> running VB [==================>-------------] 60% | change in elbo 0.00070%
#>
#> running VB [==================>------------] 61% | change in elbo -0.00046%
#>
#> running VB [==================>-----------] 62% | change in elbo -0.000033%
#>
#> running VB [===================>-----------] 63% | change in elbo 0.000016%
#>
#> running VB [===================>-----------] 64% | change in elbo -0.00031%
#>
#> running VB [=====================>-----------] 65% | change in elbo 0.0010%
#>
#> running VB [====================>----------] 66% | change in elbo -0.00025%
#>
#> running VB [====================>----------] 67% | change in elbo -0.00026%
#>
#> running VB [=====================>----------] 68% | change in elbo 0.00037%
#>
#> running VB [=====================>----------] 69% | change in elbo 0.00033%
#>
#> running VB [=====================>---------] 70% | change in elbo -0.00025%
#>
#> running VB [====================>---------] 71% | change in elbo -0.000066%
#>
#> running VB [======================>---------] 72% | change in elbo 0.00065%
#>
#> running VB [======================>--------] 73% | change in elbo -0.00029%
#>
#> running VB [======================>--------] 74% | change in elbo -0.00016%
#>
#> running VB [======================>--------] 75% | change in elbo -0.00047%
#>
#> running VB [=======================>--------] 76% | change in elbo 0.00070%
#>
#> running VB [=======================>-------] 77% | change in elbo -0.00046%
#>
#> running VB [========================>-------] 78% | change in elbo 0.00058%
#>
#> running VB [========================>------] 79% | change in elbo -0.00072%
#>
#> running VB [=========================>------] 80% | change in elbo 0.00086%
#>
#> running VB [========================>------] 81% | change in elbo -0.00026%
#>
#> running VB [=========================>------] 82% | change in elbo 0.00014%
#>
#> running VB [========================>-----] 83% | change in elbo -0.000097%
#>
#> running VB [==========================>-----] 84% | change in elbo 0.00030%
#>
#> running VB [=========================>-----] 85% | change in elbo -0.00023%
#>
#> running VB [==========================>----] 86% | change in elbo -0.00048%
#>
#> running VB [=========================>----] 87% | change in elbo -0.000024%
#>
#> running VB [============================>----] 88% | change in elbo 0.0012%
#>
#> running VB [===========================>---] 89% | change in elbo -0.00040%
#>
#> running VB [===========================>---] 90% | change in elbo -0.00060%
#>
#> running VB [============================>---] 91% | change in elbo 0.00072%
#>
#> running VB [============================>--] 92% | change in elbo -0.00018%
#>
#> running VB [============================>--] 93% | change in elbo -0.00011%
#>
#> running VB [============================>--] 94% | change in elbo -0.00046%
#>
#> running VB [==============================>--] 95% | change in elbo 0.0010%
#>
#> running VB [=============================>-] 96% | change in elbo -0.00050%
#>
#> running VB [==============================>-] 97% | change in elbo 0.00019%
#>
#> running VB [=============================>-] 98% | change in elbo -0.00045%
#>
#> running VB [==============================>] 99% | change in elbo 0.000016%
#> clonealign inference complete
print(cal)
#> A clonealign_fit for 200 cells, 100 genes, and 3 clones #> To access clone assignments, call x$clone #> To access ML parameter estimates, call x$ml_params
clones <- cal$clone