Ben is looking at parallelizing the genotype log likelihood calculation using Rcpp parallel, for a version of rubias that updates the allele frequencies and hence get bogged down recomputing those logls.

I looked over the Rcpp parallel documentation, and it looks like there is a lot of C++ templating to get right. I was wondering if it would not be possible to actually do the parallelization within R using mclapply.

It actually won’t be, because Ben is doing all the stuff internally to the mcmc function in Rcpp. But I am still curious The idea is to just chop a matrix up and then operate on each part.

I will do it simply with sqrt.

Here is out matrix

library(tidyverse)
Registered S3 method overwritten by 'dplyr':
  method           from
  print.rowwise_df     
Registered S3 methods overwritten by 'ggplot2':
  method         from 
  [.quosures     rlang
  c.quosures     rlang
  print.quosures rlang
── Attaching packages ────────────────────────────────── tidyverse 1.2.1 ──
✔ ggplot2 3.1.1     ✔ purrr   0.3.2
✔ tibble  2.1.1     ✔ dplyr   0.8.1
✔ tidyr   0.8.3     ✔ stringr 1.4.0
✔ readr   1.3.1     ✔ forcats 0.4.0
── Conflicts ───────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
library(parallel)
mat <- abs(rnorm(1e8)) %>%
  matrix(nrow = 10000)

And here is our calculation

system.time(ser <- log(sqrt(mat)))
   user  system elapsed 
  1.527   0.251   1.779 

So, it takes about 2 seconds. Now, the question is, could we parallelize that in a way that was any more efficient?

We will break it up into segments:

starts <- seq(1, 10000, by = floor(10000/8))
ends <- c(starts[-1] - 1, 10000) 
se <- cbind(starts, ends)

system.time({par <- mclapply(1:8, function(x) {
  log(sqrt(mat[se[x,1]:se[x,2], ]))
}) %>%
  do.call(rbind, .)})
   user  system elapsed 
  8.800   2.615   6.121 

Check to see that we get the same thing:

all.equal(ser, par)
[1] TRUE

All of which shows that there is a lot of overhead in breaking the problem up into parallelizable chunks. OK.

LS0tCnRpdGxlOiAiU3BlZWQgVGVzdGluZyBtY2xhcHBseSIKb3V0cHV0OiBodG1sX25vdGVib29rCi0tLQoKQmVuIGlzIGxvb2tpbmcgYXQgcGFyYWxsZWxpemluZyB0aGUgZ2Vub3R5cGUgbG9nIGxpa2VsaWhvb2QgY2FsY3VsYXRpb24KdXNpbmcgUmNwcCBwYXJhbGxlbCwgZm9yIGEgdmVyc2lvbiBvZiBydWJpYXMgdGhhdCB1cGRhdGVzIHRoZSBhbGxlbGUgZnJlcXVlbmNpZXMKYW5kIGhlbmNlIGdldCBib2dnZWQgZG93biByZWNvbXB1dGluZyB0aG9zZSBsb2dscy4gIAoKSSBsb29rZWQgb3ZlciB0aGUgUmNwcCBwYXJhbGxlbCBkb2N1bWVudGF0aW9uLCBhbmQgaXQgbG9va3MgbGlrZSB0aGVyZSBpcyBhIGxvdApvZiBDKysgdGVtcGxhdGluZyB0byBnZXQgcmlnaHQuICBJIHdhcyB3b25kZXJpbmcgaWYgaXQgd291bGQgbm90IGJlIHBvc3NpYmxlIHRvCmFjdHVhbGx5IGRvIHRoZSBwYXJhbGxlbGl6YXRpb24gd2l0aGluIFIgdXNpbmcgbWNsYXBwbHkuICAKCkl0IGFjdHVhbGx5IHdvbid0IGJlLCBiZWNhdXNlIEJlbiBpcyBkb2luZyBhbGwgdGhlIHN0dWZmIGludGVybmFsbHkgdG8gdGhlIG1jbWMgCmZ1bmN0aW9uIGluIFJjcHAuICBCdXQgSSBhbSBzdGlsbCBjdXJpb3VzIFRoZSBpZGVhIGlzIHRvIGp1c3QKY2hvcCBhIG1hdHJpeCB1cCBhbmQgdGhlbiBvcGVyYXRlIG9uIGVhY2ggcGFydC4gCgpJIHdpbGwgZG8gaXQgc2ltcGx5IHdpdGggc3FydC4KCgpIZXJlIGlzIG91dCBtYXRyaXgKYGBge3J9CmxpYnJhcnkodGlkeXZlcnNlKQpsaWJyYXJ5KHBhcmFsbGVsKQptYXQgPC0gYWJzKHJub3JtKDFlOCkpICU+JQogIG1hdHJpeChucm93ID0gMTAwMDApCmBgYAoKQW5kIGhlcmUgaXMgb3VyIGNhbGN1bGF0aW9uCmBgYHtyfQpzeXN0ZW0udGltZShzZXIgPC0gbG9nKHNxcnQobWF0KSkpCmBgYAoKU28sIGl0IHRha2VzIGFib3V0IDIgc2Vjb25kcy4gIE5vdywgdGhlIHF1ZXN0aW9uIGlzLCBjb3VsZCB3ZSBwYXJhbGxlbGl6ZQp0aGF0IGluIGEgd2F5IHRoYXQgd2FzIGFueSBtb3JlIGVmZmljaWVudD8KCldlIHdpbGwgYnJlYWsgaXQgdXAgaW50byBzZWdtZW50czoKCmBgYHtyfQpzdGFydHMgPC0gc2VxKDEsIDEwMDAwLCBieSA9IGZsb29yKDEwMDAwLzgpKQplbmRzIDwtIGMoc3RhcnRzWy0xXSAtIDEsIDEwMDAwKSAKc2UgPC0gY2JpbmQoc3RhcnRzLCBlbmRzKQoKc3lzdGVtLnRpbWUoe3BhciA8LSBtY2xhcHBseSgxOjgsIGZ1bmN0aW9uKHgpIHsKICBsb2coc3FydChtYXRbc2VbeCwxXTpzZVt4LDJdLCBdKSkKfSkgJT4lCiAgZG8uY2FsbChyYmluZCwgLil9KQpgYGAKCgpDaGVjayB0byBzZWUgdGhhdCB3ZSBnZXQgdGhlIHNhbWUgdGhpbmc6CmBgYHtyfQphbGwuZXF1YWwoc2VyLCBwYXIpCmBgYAoKQWxsIG9mIHdoaWNoIHNob3dzIHRoYXQgdGhlcmUgaXMgYSBsb3Qgb2Ygb3ZlcmhlYWQgaW4gYnJlYWtpbmcgdGhlIHByb2JsZW0gdXAKaW50byBwYXJhbGxlbGl6YWJsZSBjaHVua3MuICBPSy4gIAo=