kundajelab/tfmodisco: Parallelizing Leiden Runs
Creators
- 1. Stanford University
- 2. University of Virginia
Description
Corresponds to PR https://github.com/kundajelab/tfmodisco/pull/85.
Leiden is run with multiple different random seeds (and the best partition is used) for robustness. Prior to this PR, those runs were not parallelized because trying to parallelize leidenalg.find_partition
naively via joblib results in a TypeError: cannot pickle 'PyCapsule' object
error. In this PR, parallelism is achieved by making calls to a dedicated script that runs leiden community detection (one that is called using subprocess.Popen
).
Results on bpnet nanog task are here (gives the same results as before, but spends noticeably less time on the Leiden clustering steps): http://nbviewer.jupyter.org/github/kundajelab/tfmodisco_bio_experiments/blob/b3b4d7b240b8e398597100581ae791eec0a13b61/bpnet/trial1/TryBpNet_v0.5.13.0.ipynb (Contrast with https://nbviewer.jupyter.org/github/kundajelab/tfmodisco_bio_experiments/blob/2ba855b85eddc4c4d7b5e3296c6e12cce04a705d/bpnet/trial1/TryBpNet_v0.5.11.0_reducemem.ipynb)
Files
kundajelab/tfmodisco-v0.5.13.0.zip
Files
(18.3 MB)
Name | Size | Download all |
---|---|---|
md5:e53320feeaf21ec6d497d0c839178718
|
18.3 MB | Preview Download |
Additional details
Related works
- Is supplement to
- https://github.com/kundajelab/tfmodisco/tree/v0.5.13.0 (URL)