There is a newer version of the record available.

Published February 19, 2021 | Version v0.5.13.0
Software Open

kundajelab/tfmodisco: Parallelizing Leiden Runs

Description

Corresponds to PR https://github.com/kundajelab/tfmodisco/pull/85.

Leiden is run with multiple different random seeds (and the best partition is used) for robustness. Prior to this PR, those runs were not parallelized because trying to parallelize leidenalg.find_partition naively via joblib results in a TypeError: cannot pickle 'PyCapsule' object error. In this PR, parallelism is achieved by making calls to a dedicated script that runs leiden community detection (one that is called using subprocess.Popen).

Results on bpnet nanog task are here (gives the same results as before, but spends noticeably less time on the Leiden clustering steps): http://nbviewer.jupyter.org/github/kundajelab/tfmodisco_bio_experiments/blob/b3b4d7b240b8e398597100581ae791eec0a13b61/bpnet/trial1/TryBpNet_v0.5.13.0.ipynb (Contrast with https://nbviewer.jupyter.org/github/kundajelab/tfmodisco_bio_experiments/blob/2ba855b85eddc4c4d7b5e3296c6e12cce04a705d/bpnet/trial1/TryBpNet_v0.5.11.0_reducemem.ipynb)

Files

kundajelab/tfmodisco-v0.5.13.0.zip

Files (18.3 MB)

Name Size Download all
md5:e53320feeaf21ec6d497d0c839178718
18.3 MB Preview Download

Additional details