[ARTEFACT] Scalable Enumeration of Trap Spaces in Boolean Networks via Answer Set Programming

Pastva, Samuel; Van Giang, Trinh

doi:10.5281/zenodo.10406324

Published December 19, 2023 | Version v2

Computational notebook Open

[ARTEFACT] Scalable Enumeration of Trap Spaces in Boolean Networks via Answer Set Programming

This artefact contains all data and code necessary for reproduction of the claims presented
in the "Scalable Enumeration of Trap Spaces in Boolean Networks via Answer Set Programming"
paper. By following the instructions in this document, you should be able to obtain
the presented figures for your particular hardware.

> WARNING: Keep in mind that the results are concerned with *performance* of the tested
methods. As such, the results will vary depending on the used hardware.

> The presented results were obtained using Ryzen 5800X fixed to 4.7Ghz (no turbo/OC)
and 128GiB of DDR4-3200, with a memory limit of 64GiB. However, only a single CPU core
is used for each test. Most of the results should be replicable even using 16GB of RAM,
but a few experiments can actually require more memory to complete successfully.
The benchmark scripts should automatically filter out out-of-memory or otherwise
unsuccessful test runs.

## Artifact structure

- `models.zip`: Archive with all test Boolean networks. Is extracted into a
`models` directory by the `env_setup.sh` script.
- `dependencies/*`: Archives with all Python packages directly used in our testing,
plus a list of general dependencies (including fixed versions). Used by
`env_setup.sh` to create a virtual Python environment.
- `bench/*`: Python scripts for computing min/max trap spaces using individual tools.
- `env_*.sh`: Scripts for creating, validating and removing the Python environment.
- `all_*.sh` and `bulk_bench.sh`: Scripts for running all benchmark instances in bulk.
- `results-expected.zip`: An archive with the results obtained in our testing.
- `speedup.sh/py`, `aggregate.sh/py`, `variance.sh`: Other scripts for computing specific
parts of the results.

## Environment setup

> The presented results were measured using Python 3.11 and Debian 12.

First, we setup a virtual Python environment with all required dependencies.
For the compared tools (`mpbn==1.7`, `trappist==0.8`, `trapmvn`, and
`PyBoolNet==3.0.11`), we include the actual source code of the tool to
avoid any confusion.

> Note that for `mpbn` and `trappist`, the source code is modified to include
procedure for computing *maximal* trap spaces. An unmodified version of the
Python package is also included for reference.

For our implementation (`tsconj`), we also include the full source and directly
import the Python modules from this directory.

> The full implementation of the proposed conjunctive encoding is given in
`conj.py`. Other parts of the library are adapted from `trappist`, as it provides
a suitable similar functionality in other areas.

To recreate the testing environemnt, run the following commands:

```bash
# Create a virtual environment and install Python dependencies.
./env_setup.sh

# Activate the Python environment.
source ./env/bin/activate

# Check that all software is installed and usable. At this point,
# you may be asked to install `clingo`, `gringo`, `clasp` or `minizinc`.
# If you get an error for the pyboolnet package, see below.
./env_check.sh

# Unfortunately, pyboolnet currently has issues running properly
# in Python virtual environments. To (hopefully) fix this, you can
# use the following script:
./env_pyboolnet_fix.sh

# if for whatever reason you wish to destroy the current environment
# and start over, you can use this script:
./env_cleanup.sh
```

> You may encounter warning messages if you don't have the `minizinc`
solver installed, or the version is incorrect. However, `minizinc` is not
actually used in any of the experiments, so these messages can be ignored.

## (Step 1, Optional) Running individual benchmarks

To reproduce the main claims of the paper, we provide easy to use bulk
benchmark scripts. However, in some instances it can be useful to run
a particular benchmark in isolation. To do this, each tools has a separate
Python script in the `bench` directory.

These scripts always take:
- The number of repetitions that should be performed (integer > 0).
- The number of results that should be enumerated (integer >= 0; 0 means "all results").
- The path to the tested `.bnet` model file.

The script always prints the following results as a tab-separated
row (i.e. the results from multiple scripts can be joint into a single `.tsv` file):
- The path to the tested `.bnet` model file.
- The number of detected trap spaces, or an error message.
- The average runtime across all runs.
- The runtimes for individual runs (for variance analysis).

> Keep in mind that to correctly run the benchmark, you need to use the virtual
environment, either by using `./env/bin/python3`, or by running
`source ./env/bin/activate` first.

> Also note that some of the tools print error messages to `stdout`. As such, the
result should be checked for extra output before interpreting
it as a tab-separated table.

For example:

```bash
# Find first 10 trap spaces, repeated 5-times
python3 ./bench/min-tsconj.py 5 10 ./models/bbm/001.bnet
# Expected result:
# ./models/bbm/001.bnet 10 0.262806914000975670.2044996259992331 0.18533345000105328 0.19172101200092584 0.18470791399886366 0.20581378320021032
```

## (Step 2, Optional) Testing a single tool/dataset

If you want to test a single tool, you can use the `bulk_bench.sh` script.

You can find examples of using the `bulk_bench.sh` in the `all_*.sh` scripts. In particular, this script accepts:
- A path to a directory with `.bnet` models that should be tested (typically a sub-directory of `models`).
- The number of repetitions that should be performed (integer > 0).
- The number of results that should be enumerated (integer >= 0; 0 means "all results").
- A timeout string compatible with the UNIX `timeout` utility (e.g. `10s` or `5h`).
- A path to a benchmark Python script (e.g. `./bench/min-tsconj.py`)

The `bulk_bench.sh` script runs the given Python script for all models in the specified directory
and applies the specified timeout. Note that the timeout is applied cummulatively across all
repetitions of the same run.

The output result is a `.tsv` file consisting of all rows printed for individual models by
the benchmark Python script.

> The `bulk_bench.sh` script should be always executed in the root directory of this artefact.

Examples:

```bash
# Run two repetitions of the trappist benchmark for all BBM models, yielding at most 10 trap
# spaces with a 10s timeout.
./bulk_bench.sh ./models/real-world/bbm 2 10 10s bench/bench-trappist.py
```

## (Step 3, Optional) Runtime variance assessment

As the next step, we can test the run-to-run variance of each tool. In our tested configuration,
we carefully control for environmental variables and hence we generally observe very
low run-to-run variance. However, on a more noisy system (like a laptop running multiple tasks),
the variance will likely be higher. In such a case, it is necessary to increase the number
of repetitions in each experiment during the subsequent computations.

To run the variance tests, execute the following:

```bash
./variance.sh
```

This should create a folder `results/variance` where each tool is tested 10 times on every model
in the BBM dataset. The result will then contain the standard deviation of each set of runs, as well
as what portion of the average runtime constitutes the standard deviation (i.e. `10%`` means that
the standard deviation is 10% of the average runtime).

You should inspect these numbers and check that the variance is not too high, especially on
longer running benchmarks. In our case, the variance is always less than 0.1s.

## (Step 4) Running benchmarks in bulk

> If you don't want to run this step, or you are only able to run it partially, you can
use the pre-computed results in `results-expected.zip`.

To run benchmarks for a specific group of models, we provide the `all_*.sh` scripts. The names
should be self-explanatory: `min` scripts compute minimal trap spaces while `max` scripts compute
maximal trap spaces. A `parallel` script will reduce computation time by running up to 4 experiments
concurrently.

> The "sequential" scripts can take a lot of time to complete due to the number of models and the enforced
timeouts. As such, running the experiments sequentially can take roughly a week. We recommend using the
parallel configuration when maximal precision is not required. In such case, the script needs
2-3 days to finish. If you wish to further reduce the runtime, we recommend diasabling some of
the tools in the script (e.g. `pyboolnet`). In particular, comparing only `tsconj` and `mpbn` can be
done in less than a day.

Before starting the scripts, you should specify a memory limit (in kB) appropriate for your system
(this can also influence the number of benchmarks you will be able to complete):

```bash
# We use ~64GiB as our memory limit.
export MEMORY_LIMIT=67108864

# Now you can run the benchmarks (note the length of the running time above).
./all_min.sh
./all_max.sh
```

## (Step 5) Building figures and tables

> This step assumes you have the `results/min` and `results/max` folders fully populated.
If that is not the case, extract `results-expected.zip` and rename the folder to `results`.
Alternatively, you can also try to disable parts of the analysis for data that you do not have.
Note that the provided results have slightly different model apths in some instances
due to us later renaming the datasets. However, the actual models have not changed.

The LaTeX source files for the figures and tables presented in the technical supplement are given
in the `figures` folder.

### (Step 5.1) Speed-up figures

First, we have to calculate the actual speedup:

```bash
./speedup.sh
```

This script uses `speedup.py` to analyse the `results` folder. It should properly account
for any error messages in the tool output. It also truncates the speedup to `64x` so these
results are properly displayed by the figure.

Subsequently, we can actually build the corresponding figures:

```bash
cd ./figures
pdflatex fig-speedup-min.tex
pdflatex fig-speedup-max.tex
```

### (Step 5.2) Cumulative figures

The process for computing the "cumulative" figures is very similar:

```bash
./aggregate.sh

cd ./figures
pdflatex fig-cumulative-min-random.tex
pdflatex fig-cumulative-min-vlbn.tex
pdflatex fig-cumulative-max-random.tex
pdflatex fig-cumulative-max-vlbn.tex
```

### (Step 5.3) Summary tables

To build the summary tables, execute the following commands:

```bash
cd ./tables
python3 mk_min_table.py > min_table.tex
python3 mk_max_table.py > max_table.tex

pdflatex min_table.tex
pdflatex max_table.tex
```

> Captions and colours were added to the tables manually.

Files

artefact.zip

Files (40.9 MB)

Name	Size	Download all
artefact.zip md5:816549379ed2aa233df2ecc982317dfc	40.9 MB	Preview Download

	All versions	This version
Views	71	55
Downloads	29	27
Data volume	1.2 GB	1.1 GB

[ARTEFACT] Scalable Enumeration of Trap Spaces in Boolean Networks via Answer Set Programming

Creators

Description

Files

artefact.zip

Files (40.9 MB)