Welcome to the artifact of the paper 'Probabilistic Program Verification via Inductive Synthesis of Inductive Invariants'. In what follows, we describe how to reproduce the results for our tool "cegispro2" as described in our paper. 

ARTIFACT LINK: -
ADDITIONAL REQUIREMENTS: -
EXPERIMENT RUNTIME: 1.5h for replicating most of the results (on 2,3 GHz Dual-Core Intel Core i5, using --maxto 400, see below).
REPRODUCIBILITY INSTRUCTIONS:

1. Unpack the cegispro2ae folder.
2. Open a terminal and cd into cegispro2ae.
3. For installing the tool, run 
chmod +x install.sh
source ./install.sh

This may take a few minutes since all the required packages are being installed. For some packages we get an error, which is not important for running our tool and the benchmarks. You might have to re-run the above two instructions when restarting the VM or the terminal in order to set the environment variables (PYTHONPATH).

4. We are now ready to execute the cegispro2 benchmarks. We have set up a script for reproducing the results for cegispro2 provided in the paper. Assuming you are in the root directory of the artifact, cd into 

cegispro2/cegispro2/benchmarks/scripts/TACAS23/AEC


Running

python3 aec.py

will execute *all* cegispro2 benchmarks from the paper with their respective timeouts. Since this can take quite some time, you can provide one of the following options:

--run storm
--run exist
--run absynth
--run soundlower

for running only a specific set of benchmarks (run python3 aec.py --help for more details). By default, we run only those benchmarks which do *not* time out. This can be disabled by setting --noselected. Whenever a set of benchmarks terminates, the script creates a .csv file in the AEC directory containing the results (storm.csv for Figure 4.a (and Table 2), past.csv for Figure 4.b (and Table 3), exist_sub.csv for EXIST benchmarks from Figure 4.c (and Table 4), and exist_sound.csv for Figure 4.d sound bounds (also see column 'cegissound' in file scatter_soundlower.csv in the root directory of the artifact, which is the .csv file containing the data for that scatter plot.)

The benchmarks exist, absynth, and soundlower should take at most a couple of minutes. The storm benchmarks take much more time (we remark that those benchmarks were run on a cluster, so execution times might vary). You can specify an upper bound on the timeout used for each benchmark by setting 
--maxto S
where S is the new timeout in seconds (run python3 aec.py --help for more details). On a 2,3 GHz Dual-Core Intel Core i5, using --maxto 400 suffices for most of the benchmarks. Moreover, when 
--run storm
 is set, you can execute a *single* specific benchmark from Table 2 by setting, e.g.,
--specific 2.1
for running the benchmark from row 2, column 1 (indices start at 0. Run python3 aec.py --help for more details.) This can be particularly useful if you want to check specific benchmarks that take a lot of time such as benchmark 11.1 or 22.2.
You might have to scroll up a bit to see details on the run since the tool prints the computed inductive invariant, which can be huge. 


Programs from Table 2 are located in cegispro2/cegispro2/benchmarks/TACAS23_storm
Programs from Table 3 are located in cegispro2/cegispro2/benchmarks/TACAS23_ABSYNTH
Programs from Table 4 and Figure 4.d are located in cegispro2/cegispro2/benchmarks/TACAS23_EXIST




-------------- General Usage of cegispro2  ------------------

With the correct environment variables set (might require executing install.sh, see above), we can use cegispro2 by

python3 -m cegispro2.cmd PROGRAM --post POST --prop PROP --invarianttype TYPE --templaterefiner TEMP

where
- PROGRAM is a path to a pgcl program parsed by the tool Probably (https://philipp15b.github.io/probably/) (please confer the storm, absynth, and exist benchmarks given above for examples)
- POST is a postexpectation in Disjoint Normal Form (run python3 -m cegispro2.cmd --help for more info)
- PROP is a property (specifies the bound that is to be proven, run python3 -m cegispro2.cmd --help for more info)
- TYPE is either super, sub, or past 
- templaterefiner is either inductivity, fixed, or variable (cf. Appendix D and run python3 -m cegispro2.cmd --help for more info)

For sub-invariants, one can optionally set --cdb for ensuring conditional difference boundedness (cf. Theorem 4).
For employing a cooperative verifier (cf. Appendix C) with d=2, one can set --distance 2 (this is done for the storm benchmarks from Table 2).

Let us consider an example. Assuming we are in the root directory of the artifact, we run

python3 -m cegispro2.cmd cegispro2/cegispro2/benchmarks/TACAS23_EXIST/Geo01_0.pgcl --post "z" --prop "z+1" --invarianttype super --templaterefiner inductivity

to compute a super-invariant for proving wp[C](z) <= z+1. If we substitute "super" by "past", then pre and post will be ignored and we compute a certificate for proving UPAST of C (cf. Proposition 4).
 



-------------- External Tools  ------------------

This artifact does not contain the external tools, storm, Absynth, and EXIST. We do, however, include all the benchmark files used for our evaluation.
Benchmark files for storm: cegispro2/cegispro2/benchmarks/TACAS23_storm
Benchmark files for Absynth: cegispro2/cegispro2/benchmarks/TACAS23_ABSYNTH/programs_absynth
Benchmark files for EXIST: cegispro2/cegispro2/benchmarks/TACAS23_EXIST/programs_exist

Moreover, we indicate the sources of all external tools:
Storm: https://github.com/moves-rwth/storm
   Parameters for running each benchmarks are found in cegispro2/experiments_tacas23/jobs/jobs_storm.txt
Absynth: https://channgo2203.github.io/research/
Exist: https://github.com/JialuJialu/Exist

 

