This repository contains the artifact for the iFM'24 paper "Proving Termination via Measure Transfer in Equivalence Checking" by Dragana Milovančević, Carsten Fuhs, Mario Bucev and Viktor Kunčak.
Install Docker from https://docs.docker.com
Load the docker image:
docker load --input artifact2641.tar.gz
Run the docker image in interactive mode with an 8GB memory flag:
docker run -it -e JAVA_OPTS='-Xmx8g' artifact2641
Move to the artifact directory:
cd artifact
The docker image contains the following content in the directory ~/artifact/
:
benchmarks/
This directory contains all the benchmarks we used to evaluate our system. The benchmarks are organized as follows:
tab1/
contains benchmarks from Table 1.tab2/
contains benchmarks from Table 2.stainless/
This directory contains the implementation:
transfer/
contains the implementation of measure transfer on top of the equivalence checking in Stainless.
inference/
contains the version without measure transfer.
tables/
This directory contains the material for the tables from the paper.
makefile
This file defines a set of tasks to reproduce the experiments from the paper. Step-by-Step Instructions below explain how to run them.
To run the equivalence checker with measure transfer, use the --equivchk
and --comparefuns
options of Stainless. The option --comparefuns
specifies the names of candidate functions. The option --models
specifies the names of reference functions.
For example, once in the directory ~/artifact/
, the following command runs the equivalence checking for the FiniteStreams
programs, stored in benchmarks/tab1/FiniteStreams.scala
:
./stainless/transfer/stainless benchmarks/tab1/FiniteStreams.scala --timeout=10 --solvers=smt-z3 --equivchk=true --equivchk-transfer=true --models=finiteM --comparefuns=finite --silent-verification --no-colors
For our example run, we get the following output (followed by a Stainless summary table):
Printing equivalence checking results:
List of functions that are equivalent to model FiniteStream.finiteM: FiniteStream.finite
List of erroneous functions:
List of timed-out functions (safety):
List of timed-out functions (equivalence):
List of wrong functions:
Printing the final state:
Path for the function FiniteStream.finite: FiniteStream.finiteM
Stainless successfully transfers the decreases annotation and proves the equivalence of finite
and finiteM
.
This section contains step-by step instructions for reproducing the results from the paper.
We set a 10s timeout for Z3 solver queries, like in the Example Run.
Our benchmarks are available in benchmarks/tab1/
.
The following command computes the total size of programs in number of lines of code (Column LOC), the number of functions in reference programs (Column F), the number of measure (decreases) annotations in reference programs (Column D), and prints the outcome of our experiment (Columns I, IT, T, TT):
make print-tab1
The initial state of this artifact (upon loading the docker image) contains the final state of the experiment. To reproduce the experiment, use the following command to run the equivalence checking (Warning: The execution should take around two hours on a standard laptop):
make run-tab1
The output is logged in tables/tab1/*.log
files.
Running make print-tab1
prints the summary of the new results.
The timestamps (Columns IT, TT) are computed as the average of 10 runs and migh slightly differ compared to the results in Table 1.
Our benchmarks are available in benchmarks/tab2/
.
Reference solutions are available in benchmarks/tab2/refs/
.
The following command computes the average size of programs in number of lines of code (Column LOC), the average number of function definitions (Column F), the average number of measure (decreases) annotations per reference program (Column D), the number of reference programs per benchmark (Column R), the number of submissions (Column S), ad prints the outcome of our experiment (Column T):
make print-tab2
The initial state of this artifact (upon loading the docker image) contains the final state of the experiment. To reproduce the experiment, use the following command to run the equivalence checking (Warning: The execution should take a few hours on a standard laptop):
make run-tab2
The output is logged in tables/tab2/*.log
files.
Running make print-tab2
prints the summary of the new results.
The source code is available on GitHub: https://github.com/epfl-lara/stainless. The implementation of measure transfer in equivalence checking is in core/src/main/scala/stainless/equivchk/EquivalenceChecker.scala
.
For further instructions on how to build Stainless from source, please refer to the installation guide: https://epfl-lara.github.io/stainless/installation.html