Abstract

This repository contains the artifact for the iFM'24 paper "Proving Termination via Measure Transfer in Equivalence Checking" by Dragana Milovančević, Carsten Fuhs, Mario Bucev and Viktor Kunčak.

Getting Started Guide

Docker Image

Install Docker from https://docs.docker.com

Load the docker image:

docker load --input artifact2641.tar.gz

Run the docker image in interactive mode with an 8GB memory flag:

docker run -it -e JAVA_OPTS='-Xmx8g' artifact2641

Move to the artifact directory:

cd artifact

Artifact Information

The docker image contains the following content in the directory ~/artifact/:

  1. benchmarks/

    This directory contains all the benchmarks we used to evaluate our system. The benchmarks are organized as follows:

    • tab1/ contains benchmarks from Table 1.
    • tab2/ contains benchmarks from Table 2.
  2. stainless/

    This directory contains the implementation:

    • transfer/ contains the implementation of measure transfer on top of the equivalence checking in Stainless.

    • inference/ contains the version without measure transfer.

  3. tables/

    This directory contains the material for the tables from the paper.

  4. makefile

    This file defines a set of tasks to reproduce the experiments from the paper. Step-by-Step Instructions below explain how to run them.

Example Run

To run the equivalence checker with measure transfer, use the --equivchk and --comparefuns options of Stainless. The option --comparefuns specifies the names of candidate functions. The option --models specifies the names of reference functions.

For example, once in the directory ~/artifact/, the following command runs the equivalence checking for the FiniteStreams programs, stored in benchmarks/tab1/FiniteStreams.scala:

./stainless/transfer/stainless  benchmarks/tab1/FiniteStreams.scala --timeout=10 --solvers=smt-z3 --equivchk=true --equivchk-transfer=true --models=finiteM --comparefuns=finite --silent-verification --no-colors

For our example run, we get the following output (followed by a Stainless summary table):

Printing equivalence checking results:
List of functions that are equivalent to model FiniteStream.finiteM: FiniteStream.finite
List of erroneous functions:
List of timed-out functions (safety):
List of timed-out functions (equivalence):
List of wrong functions:
Printing the final state:
Path for the function FiniteStream.finite: FiniteStream.finiteM

Stainless successfully transfers the decreases annotation and proves the equivalence of finite and finiteM.

Step-by-Step Instructions

This section contains step-by step instructions for reproducing the results from the paper.

We set a 10s timeout for Z3 solver queries, like in the Example Run.

Table 1.

Our benchmarks are available in benchmarks/tab1/.

The following command computes the total size of programs in number of lines of code (Column LOC), the number of functions in reference programs (Column F), the number of measure (decreases) annotations in reference programs (Column D), and prints the outcome of our experiment (Columns I, IT, T, TT):

make print-tab1

The initial state of this artifact (upon loading the docker image) contains the final state of the experiment. To reproduce the experiment, use the following command to run the equivalence checking (Warning: The execution should take around two hours on a standard laptop):

make run-tab1

The output is logged in tables/tab1/*.log files. Running make print-tab1 prints the summary of the new results.

The timestamps (Columns IT, TT) are computed as the average of 10 runs and migh slightly differ compared to the results in Table 1.

Table 2.

Our benchmarks are available in benchmarks/tab2/. Reference solutions are available in benchmarks/tab2/refs/.

The following command computes the average size of programs in number of lines of code (Column LOC), the average number of function definitions (Column F), the average number of measure (decreases) annotations per reference program (Column D), the number of reference programs per benchmark (Column R), the number of submissions (Column S), ad prints the outcome of our experiment (Column T):

make print-tab2

The initial state of this artifact (upon loading the docker image) contains the final state of the experiment. To reproduce the experiment, use the following command to run the equivalence checking (Warning: The execution should take a few hours on a standard laptop):

make run-tab2

The output is logged in tables/tab2/*.log files. Running make print-tab2 prints the summary of the new results.

Further Documentation

The source code is available on GitHub: https://github.com/epfl-lara/stainless. The implementation of measure transfer in equivalence checking is in core/src/main/scala/stainless/equivchk/EquivalenceChecker.scala.

For further instructions on how to build Stainless from source, please refer to the installation guide: https://epfl-lara.github.io/stainless/installation.html