Absynthe: Abstract Interpretation-Guided Synthesis


Anonymous Author(s)



This is the readme for the artifact for our PLDI 2023 submission, Absynthe: Abstract Interpretation-Guided Synthesis (draft).

A better formatted version of the readme is available in the index.html file.

Getting Started

Our artifact uses Docker. Instructions to install Docker can be found here.

After Docker is installed, run the following command to build and enter the Docker container that will be used for artifact evaluation. This will download the docker image and install all system dependencies, and drop you inside the artifact image environment.

bash setup_docker.sh

Sometimes running Docker requires sudo. If the above command fails, trying running it again with sudo.

If you face errors like below, ensure that you have the Docker Desktop app (or the Docker daemon) running in the background:

Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

Once inside the Docker container’s shell, you should be in the /root/absynthe directory (assumed to be the working directory from now on). Run the following command to install all the Ruby dependencies required by Absynthe.

bundle config --global silence_root_warning true
bundle install

You could peek around and do some evaluation, after you are done, just exit or Ctrl-D to quit. Next time when you want to start from where you left, just do:

docker start -ia absynthe-artifact

Starting over: In case the Docker container is broken and you want to destroy your Docker container and start over again run the following command, and then follow instructions from the previous section.

docker rm absynthe-artifact
docker rmi absynthe-artifact-img

Phase 1: Basic Testing

Once you are in, you can run some basic benchmarks to ensure everything works. We have a small benchmark suite that runs much faster and includes one problem from each of our benchmark groups.

cd /absynthe/scripts
python3 run_sygus_benchmarks.py --smallbench      # Runs a subset of the SyGuS benchmarks
python3 run_autopandas_benchmarks.py --smallbench # Runs a subset of the AutoPandas benchmarks

Each of these scripts will run 3 benchmarks from the full suite a few times and generate two CSV files table1.csv and table2.csv. These CSV files correspond two Table 1 and 2 of the paper respectively, but with only 3 benchmarks from each suite. If these two files are populated, it means you have setup the artifact correctly!

You may see some fatal: not a git repository (or any parent up to mount point /root) error messages while the benchmarks run. This does not interfere with the evaluation results and can be fixed by making a Git repository in the absynthe directory. We removed the .git directory to hide commit author information for the the double-blind review of the artifact.

File structure in artifact image

Under /root/absynthe:

Full Evaluation: Reproducing the tables in the paper

Table 1

Producing Table 1 requires the benchmarks to be run 11 times. We provide a Python script that will run the benchmark multiple times and gather the data as a CSV file table1.csv. To run it:

cd /absynthe/scripts/
python3 run_sygus_benchmarks.py # Runs SyGuS strings benchmarks

Do note, that running these benchmarks 11 times will take long time and it is expected that certain benchmarks will timeout (same in the paper). If running the benchmark 11 times is too time consuming, you can pass an additional flag to the above script -t N where N is the number of times the benchmarks should be run. The script with executed with -h will present these options.

The number of times the benchmark suite will be run is N + 2, where N is the number passed with -t flag. The extra 2 is because the benchmarks are run once to gather numbers without template inference and without caching small expressions.

It is common to see benchmarks that passed before, fail as they are run in other configurations. The run_sygus_benchmarks script executes programs using all features of Absynthe (N times), template inference disabled (1 time), and caching of small expressions disabled (1 time). It is expected that some benchmarks will timeout through these configurations.

table1.csv will mirror the numbers from Table 1. There might be differences in time for benchmarks because these were run on different machines. We have fixed some bugs in the tool during the paper review period, so there may be some differences in the number of programs tested (Tested Progs column). This CSV file can be found on your local machine in the path absynthe/scripts/ where you extracted the artifact zip file. This supports the claims made in the Table 1 of the paper.

Table 2

Producing Table 2 requires the benchmarks to be run 11 times. We provide a Python script that will run the benchmark multiple times and gather the data as a CSV file table2.csv. To run it:

cd /absynthe/scripts/
python3 run_autopandas_benchmarks.py # Runs AutoPandas benchmarks

Do note, that running these benchmarks 11 times will take long time and it is expected that certain benchmarks will timeout (same in the paper). If running the benchmark 11 times is too time consuming, you can pass an additional flag to the above script -t N where N is the number of times the benchmarks should be run. The script with executed with -h will present these options.

table2.csv will mirror the numbers from Table 2. There might be differences in time for benchmarks because these were run on different machines. We have fixed some bugs in the tool during the paper review period, so there may be some differences in the number of programs tested (Tested Progs column). This CSV file can be found on your local machine in the path absynthe/scripts/ where you extracted the artifact zip file. This supports the claims made in the Table 2 of the paper.

Note: The Depth column of produced table2.csv file is a part of the input of the benchmark suite. This can be verified by going to the file /root/absynthe/autopandas/benchmarks.py, where each length of the first array in self.seqs is the depth of the benchmark. The AP Neural and AP Baseline columns are sourced from the AutoPandas paper.

Benchmark timeouts: In our experience, due to difference in running inside a Docker container, certain benchmarks are flaky (due to Z3) or take a longer time than expected to complete. However, these benchmarks can be run reliably by installing Absynthe on the host system (discussed in the following section).

Flaky benchmarks: SyGuS benchmarks that use the String Length domain (phone-6,7,8) can sometimes timeout or crash due to Z3 inside Docker. The will often result in an error no implicit conversion of nil into String. It can be checked is a benchmark is using the string length domain by looking up the sygus_bench.rb file for the corresponding benchmark.

Determinstic timeout: In our experience the Autopandas benchmark SO_13261691 times out in Docker, but runs much faster to completion on the host system.

Install Absynthe from source outside of artifact environment

Absynthe can also be installed on a local system outside the artifact environment.

This will produce a working setup for Absynthe on a local system. All commands to reproduce Absynthe results should work.

Play with Absynthe

You can try to play with the implementation of Absynthe, the purpose of some of the key modules are given in the file structure section above.

You can add/modify a benchmark for SyGuS by adding a new sl file to /root/absynthe/sygus-strings folder. You can use existing benchmarks as an example to help you write your own benchmark. You may have to add/update the benchmark abstract specification in /root/absynthe/test/sygus_bench.rb file. To run the benchmark suite you can use the command: bundle exec rake bench

Similarly for the AutoPandas benchmark suite you can add/modify a new benchmark in the /root/absynthe/autopandas/benchmarks.py file. If you added a new benchmark, you’ll need to add the file to the benches list in /root/absynthe/autopandas/harness.py. To run the Autopandas benchmark suite using Absynthe, use the command: python3 harness.py in the \absynthe\autopandas directory.

To use Absynthe on a different benchmark suite, you’ll need to define your own abstract domain and semantics using the Absynthe framework and build a synthesizer from it’s API. Sample definitions of abstract domains and semantics are given in the \absynthe\lib\python or \absynthe\lib\sygus folders. The file \absynthe\bin\autopandas defines a self-contained synthesizer using the Absynthe API functions. Lines 10-61 contain the type signatures of the Pandas API methods, and the lines after contain the definition of the synthesizer. This file can be adapted to target to a new domain. We are planning to explore a better API design for enabling easier development of synthesis tools with the core Absynthe framework in the future.

Contact

We would like to receive suggestions/comments about further improving the artifact. Feel free to contact us on HotCRP.