Absynthe: Abstract Interpretation-Guided Synthesis
Anonymous Author(s)
This is the readme for the artifact for our PLDI 2023 submission, Absynthe: Abstract Interpretation-Guided Synthesis (draft).
A better formatted version of the readme is available in the
index.html file.
Getting Started
Our artifact uses Docker. Instructions to install Docker can be found here.
After Docker is installed, run the following command to build and enter the Docker container that will be used for artifact evaluation. This will download the docker image and install all system dependencies, and drop you inside the artifact image environment.
bash setup_docker.sh
Sometimes running Docker requires sudo. If the above
command fails, trying running it again with sudo.
If you face errors like below, ensure that you have the Docker Desktop app (or the Docker daemon) running in the background:
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Once inside the Docker container’s shell, you should be in the
/root/absynthe directory (assumed to be the working
directory from now on). Run the following command to install all the
Ruby dependencies required by Absynthe.
bundle config --global silence_root_warning true
bundle install
You could peek around and do some evaluation, after you are done,
just exit or Ctrl-D to quit. Next time when you want to
start from where you left, just do:
docker start -ia absynthe-artifact
Starting over: In case the Docker container is broken and you want to destroy your Docker container and start over again run the following command, and then follow instructions from the previous section.
docker rm absynthe-artifact
docker rmi absynthe-artifact-img
Phase 1: Basic Testing
Once you are in, you can run some basic benchmarks to ensure everything works. We have a small benchmark suite that runs much faster and includes one problem from each of our benchmark groups.
cd /absynthe/scripts
python3 run_sygus_benchmarks.py --smallbench # Runs a subset of the SyGuS benchmarks
python3 run_autopandas_benchmarks.py --smallbench # Runs a subset of the AutoPandas benchmarksEach of these scripts will run 3 benchmarks from the full suite a few
times and generate two CSV files table1.csv and
table2.csv. These CSV files correspond two Table 1 and 2 of
the paper respectively, but with only 3 benchmarks from each suite. If
these two files are populated, it means you have setup the artifact
correctly!
You may see some
fatal: not a git repository (or any parent up to mount point /root)
error messages while the benchmarks run. This does not interfere with
the evaluation results and can be fixed by making a Git repository in
the absynthe directory. We removed the .git
directory to hide commit author information for the the double-blind
review of the artifact.
File structure in artifact image
Under /root/absynthe:
autopandas/: The AutoPandas benchmarksbenchmarks.py: The benchmarks from AutoPandas paper (source)harness.py: The benchmarks runner harnessprotocol.py: The inter-process communication protocol between the benchmark harness (Python) and Absynthe (Ruby)runner.py: Infers abstractions from the input/output examples of the Autopandas benchmark
bin/autopandas: Sample implementation of the Absynthe framework for the AutoPandas frameworklib/absynthe/: Main directory containing Absynthe implementation:passes/: All visitor passes including that walks over AST holes and fills candidates (expand_hole.rbandpy_expand_hole.rb), calculates program size (prog_size.rbandpy_prog_size.rb), etc.python/: Python specific abstract domain and semantics definitionabstract-interpreters/:pytype_interpreter.rbandpandacol_interpreter.rbare defintions of abstract interpreters for the Python type checker and Pandas columns abstract interpreter respectively.domains/:pytype.rbandpandas_cols.rbare defintion of the Python Types and Pandas Columns abstract domains.product.rbis a Product domain for Python abstractions.
sygus/: Sygus specific abstract domain and semantics definitionabstract-interpreters:length_ext_interpter.rb,prefix_interpter.rb, andsuffix_interpter.rbare defintions of abstract interpreters for the solver-aided string length, string prefix, and string suffix abstract interpreter respectively.domains:string_length_extended.rb,string_prefix.rb, andstring_suffix.rbare defintion of the solver-aided string length, string prefix, string suffix abstract domains.product.rbis a Product domain for all these individual abstract domains.
cache.rb: Cache for small expressions that enables reuse of terms during synthesis.domain.rb: Base class and common methods (including the ⊆ relation) for the abstract domains.instrument.rb: Instrumentation used to capture benchmark statistics, finally presented in the tables.python.rb: Python specific libraries importsygus.rb: Sygus specific libraries importsynthesizer.rb: The core worklist algorithm to explore candidates during the synthesis process and prune the search space based on abstract intrepretation. This also calls the methods to run the inference on the abstract holes.template_infer.rb: Infers the template for a candidate based on testing certain examples.
scripts/*.py: Automation to run the benchmark suites multiple times and prepare the final tables in the paper.sygus-strings/: The SyGuS strings benchmark suite.test/: A few unit tests for Absynthe and the SyGuS synthesizer built from Absynthe framework (test_helper.rb).
Full Evaluation: Reproducing the tables in the paper
Table 1
Producing Table 1 requires the benchmarks to be run 11 times. We
provide a Python script that will run the benchmark multiple times and
gather the data as a CSV file table1.csv. To run it:
cd /absynthe/scripts/
python3 run_sygus_benchmarks.py # Runs SyGuS strings benchmarks
Do note, that running these benchmarks 11 times will take long time
and it is expected that certain benchmarks will timeout (same in the
paper). If running the benchmark 11 times is too time consuming, you can
pass an additional flag to the above script -t N where N is
the number of times the benchmarks should be run. The script with
executed with -h will present these options.
The number of times the benchmark suite will be run is
N + 2, where N is the number passed with
-t flag. The extra 2 is because the benchmarks are run once
to gather numbers without template inference and without caching small
expressions.
It is common to see benchmarks that passed before, fail as they are
run in other configurations. The run_sygus_benchmarks
script executes programs using all features of Absynthe (N times),
template inference disabled (1 time), and caching of small expressions
disabled (1 time). It is expected that some benchmarks will timeout
through these configurations.
table1.csv will mirror the numbers from Table 1. There
might be differences in time for benchmarks because these were run on
different machines. We have fixed some bugs in the tool during the paper
review period, so there may be some differences in the number of
programs tested (Tested Progs column). This CSV file can be
found on your local machine in the path absynthe/scripts/
where you extracted the artifact zip file. This supports the claims made
in the Table 1 of the paper.
Table 2
Producing Table 2 requires the benchmarks to be run 11 times. We
provide a Python script that will run the benchmark multiple times and
gather the data as a CSV file table2.csv. To run it:
cd /absynthe/scripts/
python3 run_autopandas_benchmarks.py # Runs AutoPandas benchmarks
Do note, that running these benchmarks 11 times will take long time
and it is expected that certain benchmarks will timeout (same in the
paper). If running the benchmark 11 times is too time consuming, you can
pass an additional flag to the above script -t N where N is
the number of times the benchmarks should be run. The script with
executed with -h will present these options.
table2.csv will mirror the numbers from Table 2. There
might be differences in time for benchmarks because these were run on
different machines. We have fixed some bugs in the tool during the paper
review period, so there may be some differences in the number of
programs tested (Tested Progs column). This CSV file can be
found on your local machine in the path absynthe/scripts/
where you extracted the artifact zip file. This supports the claims made
in the Table 2 of the paper.
Note: The Depth column of produced
table2.csv file is a part of the input of the benchmark
suite. This can be verified by going to the file
/root/absynthe/autopandas/benchmarks.py, where each length
of the first array in self.seqs is the depth of the
benchmark. The AP Neural and AP Baseline
columns are sourced from the AutoPandas
paper.
Benchmark timeouts: In our experience, due to difference in running inside a Docker container, certain benchmarks are flaky (due to Z3) or take a longer time than expected to complete. However, these benchmarks can be run reliably by installing Absynthe on the host system (discussed in the following section).
Flaky benchmarks: SyGuS benchmarks that use the
String Length domain (phone-6,7,8) can sometimes timeout or crash due to
Z3 inside Docker. The will often result in an error
no implicit conversion of nil into String. It can be
checked is a benchmark is using the string length domain by looking up
the sygus_bench.rb file for the corresponding
benchmark.
Determinstic timeout: In our experience the Autopandas benchmark SO_13261691 times out in Docker, but runs much faster to completion on the host system.
Install Absynthe from source outside of artifact environment
Absynthe can also be installed on a local system outside the artifact environment.
- Install
rbenvusing therbenv-installerscripts here. - Once
rbenvis setup, install Ruby 3.1.2:rbenv install 3.1.2 - Make Ruby 3.1.2 the global Ruby:
rbenv global 3.1.2 - Install bundler:
gem install bundler -v 2.3.22 - Enter the provided
absynthedirectory and install all dependencies:bundle install - Install Python dependencies for AutoPandas benchmarks:
pip3 install numpy scipy matplotlib plumbum pandas pygments
This will produce a working setup for Absynthe on a local system. All commands to reproduce Absynthe results should work.
Play with Absynthe
You can try to play with the implementation of Absynthe, the purpose of some of the key modules are given in the file structure section above.
You can add/modify a benchmark for SyGuS by adding a new
sl file to /root/absynthe/sygus-strings
folder. You can use existing benchmarks as an example to help you write
your own benchmark. You may have to add/update the benchmark abstract
specification in /root/absynthe/test/sygus_bench.rb file.
To run the benchmark suite you can use the command:
bundle exec rake bench
Similarly for the AutoPandas benchmark suite you can add/modify a new
benchmark in the /root/absynthe/autopandas/benchmarks.py
file. If you added a new benchmark, you’ll need to add the file to the
benches list in
/root/absynthe/autopandas/harness.py. To run the Autopandas
benchmark suite using Absynthe, use the command:
python3 harness.py in the \absynthe\autopandas
directory.
To use Absynthe on a different benchmark suite, you’ll need to define
your own abstract domain and semantics using the Absynthe framework and
build a synthesizer from it’s API. Sample definitions of abstract
domains and semantics are given in the \absynthe\lib\python
or \absynthe\lib\sygus folders. The file
\absynthe\bin\autopandas defines a self-contained
synthesizer using the Absynthe API functions. Lines 10-61 contain the
type signatures of the Pandas API methods, and the lines after contain
the definition of the synthesizer. This file can be adapted to target to
a new domain. We are planning to explore a better API design for
enabling easier development of synthesis tools with the core Absynthe
framework in the future.
Contact
We would like to receive suggestions/comments about further improving the artifact. Feel free to contact us on HotCRP.