Published January 12, 2024 | Version v2
Software Open

Towards Reliable SQL Synthesis: Fuzzing-Based Evaluation and Disambiguation

  • 1. ROR icon Instituto de Engenharia de Sistemas e Computadores Investigação e Desenvolvimento
  • 2. ROR icon Instituto Superior Técnico
  • 3. OutSystems
  • 4. ROR icon Carnegie Mellon University

Description

CUBES is a program synthesizer for SQL based on input-output examples.
Based on an input-output example, specified in an YAML file, it will find a query that generates the desired output from the input database.

Description of included files

  • analysis: directory where log files will be stored, as well as the code used to analyze the log files and create all figures in the paper. This directory already contains the log files produced by our experiments
  • helper-scripts: useful scripts for running several benchmarks or extracting information from log files
  • squares: source code for the tool
  • tests-examples: directory that contains all benchmarks
  • tests: directory that is used to specify which benchmarks should be run when using the helper scripts. Sub-directories from tests-examples can either be symlinked to here, or copied over.
  • sequential.pycubes.py: files to run the corresponding configurations.
  • miniconda.shconda.txt: setup files containing all required dependencies

Virtual Machine Specific Notes

  • The password for the default user is “fase2024”
  • The VM was tested with VirtualBox 7.0

Installation (skip if using the VM artifact)

Install and setup dependencies:

source setup.sh

The following commands should be executed every time a new shell is opened after the initial setup:

eval "$(/home/tacas23/anaconda/bin/conda shell.bash hook)"
conda activate cubes

Installation testing

The script 10_mins.sh can be used to test if everything is working correctly.

It should produce the following CSV files containing results for 3 instances each:

  • analysis/data/t01.csv
  • analysis/data/t02.csv
  • analysis/data/t03.csv
  • analysis/fuzzy/t01.csv
  • analysis/fuzzy/t02.csv
  • analysis/fuzzy/t03.csv
  • analysis/fuzzy/t03_dis.csv
  • analysis/fuzzy/t03_dis_fuzz.csv

Note that it is normal for there to be warnings/errors during fuzzing and disambiguation.
These messages are about the semantics of the queries generated by the synthesizers and not about the the fuzzing/disambiguation tool.

Running

Preset Scripts

Under 10 minutes

Script: 10_mins.sh
Instances: 3 small instances
Experiments: 1. CUBES-SEQ (1 thread), 60 second timeout, 8GB memout, includes fuzzy evaluation
                     2. CUBES-DC4 (4 threads), 60 seconds timeout, 8GB memout, includes fuzzy evaluation
                     3. CUBES-DC4 (4 threads), 60 seconds timeout, 8GB memout, all solutions under 60 seconds, includes fuzzy evaluation and disambiguation

Under 1 day

Script: 1_day.sh
Instances: 50 randomly selected instances (listed in the beginning of the script)
Experiments: 4. CUBES-SEQ (1 thread), 10 min timeout, 8GB memout, includes fuzzy evaluation
                      5. CUBES-DC4 (4 threads), 10 min timeout, 8GB memout, all solutions under 10 min, includes fuzzy evaluation and disambiguation

Full results

Script: full.sh
Instances: all
Experiments: 6. CUBES-SEQ (1 thread), 10 min timeout, 56GB memout, includes fuzzy evaluation
                         Same as sequential_5 in the provided log files.
                     7. CUBES-SEQ (1 thread), 10 min timeout, 56GB memout, all solutions under 10 min, includes fuzzy evaluation and disambiguation
                         Same as sequential_6_all600 in the provided log files.
                     8. CUBES-DC4 (4 threads), 10 min timeout, 56GB memout, includes fuzzy evaluation
                         Same as c62_4_full in the provided log files.
                     9. CUBES-DC8 (8 threads), 10 min timeout, 56GB memout, includes fuzzy evaluation
                         Same as c62_8_full in the provided log files.
                   10. CUBES-DC16 (16 threads), 10 min timeout, 56GB memout, includes fuzzy evaluation
                         Same as c62_16_full in the provided log files.
                   11. CUBES-DC16 (16 threads), 10 min timeout, 56GB memout, all solutions under 10 min, includes fuzzy evaluation and disambiguation
                         Same as c62_16_all600_full in the provided log files.

Running a single benchmark

You can use the sequential.py or cubes.py to run either the sequential, or the parallel version of CUBES, respectively:

python sequential.py tests-examples/55-tests/1.yaml
python cubes.py tests-examples/55-tests/1.yaml

This will output the synthesized query for instance 55-tests/1.yaml, if such a query is found.
You can use -h or --help in any of the 2 files to see what configuration options are available.

Important parameters

By default, cubes.py uses as many processes as the number of available logical processors. This can be changed using the -j option. For example, if you wish to use 8 threads, you would use -j 8.

By default, cubes.py and sequential.py will terminate after finding the first solution. If you wish to use the disambiguator, you should specify how long to search for solutions before terminating using --under=TIME_IN_SECONDS.

It should be taken into account that without disabling some features cubes.py should not be run with less than 4 threads.

  • Recommended settings for 3 threads: --no-split-complex-joins
  • Recommended settings for 2 threads: --no-split-complex-joins --probing-threads=1
  • Recommended settings for 1 thread: --no-split-complex-joins --probing-threads=0

Running all benchmarks present in the tests directory

Remember to symlink/copy the instances you want to test from tests-examples to tests
You can use the benchmark script to evaluate either the sequential, or the parallel version of CUBES, respectively:

python helper-scripts/benchmark.py RUN_IDENTIFIER
python helper-scripts/benchmark.py --cubes RUN_IDENTIFIER

After running benchmark.py, a new directory will be created under analysis/data/RUN_IDENTIFIER containing the log files, as well as a file analysis/data/RUN_IDENTIFIER.csv containing the results.
You should replace RUN_IDENTIFIER with a string to identify the experiment.

Important parameters

  • -t specifies the time limit for each benchmark in seconds (600 seconds by default)
  • -m specifies the memory limit for each benchmark in MB (57344 MB by default (56GB))
  • -p specifies how many instances to test in parallel (note that this is unrelated to the -j argument of cubes.py)
  • other arguments are redirected to sequential.py / cubes.py

Running the disambiguator

Currently, disambiguation is implemented as a post-processing step after cubes has generated all solutions under the desired time limit and requires the benchmark.py script to have been used.

To perform disambiguation you can use the following command, where RUN_IDENTIFIER should be a previously executed run:

 

The results can be found under analysis/fuzzy/RUN_IDENTIFIED_dis.csv as well as analysis/fuzzy/RUN_IDENTIFIED_dis.log.
You can use --help to see all configuration options available in the disambiguation script.

Running the fuzzy evaluation

Like the disambiguation script, accuracy requires instances to have been run using the benchmark.py script introduced before.

To perform fuzzy evaluation you can use the following command, where RUN_IDENTIFIER should be a previously executed run:

PYTHONPATH=. python ./helper-scripts/fuzzy_check.py --run=RUN_IDENTIFIER

The results can be found under analysis/fuzzy/RUN_IDENTIFIED.csv as well as analysis/fuzzy/RUN_IDENTIFIED.log.

If you want to perform fuzzy evaluation over the results of a previous disambiguation you can use:

PYTHONPATH=. python ./helper-scripts/fuzzy_check.py --from-dis --run=RUN_IDENTIFIER

In this case, the results can be found under analysis/fuzzy/RUN_IDENTIFIED_dis_fuzz.csv as well as analysis/fuzzy/RUN_IDENTIFIED_dis_fuzz.log.
You can use --help to see other configuration options available in the fuzzy evaluation script.

Files

Files (8.1 GB)

Name Size Download all
md5:2d4d5ac241f5317432301d58de55e342
8.1 GB Download