Peer review Open Access

Precise Data-Driven Approximation for Program Analysis


## Reproduction Package ##

This package is the reproduction kit for the program repair tool `PSP`. The package contains various components that help in running PSP for its various applications.


## File System Contents ##
The package includes the following directories and files:

- Rete - Contains the source code for running Rete augmented with PSP.
- Mythril - Contains mythril augmented with PSP's search strategy and --pending search strategy
- myth_coverage_metrics - A custom plugin for Mythril to record instruction and branch discovery times.
- Smartbugs - Contains smartbugs dataset and some scripts to sample 3 sets of 52 contracts from Smartbugs-wild
- smart-contract-run-data - Contains Mythril's run data on the 3 sets of 52 smart contracts
- Smartbugs-wild - Contains smartbugs wild dataset. Mainly for sampling those 3 sets of 52 contracts (view, accepts seed. used 1, 2, 3 to choose data)
- Magma - Magma dataset

## Instructions to Run each of the Components ##

Each directory in the package has its own file that provides instructions on how to run the components.


## Building the Container for Rete

To build the Docker container, follow the steps below:

1. Go to Rete directory
> cd Rete

2. Run the following command in your terminal:

> docker build . -t reproduction_package

3. To go inside the Docker container, run the following command:

>docker run -v $(pwd):/home/Trident/ --rm -ti reproduction_package /bin/bash


# Installing Feature Extractor

To install the feature extractor, follow the steps below:

1. Go to the infra directory in Rete:

> cd infra/

2. Build the Docker container in the infra directory:

> docker build -t rete/ubuntu-16.04-llvm-3.8.1 .
3. Change to the parent directory and build the Docker container:

> cd ..
> docker build -t rete-feature .

# Random Forest Features

The package also includes a feature extraction process that is based on random forest. This process involves counting the individual features for each variable, choosing the variables with the maximum count, and ignoring the rest. The variable identification is done based on the order of variables' features fed to the random forest. If the random forest outputs 7, it means that the 7th variable is the correct one.

Here is the list of the features used by the Random Forest of Rete:

1.  ***def:*** count of variable definitions
2.  ***use:*** count of variable uses
3.  ***for_init:*** count of 'for' loop initialisation uses
4.  ***for_cond:*** count of 'for' loop condition uses
5.  ***for_lcv:*** count of loop control uses
6.  ***while_cond:*** count of 'while' loop condition uses
7.  ***if_cond:*** count of 'if' condition uses
8.  ***hole_to_def:*** distance between hole and the def
10. ***last_use:*** distance to last use
11. ***hole_window:*** count of uses in k lines around hole
12. ***operator_histo:*** multiset of counts of operator/function uses
13. ***is_global:*** local/global variables



# Running

Run the docker container by first mounting your current directory into "/tmp" to "/bin/bash"

> docker run -v $(pwd):/tmp --rm -it rete /bin/bash

You can ignore the mounting if you do not need the current directory. You can either compile inside docker

after mounting or directly run the build code present in "/rete" folder.

> cd /tmp/rete

> cmake .. -DF1X_LLVM=/llvm-3.8.1

> make

> chmod u+x rete


To extract a json of feature information.

> ./rete -output="<path>"

To extract intermediate CDU chain data.

> ./rete -get-chain-data -output="<path>"

# Python-Prophet

To extract Prophet's Features ().

> python3 rete-feature-extracter/learning/ extract-features <file_path> --output-file <output_file_path>

To extract Prophet's Feature vector () from Prophet's features ().

> python3 rete-feature-extracter/learning/ feature-vector --buggy buggy_file.pcl --correct correct_file.pcl --mod-kind <MOD_TYPE> --line-no <line_no>

# Rete-Trident

Build and run a container:

>docker build -t rtrident .
>docker run --rm -ti rtrident /bin/bash

Build runtime:

cd runtime
KLEE_INCLUDE_PATH=/klee/include make

Run examples:



# Synthesizer interface

The synthesizer supports the following functionality:

Verifying a given patch
Generating a patch
Generating all patches
As specification, the synthesizer uses KLEE path conditions generated by Trident runtime, conjoined with test assertions.

To verify a given patch, run the following command:

> python3.6 /path/to/ --tests \ <ASSERTION_SMT_FILE>:<KLEE_OUTPUT_DIR> ... \
--components <COMPONENT_SMT_FILE> ... \
--verify <LOCATION ID>:<PATCH_FILE> ...
--templates <Template_Path> \
--depth <int> \
--model <model> \
--theta <int>

If the template path is specified then the synthesizer uses Rete's Plastic surgery based synthesis using the templates extracted from the codebase. If the path is not mentioned synthesizer defaults to naive enumeration.

The names of some files are important: the names of component files are components IDs, the names of assertion files are test IDs.

Patch file can be either an SMT file with patch semantics (same as components without holes), or JSON file that describes a tree of components and a valuation of constants. Here is an example of such JSON file:

## Running Mythril

For more information on running mythril, use its public readme.

Before running Mythril, install myth_coverage_metrics module.
>  cd myth_coverage_metrics
>  python3 install

The install mythril using the following commands:
> cd mythril
> python3 install

Incase an error is thrown, you can use the mythril's docker container, although, you have to modify the docker file to install myth_coverage_metrics into the container.


Running PSP on Mythril as per experiments at default settings:
> ../mythril/myth analyze {directory_path}/{filename} -o jsonv2 --execution-timeout 1800 --solver-timeout 25000

Running PSP on Mythril as per experiments with pending constraints strategy:
> ../mythril/myth analyze {directory_path}/{filename} -o jsonv2 --execution-timeout 1800 --solver-timeout 25000 --strategy pending

Running PSP on Mythril as per experiments with PSP's strategy:
> ../mythril/myth analyze {directory_path}/{filename} -o jsonv2 --execution-timeout 1800 --solver-timeout 25000 --strategy psp


# Getting plots from the paper:

Go to smart-contract-run-data directory

> cd smart-contract-run-data

Run the to get the plot for bugs
> python3

Run the to get the plot for bugs
> python3

This directory (i.e. smart-contract-run-data) contains mythril's run data on the 3 samples for 3 configurations.


Files (807.2 MB)
Name Size
807.2 MB Download
All versions This version
Views 6666
Downloads 1212
Data volume 9.7 GB9.7 GB
Unique views 5353
Unique downloads 1010


Cite as