Artifact for Practical Type-Based Taint Checking and Inference (ICSE2025)

anonymous, anonymous

doi:10.5281/zenodo.13176879

Published August 2, 2024 | Version v2

Software Open

Artifact for Practical Type-Based Taint Checking and Inference (ICSE2025)

anonymous, anonymous

# Artifact for "Practical Type-Based Taint Checking and Inference" (ICSE 2024)

This README.md file provides information about the artifact for "Practical Type-Based Taint Checking and Inference". The artifact includes the implementation of our tool in Java, used benchmarks in the paper and the scripts to run our checker and inference on the given benchmarks.

To access the artifact, you can find it on Zenodo. To get started, please follow the instructions below:

## Container Structure

This docker image contains:

* TaintTyper - will be found in `/var/scripts/TaintTyper` (anonymized)

* Inference (Annotator) - will be found in `/var/scripts/annotator`

* Benchmarks used in the paper for the evaluation - will be found in `/var/benchmarks/original`

## Setup

1. Install Docker based on your system configuration: [Get Docker](https://docs.docker.com/get-docker/).

2. Import the artifact into Docker: `docker load tainttyper-icse-2024`

3. Run the Docker image (give container at least 16gigs of ram): `docker run --name tainttyper tainttyper-icse-2024 &`

4. Access docker container shell: `docker exec -it tainttyper bash`

This Docker container offers a means to reproduce the data reported in the paper.

## Expected time of execution

All experiments are conducted on an Ubuntu system equipped with 64 GB of RAM and a 13th Generation Intel Core i7 processor with 16 cores. The expected times to generate the complete table are as follows:

1. Running Taint Typer on all benchmarks takes approximately 4 minutes.

2. Running inference on all benchmarks takes approximately 12 hours.

## Benchmarks

All benchmarks used in the paper can be found at `/var/benchmarks/original`. To compile and run `TaintTyper` on each benchmark, run the exiting `compile.sh` at the benchmark root directory.

Example for `struts`:

```shell

cd /var/benchmarks/original/struts

./compile.sh

```

The script will compile and run the taint type on the provided benchmark. Please note that the compilation is expected to fail, with errors reported by `TaintTyper` for all benchmarks. An example of an error reported by `TaintTyper`is shown below:

```log

[ERROR] ../struts/.../DataSourceHelper.java:[36,106] error: [argument] incompatible argument for parameter url of DriverManager.getConnection.

[ERROR] found : @RTainted String

[ERROR] required: @RUntainted String, index: X

```

To run inference on each benchmark, run `analysis/annotator.py` at the benchmark root directory.

Example for `struts`:

```shell

cd /var/benchmarks/original/struts/analysis

python3 annotator.py

```

The script will run the inference with all features and optimizations enabled. The inference process adds the inferred annotations directly to the source code, allowing you to observe the reduction in the final number of warnings by rerunning `compile.sh`.

As this process might take some time, the final output of the inference for each benchmark is stored at `/var/benchmarks/annotated/`. For instance, you can find the annotated version of the `struts` benchmark at `/var/benchmarks/annotated/struts`. The inference output on the original version will produce the exact results located in the `annotated` directory.

We also manualy annotated the given benchmarks in the paper without using inference, the manualy annotated version of benchmarks are available at `/var/benchmarks/manual-annotated/`

Example for `struts`:

```

/var/benchmarks/manual-annotated/struts/

```

Files

README.md

Files (6.9 GB)

Name	Size	Download all
artifact-icse-2024.tar md5:1ce46bb4f7eae015b46f276858270887	6.9 GB	Download
README.md md5:0166bfc0d6bdfb7b084d63fb8e7e93d8	3.5 kB	Preview Download

Additional details

Submitted: 2024-08-02

Citations

Oops! Something went wrong while fetching results.

	All versions	This version
Views	46	36
Downloads	41	30
Data volume	193.3 GB	145.0 GB

Artifact for Practical Type-Based Taint Checking and Inference (ICSE2025)

Creators

Description

Files

README.md

Files (6.9 GB)

Additional details

Dates