A pluggable type system for expressive units of measurement types and a precise, whole-program inference approach for these types. Punits can be used in three modes: (1) modularly check the correctness of a program, (2) ensure a possible unit typing exists, (3) annotate a program with units. This image is created for the OOPSLA 2020 artifact evaluation.
Pull the docker image and start the container
docker run -d -it txiang61/punits-artifact:oopsla2020 /bin/bash
docker exec -it <container id> /bin/bash
The PUnits project is located in /home/opprop/units-inference
.
The benchmark scripts are located in /home/opprop/units-inference/benchmark
The Coq proof is located in /home/opprop/units-inference/coq-proof
Go to /home/opprop/units-inference
. The project should already be built, but to be sure, pull the latest changes and compile.
git pull && ./rebuildCFCFI.sh && ./gradlew assemble
Run the test script to ensure PUnits is working properly.
./mini-tests.sh
All individual tests should pass and the overall build should finish successfully.
cd /home/opprop/units-inference/coq-proof && ./compile.sh
This script should compile without any warnings or errors. More details on the pUnits Coq proofs are in /home/opprop/units-inference/coq-proof/README.md
in the image.
Please note that depending on the hardware of your machine, these tests may take from a few minutes to more than an hour (more than an hour for annotation mode). Enter the benchmark folder /home/opprop/units-inference/benchmark
.
To run type-checking benchmarks use:
./run-benchmark-typecheck.sh <path to YAML file without the .yml extension e.g. paper-typecheck>
This script runs PUnits in type-check mode on all of the projects in the .yml file from a corpus. We recommend runningpaper-typecheck.yml
as this file contains all projects mentioned in the paper (except for Daikon
due to special build instructions). This script may take 5 to 15 minutes to run depending on the machine. This script creates a folder that has the same name as the YAML file under the benchmark
directory that contains all the projects we run type-checked mode on.
The expected result for running paper-typecheck
is
Failed projects are: ['react', 'jReactPhysics3D', 'GasFlow', 'imgscalr', 'jblas', 'exp4j', 'JLargeArrays']
.
These errors are expected from unannotated projects.
To run units type-check on Daikon:
./run-daikon-typecheck.sh <master/unit-error/error-fixed>
This script may take 10 to 30 minutes to run depending on the machine. This script runs type-check on the Daikon project on three different versions. The master
version is the unannotated project and will issue type errors as it is unannotated. The unit-error
version is fully annotated with the unit bug inserted and should issue one type error. The error-fixed
version is fully annotated with the bug fixed. You can ignore output warning: Did not find stub file /javadoc.astub on classpath or within directory / or at checker.jar/javadoc.astub
error: warnings found and -Werror specified
as these are not type errors produced by PUnits.
We claimed in the paper that PUnits is able to detect the bug inserted (Section 5.2 Daikon paragraph). The artifact supports this claim.
To inspect the overview results of type check:
../experiment-tools/gen-typecheck-table.sh <path to result folder e.g. paper-typecheck>
The table lists 4 kinds of type errors found in the projects. These errors are expected from unannotated projects as flows were detected which propagate units from annotated methods to defaulted @Dimensionless
method parameters or returns.
To inspect the detailed report for each project:
cat <path to result folder>/<project name>/logs/infer.log
We recommend inspecting the detailed report for project GasFlow
. We claim in the paper that PUnits is able to detect unit-related errors where encapsulation-based units APIs like JScience have failed (Section 5.1.1). We also claim that PUnits enforces good coding practice by giving warnings on type-unsafe heterogeneous methods (Section 5.1.3). The type errors issued by PUnits also matches with the GasFlow errors mentioned in the paper (Section 5.2 GasFlow paragraph). The artifact supports our lists of claims.
We claim that PUnits is able to type check all eight projects and the errors issued are expected as the project is unannotated (Section 5.2 first paragraph). The artifact supports this claim.
To run whole-program-inference without annotation mode:
./run-benchmark-infer.sh <path to YAML file without the .yml extension e.g. paper-inference>
This script runs PUnits in whole-program-inference mode on all of the projects in the .yml file from a corpus. This script may take 5 to 30 minutes to run depending on the machine. We recommend running paper-inference.yml
as this file contains the 6 projects mentioned in the paper that we run inference on.
The expected result from running paper-inference.yml
is Successful projects are: ['react', 'jReactPhysics3D', 'imgscalr'].
Failed projects are: ['jblas', 'exp4j', 'JLargeArrays']
. The failures/UNSATs are expected.
In the paper, jReactPhysics3D
is evaluated to UNSAT instead of SAT. As stated in the paper (Section 5.2 jReactPhysics3D paragraph), the reason why this project evaluates to UNSAT is because PUnits assumes the raw type Iterator
is actually Iterator<@UnitsTop Object>}
. The project reached UNSAT in inference because of the flow of a value obtained from this iterator into a parameter that expects a @Dimensionless
value.
PUnits now sets the bounds to @Dimensionless
, and thus the project reaches SAT.
Running whole-program-inference with annotation mode:
./run-benchmark-inference.sh true <path to YAML file without the .yml extension e.g. paper-annotation>
This will take more than an hour to run. This script runs PUnits in whole-program-inference with annotation mode on all of the projects in the .yml file from a corpus. We recommend runningpaper-annotation.yml
as this file contains the 5 projects mentioned in the paper that we run annotation mode on.
The expected result running paper-annotation
is ----- Inference successfully inferred all 5 projects. -----
..
To inspect the overview results of whole program inference:
../experiment-tools/gen-inference-summary.sh <path to result folder e.g. paper-annotation>
../experiment-tools/gen-inference-table.sh <path to result folder e.g. paper-annotation>
This result supports the claims in Figures 14 and 15 of the paper. Please note that the number of variables and constraints generated may slightly differ from what is specified in the paper as PUnits and its dependencies, Checker Framework and Checker Framework Inference, evolved. The final paper will use numbers consistent with the final artifact.
To inspect the detailed report for each project:
cat <path to result folder>/<project name>/logs/infer.log
To inspect the annotated source code for each successfully inferred project:
cat <path to result folder>/<project name>/annotated
To see the performance numbers run:
./run-GasFlow-performance.sh
This script invokes the main method in the GasFlow project for testing the time and memory performance difference between using encapsulation-based units APIs like JScience vs. PUnits. The first run uses the JScience library to enforce unit correctness. The second run uses PUnits to enforce unit correctness. This result supports the claim in Section 5.1.2 of the paper and that PUnits reaps the performance benefits of using primitive types instead of abstract data types for unit-wise consistent scientific computations.
To get the OpenJDK compilation times for the projects:
./run-benchmark-compile.sh <path to result folder e.g. paper-typecheck>
If running with arg paper-typecheck
, project imgscalr
will fail to compile. This is alright since the reason why this project failed to compile is due to the source and target version. It will not affect the PUnits evaluations.
Timing logs are created when running the benchmark scripts. compileTiming.log
contains the OpenJDK compilation time. typecheckTiming.log
contains the type-checking time. inferTiming.log
contains the inference or annotation time. To view the timing logs:
grep -r "Time taken" <path to result folder> | sort
You can compare the timing to the OpenJDK compilation times.
The performance overhead varies depending on the projects and the machine used to run. The paper claims that its performance is adequate for use in a real-world software development environment (Section 5.3). Overall, this artifact supports this claim.
Go to the PUnits project folder /home/opprop/units-inference
Custom base units and aliases (optional step)
a. Check the currently supported base units with ./experiment-tools/get-num-baseunits.sh
. See if any desired base units are missing.
b. src/units/qual
contains all base units and unit aliases used. You can move unneeded base units and aliases to src/units/notusedqual
.
c. Create new base units and new unit aliases. Look at existing files for reference.
Annotate JDK specifications and libraries (optional step)
a. All .astub
files in /src/units
are the annotated JDK and libraries
b. Create your .astub
files. Look at existing stub files for reference.
c. Add the files to the @StubFiles{}
list in src/units/UnitChecker.java
. You can comment out files that you won't need.
Build PUnits with ./gradlew assemble
Run PUnits
a. Run type-check mode on .java files:
/home/opprop/units-inference/script/run-units-typecheck.sh <java files>
b. Run type-check mode on a Java project: Go to project folder and remember to clean the project first (to ensure everything is re-checked):
/home/opprop/units-inference/script/run-dljc-typecheck.sh "<build command>"
c. Run inference mode on .java files:
/home/opprop/units-inference/script/run-units-infer.sh false <java files>
d. Run inference mode on a Java project: Go to the project folder and remember to clean the project:
/home/opprop/units-inference/script/run-dljc-inference.sh false "<build command>"
e. Run annotation mode on .java files:
/home/opprop/units-inference/script/run-units-infer.sh true <java files>
f. Run annotation mode on a Java project: Go to the project folder and remember to clean the project:
/home/opprop/units-inference/script/run-dljc-inference.sh true "<build command>"
Please see the running benchmark section for details on how and why these claims are, and are not, supported.
jReactPhysics3D
is evaluated to UNSAT during inference mode. The artifact evaluates it to SAT because of the changes in PUnits after the paper submission. The final paper will be consistent with the final artifact.