Published August 23, 2021 | Version v1
Software Open

PFPSanitizer - A Parallel Shadow Execution Tool for Debugging Numerical Errors

  • 1. Rutgers University

Description

We present the artifact for the accepted paper - Parallel Shadow Execution to Accelerate the Debugging of Numerical Errors appearing at FSE 2021. This artifact provides the link to the source code and step-by-step instructions to reproduce the performance graphs and case study from the accepted paper. We also provide the test harness to evaluate the correctness of our tool. In this artifact, we provide the scripts and instructions required to execute the different parts of the experiment automatically.


Experiments are mainly divided into three parts. The first part focuses on correctness evaluation, the second part focuses on
the case study, and the third part focuses on performance evaluation. Complete performance evaluation can require more than
6 hours to run the benchmarks. 

* HARDWARE & SOFTWARE RECOMMENDATION:

    We recommend evaluating our artifact with a machine that has:
    Disk Storage: at least 10 GB available
    RAM memory: 16GB
    Processor: 4GHz
    OS: Ubuntu 18.04
    We ran all of our experiments on a machine with similar
    specification with 126GB of RAM and 64 cores.
    We have disabled turbo-boost and hyper-threading before conducting the experiments.

__________________________________________________________________

* DOWNLOADING PREBUILT DOCKER IMAGE

        We have prebuilt a docker image and hosted it in the docker hub.

(1) Install docker if not already installed by following the
        installation documentation in this link:
        https://docs.docker.com/install/

     We recommend installing Docker and evaluating our artifact on a machine with Ubuntu. Although docker can be used with
     Windows or macOS, docker may run on top of a Linux VM.

(2) Download the prebuilt docker image by using the command:
     
     $ sudo docker pull sc1696/pfsan_artifact:latest

     The docker image is roughly 1.21GB

(3) Run the docker image:

     $ sudo docker run -it --cap-add=SYS_PTRACE --security-opt seccomp=unconfined sc1696/pfsan_artifact:latest

* STEP BY STEP INSTRUCTIONS TO EVALUATE THE ARTIFACT

(1) Testing the correctness of PFPSanitizer (Section 5:Ability to detect FP errors):

  This process should take less than 4 minutes. This script will run microbenchmarks with PFPSanitizer and report numerical errors. It runs 43 benchmarks and reports numerical errors in these benchmarks, 
        i.e., catastrophic cancellation, NaN, Inf, branch flips, and integer conversion error. The
      script also outputs whether this error is correctly found (expected
      with green letters) or incorrectly found (unexpected with red
      letters). The script should only report "expected" and not
      "unexpected."

      Finally, when the script terminates, it reports the total
      number of microbenchmarks, the total number of
      microbenchmarks that reports each type of error, and whether
      the numbers are correct or not.

      Therefore, you should see a total of 43 benchmarks, 16
      benchmarks with catastrophic cancellation, 0 programs with
      NaN computation, 2 programs with Inf,
        5 programs with branch flips, and 0 program
      with integer conversion error.

  $ cd PFPSanitizer/correctness_suite
  $ python3 correctness_test.py
  $ cd ..


(2). Debugging (Section 5: (Debugging a previously unknown error in Cholesky from Polybench))

  To run Cholesky with gdb, compile runtime with -O0

  $ export SET_DEBUG=DEBUG //reset to debug mode
  $ cd runtime
  $ make clean
  $ make
  $ cd ../case_study/cholesky
  $ make clean
  $ make
  $ gdb ./cholesky_fp
  $ b handleReal.cpp:1335
    Make breakpoint pending on future shared library load? (y or [n]) y
  $ r
  $ call fpsan_trace(buf_id, res)

  This will show the error trace matching with Figure 7(a) in the paper.
  This trace shows the "Inst ID: Opcode: Op1_Inst_ID: Op2_Inst_ID: (real value: computed value: error)"
  As it can be seen in the trace, the Inf exception has occurred in instruction 189.
  If you trace back, then you would notice that error was propagated from instruction 450.
  However, instruction 450 is computed in a different function. To trace back the error in the instruction
  450 follow the below instructions.


  $ b handleReal.cpp:1287 if res->error >= 28
  $ r
    Start it from the beginning? (y or n) y
  $ call fpsan_trace(buf_id, res)

  You will get the trace matching Figure 7(b).
  This error trace would show that the addition of 1 and 2.70400000000000e7 has resulted in an error of 28 bits.
  The first instruction in the trace shows real computation as 2.70400010000000e7 and floating-point computation as 27040000, hence rounding error has occurred.
  $ quit

(3). Performance testing (Section 5: (Performance speedup with PFPSanitizer compared to FPSanitizer))

  We have done our performance experiments with licensed spec benchmarks, and we won't provide the source code of spec benchmarks. However, please check our public git repo for patch update:https://github.com/rutgers-apl/PFPSanitizer
  
  We have provided benchmarks that don't require license.

  $ cd PFPSanitizer/performance
  $ ./run_perf.sh
  This script will run two performance benchmarks and produce graphs speedup.pdf(Figure 8)
  and slowdown.pdf(Figure 9), only for AMG and MILCmk.

  Please note that we have disabled turbo-boost and hyper-threading, and our data is
  generated on a 64 core machine.

Files

Files (26.2 MB)

Name Size Download all
md5:675f2ecdba411040f5cf2809b745f38f
24.8 MB Download
md5:6dfafcfb391c44665858bbcb72bbe787
1.4 MB Download