10.5281/zenodo.3897692
https://zenodo.org/records/3897692
oai:zenodo.org:3897692
Utture, Akshay
Akshay
Utture
0000-0002-9623-3049
UCLA
Kalhauge, Christian Gram
Christian Gram
Kalhauge
UCLA
Liu, Shuyang
Shuyang
Liu
UCLA
Palsberg, Jens
Jens
Palsberg
UCLA
NJR-1 Dataset
Zenodo
2020
Static Analysis, Java
2020-06-16
eng
10.1145/3236454.3236501
10.5281/zenodo.3897691
1.0.0
Creative Commons Attribution 4.0 International
NJR is a Normalized Java Resource.
The NJR-1 dataset consists of 293 Java bytecode programs, each of which executes at least 100 unique application methods at runtime. Additionally, 5 static analysis tools (SpotBugs, Wala, Doop, Soot, Petablox) successfully run on these programs.
These programs are repositories picked from the set of Java-8 projects on Github that compile and run successfully.
Each of these programs comes with an executable jar file, the compiled bytecode file, and the Java source code.
There are 3 files available for download: njr-1_dataset.zip, scripts.zip, benchmark_stats.csv.
njr-1_dataset.zip has the actual dataset programs. scripts.zip contains Python3 scripts to run analysis tools (SpotBugs, Wala, Doop, Soot, Petablox) on the entire dataset. The benchmark_stats.csv file lists, for each benchmark, the number of nodes and edges in its dynamic application call-graph, as well as the number of edges in its static application call-graph (as computed by Wala).
A summary of the same is listed here:
Statistics Dynamic-Nodes Dynamic-Edges Static-Edges
Mean 205 469 1404
St.Dev 199 464 2523
Median 149 327 610
To cite the dataset, please cite the following paper:
Jens Palsberg and Cristina V. Lopes, NJR: a Normalized Java Resource.
In Proceedings of ACM SIGPLAN International Workshop on State Of the Art in Program Analysis (SOAP), 2018.
Funded by the following NSF grant (https://www.nsf.gov/awardsearch/showAward?AWD_ID=1823360&HistoricalAwards=false)