There is a newer version of this record available.

Dataset Open Access

NJR-1 Dataset

Utture, Akshay; Kalhauge, Christian Gram; Liu, Shuyang; Palsberg, Jens


JSON Export

{
  "files": [
    {
      "links": {
        "self": "https://zenodo.org/api/files/ec39320b-0f2b-453d-b132-0c6379c25f71/benchmark_stats.csv"
      }, 
      "checksum": "md5:f9b0d46bf68b0609722e980c1e317ea8", 
      "bucket": "ec39320b-0f2b-453d-b132-0c6379c25f71", 
      "key": "benchmark_stats.csv", 
      "type": "csv", 
      "size": 25542
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/ec39320b-0f2b-453d-b132-0c6379c25f71/njr-1_dataset.zip"
      }, 
      "checksum": "md5:c0a57feaf93f4b9374b32885b01997fc", 
      "bucket": "ec39320b-0f2b-453d-b132-0c6379c25f71", 
      "key": "njr-1_dataset.zip", 
      "type": "zip", 
      "size": 2603240127
    }, 
    {
      "links": {
        "self": "https://zenodo.org/api/files/ec39320b-0f2b-453d-b132-0c6379c25f71/scripts.zip"
      }, 
      "checksum": "md5:8f9cf0a0df99f8c1d3cfcd9a45cea8ac", 
      "bucket": "ec39320b-0f2b-453d-b132-0c6379c25f71", 
      "key": "scripts.zip", 
      "type": "zip", 
      "size": 377608
    }
  ], 
  "owners": [
    107927
  ], 
  "doi": "10.5281/zenodo.4632231", 
  "stats": {
    "version_unique_downloads": 67.0, 
    "unique_views": 65.0, 
    "views": 73.0, 
    "version_views": 144.0, 
    "unique_downloads": 36.0, 
    "version_unique_views": 117.0, 
    "volume": 5207802916.0, 
    "version_downloads": 88.0, 
    "downloads": 40.0, 
    "version_volume": 13027291829.0
  }, 
  "links": {
    "doi": "https://doi.org/10.5281/zenodo.4632231", 
    "conceptdoi": "https://doi.org/10.5281/zenodo.3897691", 
    "bucket": "https://zenodo.org/api/files/ec39320b-0f2b-453d-b132-0c6379c25f71", 
    "conceptbadge": "https://zenodo.org/badge/doi/10.5281/zenodo.3897691.svg", 
    "html": "https://zenodo.org/record/4632231", 
    "latest_html": "https://zenodo.org/record/4839913", 
    "badge": "https://zenodo.org/badge/doi/10.5281/zenodo.4632231.svg", 
    "latest": "https://zenodo.org/api/records/4839913"
  }, 
  "conceptdoi": "10.5281/zenodo.3897691", 
  "created": "2021-03-23T20:03:20.311311+00:00", 
  "updated": "2021-05-30T18:19:51.629313+00:00", 
  "conceptrecid": "3897691", 
  "revision": 4, 
  "id": 4632231, 
  "metadata": {
    "access_right_category": "success", 
    "doi": "10.5281/zenodo.4632231", 
    "description": "<p>NJR is a Normalized Java Resource.</p>\n\n<p>The <em>NJR-1</em> dataset consists of 293 Java bytecode programs, each of which runs successfully with the following 12 Java static analysis tools:</p>\n\n<p>1. &nbsp;SpotBugs (https://spotbugs.github.io)<br>\n2. &nbsp;Wala (https://wala.github.io)<br>\n3. &nbsp;Doop (https://bitbucket.org/yanniss/doop)<br>\n4. &nbsp;Soot (https://github.com/soot-oss/soot)<br>\n5. &nbsp;Petablox (https://github.com/petablox/petablox)<br>\n6. &nbsp;Infer (https://fbinfer.com)<br>\n7. &nbsp;Error-Prone (http://errorprone.info)<br>\n8. &nbsp;Checker-Framework (https://checkerframework.org)<br>\n9. &nbsp;Opium (Opal-framework) (https://www.opal-project.de)<br>\n10. Spoon (https://spoon.gforge.inria.fr)<br>\n11. PMD (https://pmd.github.io)<br>\n12. CheckStyle (https://checkstyle.org)</p>\n\n<p>Additionally, each program&nbsp;executes at least 100 unique application methods at runtime.&nbsp;These programs are repositories picked from the set of Java-8 projects on Github that compile and run successfully.&nbsp;Each of these programs come with a jar file, the compiled bytecode files, compiled library files&nbsp;and the Java source code. It also comes with a list of source files, declared methods, application-classes list, and main-class names.&nbsp;The availability of the files in both jar-file form, as well as source code form (with the compiled library classes) is a major reason the dataset works&nbsp;with&nbsp;so many tools, without requiring any extra effort.</p>\n\n<p>There are 3 files available for download: <em>njr-1_dataset.zip, scripts.zip, benchmark_stats.csv.</em></p>\n\n<p><em>njr-1_dataset.zip</em> has the actual dataset programs. <em>scripts.zip</em> contains&nbsp;Python3 scripts&nbsp;for each tool, to run it&nbsp;on the entire dataset.&nbsp;The <em>benchmark_stats.csv</em> file lists, for each benchmark, the number of nodes and edges in its dynamic application call-graph, as well as the number of edges in its static application call-graph (as computed by Wala) when using the main function listed in the&nbsp;<em>info/mainclassname</em> file.&nbsp;<br>\nA summary of the same is listed here:</p>\n\n<p><strong><em>Statistics &nbsp;Dynamic-Nodes &nbsp;Dynamic-Edges &nbsp;Static-Edges</em></strong><br>\nMean &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 205&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;469&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 1404<br>\nSt.Dev &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; 199&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;464&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;2523<br>\nMedian &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;149&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;327&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;610</p>\n\n<p>To cite the dataset, please cite the following paper:<br>\nJens Palsberg and Cristina V. Lopes,&nbsp;NJR: a&nbsp;Normalized Java Resource.&nbsp;<br>\nIn Proceedings of ACM SIGPLAN International Workshop&nbsp;on State Of the Art in Program Analysis (SOAP), 2018.</p>", 
    "language": "eng", 
    "title": "NJR-1 Dataset", 
    "license": {
      "id": "CC-BY-4.0"
    }, 
    "notes": "Funded by the following NSF grant (https://www.nsf.gov/awardsearch/showAward?AWD_ID=1823360&amp;HistoricalAwards=false)", 
    "relations": {
      "version": [
        {
          "count": 3, 
          "index": 1, 
          "parent": {
            "pid_type": "recid", 
            "pid_value": "3897691"
          }, 
          "is_last": false, 
          "last_child": {
            "pid_type": "recid", 
            "pid_value": "4839913"
          }
        }
      ]
    }, 
    "version": "1.0.1", 
    "keywords": [
      "Static Analysis, Java"
    ], 
    "publication_date": "2020-06-16", 
    "creators": [
      {
        "orcid": "0000-0002-9623-3049", 
        "affiliation": "UCLA", 
        "name": "Utture, Akshay"
      }, 
      {
        "affiliation": "UCLA", 
        "name": "Kalhauge, Christian Gram"
      }, 
      {
        "affiliation": "UCLA", 
        "name": "Liu, Shuyang"
      }, 
      {
        "affiliation": "UCLA", 
        "name": "Palsberg, Jens"
      }
    ], 
    "access_right": "open", 
    "resource_type": {
      "type": "dataset", 
      "title": "Dataset"
    }, 
    "related_identifiers": [
      {
        "scheme": "doi", 
        "identifier": "10.1145/3236454.3236501", 
        "relation": "isSupplementTo", 
        "resource_type": "publication-conferencepaper"
      }, 
      {
        "scheme": "doi", 
        "identifier": "10.5281/zenodo.3897691", 
        "relation": "isVersionOf"
      }
    ]
  }
}
144
88
views
downloads
All versions This version
Views 14473
Downloads 8840
Data volume 13.0 GB5.2 GB
Unique views 11765
Unique downloads 6736

Share

Cite as