Dataset Open Access

DIRE: A Neural Approach to Decompiled Identifier Naming

Lacomis, Jeremy; Yin, Pengcheng; Schwartz, Edward J.; Allamanis, Miltiadis; Le Goues, Claire; Neubig, Graham; Vasilescu, Bogdan


Citation Style Language JSON Export

{
  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3403078", 
  "title": "DIRE: A Neural Approach to Decompiled Identifier Naming", 
  "issued": {
    "date-parts": [
      [
        2019, 
        9, 
        9
      ]
    ]
  }, 
  "abstract": "<p>This dataset is released as a companion to the paper &quot;DIRE: A Neural Approach to Decompiled Identifier Naming&quot;, appearing in the proceedings of the&nbsp;34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019).</p>\n\n<p>It contains information generated by decompiling 3,195,962 functions found in 164,632 unique binaries generated from C code scraped from GitHub. For practicality, the dataset is partitioned into 16 archives by the first hexadecimal digit of the SHA-256 hash of the binary used to generate it. Each of the 16 archives contains approximately 10,000&nbsp;JSONL files, named according to a binary&#39;s hash. Each JSONL file consists of a single JSON object per-line corresponding to a single function in the decompiled binary.</p>\n\n<p>Archives are provided in both GZIP and BZIP2 format.</p>\n\n<p>See the README file for more information.</p>", 
  "author": [
    {
      "family": "Lacomis, Jeremy"
    }, 
    {
      "family": "Yin, Pengcheng"
    }, 
    {
      "family": "Schwartz, Edward J."
    }, 
    {
      "family": "Allamanis, Miltiadis"
    }, 
    {
      "family": "Le Goues, Claire"
    }, 
    {
      "family": "Neubig, Graham"
    }, 
    {
      "family": "Vasilescu, Bogdan"
    }
  ], 
  "type": "dataset", 
  "id": "3403078"
}
546
5,059
views
downloads
All versions This version
Views 546546
Downloads 5,0595,059
Data volume 846.1 GB846.1 GB
Unique views 482482
Unique downloads 347347

Share

Cite as