Dataset Open Access
Lacomis, Jeremy;
Yin, Pengcheng;
Schwartz, Edward J.;
Allamanis, Miltiadis;
Le Goues, Claire;
Neubig, Graham;
Vasilescu, Bogdan
{ "publisher": "Zenodo", "DOI": "10.5281/zenodo.3403078", "title": "DIRE: A Neural Approach to Decompiled Identifier Naming", "issued": { "date-parts": [ [ 2019, 9, 9 ] ] }, "abstract": "<p>This dataset is released as a companion to the paper "DIRE: A Neural Approach to Decompiled Identifier Naming", appearing in the proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019).</p>\n\n<p>It contains information generated by decompiling 3,195,962 functions found in 164,632 unique binaries generated from C code scraped from GitHub. For practicality, the dataset is partitioned into 16 archives by the first hexadecimal digit of the SHA-256 hash of the binary used to generate it. Each of the 16 archives contains approximately 10,000 JSONL files, named according to a binary's hash. Each JSONL file consists of a single JSON object per-line corresponding to a single function in the decompiled binary.</p>\n\n<p>Archives are provided in both GZIP and BZIP2 format.</p>\n\n<p>See the README file for more information.</p>", "author": [ { "family": "Lacomis, Jeremy" }, { "family": "Yin, Pengcheng" }, { "family": "Schwartz, Edward J." }, { "family": "Allamanis, Miltiadis" }, { "family": "Le Goues, Claire" }, { "family": "Neubig, Graham" }, { "family": "Vasilescu, Bogdan" } ], "id": "3403078", "type": "dataset", "event": "International Conference on Automated Software Engineering (ASE)" }
All versions | This version | |
---|---|---|
Views | 650 | 650 |
Downloads | 7,582 | 7,582 |
Data volume | 1.3 TB | 1.3 TB |
Unique views | 580 | 580 |
Unique downloads | 423 | 423 |