Dataset Open Access

DIRE: A Neural Approach to Decompiled Identifier Naming

Lacomis, Jeremy; Yin, Pengcheng; Schwartz, Edward J.; Allamanis, Miltiadis; Le Goues, Claire; Neubig, Graham; Vasilescu, Bogdan


Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator>Lacomis, Jeremy</dc:creator>
  <dc:creator>Yin, Pengcheng</dc:creator>
  <dc:creator>Schwartz, Edward J.</dc:creator>
  <dc:creator>Allamanis, Miltiadis</dc:creator>
  <dc:creator>Le Goues, Claire</dc:creator>
  <dc:creator>Neubig, Graham</dc:creator>
  <dc:creator>Vasilescu, Bogdan</dc:creator>
  <dc:date>2019-09-09</dc:date>
  <dc:description>This dataset is released as a companion to the paper "DIRE: A Neural Approach to Decompiled Identifier Naming", appearing in the proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019).

It contains information generated by decompiling 3,195,962 functions found in 164,632 unique binaries generated from C code scraped from GitHub. For practicality, the dataset is partitioned into 16 archives by the first hexadecimal digit of the SHA-256 hash of the binary used to generate it. Each of the 16 archives contains approximately 10,000 JSONL files, named according to a binary's hash. Each JSONL file consists of a single JSON object per-line corresponding to a single function in the decompiled binary.

Archives are provided in both GZIP and BZIP2 format.

See the README file for more information.</dc:description>
  <dc:identifier>https://zenodo.org/record/3403078</dc:identifier>
  <dc:identifier>10.5281/zenodo.3403078</dc:identifier>
  <dc:identifier>oai:zenodo.org:3403078</dc:identifier>
  <dc:relation>doi:10.5281/zenodo.3403077</dc:relation>
  <dc:rights>info:eu-repo/semantics/openAccess</dc:rights>
  <dc:rights>https://opensource.org/licenses/MIT</dc:rights>
  <dc:title>DIRE: A Neural Approach to Decompiled Identifier Naming</dc:title>
  <dc:type>info:eu-repo/semantics/other</dc:type>
  <dc:type>dataset</dc:type>
</oai_dc:dc>
546
5,059
views
downloads
All versions This version
Views 546546
Downloads 5,0595,059
Data volume 846.1 GB846.1 GB
Unique views 482482
Unique downloads 347347

Share

Cite as