Published September 9, 2019 | Version v1
Dataset Open

DIRE: A Neural Approach to Decompiled Identifier Naming

  • 1. Carnegie Mellon University
  • 2. Carnegie Mellon University Software Engineering Institute
  • 3. Microsoft Research

Description

This dataset is released as a companion to the paper "DIRE: A Neural Approach to Decompiled Identifier Naming", appearing in the proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019).

It contains information generated by decompiling 3,195,962 functions found in 164,632 unique binaries generated from C code scraped from GitHub. For practicality, the dataset is partitioned into 16 archives by the first hexadecimal digit of the SHA-256 hash of the binary used to generate it. Each of the 16 archives contains approximately 10,000 JSONL files, named according to a binary's hash. Each JSONL file consists of a single JSON object per-line corresponding to a single function in the decompiled binary.

Archives are provided in both GZIP and BZIP2 format.

See the README file for more information.

Files

Files (6.5 GB)

Name Size Download all
md5:f6f50e591e0b87286adf4037db5c5326
151.2 MB Download
md5:b2b9e5b6d9804ee0dd5e512e2fb1b886
250.8 MB Download
md5:22725b6a13a581fd11939d9f0d9755d2
151.8 MB Download
md5:2b8ae5d693fad500f0fd0b59b50d8d81
251.8 MB Download
md5:90d57fe8d1e0e7c5c965e29cce752388
157.0 MB Download
md5:c80d48c918a1c7e8b60789691cda21cd
259.7 MB Download
md5:d86ae421c9628b5fc6bdfe0b60bc8997
166.8 MB Download
md5:03326d0d4300579efbd6e233be3b99bd
276.9 MB Download
md5:e1448edff59ad43e81ca572ed5b9321c
163.9 MB Download
md5:98cb8057b336d6cb8beea88de84fbac1
271.0 MB Download
md5:236626301caa357b21806acc319b02e6
153.0 MB Download
md5:9b4f94607335772b044c62d647aba6cf
253.7 MB Download
md5:504a7bae05a79a1e9b417cdb24644ea7
264.2 MB Download
md5:dbaed7f3096ed4a5ef90c5facf48be75
150.7 MB Download
md5:b7347771c5ffd40d33857e66114d68ff
249.9 MB Download
md5:bc480a2bee0bd3212937d512dd992c34
156.2 MB Download
md5:c7672cc3575db9760da635f58d6e441d
259.0 MB Download
md5:7e8c1d75838252154d76eec139e2e4d7
157.7 MB Download
md5:ff8f29e381078c77bd3164138302ad0a
261.0 MB Download
md5:8140925c954a99f925049633fb7a5a64
154.2 MB Download
md5:d3b65c7aea71a146f0631c52f7424f05
255.3 MB Download
md5:e6c3e05f761d996c2032a76169a018b6
155.4 MB Download
md5:9d74988a8e928a9122166deb4a21e90c
257.6 MB Download
md5:307a08acef47c35e5a489d5afab0ddcd
163.5 MB Download
md5:05dc395dbf51303e1ef38e71f5249ca9
271.9 MB Download
md5:f7ad3caabad2b0a2ec8e204ebde2e609
162.5 MB Download
md5:1c666f3b20ba0403d5d0ec261ebc03b7
269.7 MB Download
md5:89ab0e0fe48f283ec642e69ddddc8e35
156.0 MB Download
md5:7b57706ec9a2948cc2ae3b45353090aa
257.9 MB Download
md5:681666c06d9e43093fa4c50dbbfecb4a
150.1 MB Download
md5:ae0ecd96ef96cccc0e2340f757d8bdcf
249.2 MB Download
md5:ce1c12d7b5072c540e1ffea48bd78e0b
935 Bytes Download