Dataset Open Access
Lacomis, Jeremy;
Yin, Pengcheng;
Schwartz, Edward J.;
Allamanis, Miltiadis;
Le Goues, Claire;
Neubig, Graham;
Vasilescu, Bogdan
This dataset is released as a companion to the paper "DIRE: A Neural Approach to Decompiled Identifier Naming", appearing in the proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE 2019).
It contains information generated by decompiling 3,195,962 functions found in 164,632 unique binaries generated from C code scraped from GitHub. For practicality, the dataset is partitioned into 16 archives by the first hexadecimal digit of the SHA-256 hash of the binary used to generate it. Each of the 16 archives contains approximately 10,000 JSONL files, named according to a binary's hash. Each JSONL file consists of a single JSON object per-line corresponding to a single function in the decompiled binary.
Archives are provided in both GZIP and BZIP2 format.
See the README file for more information.
Name | Size | |
---|---|---|
0-trees.tar.bz2
md5:f6f50e591e0b87286adf4037db5c5326 |
151.2 MB | Download |
0-trees.tar.gz
md5:b2b9e5b6d9804ee0dd5e512e2fb1b886 |
250.8 MB | Download |
1-trees.tar.bz2
md5:22725b6a13a581fd11939d9f0d9755d2 |
151.8 MB | Download |
1-trees.tar.gz
md5:2b8ae5d693fad500f0fd0b59b50d8d81 |
251.8 MB | Download |
2-trees.tar.bz2
md5:90d57fe8d1e0e7c5c965e29cce752388 |
157.0 MB | Download |
2-trees.tar.gz
md5:c80d48c918a1c7e8b60789691cda21cd |
259.7 MB | Download |
3-trees.tar.bz2
md5:d86ae421c9628b5fc6bdfe0b60bc8997 |
166.8 MB | Download |
3-trees.tar.gz
md5:03326d0d4300579efbd6e233be3b99bd |
276.9 MB | Download |
4-trees.tar.bz2
md5:e1448edff59ad43e81ca572ed5b9321c |
163.9 MB | Download |
4-trees.tar.gz
md5:98cb8057b336d6cb8beea88de84fbac1 |
271.0 MB | Download |
5-trees.tar.bz2
md5:236626301caa357b21806acc319b02e6 |
153.0 MB | Download |
5-trees.tar.gz
md5:9b4f94607335772b044c62d647aba6cf |
253.7 MB | Download |
6-trees.tar.gz
md5:504a7bae05a79a1e9b417cdb24644ea7 |
264.2 MB | Download |
7-trees.tar.bz2
md5:dbaed7f3096ed4a5ef90c5facf48be75 |
150.7 MB | Download |
7-trees.tar.gz
md5:b7347771c5ffd40d33857e66114d68ff |
249.9 MB | Download |
8-trees.tar.bz2
md5:bc480a2bee0bd3212937d512dd992c34 |
156.2 MB | Download |
8-trees.tar.gz
md5:c7672cc3575db9760da635f58d6e441d |
259.0 MB | Download |
9-trees.tar.bz2
md5:7e8c1d75838252154d76eec139e2e4d7 |
157.7 MB | Download |
9-trees.tar.gz
md5:ff8f29e381078c77bd3164138302ad0a |
261.0 MB | Download |
a-trees.tar.bz2
md5:8140925c954a99f925049633fb7a5a64 |
154.2 MB | Download |
a-trees.tar.gz
md5:d3b65c7aea71a146f0631c52f7424f05 |
255.3 MB | Download |
b-trees.tar.bz2
md5:e6c3e05f761d996c2032a76169a018b6 |
155.4 MB | Download |
b-trees.tar.gz
md5:9d74988a8e928a9122166deb4a21e90c |
257.6 MB | Download |
c-trees.tar.bz2
md5:307a08acef47c35e5a489d5afab0ddcd |
163.5 MB | Download |
c-trees.tar.gz
md5:05dc395dbf51303e1ef38e71f5249ca9 |
271.9 MB | Download |
d-trees.tar.bz2
md5:f7ad3caabad2b0a2ec8e204ebde2e609 |
162.5 MB | Download |
d-trees.tar.gz
md5:1c666f3b20ba0403d5d0ec261ebc03b7 |
269.7 MB | Download |
e-trees.tar.bz2
md5:89ab0e0fe48f283ec642e69ddddc8e35 |
156.0 MB | Download |
e-trees.tar.gz
md5:7b57706ec9a2948cc2ae3b45353090aa |
257.9 MB | Download |
f-trees.tar.bz2
md5:681666c06d9e43093fa4c50dbbfecb4a |
150.1 MB | Download |
f-trees.tar.gz
md5:ae0ecd96ef96cccc0e2340f757d8bdcf |
249.2 MB | Download |
README
md5:ce1c12d7b5072c540e1ffea48bd78e0b |
935 Bytes | Download |
All versions | This version | |
---|---|---|
Views | 616 | 616 |
Downloads | 7,508 | 7,508 |
Data volume | 1.2 TB | 1.2 TB |
Unique views | 548 | 548 |
Unique downloads | 401 | 401 |