Published September 2, 2022 | Version camera-ready
Conference paper Open

Cornucopia: A Framework for Feedback Guided Generation of Binaries

Description

Binary analysis or the ability to analyze binary code is an important
capability required for many security and software engineering
applications. Consequently, there are many binary analysis tech-
niques and tools with varied capabilities. However, testing these
tools requires a large, varied binary dataset with corresponding
source-level information. In this paper, we present Cornucopia,
an architecture agnostic automated framework that can generate
a large number of binaries from program source code. We exploit
compiler optimizations and use feedback-guided learning to max-
imize the generation of unique binaries that correspond to the
same program. Our evaluation shows that Cornucopia was able
to generate 309K binaries across four architectures (x86, x64, ARM,
MIPS) with an average of 403 binaries for each program and outper-
forms BinTuner, a similar technique. Our experiments also
revealed a large number (∼300) of issues with LLVM optimization
scheduler resulting in compiler crashes. Our evaluation of four
popular binary analysis tools angr, Ghidra, ida, and radare, us-
ing Cornucopia generated binaries, revealed various issues with
these tools. Specifically, we found 263 crashes in angr and one
memory corruption issue in ida. Our differential testing on the
analysis results revealed various semantic bugs in these tools. We
also tested machine learning tools, Asm2Vec, SAFE, and Debin, that
claim to capture binary semantics and show that they perform very
poorly (e.g., Debin F1 score dropped to 12.9% from reported 63.1%)
on Cornucopia generated binaries. In summary, our exhaustive
evaluation shows that Cornucopia is an effective mechanism to
generate binaries that can be used to test binary analysis techniques
effectively.

Files

Cornucopia-main.zip

Files (10.2 GB)

Name Size Download all
md5:45fe4042851546297db978a8855efb9e
1.8 GB Download
md5:5ec623188581ee27163cbbbf1bdbc6ff
252.8 MB Preview Download
md5:3b2b9f0e6a22fa4e9b454b6d53f66724
2.1 GB Download
md5:1c27e026de4be5ac351752f448e2e61c
1.9 GB Download
md5:91bb7a56a9472e66e0f8025760e98abe
2.7 GB Download
md5:b243e3648589db2e4b627b7dbcbf0218
1.4 GB Preview Download