Published April 14, 2025 | Version v2
Model Open

Pre-Training Representations of Binary Code Using Contrastive Learning

  • 1. EDMO icon Vanderbilt University

Description

ContraBin is a novel framework for pre-training binary code representations using contrastive learning. It bridges the semantic gap between binary code, source code, and comments by integrating them into a unified representation. Through innovative techniques such as simplex interpolation and intermediate representation learning, ContraBin achieves state-of-the-art performance on critical tasks like function name recovery, code summarization, and reverse engineering. The repository includes modular implementations, tools for dataset preprocessing, and visualization utilities to ensure ease of reproducibility and experimentation.

Files

README.md

Files (15.8 kB)

Name Size Download all
md5:1eceb94eb5fd65cc0cd44059502bf6a9
12.7 kB Preview Download
md5:d407971edad5f42bed252b4654eecc69
3.0 kB Preview Download