Leveraging Artificial Intelligence for Automated Reverse Engineering of Legacy Software Systems
Creators
Contributors
Project manager:
Description
We present an AI-assisted reverse engineering framework that achieves dramatic speedups—on the order of hundreds of times faster than traditional manual methods—by orchestrating specialized agents for evidence curation, struct recovery, and code drafting. Using this approach, we recreated a bootable prototype of Apple System 7.1 from binary analysis in just 3 days, a task that would normally require months or years. The framework enforces strict provenance tracking, tying each change to either disassembly bytes or runtime verification under QEMU. Rather than reporting abstract accuracy percentages, we emphasize artifact-based validation: screenshots, serial logs, and resource extractions that demonstrate Chicago font rendering, menu bar behavior, desktop patterns, and icon display. This work shows how carefully scoped AI assistance, coupled with human review and a structured verification loop, can transform reverse engineering from a slow artisanal process into a systematic, reproducible workflow for preserving computing history and modernizing legacy systems.
Files
AI Tools for Reverse Engineering Legacy Software - final.zip
Files
(199.5 kB)
Name | Size | Download all |
---|---|---|
md5:63f33b23cf4ccb3b7d3133b84e233942
|
199.5 kB | Preview Download |
Additional details
Dates
- Created
-
2025-09-24
Software
- Repository URL
- https://github.com/Kelsidavis/System7
- Programming language
- C
- Development Status
- Active
References
- Allamanis, M., Barr, E. T., Devanbu, P., & Sutton, C. (2018). A Survey of Machine Learning for Big Code and Naturalness. ACM Computing Surveys, 51(4), Article 81. https://doi.org/10.1145/3212695
- Armengol-Estapé, J., Woodruff, J., Cummins, C., & O'Boyle, M. (2023). SLaDe: A Portable Small Language Model Decompiler. arXiv preprint arXiv:2305.12520. https://doi.org/10.48550/arXiv.2305.12520
- Benali, A. (2022). An Initial Investigation of Neural Decompilation for WebAssembly → C [Master's thesis, Uppsala University]. DiVA Portal.
- Canfora, G., & Di Penta, M. (2007). New Frontiers of Reverse Engineering. In Future of Software Engineering (FOSE '07) (pp. 326-341). IEEE Computer Society. https://doi.org/10.1109/FOSE.2007.15
- Cao, Y., Liang, R., Chen, K., & Hu, P. (2023). Boosting Neural Networks to Decompile Optimized Binaries. arXiv preprint arXiv:2301.00969. https://doi.org/10.48550/arXiv.2301.00969
- Chikofsky, E., & Cross, J. (1990). Reverse Engineering and Design Recovery: A Taxonomy. IEEE Software, 7(1), 13-17. https://doi.org/10.1109/52.43044
- Comella-Dorda, S., Wallnau, K., Seacord, R. C., & Robert, J. (2000). A Survey of Legacy System Modernization Approaches (CMU/SEI-2000-TN-003). Software Engineering Institute, Carnegie Mellon University.
- Dramko, L., Le Goues, C., & Schwartz, E. J. (2025). Idioms: Neural Decompilation with Joint Code and Type Prediction. arXiv preprint arXiv:2502.04536. https://doi.org/10.48550/arXiv.2502.04536
- Hosseini, I., & Dolan-Gavitt, B. (2022). Beyond the C: Retargetable Decompilation using Neural Machine Translation. arXiv preprint arXiv:2212.08950. https://doi.org/10.48550/arXiv.2212.08950
- Katz, D., Ruchti, J., & Schulte, E. (2019). Towards Neural Decompilation. arXiv preprint arXiv:1905.08325. https://doi.org/10.48550/arXiv.1905.08325
- Nelson, M., Cowan, A., Alencar, P., & Cowan, D. (2005). A Survey of Reverse Engineering and Program Comprehension. arXiv preprint cs/0503068. https://doi.org/10.48550/arXiv.cs/0503068