Speculative Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

Khatchadourian, Raffi; Castro Velez, Tatiana; Bagherzadeh, Mehdi; Jia, Nan; Raja, Anita

doi:10.5281/zenodo.17216491

Published September 27, 2025 | Version v9

Dataset Open

Speculative Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

1. Hunter College
2. The Graduate Center, CUNY
3. Oakland University

Efficiency is essential to support ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code---supporting symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. Though hybrid approaches aim for the "best of both worlds," using them effectively requires subtle considerations. Our key insight is that, while DL programs typically execute sequentially, hybridizing imperative DL code resembles parallelizing sequential code in traditional systems. Inspired by this, we present an automated refactoring approach that assists developers in determining which otherwise eagerly-executed imperative DL functions could be effectively and efficiently executed as graphs. The approach features novel static imperative tensor and side-effect analyses for Python. Due to its inherent dynamism, analyzing Python may be unsound; however, the conservative approach leverages a speculative (keyword-based) analysis for resolving difficult cases that informs developers of any assumptions made. The approach is: (i) implemented as a plug-in to the PyDev Eclipse IDE that integrates the WALA Ariadne analysis framework and (ii) evaluated on nineteen DL projects consisting of 132 KLOC. The results show that 326 of 766 candidate functions (42.56%) were refactorable, and an average relative speedup of 2.16x on performance tests was observed with negligible differences in model accuracy. The results indicate that the approach is useful in optimizing imperative DL code to its full potential.

Files

candidate_functions.csv

Files (1.9 GB)

Name	Size	Download all
candidate_functions.csv md5:d796b4d64f94897248af69e53aa04746	64.0 kB	Preview Download
decorators.csv md5:21fc32f517ff050c0d066029b5595d29	218.1 kB	Preview Download
failed_preconditions.csv md5:94e78804c1ef35704f6ea89d2a8fc2f9	191.1 kB	Preview Download
functions.csv md5:9f14727001525e480324934133595ff9	3.0 MB	Preview Download
nonoptimizable.csv md5:508965b7313d469ce04881c1d082d27b	36.1 kB	Preview Download
optimizable.csv md5:ed3d0dee5cc4813f367fc39c96655a1e	28.0 kB	Preview Download
performance.zip md5:9ca44698ad6787678aa435a862dfe32b	37.1 MB	Preview Download
README.md md5:ff3cd2f2da6d04ea7811daa8c6009e84	5.7 kB	Preview Download
src.zip md5:03a36bb192ddd8bd7b64acd078c98d37	1.9 GB	Preview Download
statuses.csv md5:7a622d2c2368e267e6ded74172e57100	3.2 MB	Preview Download
subjects.csv md5:6c57ff77204abcc2ff970856e1429704	2.3 kB	Preview Download
transformations.csv md5:31676dd4f9ab4bdfe7c0ee4f1bfc508b	33.8 kB	Preview Download

Additional details

Is compiled by: Software: 10.5281/zenodo.15045769 (DOI)
Is supplement to: Conference paper: arXiv:2504.05424 (arXiv)

U.S. National Science Foundation
SHF: Small: Practical Analyses and Safe Transformations for Imperative Deep Learning Programs 2200343
U.S. National Science Foundation
Collaborative Research: CCRI: New: A Software Refactoring Community Infrastructure 2213763
U.S. National Science Foundation
SHF: Small: Knowledge, Methodologies, and Tool-support for Combating Technical Debt in Machine Learning Systems 2343750

Repository URL: https://github.com/ponder-lab/Hybridization-Evaluation
Programming language: CSV , Markdown , Python

	All versions	This version
Views	361	120
Downloads	1,648	406
Data volume	124.4 GB	93.2 GB

candidate_functions.csv

Files (1.9 GB)

Related works

Funding

Software

Speculative Automated Refactoring of Imperative Deep Learning Programs to Graph Execution

Authors/Creators

Description

Files

candidate_functions.csv

Files (1.9 GB)

Additional details

Related works

Funding

Software