Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study

Tatiana Castro Vélez; Raffi Khatchadourian; Mehdi Bagherzadeh; Anita Raja

doi:10.5281/zenodo.6403785

Published March 31, 2022 | Version v5

Dataset Open

Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study

1. City University of New York (CUNY) Graduate Center
2. City University of New York (CUNY) Hunter College
3. Oakland University

Efficiency is essential to support responsiveness w.r.t. ever-growing datasets, especially for Deep Learning (DL) systems. DL frameworks have traditionally embraced deferred execution-style DL code that supports symbolic, graph-based Deep Neural Network (DNN) computation. While scalable, such development tends to produce DL code that is error-prone, non-intuitive, and difficult to debug. Consequently, more natural, less error-prone imperative DL frameworks encouraging eager execution have emerged but at the expense of run-time performance. While hybrid approaches aim for the "best of both worlds," the challenges in applying them in the real world are largely unknown. We conduct a data-driven analysis of challenges—and resultant bugs—involved in writing reliable yet performant imperative DL code by studying 250 open-source projects, consisting of 19.7 MLOC, along with 470 and 446 manually examined code patches and bug reports, respectively. The results indicate that hybridization: (i) is prone to API misuse, (ii) can result in performance degradation—the opposite of its intention, and (iii) has limited application due to execution mode incompatibility. We put forth several recommendations, best practices, and anti-patterns for effectively hybridizing imperative DL code, potentially benefiting DL practitioners, API designers, tool developers, and educators.

Notes

Support for this project was provided by PSC-CUNY Award #638010051, jointly funded by The Professional Staff Congress and The City University of New York.

Files

commit_categorizations.csv

Files (461.7 kB)

Name	Size	Download all
commit_categorizations.csv md5:6f46580dfcbb60fa81b3845123238b45	72.7 kB	Preview Download
commits.csv md5:6e34c363f2ee647afa13a41c1fdefc85	57.8 kB	Preview Download
datasets.csv md5:86452006f55e1caf918c32a81cf5e559	503 Bytes	Preview Download
issue_categorizations.csv md5:e5e9f30dd708657a4871976e34d49549	56.0 kB	Preview Download
issues.csv md5:7a1f08f5ef05bc3ba304761d56a4ab18	221.6 kB	Preview Download
pipeline_stages.csv md5:4eaa2a604e956d64623d72bc294dd5c2	424 Bytes	Preview Download
problem_categories.csv md5:45c3f2cf9b5fa1f1ce86ac2bcff34ffa	7.4 kB	Preview Download
problem_causes.csv md5:d53d487ea1bdc81cda463e6b779c5f98	2.2 kB	Preview Download
problem_fixes.csv md5:f5c3f90fa7200c6e454044a13b2364b1	7.8 kB	Preview Download
problem_symptoms.csv md5:3d675fdd2821e52e7b8c62cbbc0f4071	1.2 kB	Preview Download
README.md md5:bdcd6caa5f4e335bf42b1cf9eaebdd43	2.2 kB	Preview Download
studied_subjects_commits.csv md5:3100b5360486d6941a7699ab53dde9fa	13.0 kB	Preview Download
studied_subjects_issues.csv md5:fdb1721d715a9143c70f6a4d70c895f4	18.6 kB	Preview Download
studies.csv md5:417cf62ee181ec9b7baffd832a03ce61	184 Bytes	Preview Download

Additional details

Is compiled by: Software: https://github.com/ponder-lab/Imperative-DL-Study-Web-App (URL)
Is derived from: Other: https://github.com/ponder-lab/Imperative-DL-Study-Data (URL)
Is supplement to: Preprint: arXiv:2201.09953 (arXiv)

	All versions	This version
Views	1,041	98
Downloads	462	32
Data volume	23.7 MB	1.7 MB

Challenges in Migrating Imperative Deep Learning Programs to Graph Execution: An Empirical Study

Creators

Description

Notes

Files

commit_categorizations.csv

Files (461.7 kB)

Additional details

Related works