Dataset Open Access
Modern systems are built using development frameworks. The infrastructure provided by these frameworks have a major impact on how the developed system executes, how configurations are managed, how it is tested, and how and where it is deployed. Machine learning (ML) systems have revolutionized multiple industries and come with different kinds of frameworks. Naturally, the issues that manifest in such systems may differ as well---as may the behavior of developers correcting those issues. We are interested in characterizing the types of system-related issues---issues impacting performance, memory and resource usage, and other quality attributes---that emerge in machine learning frameworks, and how they differ from those in traditional frameworks. To this end, we have conducted a large-scale exploratory study analyzing real-world system-related issues from 10 popular machine learning frameworks.
Our findings offer a number of interesting observations, with implications for the development of machine learning systems, including differences in the frequency of occurrence of certain issue types, observations regarding the impact of debate and time on issue correction, and differences in the specialization of developers. We hope that this exploratory study will enable developers to improve their expectations, plan for risk, and allocate resources accordingly when making use of the tools provided by these frameworks to develop ML-based systems.