Architectural Design Decisions for the Machine Learning Workflow: Dataset and Code

Warnett, Stephen John; Zdun, Uwe

doi:10.5281/zenodo.5730291

Published November 29, 2021 | Version 1

Dataset Open

Architectural Design Decisions for the Machine Learning Workflow: Dataset and Code

1. University of Vienna

Title: Architectural Design Decisions for the Machine Learning Workflow: Dataset and Code

Authors: Stephen John Warnett; Uwe Zdun

About: This is the dataset and code artifact for the article entitled "Architectural Design Decisions for the Machine Learning Workflow".

Contents: The "_generated" directory contains the generated results, including latex files with tables for use in publications and the Architectural Design Decision model in textual and graphical form. "Generators" contains Python applications that can be run to generate the above. "Metamodels" contains a Python file with type definitions. "Sources_coding" contains our source codings and audit trail. "Add_models" contains the Python implementation of our model and source codings. Finally, "appendix" contains a detailed description of our research method.

Article Abstract: Bringing machine learning models to production is challenging as it is often fraught with uncertainty and confusion, partially due to the disparity between software engineering and machine learning practices, but also due to knowledge gaps on the level of the individual practitioner. We conducted a qualitative investigation into the architectural decisions faced by practitioners as documented in gray literature based on Straussian Grounded Theory and modeled current practices in machine learning. Our novel Architectural Design Decision model is based on current practitioner understanding of the topic and helps bridge the gap between science and practice, foster scientific understanding of the subject, and support practitioners via the integration and consolidation of the myriad decisions they face. We describe a subset of the Architectural Design Decisions that were modeled, discuss uses for the model, and outline areas in which further research may be pursued.

Objective: This article aims to study current practitioner understanding of architectural concepts associated with data processing, model building, and Automated Machine Learning (AutoML) within the context of the machine learning workflow.

Method: Applying Straussian Grounded Theory to gray literature sources containing practitioner views on machine learning practices, we studied methods and techniques currently applied by practitioners in the context of machine learning solution development and gained valuable insights into the software engineering and architectural state of the art as applied to ML.

Results: Our study resulted in a model of Architectural Design Decisions, practitioner practices, and decision drivers in the field of software engineering and software architecture for machine learning.

Conclusions: The resulting Architectural Design Decisions model can help researchers better understand practitioners' needs and the challenges they face, and guide their decisions based on existing practices. The study also opens new avenues for further research in the field, and the design guidance provided by our model can also help reduce design effort and risk. In future work, we plan on using our findings to provide automated design advice to machine learning engineers.

Notes

This work was supported by: FFG (Austrian Research Promotion Agency) project AMMONIS, no. 879705.

Files

ml_workflow_adds_v1.zip

Files (3.0 MB)

Name	Size	Download all
ml_workflow_adds_v1.zip md5:8cd0e83b6a3d8617a736518dbe25e645	3.0 MB	Preview Download

	All versions	This version
Views	477	477
Downloads	45	45
Data volume	146.0 MB	146.0 MB

Architectural Design Decisions for the Machine Learning Workflow: Dataset and Code

Creators

Description

Notes

Files

ml_workflow_adds_v1.zip

Files (3.0 MB)