Published November 1, 2021 | Version v0.2.0
Journal article Open

Workflow Analysis of Data Science Code in PublicGitHub Repositories

Description

This contains supplementary files for a scientific article. It includes a dataset containing data science code snippets taken from the Jupyter notebook cells annotated with the data science step it performs and a set of analyses to understand the data science implementation life cycle. 

*The version v.0.2.0 contains the additional analyses like anti pattern analysis.

Files

data-science-code-analysis.zip

Files (8.7 MB)

Name Size Download all
md5:dcfb2f3ad5b2c6a8a2e0649f0a9a286f
8.7 MB Preview Download

Additional details

Funding

Data-driven Contemporary Code Review PP00P2_170529
Swiss National Science Foundation
CrowdAlytics: Large-Scale Human-Machine Systems for Data Science 200020_184994
Swiss National Science Foundation