Published April 26, 2025 | Version v4
Data paper Open

Replication package for the paper "Performance Smells in ML and Non-ML Python Projects: A Comparative Study"

Authors/Creators

Description

Project Directory Structure

Analysis 

  • Contains the results of the smell distribution per file and per KLOC.

classification

  • Holds the classification of smelly files based on their content.

criteria

  • Python package defining the criteria used to filter ML and non-ML projects, including toy projects.

csv

  • Hold all the files used to analyze the type of operations carried out in each specific stage
  • Hold the statistics for ML and Non-ML projects
  • Hold the report of the dataset collection
  • Hold the csv file used to evaluate the accuracy, precision and recall of our classifier model
  • Hold the csv files used to assess the performance of the RIdiom tool

dataset

  • Contains the list of projects collected per domain for analysis.

images

  • Contains heatmaps, boxplots and histogram generated from the analysis.

repo_mining

  • Code for querying projects by topic from GitHub.

source

  • All scripts used for the analysis.

utils

  • Contains various helper functions to reduce code duplication.

Zero_shot_classification

  • Code containing our zero-shot classification model

Files

Artifacts_mlvsno-ml.zip

Files (3.7 MB)

Name Size Download all
md5:401259c6e3a6212271ff5d210ea071a9
3.7 MB Preview Download

Additional details

Software

Programming language
Python
Development Status
Active