Published September 21, 2025 | Version 1.0.0
Dataset Open

Replication package: The Prevalence of Code Review Guidelines for GUI-Based Testing in Open-Source

  • 1. ROR icon Blekinge Institute of Technology
  • 2. ROR icon Fortiss

Description

Replication package for the study "The Prevalence of Code Review Guidelines for GUI-Based Testing in Open-Source"

  • results.xls contains the final results of mapping code review guidelines to observed code review comments in pull requests.
  • repositories_all.zip includes a CSV file listing all identified GitHub repositories that meet our search criteria.
  • repositories-top100.xlsx provides an overview of the top 100 (by star count) repositories that are considered in the study. 
  • GitHub-Crawler-code-only includes the scripts used to gather all relevant data from GitHub.
  • GitHub-Crawler-with-data includes the scripts used to gather all relevant data from GitHub, with intermediate results from the crawling process, such as all pull requests and metadata for each repository. (~3.6 GB uncompressed)

Files

GitHub-Crawler-code-only.zip

Files (685.8 MB)

Name Size Download all
md5:be1a4607a3bc024b45ac9c0b561524bf
26.6 kB Preview Download
md5:3615ff023a1cda72feb9c9ad1db07180
683.3 MB Preview Download
md5:6ea3f28fdfec781878a612d0d1a9f92f
1.3 MB Preview Download
md5:d7c8153aa258d76dea0796def3783992
18.6 kB Download
md5:b878dd5e25c49405e1f160e576774bf8
1.2 MB Download

Additional details

Software

Programming language
Python