HackRep: A Large-Scale Dataset of GitHub Hackathon Projects
Authors/Creators
Description
Hackathons are time-bound collaborative events that often target software creation. Although hackathons have been studied in the past, these studies have been limited to in-depth studies of few events, limiting understanding of hackathons as a software engineering activity.
To complement the existing body of knowledge, we introduce HackRep, a dataset of 100,356 hackathon GitHub repositories. We illustrate the ways HackRep can benefit software engineering researchers by presenting a preliminary investigation of hackathon continuation, composition of hackathon teams, and the ability to estimate the geographical location of hackathons. In these investigations, we display the opportunities made possible with this dataset, for instance showing the possibility of estimating hackathon durations based on commit timestamps.
Files
Scripts.zip
Files
(2.5 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:9d9acd77e05b82c57d6a1053e75254fd
|
2.5 GB | Download |
|
md5:470cb811137078a0da4c4deeb6c11493
|
24.5 kB | Preview Download |