Published March 15, 2019 | Version v0.1.0
Dataset Open

A Panel Data Set of Cryptocurrency Development Activity on GitHub

  • 1. Carnegie Mellon University
  • 2. University of Evansville

Description

Contents:

  • all-sorted-recovered-normalized-2018-01-21-to-2019-02-04.csv: CSV format of all data, sorted by date. This file contains some imputed values for missing data, and all fields across all repositories and normalized to "null". This is the most convenient form to use.
  • all-sorted-2018-01-21-to-2019-02-04.csv: CSV format of all, sorted by date. It is the raw data after processing the raw format.
  • raw-data-2018-01-21-to-2019-02-04.tar.gz: The raw format of data collected (S-expressions). Contains additional contributor data and CoinMarketCap data not currently in the CSV datasets.
  • recovered.patch: The modification on all-sorted-2018-01-21-to-2019-02-04.csv after recovering (imputing) datashowing what was recovered.
  • recovered-normalized.patch: The modification of all-sorted-2018-01-21-to-2019-02-04.csv after normalizing the recovered data set. Thus, patching all-sorted-2018-01-21-to-2019-02-04.csv with recovered.patch, then recovered-normalized.patch gives all-sorted-recovered-normalized-2018-01-21-to-2019-02-04.csv
  • missing-dates.txt: Days for which we missed GitHub data collection (partial or completely).

Related publications:

@inproceedings{van-tonder-crypto-oss-2019, 
  title = {{A Panel Data Set of Cryptocurrency Development Activity on GitHub}},
  booktitle = "International Conference on Mining Software Repositories",
  author = "{van~Tonder}, Rijnard and Trockman, Asher and {Le~Goues}, Claire",
  series = {MSR '19},
  year = 2019
} 

@inproceedings{trockman-striking-gold-2019, 
  title = {{Striking Gold in Software Repositories? An Econometric Study of Cryptocurrencies on GitHub}},
  booktitle = "International Conference on Mining Software Repositories", author = "Trockman, Asher and {van~Tonder}, Rijnard and Vasilescu, Bogdan",
  series = {MSR '19},
  year = 2019
}

Related code: https://github.com/rvantonder/CryptOSS

Files

all-sorted-2018-01-21-to-2019-02-04.csv

Files (6.3 GB)

Name Size Download all
md5:546004c45e2bad619f003c8954178e9c
204.2 MB Preview Download
md5:f47f9c115e42261a9bd44cbc489fba3a
378.9 MB Preview Download
md5:98b8f5edb0d0c6fe976b7da8b3e905a5
484 Bytes Preview Download
md5:42172152b23447682251d03ea5e9b8e9
5.4 GB Download
md5:5e2cfb9e48f96cffedb538dfd4b98816
310.1 MB Download
md5:4e6c58da1bf16a62b856f5395e81fb31
9.5 MB Download