Published October 11, 2019 | Version 2.0
Dataset Open

Mining the Technical Roles of GitHub Users

  • 1. Montandon
  • 2. Silva
  • 3. Valente

Description

This dataset contains the scripts and dataset used in the study reported at Mining the Technical Roles of GitHub Users paper. The files are described in more detailed below:

  • processed_ground_truth.csv: A CSV file with the information of the developers considered in the study. Due to privacy issues, we already preprocessed the dataset to remove identification clues. Please contact the authors in case you need the original one.
  • processed_ground_truth_fullstack.csv: Same CSV file but with fullstack developers.
  • script.ipynb, utils.py: Source code of the script used in our study.
  • Dockerfile, docker-compose.yml, requirements.txt: Files to replicate the code environment used in this study.
  • BoW-tuning.csv: List of classifications results for different bag of words parameters.

Files

BoW-tuning.csv

Files (32.8 MB)

Name Size Download all
md5:129d88996d8db01ec8fae56f9c5e2771
7.4 kB Preview Download
md5:6e99b1c4dd52adc0197d1d6006db2890
341 Bytes Download
md5:37a8b04ae13f8a985416a0cb155c345b
349 Bytes Download
md5:18334c98e1ec6aac84068717371889cb
13.6 MB Preview Download
md5:a81c3873fc5a778a9f493da94446b5db
19.1 MB Preview Download
md5:9321b0ae73d2ef267541b8f720696c50
1.5 kB Preview Download
md5:43fd542071544ccc1e647f8126a56311
23.2 kB Preview Download
md5:ec7035aef2364ff9cf9d8f476082d49a
16.1 kB Download