Published August 25, 2020 | Version v1
Dataset Open

A ground-truth dataset to identify bots in GitHub

  • 1. Software engineering lab, University of Mons

Description

This dataset is a ground truth dataset we used to identify bots. Each account in this dataset is rated by at least 3 raters with high interrater agreement.

===

This dataset is outdated (it was created in 2020) and therefore no longer recommended for use. Many of the classified GitHub bot accounts are no longer active or even available today, and some may even have changed their status from bot to human (or conversely) since. If you want to use a ground-truth dataset of bot accounts for academic (or other) purposes, we therefore recommend to use a more recent and more complete dataset of GitHub bot accounts. Such a dataset can be found here:

https://doi.org/10.5281/zenodo.7740520

===

Files

Files (81.1 kB)

Name Size Download all
md5:b2f970611a844bbd9e915836d87d693c
81.1 kB Download