Published December 12, 2024 | Version v1.0.2
Dataset Open

A Dataset of bots and human activities in NumFOCUS

  • 1. ROR icon University of Mons

Description

This repository provides a dataset of public GitHub events performed by 24,400 contributors and higher-level activities performed by the 853 most active contributors (34 bot accounts, 13 GitHub Apps, 4 built-in GitHub services and 802 human accounts) that were active in GitHub repositories belonging to NumFOCUS organisations during July, August and September 2024. This dataset is used for an empirical study in the paper titled Observing bots in the wild: A quantitative analysis of a large open source ecosystem published at the 6th International Workshop on Bots in Software Engineering (BotSE) 2025. DOI: https://www.doi.org/10.1109/BotSE67031.2025.00008. This research paper is co-authored by Natarajan Chidambaram and Tom Mens (Software Engineering Lab, University of Mons, Belgium). This work is supported by Service Public de Wallonie Recherche under grant number 2010235 - ARIAC by DigitalWallonia4.AI, and by the Fonds de la Recherche Scientifique – FNRS under grant numbers J.0147.24 and T.0149.22.

 

Files Description:

RawEvents.zip: Contains the raw events that were performed by contributors in GitHub repositories belonging to NumFOCUS organisations

activities.csv: A CSV file containing the activities of all the contributors

basic_features.csv: A CSV file containing the basic features (contributor type, number of activities performed, number of repositories contributed to, number of organisations involved with) for all contributors

activities_per_activity_type.csv: A CSV file containing number of activities for each activity type that contributors performed in GitHub repositories belonging to NumFOCUS organisations 

Files

activities.csv

Files (40.2 MB)

Name Size Download all
md5:ecd6ccce381452bfc197ef5f110b1d3b
23.3 MB Preview Download
md5:00156aea5795a0c00d717a3c05f77082
50.3 kB Preview Download
md5:002e444e93490a8a9dd4bef5c8630ec7
19.4 kB Preview Download
md5:49f9b758168222fa90bcd0b3cb42a156
16.8 MB Preview Download