Dataset Open Access
A Curated Dataset of 470,925 pull requests for 3349 popular NPM packages, description of the variables, code snippet for creating a Random Forest model for predicting pull request acceptance, and a pre-trained Random Forest model (in R). The dataset is for the ESEM-2020 paper: "Impact of Technical and Social Factors on Pull Request Quality for the NPM Ecosystem" (https://arxiv.org/abs/2007.04816).
Citation:
@inproceedings{dey2020effect, title={Effect of technical and social factors on pull request quality for the npm ecosystem}, author={Dey, Tapajit and Mockus, Audris}, booktitle={Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)}, pages={1--11}, year={2020} }
Name | Size | |
---|---|---|
Curated_Pull_Request_Data.csv
md5:9ba00c622679e3ae6d2ff12bde44e3e7 |
35.8 MB | Download |
description.pdf
md5:4c5c559bb644f3e2c991d71dc19932b5 |
40.4 kB | Download |
PRMODEL.Rdata
md5:ba1eb93c488e1090ab051901aeb370f2 |
258.4 MB | Download |
snippet.R
md5:5cf376f3114ce94edf9e16fddaa50185 |
841 Bytes | Download |
All versions | This version | |
---|---|---|
Views | 329 | 312 |
Downloads | 227 | 223 |
Data volume | 9.3 GB | 9.3 GB |
Unique views | 306 | 293 |
Unique downloads | 142 | 140 |