Published May 26, 2020
| Version v2
Dataset
Open
A Dataset of Pull Requests and A Trained Random Forest Model for predicting Pull Request Acceptance
Description
A Curated Dataset of 470,925 pull requests for 3349 popular NPM packages, description of the variables, code snippet for creating a Random Forest model for predicting pull request acceptance, and a pre-trained Random Forest model (in R). The dataset is for the ESEM-2020 paper: "Impact of Technical and Social Factors on Pull Request Quality for the NPM Ecosystem" (https://arxiv.org/abs/2007.04816).
Citation:
@inproceedings{dey2020effect, title={Effect of technical and social factors on pull request quality for the npm ecosystem}, author={Dey, Tapajit and Mockus, Audris}, booktitle={Proceedings of the 14th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)}, pages={1--11}, year={2020} }