Dataset Open Access
Benjamin S. Meyers; Nuthan Munaiah; Emily Prud'hommeaux; Andrew Meneely; Cecilia O. Alm; Josephine Wolff; Pradeep K. Murukannaiah
This dataset was released as part of the following publication.
This is the full dataset containing over 1.5 million comments posted by developers reviewing proposed code changes. The dataset also includes the values we calculated for all nine linguistic features (described in Section 4 of the paper cited above).
This dataset is a subset of the chromium_conversations.csv dataset. It contains the data used in the classification experiment outlined in Section 5 of the paper cited above (2,994 comments automatically identified as acted-upon and 800 comments manually identified as not (known-to-be) acted-upon).
|All versions||This version|
|Data volume||186.2 GB||187.3 GB|