Itemlet Dataset: A Multi-project, Multi-domain, and Feature-engineered Dataset for Empirical Software Engineering
Description
Itemlet dataset represents a very large scale and a number of different projects of Jira issues (total 727282 items). This dataset was created by using 204 open-source projects that include 19 different areas or domains such as healthcare; finance; developer tools and e-commerce etc. Every row includes 108 variables. 60 of these were extracted from the Jira REST API based on an entire life cycle of an issue (sprint metadata), users involved with the issue and effort applied to resolve. The remaining 48 variables were generated through pre-computation techniques. These techniques have encoded collaboration dynamics; effort risk; temporal patterns and business value signals. Three formally defined predictive problems can be supported by this dataset. These are - effort estimation; issue prioritization and complexity classification. For each of these predictive problems, there is a ground truth that has been generated directly from one or more of the variables within the dataset. There are four concurrent efforts that have been identified in the dataset. These are - story point effort; cycle time effort; total time logged effort and completion time effort. Sprint metadata for 267203 issues is also included. In addition, there are domain labels that have been assigned to 19 different categories of data. All 108 variables in the dataset are included in the accompanying data dictionary. A dimension weighted average score of 94.75% has been achieved on all 15 fair criteria. The Itemlet dataset is made publicly available under cc-by 4.0 license along with the supplementary materials, a project summary file, a file containing the domain classifications for the data and a fixed requirements file.
Files
itemlet_dataset.csv
Files
(546.4 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:8f7905125bb240eee3f4d5d8aeb505b1
|
2.4 kB | Download |
|
md5:e62d3884e7deca03a106b86469ea9d48
|
16.0 kB | Preview Download |
|
md5:8c6502de0ab9aa41c5410f75a1085709
|
1.7 kB | Preview Download |
|
md5:6caa3c6f10eb9cf629f3f8e7014df772
|
49.0 kB | Preview Download |
|
md5:ea0426869723cd4792b37e928019bb81
|
438.5 MB | Preview Download |
|
md5:e80d4f6b521d5ace92c039fb49325480
|
107.9 MB | Download |
|
md5:3a49d0835731aab84f18ceca108ad0ae
|
16.4 kB | Preview Download |
|
md5:f8f4cf3ffd64c8cd663db3c0d8a07f05
|
18.8 kB | Preview Download |
|
md5:3788c7bd3b0b430fdb30627e863db640
|
322 Bytes | Preview Download |
|
md5:891de9fe9c9ddc0a1890ba4a4dcb8888
|
9.3 kB | Preview Download |