Apache POI pre-processed data for the first DocGen challenge at DySDoc 3
Creators
- 1. McGill University
- 2. The University of Texas at Dallas
- 3. University of Adelaide
- 4. Università della Svizzera italiana
- 5. University of Delaware
- 6. University of Victoria
- 7. Northern Arizona University
- 8. Nara Institute of Science and Technology
- 9. Tokyo Institute of Technology
- 10. Colorado State University
- 11. University of Alberta
- 12. ABB Corporate Research
Description
Apache POI pre-processed data for the first DocGen challenge
The pre-processed data for First Software Documentation Generation Challenge (DocGen), hosted at the Third International Workshop on Dynamic Software Documentation (DySDoc 3), includes the following datasets for Apache POI 3.17:
Call graph between method and classes.
File: call-graph-poi-3.17-all.zip
CSV file with the call graph between methods and between classes. Class A calls class B if there exists a call between amethod of class A and a method of class B. The call graph was produced by the tool java-callgraph.
The CSV file contains the following columns:
- call_type: call between (C)lasses or (M)ethods
- caller: the Fully Qualified Name (FQN) of the caller
- method_call_type: the type of method call:
- M for invokevirtual calls
- I for invokeinterface calls
- O for invokespecial calls
- S for invokestatic calls
- D for invokedynamic calls
- callee: the FQN of the callee
For more details about the format and each type of method call, check the tool README.
Inheritance hierarchy
File: poi-3.17-inheritance.zip
A CSV file with the inheritance hierarchy of POI, which was extracted using bcel 6.2
The CSV file contains the following columns:
- record_id: sequential number
- parent_class: the parent class
- child_class: the child class
- relationship_type: the type of relationship between classes, i.e., the child class 'extends' or 'implements' the parent class
Issues
File: bugzilla-poi-dump.zip
CSV file with the list of issues of Apache POI (timestamp: Tue Feb 27, 2018, 18.41.40 UTC)
The CSV file contains the following columns:
- record_id: sequential number
- issue_id: the ID that identifies the issue in the issue tracker
- issue_url: the URL of the issue in the issue tracker
- issue_title: the title of the issue
- xml_path: the path to the XML of the issue, which contains all the issue information provided by the issue tracker
All the issues in XML format can be found in the "poi" folder in the ZIP file
Commits
File: poi-commits.zip
A JSON file with commit information for POI 3.17 (until revision 219dff00e6, on Sept. 8, 2017). The information was extracted using the tools Historage and Kataribe.
For each commit, we provide:
- Commit hash
- Parent commit hash (if exists)
- Commit message
- Commit time
- Committer name
- Method-level changes (addition/deletion/modification/renaming and method FQN).
- The FQN contains information about the class (CN) and method (MT) or constructor (CS)
StackOverflow posts
File: apache-poi-SO.zip
JSON file with all 6,299 Stack Overflow threads with the apache-poi
tag,
Files
apache-poi-SO.zip
Files
(184.3 MB)
Name | Size | Download all |
---|---|---|
md5:d1323fce43b7bb2dd28c4371a871c795
|
12.4 MB | Preview Download |
md5:ed33d2df3151cc1c18db4cbf9f4d11f2
|
166.8 MB | Preview Download |
md5:94167dd47a30b2b995c9cbd4d866cf36
|
2.3 MB | Preview Download |
md5:d2da7a882e80164fa08d115c0629cb71
|
85.3 kB | Preview Download |
md5:39606442e9240c99fb3f570bb4c8b6b6
|
2.7 MB | Preview Download |