There is a newer version of this record available.

Dataset Open Access

JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing

Corò, Federico; Verdecchia, Roberto; Cruciani, Emilio; Miranda, Breno; Bertolino, Antonia

The recent push towards test automation and test-driven development continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with production code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 750K test classes belonging to 40K+ GitHub projects and summing up to more than 130 Million LOCs of ready-to-use real-world test code.

Companion page for the JTeC dataset at https://github.com/JTeCDataset/JTeC
Files (1.5 GB)
Name Size
JTeC.csv
md5:3dd3fdd364dffc81e738d8e1dd747364
4.5 MB Download
JTeC.zip
md5:c0ec31063e775f5bccee232b84879eb6
1.5 GB Download
LICENSE
md5:daf5dc54f3672cda8f71a642d2030688
19.0 kB Download
929
1,607
views
downloads
All versions This version
Views 929646
Downloads 1,6071,368
Data volume 155.9 GB24.3 GB
Unique views 800573
Unique downloads 1,2711,120

Share

Cite as