Dataset Open Access

JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing

Corò, Federico; Verdecchia, Roberto; Cruciani, Emilio; Miranda, Breno; Bertolino, Antonia

Dublin Core Export

<?xml version='1.0' encoding='utf-8'?>
<oai_dc:dc xmlns:dc="" xmlns:oai_dc="" xmlns:xsi="" xsi:schemaLocation="">
  <dc:creator>Corò, Federico</dc:creator>
  <dc:creator>Verdecchia, Roberto</dc:creator>
  <dc:creator>Cruciani, Emilio</dc:creator>
  <dc:creator>Miranda, Breno</dc:creator>
  <dc:creator>Bertolino, Antonia</dc:creator>
  <dc:description>The recent push towards test automation and test-driven development continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with production code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 2.5M+ test classes belonging to 31K+ GitHub projects and summing up to more than 430 Million LOCs of ready-to-use real-world test code.</dc:description>
  <dc:description>Companion page for the JTeC dataset at</dc:description>
  <dc:subject>Software Testing, GitHub, Test Suite, Large Scale</dc:subject>
  <dc:title>JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing</dc:title>
All versions This version
Views 929148
Downloads 1,607211
Data volume 155.9 GB76.0 GB
Unique views 800134
Unique downloads 1,271138


Cite as