Dataset Open Access

JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing

Corò, Federico; Verdecchia, Roberto; Cruciani, Emilio; Miranda, Breno; Bertolino, Antonia

Citation Style Language JSON Export

  "publisher": "Zenodo", 
  "DOI": "10.5281/zenodo.3711509", 
  "language": "eng", 
  "title": "JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing", 
  "issued": {
    "date-parts": [
  "abstract": "<p>The recent push towards test automation and test-driven development continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with production code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 2.5M+ test classes belonging to 31K+ GitHub projects and summing up to more than 430 Million LOCs of ready-to-use real-world test code.</p>", 
  "author": [
      "family": "Cor\u00f2, Federico"
      "family": "Verdecchia, Roberto"
      "family": "Cruciani, Emilio"
      "family": "Miranda, Breno"
      "family": "Bertolino, Antonia"
  "note": "Companion page for the JTeC dataset at", 
  "version": "2.0", 
  "type": "dataset", 
  "id": "3711509"
All versions This version
Views 928148
Downloads 1,606211
Data volume 155.9 GB76.0 GB
Unique views 799134
Unique downloads 1,270138


Cite as