Published May 19, 2019 | Version 2.0
Dataset Open

JTeC: A Large Collection of Java Test Classes forTest Code Analysis and Processing

  • 1. Gran Sasso Science Institute
  • 2. Gran Sasso Science Institute & Vrije Universiteit Amsterdam
  • 3. Federal University of Pernambuco
  • 4. Consiglio Nazionale delle Ricerche

Description

The recent push towards test automation and test-driven development continues to scale up the dimensions of test code that needs to be maintained, analysed, and processed side-by-side with production code. As a consequence, on the one side regression testing techniques, e.g., for test suite prioritization or test case selection, capable to handle such large-scale test suites become indispensable; on the other side, as test code exposes own characteristics, specific techniques for its analysis and refactoring are actively sought. We present JTeC, a large-scale dataset of test cases that researchers can use for benchmarking the above techniques or any other type of tool expressly targeting test code. JTeC collects more than 2.5M+ test classes belonging to 31K+ GitHub projects and summing up to more than 430 Million LOCs of ready-to-use real-world test code.

Notes

Companion page for the JTeC dataset at https://github.com/JTeCDataset/JTeC

Files

JTeC.csv

Files (2.0 GB)

Name Size Download all
md5:c8ad1c386212621bb6975f964134623b
2.0 GB Download
md5:47ddd906e6ea88f7b51b2510b0f57c92
3.4 MB Preview Download
md5:5b4473596678d62d9d83096273422c8c
35.1 kB Download