There is a newer version of the record available.

Published March 11, 2016 | Version v1
Dataset Open

A corpus of Java projects representing the 2012 Ohloh universe

Creators

  • 1. Centrum Wiskunde & Informatica (CWI)

Description

A corpus of Java projects build using the Software Projects Sampling (SPS) tool by Naggapan et al [1]. SPS measures representativeness of a smaller corpus with respect to the universe (Ohloh 2012) in terms of diversity dimensions and constructs a maximally representative corpus by iteratively adding projects that would increase the representativeness most.

The corpus contains the source code of the projects selected at 1 June 2012. This is due to the fact that the Ohloh data on which the diversity was calculated was of that period.

 

[1]  M. Nagappan, T. Zimmermann, and C. Bird, “Diversity in software engineering research,” in ESEC/FSE. ACM, 2013, pp. 466–476.

Files

Files (1.6 GB)

Name Size Download all
md5:bb6c5ec83475f026456f2758e6eed212
1.6 GB Download

Additional details

Related works

Is identical to
urn:NBN:nl:ui:18-24399 (URN)