Published February 13, 2020 | Version 1.1
Dataset Open

MLCQ: Industry-relevant code smell data set

  • 1. Faculty of Computer Science and Management, Wroclaw University of Science and Technology

Description

The MLCQ data set with nearly 15000 code samples was created by software developers with professional experience who reviewed industry-relevant, contemporary Java open source projects. 

We expect that this data set should stay relevant for a longer time than data sets that base on code released years ago and, additionally, will enable researchers to investigate the relationship between developers' background and code smells' perception.

If you use this data set please cite the following paper:

Lech Madeyski and Tomasz Lewowski. MLCQ: Industry-relevant code smell data set. In Evaluation and Assessment in Software Engineering (EASE2020), April 15–17, 2020, Trondheim, Norway.ACM, New York, NY, USA, 6 pages, DOI: 3383219.3383264 URL: https://doi.org/10.1145/3383219.3383264

Note: Pre-print should be available soon from http://madeyski.e-informatyka.pl

Notes

This work has been supported by the National Centre for Research and Development (NCBR) project POIR.01.01.01-00-0792/16

Files

MadeyskiLewowskiMLCQAppendix.pdf

Files (9.7 MB)

Name Size Download all
md5:c7e658f9ffa8b89e10869c75255a0aae
212.6 kB Preview Download
md5:27dd24cf8fd8aa4118be1fa6fa06b02f
46.2 kB Preview Download
md5:5a60239229b3fedac953d3f0f6e4f02a
23.8 kB Download
md5:9019beae88089838d2d13dc12e69241b
7.5 MB Preview Download
md5:ab313c25a4bd57c79e64eb263a96e7de
1.9 MB Download