Info: Zenodo’s user support line is staffed on regular business days between Dec 23 and Jan 5. Response times may be slightly longer than normal.

Published October 12, 2020 | Version 1.0
Video/Audio Open

Understanding and Detecting Harmful Code

Description

Code smells typically indicate poor design implementation and choices that may degrade software quality. Hence, they need to be carefully detected to avoid such poor design. In this context, some studies try to understand the impact of code smells on the software quality, while others propose rules or machine learning based techniques to detect code smells. However, none of those studies or techniques focus on analyzing code snippets that are really harmful to software quality. This paper presents a study to understand and detect code harmfulness. We analyze harmfulness in terms of Clean, Smelly, Buggy, and Harmful code. By Harmful Code, we define a Smelly code element having one or more bugs reported.These bugs may have been fixed or not. Thus, the incidence of Harmful Code may represent a increased risk of introducing new defects and/or design problems during its fixing. We perform our study with 22 smell types, 803 versions of 13 open-source projects,40,340 bugs and 132,219 code smells. The results show that even though we have a high number of code smells, only 0.07% of those smells are harmful. The Abstract Function Call From Constructor is the smell type more related to Harmful Code. To cross-validate our results, we also perform a survey with 60 developers. Most of them (98%) consider code smells harmful to the software, and 85% of those developers believe that code smells detection tools are important. But, those developers are not concerned about selecting tools that are able to detect Harmful Code. We also evaluate machine learning techniques to classify code harmfulness: they reach the effectiveness of at least 97% to classify Harmful Code.While the Random Forest is effective in classifying both Smelly and Harmful Code, the Gaussian Naive Bayes is the less effective technique. Our results also suggest that both software and developers’ metrics are important to classify Harmful Code.

Files

SBES2020_Harmful_code.mp4

Files (44.1 MB)

Name Size Download all
md5:0d38f097cfe4726bb7748a1154f50a60
44.1 MB Preview Download