Software Quality Metrics Calculations for Java Programming Learning Assistant System

A Web-based Java Programming Learning Assistant System (JPLAS) has been proposed to assist Java programming educations in universities. In the code writing problem, the correctness of an answer code from a student is verified by running the test code on JUnit. Besides, their quality should be measured using the metrics to assess them. The currently using plugin could only be measured on eclipse for offline answering in JPLAS. To calculate the metrics and implement in web-based JPLAS, there are several equations that have been reported. In this paper, we find the proper equations to calculate the metrics that provide the same results as from Eclipse plugin. The application results for 45 source codes showed that the adopted metrics equations provide the same results as the plugin.


INTRODUCTION
Java is a secure, portable, and platform independent programming language. Java engineers have been demand. Thus, Java is taught in many universities and colleges.
To assist Java educations, Java programming learning assistant system (JPLAS) has been developed. In JPLAS, the student can solve the exercise problems both on online and offline using Eclipse. JPLAS provides different types of Java programming exercises to cover the different learning levels. The code writing problem asks a student to write a whole source code that passes the given test code on JUnit. Besides, software metrics should be measured to assess the quality of the answer code [1].
Software metrics can evaluate the software product under developments, which gives a vision on the quality to make a whole process more successful. These metrics are called software quality metrics or product metrics. In the code writing problem, seven quality metrics can be measured for assessment of the code. They include NOCnumber of classes, NOMnumber of methods, CCcylomatic complexity, NBDnested block depth, LCOMthe lack of cohesion, MLOC -method lines of code and TLOCtotal lines of code [2]. They can be used for the size estimation, the complexity evaluation, and the maintenance of the code.
The currently using plugin can be used on Eclipsed to measure the metrics. Eclipse is used when the students solve the problems on offline. To measure the software metrics of a code, the equations to calculate them must be implementation in the JPLAS server. Unfortunately, there are several different equations to calculate some quality metrics. We need to find proper equation to calculate the metrics that provide the same results from Eclipse plugin.
In this paper, first, equations to calculate each metric are surveyed. Then, the proper one is selected by comparing the results with Eclipse plugin for 45 answer codes. The results showed that the calculated metrics by the selected equations are equal to those of Eclipse plugin.

II. REVIEW OF JPLAS
In this section, we review JPLAS.

A. JPLAS
In JPLAS, Ubuntu is used as the operating system that is running on VMware. For web server, Tomcat is used to run HTML, JSP and servlets. HTML-hypertext Markup Language is the language to create the webpages. JSP is a script combined with Java code and it is embedded to HTML. Servlet is a small Java program that runs on web server. For database, MySQL is used for managing the data.
JPLAS implements supporting functions for teachers and students. In the teacher function, the teacher can show the problems to students by uploading them to the server. In the student function, the student can answer the problems from a Web browser and submit them to the server on online.
Besides, they can answer the problems using Eclipse on offline after receiving the problems by e-mail or downloading the problems from JPLAS, and submit the answer via email.

B. Exercies Problems in JPLAS
In JPLAS, four types of problems are provided. The three problems are the types of fill-in-blank problems to understand the grammar and reading studies. The last Identify applicable funding agency here. If none, delete this text box. problem is for the code writing study. In this paper, the code writing problem is focused for the study of software metrics.

C. Fill-in-blank Problem
In this problem, the students need to fill the correct answers in blanks for a given Java code. The answer is marked by comparing with their original elements in the code. Thus, the original element must be unique correct answer for each blank. The blanks elements are identifiers, variable, reserved words and control symbols.

D. Value Trace Problem
This problem is another type of element fill-in-blank problem that keeps the process of filling and marking the answer. It questions a student about actual values of important variables in the code. The students need to fill the correct values of variables in blanks. In this problem, the output data of variables from the code execution is blanked. The output data may contain one or more values. It blanks the output data line by line.

E. Statement Fill-in-blank Problem
In this problem, the students need to fill the whole statement in a blank for a give Java code. The answer is marked by using the test code on JUnit as the code writing problem.

F. Code Writing Problem
In this problem, the students need to write the whole code as the information given in test code. Then, the answer code is tested through the test code on JUnit as test driven development method (TDD). Besides, their quality is accessed by measuring the seven quality metrics on Eclipse plugin.

G. JUnit
It is an open-source Java framework for unit testing on Java programming language and adopted in JPLAS. It is important in development of TDD -test driven development.
Test code is implemented on JUnit. Although, test code is a test code programming language, it is rather simple for the Java programmers. This reason is that JUnit has been designed for Java programming language. In JUnit, one test can be performed by using one method in the JUnit library. In the code writing problem, the method whose name starts with 'assert' is used to check the execution results by comparing with the expected method [3].

H. TDD Method
In the TDD method, the following process can be done. 1) The source code must be prepared first to write test code from them. Thus, the test code includes the information on model source code, it will be tested in later.

2)
Then, the answer code is written and tested it on JUnit through test code.
3) The source code can be re-factored until passing through the test code. Thus, the re-factoring process of a source code becomes easy, because the modified code can be tested instantly.

I. Metric Plugin
Metric plugin for Eclipse is commonly used open source software plugin for metrics calculation. This plugin can measure the various metrics on source code and their results are shown by number in metric view. This plugin is used to measure software metrics to assess the quality of the code for the code writing problem [4] [5].

III. SOFTWARE METRICS
Software metrics are the measurement and prediction of software products, which are essential resources for a project and products relevant for software evolutions. Measurements can be used throughout the software project for quality control by comparing the current measurements with past measurements for similar projects.

A. Overview of Metrics
There are many metrics that can be categorized into process and product metrics. Besides, due to the great interest in the use of object oriented languages, many object oriented design metrics has been proposed. Product metrics measure size, complexity, quality, and reliability of software product. Process metrics measure the various characteristics of the software development process. Object-oriented metrics measure the different aspects of object-oriented design, including complexity, cohesion, and coupling. Among them, product metrics and object-oriented metrics measure the quality of the code.

B. Product Metrics
Product metrics are known as quality metrics. They help improving the quality of the different system components, and comparisons between existing systems. Various kinds of product metrics have been proposed. They include reliability metrics, functionality metrics, performance metrics, usability metrics, cost metrics, size metrics, complexity metrics and style metrics. They are used to measure the properties of the software. Among them, some quality criteria can be used to predict a certain quality of the software [6] and they are as follows:  NOC -number of classes and DIT -depth of inheritance are measured to assess maintainability and reusability of the program.  LOC-lines of code is measured to assess the size of the code.  CC -cyclomatic complexity is measured to assess reliability of the program.

C. Object-Oriented Metrics
Object-oriented designs are more beneficial in software development environment. Object-oriented metrics are used to measure properties of object oriented designs. The objectoriented metrics measure on class and its design viz; localization, encapsulation, inheritance, polymorphism, and object abstraction techniques, which make the class unique.

A. Number of Classes (NOC)
NOC measures the number of classes within the application package. It is a measure of how many subclasses are going to inherit the methods in the parent class. If a class has many subclasses, it is regarded as the bad design. The lower value of NOC helps maintainability and complexity of codes.

B. Number of Methods (NOM)
NOM measures the number of methods within classes. The number of methods that are local to the class and only those methods can be measured.

C. Cyclomatic Complexity (CC)
CC measures the structural complexity of a procedure by counting the number of independent paths in a method. The paths represent the number of decision points in the code, which include if, while, do-while, for, switch-casedefaults, try-catch finally. The goal of CC is to evaluate the testability and maintainability of a software module [8].
The original complexity is calculated as follows: Where: CC = cyclomatic complexity E = the number of edges of the graph N = the number of nodes of the graph Then, the improve complexity is defined as the followings [9]: 1) If the source codes contain no decision points, their complexity would be 1 since there is only a single path through the code.

2)
If the code has a single IF statement containing a single condition, there would be two paths through the code, one path for TRUE and one path for FALSE. In above conditions, CC is calculated as follows: where: P = the number of connected components

3)
An alternate function is used when the cyclomatic complexity is applied to several subprograms at the same time. CC= E -N + P The following example code contains a single IF statement. Thus, it contains the two paths to evaluate the path as TRUE of FALSE. Firstly, the statements are transformed into a graph, where every piece of a statement is represented as a node and their flows (sequence of execution of statements) are represented as the edges. For the single program, P is always equal to 1 since it has a single exit point. The cyclomatic complexity may be applied to several subprograms at the same time, where P will be equal to the number of programs. Figure 2 shows the flow chart of the source code containing single IF statements.

D. Lack of Cohesion in Methods (LCOM)
LCOM measures the cohesiveness of a class. It represents the difference between two methods whose similarity is zero or not. LCOM can judge the cohesiveness among the class methods. There are several LCOM metrics. The LCOM takes its values in the range 0 to 1.
 If the two methods share at least one field, Q is increased by one. Otherwise, Q is increased by one. It is noted that P and Q are initialized by 0. LCOM is calculated on each pair of metrics as follows Where : ( ) = number of accessing times of attributes among the methods.
The following example code shows the accessing of attributes (member variables) among the methods. It contains two member variables: radius and colour, and three methods: getRadius, getColor and getData. In this example, radius is accessed three times by getRadius and getData. color is accessed one time by getColor.
By equation (4), LCOM is calculated on the number of accessing times of attributes among the methods. The results is 0 because the value of Q is greater than P where P=0 and Q=4. Then, by equation (5), LCOM is calculated by using the number of methods, attributes and the number of accessing times of attributes. The result is 0.4 where m=3, a=2 and mA=4.  Fig.4 shows the diagram for the methods that accessing the attributes. By equation (6), LCOM is calculated by using the number of methods, attributes and the number of accessing times of attributes. LCOM is 0.5 where m=3, a=2 and ( )=4 respectively by equation (6).
According to the equation (4), LCOM is only 0 or 1. According to equation and (5) and (6), LCOM decreases and close to 0, when the accessing times of attributes are more. On the other hand, LCOM increases and close to 1 when the accessing times of attributes are less. The declared variables should be accessed among the methods In this time, equation (6) is selected by the calculation result that is same with plugin in Section V. M3 radius E. Nested Block Depth (NBD) NBD represents the maximum nest depth in a method. The nest depth is given by the number of statements of blocks that are nested due to the use of control structure (branches and loops).

F. Total lines of Code (TLOC)
TLOC measures the total number of lines in the source code. It is calculated by counting on the executable lines, comment and empty lines.

G. Methodl lines of Code (MLOC)
MLOC represents the total number of method lines in method. It is calculated by counting on comments and empty lines.

V. COMPARISON OF METRIC RESULTS
In this section, we evaluate the seven quality metrics for the 45 source codes for five assignments from nine students in [11]. The first assignments ask the concepts of encapsulation, inheritance and polymorphism. The last two assignments ask the algorithms using those concepts for implementation it.

A. Calculated Metrics
Among the seven metrics, CC and LCOM have several equations to calculate them. For CC, the three equations (1), (2) and (3) are used. For LCOM, equation (6) is used.

B. Comparison of Metrics Values
The metrics values of CC and LCOM for each code are compared between the results by Eclipse plugin and those by the adopted equations to find the proper equations that give the same results. Other values represent the number of classes, methods, branches, and lines of code, which are unique. There is no equation for other metrics values. It can easily be calculated by counting the number of classes, methods, branches and lines of code as described in section IV-A, B, E, F, G. In this paper, the plugin result is represented as T1 and the equation result is represented as T2.
Tables 1-5 show the calculated metrics values of seven metrics for each code. NOC, NOM, NBD, TLOC and MLOC are calculated by counting the number of classes, methods, branches and lines in a code manually. CC is calculated by using Equation (1) (2) and (3). LCOM is calculated by Equation (6) for each code. Then, the values are compared between the plugin and calculated results.