Decision Support System on Computer Maintenance Management System Using Association Rule and Fisher Exact Test One Side P-Value

Decision-makinghas been growth rapidly and many methods can used. Thus, how to apply that methode not only fast but also right. One of implementationdecision making is decision support system in Computer Maintenance Management System (CMMS). This research using data test with Association Rule and Fisher Exact One Side P-Value from same problems in Computer Maintenance Management System (CMMS). Object from this research to get pattern of association between symptom and root cause, to prove relation those variable. Previous research prove any relation between that two variable when use Association Rule and Pearson Chi-Square but many rule must eliminated because not ineligable from Pearson Chi-Square. The meaning of this research to confirm proof of test relation between symptom and root cause in CMMS. And hope with this result of test can make strength hypothesis from previous research.


Introduction
Decision support system in computer engineering has applied by manufacturer company as tools to eliminate risk and cost that relate with it production. [Nor MZM] One of part from this model is knowledge management, around industrial company use this technique to eliminate risk and cost when production running with utilized software system to manage activity of technician [15]. Computer Maintenance Management System (CMMS) is system that use in maintenance mechine to keep running well without waiting some mechine damage first. And to eleminate risk and cost when maintenance some system in organization, system need integration of some process in management like maintenance, repair, inventory, scheduling, monitoring technology and database and all this condition found in CMMS. This like [5] ask about critical asset must be monitored closely and continously using a reliable and effective technique to assess operating condition in preventive maintenance to make sure job done well with best recomended solution. One of part in CMMS is Work Order (WO). Work order in manufacturing industri use to remind technician about problem that request by user to handle this soon and it record some problem description when the job start until finish. The first description is fill by requestor like problem description and after job done, technician fill description WO like symptom, root cause, solution, device that checked. Some of problem in WO can found in another WO in the last time when it has same problem, symptom, root cause and solution. This is called as warehouse data. How to utilize warehouse data to make best decision making for technician when found same problem but has different symptom, root cause or solution. And this is purpose of this research to search the best solution for problem in WO to help technician make decision making quickly, precisely and accurately when check some mechine or computer by recommended solution. WO as warehousing data could use to give knowledge to technician about some problem that still did not find the best solution. [15] That is like problem that even be repeatly because downtime or it still has concern to find correct root cause or solution, so that the solution was found by technician can be use by management to modificate and maintenance system. That is problem for this research how to find or grouping WO which has low until high accuration for solution to help technician to take action to solve the same problem in manufacturing industri associated with mechine, computer or management.
As mention above paragraph, the problem always related the presence of symptoms, root cause and solutions when used to make decision making. But many technician make decision without analysed previous problem from history data, so it has low accuracy when apply in another problem.Because when same problems arise could be the result have different symptom and root cause from previous problem. Thus, the need for tools to make decision making that useful for technician to associate symptom and root cause, so that recomended of solution have high accuracy and impact performance of technicians to handle some problem in daily activity. [15] The best system from organization is the system can reduce failure rate and increase rate of repair. [7], [8] One of way to get information quickly, precisely and accurately that need some methode has good use in statistic and dynamic to reference warehouse data to predict informasion in the future. Because that this research involve statistic methode and warehouse data to take action when get best solution in part of CMMS modul. This research involve dataof information technology infrastructure from Entreprise Resource Planning (ERP) about hardware problem, software problem and network problem in manufactory.
[Abdullah] Some methode like Fuzzy has adapted learning model using rule and can use rule based system to identify risk with differences input variable. But how about when that rule still not make sure if has high confidence to apply as rule based system for Fuzzy. So the first step is how to find rule that has high confidence, and association rule has criteria to make rule with best confidence. Research [4], [12] shows if assocotiation rule appropriate use for cause-effect data and strengthened by research [10] that used cause-effect to relate symptom and root cause. That research show presence of correlation between symptom and root cause when used association rule, pearson chi-square dan coeffisien correlation of phi. [13] have stated if size of 2x2 table used in contingency table and has expected count in one of cell is 0 or 5 that means two variabel is maximum dependent or that has strong correlation, and [13], [14] state that can't use pearson chi-squareif one of cell has expected count 0 or less then 5. In research [10] found some problem in that data has maximum dependent, so need test with another method. Moreover, that data ineligible for pearson chi-squareas confirmed in previous sentence.Cause of the research problem [10], frequencies some datathat has same symptom and root cause smaller than frequencies data that has different symptom and root cause.By this analysis, it is necessary to change correlation method so that problem has expected count 0 or less than 5 from 2x2 table could use for correlation. For this research keep using association ruleas first method to generate the best rule before using correlation method.And correlation method is fisher exact test one side p-value cause requirement of this method appropriate with the problem data. It only require data sample less than 40 and for 2x2 table found cell has expected count 0 or less than 5 that this research focussed. Research question is looking for correlation between symptom and root cause using association rule and fisher exact test one side p-valuein CMMS.

Method
This research use two method to correlate variable of symptom and root cause from work order in CMMS, that is association rule, fisher exact test one side p-value and phi coefficient correlation.

Assocciation Rule
This method is one of data mining method and used in process of establishing rules to filter data that has high confidence degree to strengthen correlation test. Researcher [2] state data mining is one set of process that consist of data in market basket analysis, and this data can proses using this method to get value of support and confidence. Based on that opinion, first step when use this method is defined frequency of itemset, minimum support and minimum confidence. The formulation of minimum support (1) and minimum confidence (2) (1)

1843
(2) Basically, suppport responsible statistically with significant data, while confidence only measure degree of strength of rule that formed. This is necessary step from this research to establish rule using association rule. a. Determining number of k-itemsetfor looping. b. Determining minimum number of frequency itemset at least has in transaction A, minimum support and minimum confidence. c. Transactions pattern when rule generated is "if(s-s) then s" in k-itemset. Suddenly that pattern would apply in second looping when using 2-itemset. In that pattern, s is first element and ss-sis second element to establist pattern of rule ( Figure 1 and Figure 2). d. Calculate value of support and confidence for each rule after establish before and eliminate some rule that has value of support and confidence less than minimum support and minimum confidence. e. Repeat process from b until d using next k-itemset to maximum k-itemset that predetermined in a process. Researh [11] stated the opinion that low value of supportindicate a rule is not important, but that rule can be important if has high value of confidence. When a rule eliminated, that is not indicate a rule not necessarily important. The rule could be important but only occurs very infrequently. Therefore, researcher has confirmed if low value of support does not have a rule eliminated, and it tenable as rule by using of test correlation to prove association between that variable [9] confirmed if there appears some problem in association rule, when determined minimun support are too high then many rare items not found and recomended use low minimum support to get it. This is the basis of research when use low value of support to establish rule by this method. And this method compatible with this reseach because it use cause-effect data for processing. Referingresearch [9], [15] state that this method appropriate to relate cause-effect to get solution in some problem.

Fisher Exact Test One Side P-Value
Fisher exact testis one of method from chi-square that used in some specific case. One of requirement this method is only use for 2x2 Table I that one of cell in table has expected count of 0 or less then 5 and sample of data not more than 40. In pearson chi-square all of requirement when use 2x2 table has not expected count in one of cell table is 0 and less than 5, and sample data more than 40. Therefore, this research used fisher exat test for correlation method. This method has two kind calculation to get result of correlation, that is using one side p-value and double one side p-value.Differences that method only when apply p-value, second method has p-value higher than first method cause. And need to know that fisher exact test has two way calculations when use formula (3) , that is calculate value of extreme deviation (when there is no cell has value of 0) and that does not use extreme deviation (if there any cell has value of 0). (3)

SSS-SS-S SS-S S
Where is notation for fisher exact test one side p-valuethen A, B, C, and Dis frequency for each cell in 2x2 table. Significant value for table in this method is often used of 0.05. Null hypothesis is accepted when p-value of fisher exact test one side p-value more than 0.05 and null hypotheis rejected when it less than 0.05. Null hypothesis in this research indicate that two variable the symptom and root cause have no association (independent).

Correlation Coefficient Phi
Phi coeefficient is used to compare two of attribute that has dichotomy data, where this data only has two point of scale that can measured.Dichotomydata is one of discret data that has categorical data or nominal data is resulted from calculation and not found decimal number. Each of dichotomy data grouped by categorical and given number as label, it is not as levelling number. This data divided two kind of value, that is true dichotomy and artificial dichotomy. Kind of true dichotomy is like gender, skin of colours, language, country and etc.Sample of attribute data in true dichotomy such as living or dead, black or white, accepted or rejected, successed or failed. While artificial dichotomy usually given a number format as label such as pass (1) and failed to pass (2), and when use numbering as label is not basic to use it such as pass (0) and failed to pass (4).
Coefficient phi has known as Yule (∅), where in statistic test this method is used to test strength of correlation between two variable. Using this method for this research associate with the result of fisher exact test one side p-value that used to find correlation between symptom and root cause. This formula (4) is used to find result of correlation coefficient phi ∅ (4) The notation (∅) is value of correlation coefficient phi, X² is value of pearson chi-square, then N is total of data sample from 2x2 table. And this method has value between -1 to 1. Another formula (5) for this method when use 2x2 table as illustrated in Table 1.
Where ∅ is value of correlation coefficient phi then A, B, C and D is frequency for each cell in 2x2 table. When the result of this method close to 0 that indicate correlation between two variable is getting weaker while the value closed to -1 or 1 that indicate correlation is getting stronger. Sign of positive (+) in correlation that show if a variable x increased then y goes up. While negative (-) signed if x value increased then y value goes down, when x value decreased then y value rise up.

Implementation
Some method was mentioned above is used to prove relation between variable of symptom and variable of root cause to get the best solution in a problem. Figure 3 is a flowchart to show knowledge base in this research.  All data problems in this research get from work order that available in Microsoft Dynamic AX as ERP system in manufacturing company. In a work order in this system consist of problem, symptom, root cause, solution and etc. But in this research only focused with two variable, there is symptom and root cause. Table 2 represented some sample data of work order with two variabel that will discussed. And from that datawill get that variable has relation or not. From each problem in Table 2 show that a problem can be occured in another time, so can be asked the issue repeatedly. And may be this issue still did not find the best solution or the problem occured in another mechine. Because of this, need a method that can use to get the best solution to filter symptom and root cause correctly. One of this method is association rule that filter some rule that has best value of support and value of confidence. Figure 4 shows calculation of rule to get value of support and confidence from a problem "can't open dynamic AX"that has symptom is "lost path from network and last path remove from network" and root cause is "new path create for AX-AD network".

. Calculation Support and Confidence
This research has a limit value of support and confidence when is used in association rule. Table 3 show the result number of rule that established with association rule and use minimum support and minimum confidence. Based on Table 3 the research [11] proves about using minimum support and minimum confidence in association rule that low support did not indicate a rule is not important. Next step after apply association rule to generate rule from all data in work order is fisher exact tes one side p-value. This step has aimed to get correlation between two variable of symptom and root cause. When using this correlation method to get the best solution in decision making, null hypothesis (H0) is symptom and root cause has not association (independence) then H1 is symptom and root cause has association.
After getting rule from assossation rule, next step is getting corellation between symtom and root cause. Look Figure 5 as example fisher exact test one side p-value in a problem of "Low access internet" that has the symptom is "Any host was using internet download manager" and the root cause is "Seen from the condition of the download rate is stable at a certain IP is continuously in mikrotik". Table 4 show sample of table wit cell is (3 0 0 4) from 2x2 table in that problem.  Table 4 indicate value of that method is 0.029. That means p-value of fisher exact test less than 0.05, so conclusion rejected null hypothesis. And that hypothesis has meaning of symptom in "Any host was using internet download manager" and root cause of "Seen from the condition of the download rate is stable at a certain IP is continuously in mikrotik" has correlation one another in a problem of "Low access internet".
Testing of correlation from two variable above has proven relate each other. Next step, testing strength of correlation between that variable use correlation coefficient phi.Calculation strength of correlation is showed in Figure 6. And from that calculation can be said if relation from those vaiables has a positive correlation and strong cause value of correlation coefficient phi to closed the value of +1. In Figure 7 show calculation from all methode that use data from Table 4.

Result and Discussion
This section explain the result and evaluation of method in this research. This study uses some tools to apply in the methods that involves such as XAMPP, SPSS, and QI Macross.

Dataset
This research used database of work order that get from CMS in Microsoft Dynamic AX. And involved about 712 data from all problem in this system. In each problem from WO has some variabel but in this research only used two variable when generate rule and get relation that is symptom and root cause.

Testing
Testing in this research involve three methode was explained before. First step is association rule, when generating rule by using minimum confidence between 0.1 until 0.99 and form minimum support between 0.001 until 0.027. And forthreshold of minimum frequency itemset use value of 3. Figure 7 show of calculation how to find support and confidence when using association rule and how to find fisher exact test one side p-value with different sample in 2x2 table. And it point of calculation for fisher exact depend on total sample in every cells and total sample for all cell not more than 40. Using random sample by using association rule minimum support between 0.001-0.006 and minimum confidence between 0.1-0.3, ther result of correlation between symptom and root cause illustrated in figure 8. And based on Figure 4 shows area of rejected H0 has insample table (12 12 0 3), (3 4 8 11), (12 12 0 3) and (3 5 0 4), where four of samples has a value of fisher exact test is greather than 0.05 from p-value table. And that sample was reinforced with a correlation coefficient phi is less than 0.5, which means it closer to a value of 0 and has meaning a weak correlation for two variable. From these test prove 89% of rule has relationship between symptom and root cause, and it seen from value of fisher exact test is less than 0.05 and almost all of sample have a correlation coefficient phi closer to 1 and has a value of 1.  Table   Table 8 show recomendation solution for symptom and root cause in problem of "Can't Print", where is has rating position based on fisher exact test one side p-value. Based on correlation of phi, all rule was generated by association rule then testing use fisher exact test one side p-value has strength correlation when use different minimum support and minimum confidence. But when see in the point of view fisher exact test one side p-value, value of this methode has unique pattern of rating position for recomended solution when using different minimum support and minimum confidence. In rule number 1 show if total rule is 3 but when increasing minimum confidence, total of rule will decrease that show in rule number 2. But when increasing minimum confidence value of fisher exact test increase significantly. And this also apply for minimum support when increasing. Look rule number 3 to number 2 when minimum confidence has constant value and increasing minimum support, some of rule in number 3 eliminate but rule show in number 2 has value of fisher exact test higher than number 3 that has a same rule. For example that is change value for symptom "No have access to printer" and root cause "No install printer driver" from 0,698951*10¯⁷to 205,6767*10¯⁷. The unique pattern also occur in rule number 3 to number 4 that is occur change positon of rule, see all rule in number 3 for symptom "Green lamp blink" and root cause "Printer need reset", that is change from first position with smallest value of fisher exact test one side p-value to last row in rule number 4 with higher value of fisher exact test. This prove if use minimum support and minmum confidence in association rule will effect with result of recomended solution for problem in this research.
Pattern of rule and rating position of rule in this research has closer a same result with research [10] although in this research focus with total data for 2x2 table less than 40 but in [10] research total data for 2x2 table is more than 40. But viewpoint about correlation, especially about recomended solution for variable symptom and root cause has a same result that two varible has correlation. Table 6 is example of result when problem "Can't print" in research [10]. Observe rule number 1 and number 2 in Table 5 when minimum support and minimum confidence was increase and what effect to position recommended solution especially "symptom" and "root cause" for this problem. In first row for rule number 1 was eleminated when increasing minimum support and minimum confidence.And effect to position second row of rule number 1 changed to first row in rule number 2. Based on this comparing this research and research [5] can observe if using minimum support and minimum confidence in association rule