Journal article Open Access
Isah Aliyu Kargi; Norazlina Bint Ismail; Ismail Bin Mohamad
Classification of cancer and selection of genes is one of the most important application of DNA microarray data. As a result of the higher dimensionality of microarray data, classification and selection of gene techniques are frequently employed to support the professional systems in the diagnosing ability of cancer with higher precision in classification. Least absolute shrinkage and selection operator (LASSO) is one of the most popular method for cancer classification and gene selection in high dimensional data. However, Lasso has limitations of being biased and cannot select variables more than the sample size (n) in gene selection and classification of high dimensional microarray data. To address this problems, LASSO-C1F was proposed using scale invariant measure of maximal information complexity of covariance matrix denoted with weight modifications as data-adaptive alternative to the fairly arbitrary choice of the regularization term in the least absolute shrinkage and selection operator (LASSO). The results indicated the effectiveness of the proposed method LASSO-C1F over the classical LASSO. The evaluation criteria result shows that the proposed method, LASSO-C1F has a better performance in terms of AUC and number of genes selected.