Feature selection of unbalanced breast cancer data using particle swarm optimization
Authors/Creators
- 1. Suez Canal University
Description
Breast cancer is one of the significant deaths causing diseases of women around the globe. Therefore, high accuracy in cancer prediction models is vital to improving patients’ treatment quality and survivability rate. In this work, we presented a new method namely improved balancing particle swarm optimization (IBPSO) algorithm to predict the stage of breast cancer using unbalanced surveillance epidemiology and end result (USEER) data. The work contributes in two directions. First, design and implement an improved particle swarm optimization (IPSO) algorithm to avoid the local minima while reducing USEER data’s dimensionality. The improvement comes primarily through employing the cross-over ability of the genetic algorithm as a fitness function while using the correlation-based function to guide the selection task to a minimal feature subset of USEER sufficiently to describe the universe. Second, develop an improved synthetic minority over-sampling technique (ISMOTE) that avoid overfitting problem while efficiently balance USEER. ISMOTE generates the new objects based on the average of the two objects with the smallest and largest distance from the centroid object of the minority class. The experiments and analysis show that the proposed IBPSO is feasible and effective, outperforms other state-of-the-art methods; in minimizing the features with an accuracy of 98.45%.
Files
43 1570681382 27112 EM 21feb22 24oct20 F.pdf
Files
(441.0 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a03797045339b3d84f4eec4be393af95
|
441.0 kB | Preview Download |