Published March 1, 2023 | Version v1
Journal article Open

Comparison of machine learning models for breast cancer diagnosis

  • 1. Mustansiriyah University

Description

Breast cancer is the most common cause of death among women worldwide. Breast cancer can be detected early, and the death rate can be reduced. Machine learning (ML) techniques are a hot topic for study and have proved influential in cancer prediction and early diagnosis. This study's objective is to predict and diagnose breast cancer using ML models and evaluate the most effective based on six criteria: specificity, sensitivity, precision, accuracy, F1-score and receiver operating characteristic curve. All work is done in the anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries, and pandas and matplotlib. This study used the Wisconsin diagnostic breast cancer dataset to test ten ML algorithms: decision tree, linear discriminant analysis, forests of randomized trees, gradient boosting, passive aggressive, logistic regression, naïve Bayes, nearest centroid, support vector machine, and perceptron. After collecting the findings, we performed a performance evaluation and compared these various classification techniques. Gradient boosting model outperformed all other algorithms, scoring 96.77% on the F1-score.

Files

42 21654.pdf

Files (391.2 kB)

Name Size Download all
md5:2d1f5e58da5257dd16713b71c852fb04
391.2 kB Preview Download