Published July 1, 2022 | Version v1
Journal article Open

Classification of specialities in textual medical reports based on natural language processing and feature selection

  • 1. Department of Electrical Engineering, College of Engineering, University of Karbala, Karbala, Iraq

Description

Nowadays, a great deal of detailed information about patients, including disease status, medication history, and side effects, is collected in an electronic format; called an electronic medical record (EMR), and the data serves as a valuable resource for further analysis, diagnosis, and treatment. The huge quantity of detailed patient information in these medical texts produces a huge challenge in terms of processing this data efficiently, however. Machine learning (ML) algorithms, artificial intelligence techniques, and natural language processing tools can have the potential effect of simplifying unstructured data, which could positively affect medical report analysis. Natural language processing (NLP) has recently made huge advances on a variety of tasks. In this paper, an automatic system was thus produced to classify specialist consultant interactions based on patients’ medical reports. NLP was used as a pre-processing step on a dataset formed of unstructured medical reports. Feature extraction and selection methods were used to convert the textual reports into sets of features and to extract the most effective features to increase classification accuracy and reduce execution time. Various classification methods were then applied (ML perceptron, logistic regression random forest (RF), and linear support vector classifier (LSVC)). The highest accuracy (99.39%) was achieved in ML-perceptron classification techniques.

Files

20 27023 v27i1 Jul22.pdf

Files (822.9 kB)

Name Size Download all
md5:ed1319ee013af3ba7e9c91f1a7a98f12
822.9 kB Preview Download