Pneumonia Detection Using Deep Learning Based on Convolutional Neural Network

Artificial intelligence has found its use in various fields during the course of its development, especially in recent years with the enormous increase in available data. Its main task is to assist making better, faster and more reliable decisions. Artificial intelligence and machine learning are increasingly finding their application in medicine. This is especially true for medical fields that utilize various types of biomedical images and where diagnostic procedures rely on collecting and processing a large number of digital images. The application of machine learning in processing of medical images helps with consistency and boosts accuracy in reporting. This paper describes the use of machine learning algorithms to process chest X-ray images in order to support the decision making process in determining the correct diagnosis. Specifically, the research is focused on the use of deep learning algorithm based on convolutional neural network in order to build a processing model. This model has the task to help with a classification problem that is detecting whether a chest X-ray shows changes consistent with pneumonia or not, and classifying the X-ray images in two groups depending on the detection results.

Abstract-Artificial intelligence has found its use in various fields during the course of its development, especially in recent years with the enormous increase in available data. Its main task is to assist making better, faster and more reliable decisions. Artificial intelligence and machine learning are increasingly finding their application in medicine. This is especially true for medical fields that utilize various types of biomedical images and where diagnostic procedures rely on collecting and processing a large number of digital images. The application of machine learning in processing of medical images helps with consistency and boosts accuracy in reporting. This paper describes the use of machine learning algorithms to process chest X-ray images in order to support the decisionmaking process in determining the correct diagnosis. Specifically, the research is focused on the use of deep learning algorithm based on convolutional neural network in order to build a processing model. This model has the task to help with a classification problem that is detecting whether a chest X-ray shows changes consistent with pneumonia or not, and classifying the X-ray images in two groups depending on the detection results.
Index Terms-artificial intelligence; convolutional neural network; deep learning; image processing, machine learning; pneumonia detection.

I. In t r o d u c t io n
Artificial intelligence (AI) was recognized as an academic discipline as early as in 1950s, but it was not widely explored by scientific community due to its limited practical feasibility for a long time [1]. In last couple of decades, because of increasing availability of processing power and emergence of big data, AI became a focal point of research and business discussions. AI went through various phases in its development, often regarded as "Seasons of AI". Until recently the best-known season of AI, was the period from the 1970s to the 2000s and the majority of this period is referred to as the "AI Winter" [2]. One of the main reasons hy AI could not develop at a fast pace during this period is the fact that computers did not have sufficient processing power to handle the work required [3]. The comeback of AI started when IBM developed their chess playing program, Deep Blue, which was able to beat world champion Gary Kasparov in 1997 [1]. In the following years AI started to develop rapidly, which resulted in development of new fields such as Machine Learning (ML) and Deep Learning (DL). In contrast to ML, DL uses a set number of traits and requires human input, and can be rained to classify data on its own [2].
Despite the fact that they were first described in the 1940s, Artificial Neural Networks (ANN) made a big comeback in 2016 when Google's program AlphaGo, based on DL, managed to beat the world champion in the board game Go [2]. This is considered to be one of the biggest milestones and successes of AI. For comparison in chess at opening there are 20 possible moves, but 361 in Go. This success further pushed AI research and ANN and DL algorithms are used for image recognition, speech recognition, sensor data processing, etc. It is no surprise that with its recent development AI, especially DL, and their application in medicine is growing rapidly, because of the ability of CNNs to successfully classify images [4,5]. Today AI is used in various medical fields, such as: gastroenterology [6], radiology [7], cardiology [8], endoscopy [2]. AI and ML are usually used in the fields that have the ability to build a large database of medical data, typically in a form of digital images, that can be used later for training models. The use of AI in support tools for processing medical images has been suggested in order to improve accuracy and consistency, and time efficiency in reporting [2].
In this paper, we have focused on pneumonia and the use of CNN based algorithm to process chest X-ray images. Pneumonia is an infection of the lower part of the respiratory tract in which the airways and lung parenchym are affected, and the alveolar spaces are consolidated. The annual incidence of pneumonia in children under the age of 5 in Europe is 34-40 per 1000. In developing countries, pneumonia is the leading cause of child mortality. X-rays of the lungs are very useful in diagnosing respiratory disorders and diseases in children [9]. Pneumonia is also detected in hospitalized patients infected with 2019-nCoV coronavirus [10], which further motivated us to explore and research this problem at the time of worldwide pandemic.

A. Images o f Chest X-Rays, The Dataset
The dataset used for this research is provided by Guangzhou Women and Children's Medical Center, Guangzhou and is openly available on Kaggle [11]. Before the analysis all the bad quality X-rays have been removed by the experts at the Medical Center. The rest has been classified by three experts in the field of radiology [11].
The dataset contains 5856 images of chest X-rays (JPEG). It is divided into three folders, named train, val and test, that are used as training, validation and testing data. The original dataset has only 16 images in the validation folder. For the purpose of experiments in this research, an 80/10/10 split has been performed. That means that 80% of the images is used as training data, 10% as validation data and 10% as test data. Therefore, the train folder for this experiment contains 4684 images, val folder 586 images and test folder 586 images. Each of these three folders contains two subfolders containing images diagnosed as pneumonia or normal. The subfolder names represent the data labels.
The images are high-quality and come in various sizes, but they were subsequently resized for training the model. Because there were more images of X-rays labeled as "Pneumonia" in the training dataset, in order to increase the number of training examples of "Normal" images, the data augmentation was used (techniques such as rotation, zoom, width and height shift). A couple of examples of X-ray images, with their corresponding labels are shown in Fig. 1.

B. Pre-processing o f Images, Dataset Preparation
Most of the time, the first step before building a model is the preprocessing of the imported data. The original images are RGB, but for the purpose of this experiment , they were imported as grayscale and resized to 200x200 pixels. After that, the pixel intensity values were normalized by dividing the pixel values with 255. This way the pixels in the image are represented with floating point numbers between 0 and 1, rather than integer numbers in the range 0 to 255. This should positively affect the performance of CNN [12].
As stated before, the data augmentation has been performed, because of the disbalance in the number of training examples of images showing pneumonia versus those that were normal. This was done because the degree of model overfitting is determined by both its power and the amount of training it receives, providing a CNN with more training examples of images diagnosed as normal, in order to reduce overfitting. In situations when there is no more available data of certain type it can be artificially created by zooming, asymmetrically cropping or rotating input images [12]. All of this can be performed with Keras's preprocessing tools [13].

C. Convolutional Neural Network and Tools Selection
The main tools used in this example were: Numpy, Pandas, Keras, Jupyter notebook, matplotlib and seaborn [14]. The training and testing of the model were performed locally, on a PC with the following hardware: AMD Ryzen 5 3600 CPU, Nvidia GeForce RTX 2060 sUPER GPU and 16GB of RAM at 3200MHz. Training the model would take up to 90 minutes.
The image classification was done with the use of a CNNbased machine learning algorithm. The CNN is a class of deep learning neural networks. CNNs represent a huge breakthrough in image recognition and classification, where they are most commonly used. They consist of few layers, such as: input, output and between them there are hidden layers. These hidden layers do the most work in terms of calculation. Convolutional layers can be found inside of the hidden layers [13]. Hidden layers can also consist of pooling layers and fully connected layers.
The most important building block of a CNN is the convolutional layer: each neuron in the convolutional layer is only connect to a small number of neurons (receptive field) in the next convolutional layer [13]. Such a design allows the network structure to focus on a small low-level feature in the first hidden layer, and then aggregate them into higher-level features in the next hidden layer, and so on [13]. This design is one the main reasons why CNNs are successfully used in image recognition.
The main use of pooling layers is to reduce the size of the input image without losing any important information. This is done to reduce the computational cost, and reduce memory usage. This process also reduces the number of parameters, which reduces the risk of overfitting. Figure 2 shows a max pooling layer, which is the most common type of pooling layer [12,13]. The max pooling has been used in this project as well.
The activation function used in our experiments was the ReLU activation function. ReLU behaves very good in deep neural networks, mostly because it does not saturate for positive values and because it is fast to compute. There are some variations of the ReLU activation function such as leaky ReLU, parametric leaky ReLU and SELU [13]. In addition, in the model creation, a dropout method was used to boost the performance. The dropout is a technique where some neurons are randomly shut down and are not used for that iteration. Just by simply adding a dropout, the network can get a 1-2% accuracy boost. Fig. 2. Illustration of the max pooling layer (2 x 2 pooling kernel, stride 2, no padding) [13] The architecture of the CNN used for this research is depicted in Figures 3 and 4. Figure 3 shows the printout of the model summary of the CNN model used for the experiments discussed in this paper. The details on the architecture of the CNN implementation are illustrated in Fig. 4.
Similar approaches were used in [15] and [16]. In [16], the same dataset was used, but the authors used the original split of the data into training, validation and test subsets. The progress of the training and validation accuracy and loss during time is shown in the Figures 5 and 6. They show that over time, both the training and validation accuracy were getting better, especially after the 12th epoch. Figure 6 shows that the validation loss has spikes in the last 30 epochs, especially at epochs 198, but at the end it come to a good level of near 0. The loss is a way to analyze how well the model is doing on both the training and validation set. It calculated how well the model is doing on each example in these sets and calculates the sum of errors made on them.  One additional way of representing the results of the model is to build a confusion matrix [13]. The Y-axis of the confusion matrix holds the predicted values, while the Xaxis holds the true values. The confusion matrix for our latest experiment is illustrated in Fig. 7. With the trained model, 334 out of 381 were accurately predicted as images of X-rays with pneumonia, while 187 out of 205 were accurately predicted as X-rays without pneumonia. This gives us a model accuracy of 88.90%, which is comparable to the results in [15,16]. This paper describes the use of deep learning in order to classify digital images of chest X-rays according to presence or absence of changes consistent with pneumonia. The implementation was based on CNN model using Python programming and scientific tools. Initial experiments show promising results, but more research is needed. Even though the model accuracy is relatively high, nearly 90%, there is a possibility of overfitting due to the size of the dataset. Also, the 90% accuracy means that the prediction model could potentially be used as a decision support tool, but there is still much work to be done. The proper diagnosis of any kind of disease still requires the involvement and presence of medical specialists. In order to build a good and reliable disease classification model, it is very important to gather as much data as possible.
Further research steps will include experimenting with various preprocessing and CNN configurations, data augmentation techniques, as well as using additional X-ray datasets with additional data labels showing other pathologies.