Automatic colon polyp detection using Convolutional encoder-decoder model

Colorectal cancer is a leading cause of cancer deaths, estimated 696 thousand worldwide. Recent years have seen an increase of deep learning techniques and algorithms being used to detect colon polyps. In this work we address colon polyp detection using Convolutional Neural Networks (CNNs) combined with Autoencoders. We use 3 publicly available databases namely: CVC-ColonDB, CVC-ClinicDB and ETIS-LaribPolypDB, to train the model. The results obtained in terms of accuracy are: 0.937, 0.951, 0.967 for the above-mentioned databases respectively. Due to the nature of the colon polyps, diverse shapes and characteristics, there is still place for improvements.


I. Introduction
In 2012 the International Agency for Research on Cancer identified Colorectal Cancer (CRC) as the 4th highest estimated number of deaths worldwide with around 696 thousand cases [1]. Nonetheless, the major concern is the elevated incidence of this cancer (3rd place with around 1.3 million cases) and the possibility to be prevented by conducting effective screening test [2][3][4]. Among the used screening tests to diagnose CRC, colonoscopy is the preferred technique for both, screening and prevention. Usually, colorectal cancer begins as a growth on the inner surface of the colon known as a polyp which can later develop into cancer. During a colonoscopy the medical personnel can identify and remove colon polyps/adenomas before they progress into colon cancer. Colonoscopy has been a successful preventative procedure and has contributed to a 30% decline in the incidence of colorectal cancer [2]. A more recent study on the impact of screening on colorectal cancer mortality and incidence shows that screening colonoscopy suggest an effect of greater than 50% reduction for CRC mortality [3]. Although colonoscopy has shown such progress in reduction of mortality and incidence of CRC, the miss rate of colon polyps remains still high. A retrospective observational study published in 2017 [4] demonstrates that among 659 patients, the miss rate of colorectal polyps was 17.24% (372 out of 2158 polyps), and 38.69% of patients (255 out of 659 patients) had at least 1 missed polyp. An unidentified polyp can lead to a late diagnosis of colon cancer, associated with a survival rate of less than 10% for metastatic colon cancer. Computer-aided polyp detection provide a tool to assist colonoscopists reduce polyp miss rates. In the recent years researchers have been working on the use of deep learning techniques to detect colon polyps [5][6][7][8][9]. Some examples of these reported contributions include Tajbakhsh et. al. using a pre-trained deep convolutional neural network (CNN) to detect colon polyps [5], Ribeiro et. al. with a method that allows the use of small patches (subimages) to increase the size of the database as well to classify different regions in the same image [6]; and the exploration of deep learning for the automated classification of colonic polyps using different configurations for training CNNs from scratch and distinct architectures of pretrained CNNs tested on 8-HDendoscopic image databases [7]. Also, authors in [8] evaluate and analyze the use of CNNs as a general feature descriptor doing transfer learning to generate CNN's features for the colon polyp classification; and developed a convolutional neural network to detect and classify two types of colon polyps, hyperplastic and adenomatous polyps by transferring different layers of low-level CNN features learned from two online nonmedical databases [9]. Our approach is based on a convolutional neural network combined with autoencoders. We tested the model on three colon databases, CVC-ColonDB [10], CVC-ClinicDB [11] and ETIS-LaribPolypDB [12] which due to its usage are now standard in the development of polyp detection techniques. These databases are open to the public for research purposes.  Figure 1. Some of them have the same texture as the colon and grow horizontally, leading to polyp misdetections.

B. Methods
We used TensorFlow library [13] for training the convolutional encoder-decoder model, which is the TensorFlow implementation of the SegNet architecture [14], a model firstly implemented using another deep learning framework, Caffe. All training was performed using TensorFlow 1.3.0 on CUDA 8 and NVIDIA Titan X GPU. The architecture of the CNN-Autoencoder algorithm is depicted in Figure 2. The first part represents the encoder and the second part the decoder. The encoder in such models, in many cases, has a structure similar to some image classification neural networks. Layers in the decoder are the inversed layers used in the encoder (e.g. for each convolution in the encoder part, there is the deconvolution layer in the decoder part; for the max_pool, some form of "demax_pool").
In the encoder part, we use three similar "modules", each consisting of convolution layer with stride 2 followed by convolution layer with stride 1 and no-overlapping max_pool with kernel 2. In the decoder section, each layer in the encoder contains its "counter-part". The network output dimension is equal to the input dimension: • for no-shrinking convolution layer use the same layer • for shrinking convolution layer use transposed deconvolution with same arguments • for max_pool layer use nearest neighbor upsampling (tf.image.resize_nearest_neighbor) Figure 2. Convolutional encoder-decoder architecture 1 These images are taken from the colon polyp database our group, eVIDA, is working. This database will be publicly available for researchers soon.
Medical image databases do not have large amounts of images in them. Many researchers use image augmentations when training a model on such databases. In our model we use the Imaug library [15], an open source image augmentation library in Python. The image shown in Figure 3 is the result when some augmenters are applied to an image. Some of the image augmentations that we used are: • Crop -crops away pixels, • Fliplr -flips the image from left to right, • Flipud -flips the image from up to down, • GaussianBlur -blurs images using a gaussian kernel with size s, • Dropout -sets pixels to zero with probability P, • AdditiveGaussianNoise -adds white/gaussian noise pixelwise to an image, • Affine -applies affine transformation to images, such as: scaling, translating, rotating and shearing, etc.

III. Results and Discussion
We trained our model from scratch using the selected databases. As each database has a different number of images, the time to train the model varied. Table 1 depicts, for each database, the best accuracy and in which batch that was achieved and the total training time. The best accuracy of 0.967 was achieved on ETIS-LaribPolypDB. Polyps have different shapes and characteristics, ranging from evident protuberances in the inner surface of the colon to barely distinguishable circular shapes, this wide variation induce to errors in its recognition. In Figure 4 we present an example from the results of the proposed approach. We have selected three polyps from each of the databases (ETIS, ClinicDB and ColonDB corresponding with the left, middle and right columns from Figure 4 respectively). The first row of Figure 4 depicts the polyps' grayscale images, it should be noted that the polyp in the left column is merely visible, while the polyp in the right column is an evident protuberance easily recognizable. In the second row, the database ground truth is presented. The last row shows the obtained segments when using the CNN-Autoencoder algorithm. These results are good and they correspont to the hight accuracy results we obtained. However, we have verified, visually looking at all the images we obtained, that some of the masks are not as good as expected. This is due to the shape and texture of the polyps, and the lighting conditions the image was taken. There are cases were the algorithm identifies healthy tissues as polyps, as depicted in the Figure 5.

IV. Conclusion
The model shows promises although the results are not as expected. The same model was used on two other medical image datasets, pressure ulcers and iris, and the results were much better than with colon polyp images. One of the reasons is the difficult nature of colon polyps. Some polyps have round shapes that are very distinct form the rest of the colon, and some have the same texture as the colon making them very difficult to detect. In this paper we only tested the algorithm without making any changes.
There is no image preprocessing either. We are currently making some changes on the model and we are adding four image augmentations that are not implemented yet in the Imaug library. We will train the new model with a new colon polyp database that the eVIDA group has been working in collaboration with some hospitals in the Basque Country.

V. Acknowledgement
Ornela Bardhi received funding from the European Union's Horizon 2020 CATCH ITN project under the Marie Sklodowska-Curie grant agreement no. 722012.