Published December 10, 2020 | Version v1
Dataset Open

Malaria Blood Smear Image Dataset Creation

Description

Dataset Creation

The dataset was collected from Tanzania. We sought out ethical clearance that gave permission to collect samples of patients that tested positive for malaria and as well as negative. The blood samples were stained and images were taken using the iPhone 6s  mounted on top of an Olympus microscope.  Afterward,  the images were labeled by the three Lab technologists by drawing bounding boxes around the malaria parasites and white blood cells. 

 

Ethical Statement

The nationally recognized ethics committee at The University of Dodoma and Benjamin Mkapa Hospital Research Center approved this research. It granted permission to isolate the samples of the positive cases so as to capture images for the purpose of this research. The data was collected from patients with suspected cases of malaria who willingly went to the hospital for diagnosis and treatment. We took images of what the lab technician was examining under the microscope. To avoid privacy violations of patients, no details about the patient identity were taken for this research rather than the images of their stained blood samples, age, gender, and location.

 

Sample Collected

The samples were collected from patients that had been requested, by a doctor, to receive a malaria test. A total of 40 cases were included in the present study, 20 patients had positive confirmation of having malaria. Their mean age was 23.7 years (SD: 17.9 years) and 44.3% of cases were males and 53.7% were females. The mean ages of positive confirmed cases and negative confirmed cases were 23.30±17.7 and 25.89±18.7 years, respectively. All cases were residents of the Morogoro Region in Tanzania which have a higher rate of malaria patients. 

 

Reagent Preparation

Before subjecting a blood sample to a microscope for observation and image capturing, it had to be stained using a reagent. For that case, a buffer solution using 1 liter of distilled water and 1 buffer tablet were prepared with the aim of making a 7.2 PH solution. Thereafter, a Giemsa working solution was prepared by taking 2.5 ml of Giemsa stain stock into 25 ml of water making a 10% concentration. The working solution was then filtered using a circle filter paper. After filtration, the samples were placed horizontally and stained for 10 minutes. The stained samples were washed by using tap-water and placed vertically using a staining rack for the water to run off. At this stage, the dried stained blood samples were ready for observation under a 100 magnification of Olympus microscope. 

 

Image Collection

This phase involved using a smartphone (iPhone 6s+) to capture images of stained blood smear that were observed under a microscope. A small portion of immersion oil was applied to the stained thick blood smear to enhance visibility. The slide was then placed under an Olympus CX 21 microscope for observation. The lens used had 100x magnification as recommended by the WHO (D Payne 1988). The microscope was continuously adjusted by a lab technician to ensure proper focus. At the same time, the iPhone 6s+ mobile phone was mounted to the microscope using the Labcam Microscope Adapter as shown in figure 1, and pictures were taken.

The standard malaria diagnosis involves a lab technologist examining not less than 100 fields for a single slide under observation (D Payne, 1988). Therefore, approximately 100 images were captured for every blood smear slide placed under observation. For the 100 patients, we had a total of 100,000 images captured with 5000 images from positive patients. All 5000 images from 50 positive (infected) patients required annotation (labeling of the parasites and white blood cells). On the other hand, the 5000 images from uninfected patients did not require any annotation. The images captured were in JPG format, with a resolution of  4302 X 3204 pixels and a size of approximately 1 MB. The images were stored in a folder labeled with a date the slide was taken followed by a sample number for identification of the image.

 

Image Annotation

A team of three experts from the College of Health Science of the University of Dodoma and Benjamin Mkapa Hospital performed the annotation of the 2000 images altogether. The images were annotated using the LabelImg annotation tool. The annotation involved creating bounding boxes for the plasmodium and white blood cell classes. Annotators were instructed to label a target class by drawing the smallest possible box that contains all the visible parts of the plasmodium and the white blood cells. The output of the annotation was a Pascal VOC XML file with specific details on where the image is stored, the size of the image, filename, and coordinates of bounding boxes of all objects present in the image (Plasmodium and white blood cells). The time taken to annotate a single image with a fewer number of parasites, this means less than 20 parasites, took approximately 2 minutes while for a case with a higher number of parasites approximately more than 100 parasites, took around 15 to 20 minutes for a single image. In general for one patient, it took about 8 hours to annotate an image of the stained blood sample. The table below shows a summary of the dataset that was created in this stage.

 

Files

malaria_dataset.zip

Files (148.9 MB)

Name Size Download all
md5:785f978f766047a7717ebb8dcaeedcdf
148.9 MB Preview Download