Published 2026 | Version v1
Dataset Open

Decision-support system for live detection of Leishmania parasites from microscopic images with Deep Learning

  • 1. ROR icon Technical University of Applied Sciences Wildau
  • 2. ROR icon University of Tripoli
  • 3. Al-Quds University
  • 4. ROR icon Kharkiv National University of Radio Electronics
  • 5. ROR icon Robert Koch Institute
  • 6. University of Tripoli Faculty of Medicine

Description

1. Overview  

This dataset consists of microscopic images of Giemsa-stained skin smears obtained from patients diagnosed with cutaneous leishmaniasis (CL).  
It is organized into two main parts:  

  • Dataset 1  → Collected with a Keyence BZ9000E digital microscope (lab-based).  
  • Dataset 2  → Extended dataset including all images from Dataset 1, plus an additional set collected with a Bresser Erudit DLX microscope (portable, low-cost).  

Both datasets contain paired Images (.png) and Labels (.txt), split into train, val, and test subsets.

2. Data Acquisition

Dataset 1  

  • Patients: 244 Libyan CL patients (confirmed by PCR at Tripoli University Hospital)
  • Samples: Skin lesion smears (slit-skin or touch smears)
  • Preparation: Air-dried, methanol-fixed, Giemsa-stained slides
  • Imaging Setup:  
    • Microscope: Keyence BZ9000E (lab-grade)  
    • Magnification: 100× oil immersion objective  
    • Numerical Aperture (NA): 1.3  
    • Resolution: 0.21 μm  
  • Image Count:  
    • 350 positive images (parasite densities: 1–100 amastigotes per image)  
    • 220 negative images (no parasites, controls)  
    • Total: 570 images 

Dataset 2  

  • Patients: Additional cohort (6 patients)
  • Samples: Same smear preparation method
  • Imaging Setup:  
    • Microscope: Bresser Erudit DLX (portable, battery-powered)  
    • Camera: BRESSER MikroCam SP 5.0  
    • Magnification: 100× oil immersion objective  
    • Numerical Aperture (NA): 1.25  
    • Resolution: 0.22 μm  
  • Image Count:  
    • 106 positive images  
    • 58 negative images  
    • Total: 164 images   

👉 Dataset 2 folder = Dataset 1 images + Dataset 2 images (extended dataset). 

3. Directory Structure

Dataset_1/
│
├── Images/
│   ├── train/   # dataset_1_image_1.png ... dataset_1_image_398.png
│   ├── val/     # continues numbering from train
│   └── test/
│
└── Labels/
    ├── train/   # dataset_1_image_1.txt ...
    ├── val/
    └── test/

Dataset_2/
│
├── Images/
│   ├── train/   # contains both dataset_1 and dataset_2 images
│   ├── val/
│   └── test/
│
└── Labels/
    ├── train/
    ├── val/
    └── test/
  • Naming Convention
    • dataset_1_image_X.png for Dataset 1 images  
    • dataset_2_image_X.png for Dataset 2 additional images  
    • Labels follow the same numbering with .txt extension  
  • Splits:
    • Train, validation, and test sets are sequential
    • Example: Train = images 1–398, Val = 399–…, Test = continues onward

4. Labels & Schema  

  • Image format: .png
  • Label format: .txt (YOLO-style bounding boxes)
  • Each line = one object (parasite body)

Format: 

  • class_id:  
    • 0 = parasite  
  • Coordinates are normalized by image width and height.  
  • After class_id, the values are given in pairs: (x1 y1 x2 y2 … x4 y4)
  • Every .png has a corresponding .txt file in the same split (train/, val/, test/).

5. Dataset Relations  

  • Dataset 1 = base dataset (lab microscope, high-quality).  
  • Dataset 2 = superset (Dataset_1 + portable microscope data).  
  • Train/Val/Test subsets are disjoint (sequential indexing prevents data leakage).  

Files

Leishmania_Dataset.zip

Files (2.7 GB)

Name Size Download all
md5:791a2387b6c596636bd17fb15f19007c
2.7 GB Preview Download

Additional details

Software

Repository URL
https://github.com/ZKI-PH-ImageAnalysis/leishmania
Programming language
Python
Development Status
Concept