Comprehensive High-Resolution Eggplant Leaf Image Dataset for Plant Disease Detection
Creators
Description
This is a comprehensive version of the Eggplant Leaf Image Dataset, designed to support machine learning and deep learning research in agriculture, plant pathology, and computer vision. This dataset addresses class imbalance and model generalization challenges by including a significantly expanded collection of images through controlled data augmentation.
The dataset includes a total of 2,180 high-resolution images (6000×4000 pixels), categorized into six disease or health classes of Solanum melongena (eggplant) leaves:
Class | Original Images | Augmented Images | Total Images |
---|---|---|---|
Healthy | 80 | 320 | 400 |
Insect-Pest | 40 | 320 | 360 |
Leaf-Spot | 50 | 300 | 350 |
Mosaic-Virus | 15 | 345 | 360 |
Small-Leaf | 20 | 340 | 360 |
Wilt | 50 | 300 | 350 |
All original images were captured using a Canon EOS 1300D DSLR camera under consistent natural lighting conditions. Files are saved in JPG format, and image resolution is preserved within ±5% of the original dimensions to maintain visual fidelity.
To improve dataset usability for robust model training and generalization, controlled data augmentation was applied using the Albumentations library. The transformations include random rotation, horizontal flipping, brightness/contrast adjustments, slight color shifts, and padding to maintain aspect ratio. All augmentation procedures were consistently applied and seeded for reproducibility. Augmentation parameters are documented in detail in the metadata.
The metadata.csv file provides a class-wise summary including original image count, augmented image count, augmentation ratios, and the exact augmentation pipeline used. The augmentation was seeded for reproducibility.
Note: The original and augmented images are stored in separate folders under the "Original" and "Augmented" directories, respectively. Each directory is organized into six class-specific subfolders: Healthy, Insect-Pest, Leaf-Spot, Mosaic-Virus, Small-Leaf, and Wilt. Augmented images are clearly distinguishable by the inclusion of the substring "_aug_" in their filenames. This clear separation ensures reproducibility, transparency in data provenance, and ease of use for researchers who may wish to train models using only original, only augmented, or both types of data.
Files:
- EggplantLeaf-ImageDataset.zip — Contains all files and folders, inclusind Original, Augmented, metadata and readme.
- OriginalC — Contains only raw field-captured images grouped by class.
- Augmented — Contains synthetically expanded datasets, also organized by class. Augmented filenames include the marker "aug" for easy identification.
- metadata.csv — Class-level summary and augmentation details.
- Readme.md — Technical documentation and usage notes.
Files
Additional details
Dates
- Collected
-
2025-04-12