Image classification in Galaxy with fruit 360 dataset
Description
Credit: 'Fruit recognition from images using deep learning' by H. Muresan and M. Oltean (https://arxiv.org/abs/1712.00580)
Fruit 360 is a dataset with 90380 images of 131 fruits and vegetables (https://www.kaggle.com/moltean/fruits). Images are 100 pixel by 100 pixel and are RGB (color) images (3 values for each pixel). This dataset is a subset of Fruit 360 dataset, containing only 10 fruits/vegetables (Strawberry, Apple_Red_Delicious, Pepper_Green, Corn, Banana, Tomato_1, Potato_White, Pineapple, Orange, and Peach). We selected a subset of fruits/vegetables, so the dataset size is smaller and the neural network can be trained faster.
The utilities used to create the dataset, along with step by step instructions, can be found here: https://github.com/kxk302/fruit_dataset_utilities
First, we created feature vectors for each image. Each image is 100 pixel by pixel and are RGB (color) images (3 values for each pixel). Hence, each image can be represented by 30,000 values (100 X 100 X 3). Second, we selected a subset of 10 fruits/vegetables images (training and test dataset sizes go from 7 GB and 2.5 GB for 131 fruits/vegetables to 500 MB and 177 MB for 10 fruits/vegetables, respectively). Third, we created separate files for feature vectors and labels. Finally, we mapped the labels for the 10 selected fruits/vegetables to a range of 0 to 9.
Notes
Files
Files
(707.4 MB)
Name | Size | Download all |
---|---|---|
md5:d81787987d5f343d9439b09b33a5adfa
|
177.3 MB | Download |
md5:ab03cc8b06ca93940156cc7bc4bbdc5c
|
66.8 kB | Download |
md5:d4c7c63f40686d1eb35cf6f6715b6861
|
529.8 MB | Download |
md5:abe38c158980e8b2e92769a9c7d95997
|
220.7 kB | Download |