IowaNet dataset for deep learning: 1 million samples with 10 land cover classes
- 1. Iowa State University
- 2. Embrapa
- 3. University of Wisconsin-Madison
Description
Description:
The IowaNet dataset was developed for convolutional neural network applications. This benchmark dataset contains 1 million samples with 10 classes at 6 different resolution: 8 x 8, 16 x 16, 32 x 32, 64 x 64, 128 x 128, 256 x 256 pixels. The patch images are available in each zip file. For each CNN dataset, there are two types of image samples: natural (red, green, blue) and false-color (NIR, red, green). Note: unzipping all files (1 million) from a large zip will take minutes (or hours).
Application:
Once the user defines the spectral dataset (natural or false color), the reference label is stored in the auxiliary file, indicating the number and label for each patch image. For instance, false-color dataset (CNN8x8.rar) with 8 x 8 size has the following file ("IowaNet_CNN8_falsecolor.npz").
Sample code:
Below, I provide simple python commands to load the npz file and access the image name and corresponding land cover label. Note that labels are number to reference the land cover classes: (0) Barren, (1) Cropland, (2) Fallow, (3) Forest, (4) Grassland , (5) Lake, (6) River, (7) Road or pavement, (8) Shadow (tree or building shadow) and (9) Structures (buildings, residential).
Contact: vitors@iastate.edu
Python code:
# Load and access the .npz files with reference labels
import numpy as np list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"] ref_file = r'C:\IowaNet\CNN64x64\w64\falsecolor\IowaNet_CNN64_falsecolor_labels.npz' loaded = np.load(ref_file) X_train = loaded['x_train'] Y_train = loaded['y_train']
# Visualize the patch image and respective label. This helps understand the quality of samples.
from matplotlib import pyplot as plt import pandas as pd import numpy as np from PIL import Image import os
path_train = r'C:\IowaNet\CNN64x64\w64\falsecolor' ref_file = r'C:\IowaNet\CNN64x64\w64\IowaNet_CNN64_falsecolor_labels.npz' list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"] loaded = np.load(ref_file) X_train = loaded['x_train'] Y_train = loaded['y_train'] # Create a dictionary for labels labels = {} labels['train']={} for i, x in enumerate(X_train): labels['train'][x] = Y_train[i] classes_n = [x for x in range(len(list_class))] label_train = pd.DataFrame(list_class, index=classes_n) label_train_v = np.array(label_train.to_dict()[0]) def visualize_samples(X_train, labels, label_train_v,path_train, id): img = X_train[id] label = labels['train'][img] img_path = os.path.join(path_train, img + '.jpg') imgi = np.array(Image.open(img_path), dtype="float32") / 255.0 plt.imshow(imgi) labeltrans = label_train_v.tolist() print(labeltrans[label]) plt.title(labeltrans[label]) plt.pause(5)
i=123 # get the sample 123 # visualize the image and label. visualize_samples(X_train, labels, label_train_v, path_train, i)
Files
Files
(35.2 GB)
Name | Size | Download all |
---|---|---|
md5:70bae72fd802c4a9bad680e0f972429c
|
6.5 GB | Download |
md5:647337df02b590ef0d44eff87bccf8f5
|
1.2 GB | Download |
md5:81139c1a2fea2d9925a54dd8b7a5829a
|
22.4 GB | Download |
md5:c5b476b1c51331139c4c4bd050e1050b
|
1.5 GB | Download |
md5:00842a0369c9803eddd2a71b39bad3d9
|
2.5 GB | Download |
md5:83046daab175f66f0b344d1e9d508889
|
1.1 GB | Download |