Published September 4, 2019 | Version 1.0
Journal article Open

IowaNet dataset for deep learning: 1 million samples with 10 land cover classes

  • 1. Iowa State University
  • 2. Embrapa
  • 3. University of Wisconsin-Madison

Description

Description:

The IowaNet dataset was developed for convolutional neural network applications. This benchmark dataset contains 1 million samples with 10 classes at 6 different resolution: 8 x 8, 16 x 16, 32 x 32, 64 x 64, 128 x 128, 256 x 256 pixels. The patch images are available in each zip file. For each CNN dataset, there are two types of image samples: natural (red, green, blue) and false-color (NIR, red, green). Note: unzipping all files (1 million) from a large zip will take minutes (or hours).

Application:

Once the user defines the spectral dataset (natural or false color), the reference label is stored in the auxiliary file, indicating the number and label for each patch image. For instance, false-color dataset (CNN8x8.rar) with 8 x 8 size has the following file ("IowaNet_CNN8_falsecolor.npz").

Sample code:

Below, I provide simple python commands to load the npz file and access the image name and corresponding land cover label. Note that labels are number to reference the land cover classes: (0) Barren, (1) Cropland, (2) Fallow, (3) Forest, (4) Grassland , (5) Lake, (6) River, (7) Road or pavement, (8) Shadow (tree or building shadow) and (9) Structures (buildings, residential).

Contact: vitors@iastate.edu

Python code:

# Load and access the .npz files with reference labels

import numpy as np
list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"]
ref_file = r'C:\IowaNet\CNN64x64\w64\falsecolor\IowaNet_CNN64_falsecolor_labels.npz'

loaded = np.load(ref_file)
X_train = loaded['x_train']
Y_train = loaded['y_train']

 

# Visualize the patch image and respective label. This helps understand the quality of samples.

from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
from PIL import Image
import os
path_train = r'C:\IowaNet\CNN64x64\w64\falsecolor'
ref_file = r'C:\IowaNet\CNN64x64\w64\IowaNet_CNN64_falsecolor_labels.npz'
list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"]

loaded = np.load(ref_file)
X_train = loaded['x_train']
Y_train = loaded['y_train']

# Create a dictionary for labels
labels = {}
labels['train']={}
for i, x in enumerate(X_train):
    labels['train'][x] = Y_train[i]

classes_n = [x for x in range(len(list_class))]
label_train = pd.DataFrame(list_class, index=classes_n)
label_train_v = np.array(label_train.to_dict()[0])


def visualize_samples(X_train, labels, label_train_v,path_train, id):
    img = X_train[id]
    label = labels['train'][img]
    img_path = os.path.join(path_train, img + '.jpg')
    imgi = np.array(Image.open(img_path), dtype="float32") / 255.0
    plt.imshow(imgi)
    labeltrans = label_train_v.tolist()
    print(labeltrans[label])
    plt.title(labeltrans[label])
    plt.pause(5)
i=123 # get the sample 123
# visualize the image and label.
visualize_samples(X_train, labels, label_train_v, path_train, i)

Files

Files (35.2 GB)

Name Size Download all
md5:70bae72fd802c4a9bad680e0f972429c
6.5 GB Download
md5:647337df02b590ef0d44eff87bccf8f5
1.2 GB Download
md5:81139c1a2fea2d9925a54dd8b7a5829a
22.4 GB Download
md5:c5b476b1c51331139c4c4bd050e1050b
1.5 GB Download
md5:00842a0369c9803eddd2a71b39bad3d9
2.5 GB Download
md5:83046daab175f66f0b344d1e9d508889
1.1 GB Download