IowaNet dataset for deep learning: 1 million samples with 10 land cover classes

doi:10.5281/zenodo.3385318

Published September 4, 2019 | Version 1.0

Journal article Open

IowaNet dataset for deep learning: 1 million samples with 10 land cover classes

1. Iowa State University
2. Embrapa
3. University of Wisconsin-Madison

Description:

The IowaNet dataset was developed for convolutional neural network applications. This benchmark dataset contains 1 million samples with 10 classes at 6 different resolution: 8 x 8, 16 x 16, 32 x 32, 64 x 64, 128 x 128, 256 x 256 pixels. The patch images are available in each zip file. For each CNN dataset, there are two types of image samples: natural (red, green, blue) and false-color (NIR, red, green). Note: unzipping all files (1 million) from a large zip will take minutes (or hours).

Application:

Once the user defines the spectral dataset (natural or false color), the reference label is stored in the auxiliary file, indicating the number and label for each patch image. For instance, false-color dataset (CNN8x8.rar) with 8 x 8 size has the following file ("IowaNet_CNN8_falsecolor.npz").

Sample code:

Below, I provide simple python commands to load the npz file and access the image name and corresponding land cover label. Note that labels are number to reference the land cover classes: (0) Barren, (1) Cropland, (2) Fallow, (3) Forest, (4) Grassland , (5) Lake, (6) River, (7) Road or pavement, (8) Shadow (tree or building shadow) and (9) Structures (buildings, residential).

Contact: vitors@iastate.edu

Python code:

# Load and access the .npz files with reference labels

import numpy as np
list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"]
ref_file = r'C:\IowaNet\CNN64x64\w64\falsecolor\IowaNet_CNN64_falsecolor_labels.npz'

loaded = np.load(ref_file)
X_train = loaded['x_train']
Y_train = loaded['y_train']

# Visualize the patch image and respective label. This helps understand the quality of samples.

from matplotlib import pyplot as plt
import pandas as pd
import numpy as np
from PIL import Image
import os

path_train = r'C:\IowaNet\CNN64x64\w64\falsecolor'
ref_file = r'C:\IowaNet\CNN64x64\w64\IowaNet_CNN64_falsecolor_labels.npz'
list_class = ["Barren", "Cropland", "Fallow", "Forest", "Grassland", "Lake", "River", "Road", "Shadow", "Structure"]

loaded = np.load(ref_file)
X_train = loaded['x_train']
Y_train = loaded['y_train']

# Create a dictionary for labels
labels = {}
labels['train']={}
for i, x in enumerate(X_train):
    labels['train'][x] = Y_train[i]

classes_n = [x for x in range(len(list_class))]
label_train = pd.DataFrame(list_class, index=classes_n)
label_train_v = np.array(label_train.to_dict()[0])


def visualize_samples(X_train, labels, label_train_v,path_train, id):
    img = X_train[id]
    label = labels['train'][img]
    img_path = os.path.join(path_train, img + '.jpg')
    imgi = np.array(Image.open(img_path), dtype="float32") / 255.0
    plt.imshow(imgi)
    labeltrans = label_train_v.tolist()
    print(labeltrans[label])
    plt.title(labeltrans[label])
    plt.pause(5)

i=123 # get the sample 123
# visualize the image and label.
visualize_samples(X_train, labels, label_train_v, path_train, i)

Files

Files (35.2 GB)

Name	Size	Download all
CNN128x128.rar md5:70bae72fd802c4a9bad680e0f972429c	6.5 GB	Download
CNN16x16.rar md5:647337df02b590ef0d44eff87bccf8f5	1.2 GB	Download
CNN256x256.rar md5:81139c1a2fea2d9925a54dd8b7a5829a	22.4 GB	Download
CNN32x32.rar md5:c5b476b1c51331139c4c4bd050e1050b	1.5 GB	Download
CNN64x64.rar md5:00842a0369c9803eddd2a71b39bad3d9	2.5 GB	Download
CNN8x8.rar md5:83046daab175f66f0b344d1e9d508889	1.1 GB	Download

	All versions	This version
Views	427	427
Downloads	184	184
Data volume	3.8 TB	3.8 TB

IowaNet dataset for deep learning: 1 million samples with 10 land cover classes

Creators

Description

Files

Files (35.2 GB)