Published January 28, 2022 | Version 1.0
Dataset Open

Zurich Summer Dataset

  • 1. University of Edinburgh

Description

The "Zurich Summer v1.0" dataset is a collection of 20 chips (crops), taken from a QuickBird acquisition of the city of Zurich (Switzerland) in August 2002. QuickBird images are composed by 4 channels (NIR-R-G-B) and were pansharpened to the PAN resolution of about 0.62 cm GSD. We manually annotated 8 different urban and periurban classes : Roads, Buildings, Trees, Grass, Bare Soil, Water, Railways and Swimming pools. The cumulative number of class samples is highly unbalanced, to reflect real world situations. Note that annotations are not perfect, are not ultradense (not every pixel is annotated) and there might be some errors as well. We performed annotations by jointly selecting superpixels (SLIC) and drawing (freehand) over regions which we could confidently assign an object class.

The dataset is composed by 20 image - ground truth pairs, in geotiff format. Images are distributed in raw DN values. We provide a rough and dirty MATLAB script (preprocess.m) to:

i) extract basic statistics from images (min, max, mean and average std) which should be used to globally normalize the data (note that class distribution of the chips is highly uneven, so single-frame normalization would shift distribution of classes).

ii) Visualize raw DN images (with unsaturated values) and a corresponding stretched version (good for illustration purposes). It also saves a raw and adjusted image version in MATLAB format (.mat) in a local subfolder.

iii) Convert RGB annotations to index mask (CLASS \in {1,...,C}) (via rgb2label.m provided).

iv) Convert index mask to georeferenced RGB annotations (via rgb2label.m provided). Useful if you want to see the final maps of the tiles in some GIS software (coordinate system copied from original geotiffs).

Some requests from you

We encourage researchers to report the ID of images used for training / validation / test (e.g. train: zh1 to zh7, validation zh8 to zh12 and test zh13 to zh20). The purpose of distributing datasets is to encourage reproducibility of experiments.

Acknowledgements

We release this data after a kind agreement obtained with DigitalGlobe, co. This data can be redistributed freely, provided that this document and corresponding license are part of the distribution. Ideally, since the dataset could be updated over the time, I suggest to distribute the dataset by the official link from which this archive has been downloaded.

We would like to thank (a lot) Nathan Longbotham @ DigitalGlobe and the whole DG team for his / their help for granting the distribution of the dataset.

We release this dataset hoping that will help researchers working in semantic classification / segmentation of remote sensing data in comparing to other state-of-the-art methods using this dataset as well in testing models on a larger and more complete set of images (with respect to most benchmarks available in our community). As you can imagine, it has been a tedious work in preparing everything. Just for you.
 

If you are using the data please cite the following work

Files

Zurich_dataset_v1.0.zip

Files (138.4 MB)

Name Size Download all
md5:c79a08ec33cb867d155931f5201e33a0
138.4 MB Preview Download

Additional details