Published April 9, 2025 | Version v1
Dataset Open

Building footprtints from 1970s Hexagon spy satellite images for four global urban growth hotspots

  • 1. ROR icon University of Wisconsin–Madison
  • 2. EDMO icon University of Wisconsin-Madison
  • 3. ROR icon Transylvania University of Brașov

Description

This dataset features building footprints derived from 1970s very-high resolution KH-9 Hexagon spy satellite imagery using a Mask R-CNN deep learning object detection approach for four sites: San Diego County (USA), Madison (USA), Harare (Zimbabwe), and Hyderabad (India). It also contains contemporary building footprint data from Microsoft’s building footprint layer (https://github.com/microsoft/GlobalMLBuildingFootprints) as a reference.

Corresponding publication

Franz Schug*; Neda K. Kasraee, Akash Anand, MacKenzy T. Growth-Price, Mihai D. Nita, Afag Rizayeva, Volker C. Radeloff. Quantifying multi-decadal urban growth using Hexagon spy satellite imagery and deep learning building detection across four global cities (in review). Landscape and Urban Planning.

Temporal extent

The data contains data representative for ca. 1975 and ca. 2020.

Data, data format, and units

KH-9 Hexagon data (“hex_images”) are provided as geotiff files with a spatial resolution of ~ 0.6 – 1 m. The coordinate reference systems (CRS) are local UTM projections (San Diego County: Zone 11N, EPSG 32611, Madison: Zone 16N, EPSG 32616, Harare: Zone 36S, EPSG 32736, Hyderabad: Zone 44N, EPSG 32644).

Microsoft building footprint data (“ms_buildings”) are provided as vector shape files. The data were clipped from Microsoft’s global building footprint layer (https://github.com/microsoft/GlobalMLBuildingFootprints) and are provided as a reference for ca. 2020. CRS correspond to the CRS of Hexagon data. The data are also provided in a rasterized version with 2-m and 300-m spatial resolution (nearest neighbor resampling).

Study site extents (“study_sites”) are provided as vector shape files.

Training chips (“training_chips”) contain image chips and labels used for training the Mask R-CNN models. The metadata format is “Mask RCNN Masks”. The data include no-feature tiles.

The models (“models”) are trained Mask R-CNN models ready to be used in ESRI ArcGIS Pro version 3.2.1. Please refer to the publication for details about model parameterization.

The detected buildings (“detected_buildings”) are provided as vector shape files and represent building footprints from ca. 1975 Hexagon imagery using the provided Mask R-CNN models. The data represent the final results, that means, after merging models with different chip sizes and post-processing (see manuscript). The data are also provided in a rasterized version with 2-m and 300-m spatial resolution (nearest neighbor resampling).

Processing environment

This research has been conducted using Python for ESRI ArcGIS Pro version 3.2.1 and the TensorFlow package. We conducted our analysis on a server with an NVIDIA A100 Tensor Core GPU (40GB, PCIe), a Dual AMD EPYC 7513 CPU with 2.6GHz and 128 threads in total, and 1 TB RAM (RDIMM, 3200MT/s Dual Rank).

Further information

For further information, please see the publication or contact Franz Schug (fschug@wisc.edu). Visit the website of SILVIS lab, University of Wisconsin-Madison (https://silvis.forest.wisc.edu/) to learn more.

Please check the corresponding github repository for additional data and code: https://github.com/franzschug/hexagon_bld_footprints

Acknowledgments

This study was supported by the NASA Land Cover and Land Use Change Program under agreement 80NSSC21K0310, the NASA IDS program under agreement 80NSSC24K0303, and the USDA McIntire Stennis Program. 

Files

Schug_Hexagon_Building_Footprints.zip

Files (13.6 GB)

Name Size Download all
md5:9b0e2ec7db6aa032c05dcf201fe4d8c0
13.6 GB Preview Download

Additional details

Funding

National Aeronautics and Space Administration
80NSSC21K0310
National Aeronautics and Space Administration
80NSSC24K0303
United States Department of Agriculture