Published 2024 | Version 1.0.0
Dataset Open

OpenSendaiBench: A Benchmark Dataset of Building Exposure and Vulnerability Dynamics for EO-based Auditing of Global Disaster Risk

  • 1. ROR icon University of Cambridge
  • 2. UKRI Centre for Doctoral Training (CDT) in the Application of Artificial Intelligence to the study of Environmental Risks (AI4ER)
  • 3. Cambridge University Centre for Risk in the Built Environment (CURBE)
  • 4. ROR icon Deutsches Zentrum für Luft- und Raumfahrt e. V. (DLR)
  • 5. ROR icon University of Bonn

Description

This Zenodo repository is the official global dataset for the research poster "Global Mapping of Exposure and Physical Vulnerability Dynamics in Least Developed Countries using Remote Sensing and Machine Learning” at 2nd Machine Learning for Remote Sensing Workshop12th International Conference on Learning Representations (ICLR) in Vienna, Austria, on 11th of May 2024. The GitHub repository of Python codes can be accessed here: github.com/riskaudit/OpenSendaiBench. The following technical info is from the four-page paper of this research poster. If you have any inquiries or would like to access any related materials, please feel free to visit my website (joshuadimasaka.com) or our project website (riskaudit.github.io), follow our project's GitHub repository (github.com/riskaudit), or send an email to jtd33@cam.ac.uk.

Technical info (English)

1. National Census-Derived Exposure Data

We rasterized every country-wide point dataset of building counts from the METEOR project with a defined physical vulnerability type at a spatial resolution of 15 arcseconds or approximately 500 meters at the equator (Huyck et al., 2019). We then implemented a rigorous probability-based approach in extracting 100 square tiles for each country. In sampling these 100 square tiles per country, we considered the number of physical vulnerability types that are present in every pixel to ensure that every label including those unlabeled pixels is represented. 

2. Time-Series Satellite Imagery

With the previously extracted geographical extents, we obtained the following pre-processed time-series satellite imagery via Google Earth Engine (Gorelick et al., 2017).

2.1. Sentinel-1 SAR GRD

At 10-m spatial resolution, we used the annual mean of the Ground Range Detected (GRD) scenes that are acquired from the dual-polarization C-band Synthetic Aperture Radar (SAR) instrument at 5.405GHz of Sentinel-1 satellite (Copernicus Sentinel data, 2024a). As a result, covering the years from 2015 to 2023, we extracted nine annual mean of the two bands:

  • VV (vertical transmit, vertical receive) and
  • VH (vertical transmit, horizontal receive) signals.

To avoid data incompleteness across large areal extent, we disregarded filtering by orbital number and satellite direction. We also note that there are countries such as Angola, Comoros, Ethiopia, Kiribati, and Tuvalu with either partially or fully complete VV and VH signals because the orbit of Sentinel-1 satellite does not cover these areas for some time or only a single VV signal is available.

2.2. Sentinel-2 Harmonized MSI

With similar spatial resolution at 10 meters, we also extracted the annual median of the atmospherically corrected surface reflectance signals represented by the red, green, and blue (RGB) bands that are acquired from the MultiSpetral Instrument (MSI) of Sentinel-2 satellite (Copernicus Sentinel data, 2024b). The aggregation by year also allows us to filter out and minimize the unnecessary cloudy or shadowy signals using the available and corresponding Sentinel-2 cloud probability dataset (Copernicus Sentinel data, 2024c). Unlike Sentinel-1 SAR GRD, the resulting six annual median maps from 2018 to 2023 are all available for 47 countries (Note: Bhutan and Vanuatu already graduated from LDC status).

3. File and Folder Structure

Each <countryCode>.zip file has the following file and folder structure.

├───extent
│   └───<countryCode>_<nth>_of_<totalTiles>_<index>.geojson
├───groundtruth
│   └───<countryCode>_nbldg_<vulnerabilityCode>_<nth>_of_<totalTiles>_<index>.tif
└───obsvariables
    ├───SENTINEL1-DUAL_POL_GRD_HIGH_RES
  │   └───<countryCode>_<nth>_of_<totalTiles>_<index> │       ├───<year>_VV.tif
   │       └───<year>_VH.tif     └───SENTINEL-2-MSI_LVL2A       └───<countryCode>_<nth>_of_<totalTiles>_<index> └───<year>_RGB.tif

4. Custom Download Individual Country

The public may download the entire data repository or individual countries under the "Files" tab. We also provided the following customized download hyperlinks and expanded description of every country. 

  1. AFG: Afghanistan
  2. AGO: Angola
  3. BDI: Burundi
  4. BEN: Benin
  5. BFA: Burkina Faso
  6. BGD: Bangladesh
  7. BTN: Bhutan (graduated from LDC status in December 2023)
  8. CAF: The Central African Republic
  9. COD: The Democratic Republic of the Congo
  10. COM: The Comoros
  11. DJI: Djibouti
  12. ERI: Eritrea
  13. ETH: Ethiopia
  14. GIN: Guinea
  15. GMB: The Gambia
  16. GNB: Guinea-Bissau
  17. HTI: Haiti
  18. KHM: Cambodia
  19. KIR: Kiribati
  20. LAO: The Lao People's Democratic Republic
  21. LBR: Liberia
  22. LSO: Lesotho
  23. MDG: Madagascar
  24. MLI: Mali
  25. MMR: Myanmar
  26. MOZ: Mozambique
  27. MRT: Mauritania
  28. MWI: Malawi
  29. NER: The Niger
  30. NPL: Nepal
  31. RWA: Rwanda
  32. SDN: The Sudan
  33. SEN: Senegal
  34. SLB: Solomon Islands
  35. SLE: Sierra Leone
  36. SOM: Somalia
  37. SSD: South Sudan
  38. STP: Sao Tome and Principe
  39. TCD: Chad
  40. TGO: Togo
  41. TLS: Timor-Leste
  42. TUV: Tuvalu
  43. TZA: United Republic of Tanzania
  44. UGA: Uganda
  45. VUT: Vanuatu (graduated from LDC status in December 2020)
  46. YEM: Yemen
  47. ZMB: Zambia

5. References

  • Copernicus Sentinel data. Sentinel-1 SAR GRD: C-band Synthetic Aperture Radar Ground Range Detected, log scaling. https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S1_GRD, 2024a. Accessed: 2024-02-01.
  • Copernicus Sentinel data. Harmonized Sentinel-2 MSI: MultiSpectral Instrument, Level-2A. https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_SR_HARMONIZED, 2024b. Accessed: 2024-02-01.
  • Copernicus Sentinel data. Sentinel-2: Cloud Probability. https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2_CLOUD_PROBABILITY, 2024c. Accessed: 2024-02-01.
  • Joshua Dimasaka, Christian Geiß, and Emily So. Global mapping of exposure and physical vulnerability dynamics in least developed countries using remote sensing and machine learning [Poster], 2nd ML for Remote Sensing Workshop, 12th ICLR. Vienna, Austria. 11 May. 2024.
  • Noel Gorelick, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore. Google earth engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 2017. doi: 10.1016/j.rse.2017.06.031. URL https://doi.org/10.1016/j.rse.2017.06.031.
  • C Huyck, Z Hu, P Amyx, G Esquivias, M Huyck, and M Eguchi. METEOR: Exposure data classification, metadata population and confidence assessment. report m3. 2/p. 2019.

Notes (English)

Not all countries have the same set of physical vulnerability types and satellite imagery signals. Instead of downloading and manually checking each file, we recommend reviewing this spreadsheet first to understand the available features and groundtruth labels of each country. 

  • Some countries do not have both VV and VH, but single VV only. This means that some areas are not covered by the satellite orbits.
  • We do not suggest using the 2018 median map for the Sentinel-2 Harmonized MSI because most do not fully cover the defined extent or mask. This may be because of the limited spatiotemporal coverage of the satellite.

Files

_reference.pdf

Files (60.0 GB)

Name Size Download all
md5:14122999da3b0cba9fb5caa477717465
21.9 kB Download
md5:83d9e1fdc92ed01130ad72d9727f61b0
158.6 kB Preview Download
md5:6564e9021e52b704be98e50722fc5d4f
1.4 GB Preview Download
md5:3af2490abeabdce15eb153a7f933b405
1.2 GB Preview Download
md5:3996eb4a9e7784e9fa7208c4959249c3
1.3 GB Preview Download
md5:a19fca492d56f70e3de3d8ccb13a05cc
1.3 GB Preview Download
md5:c3b569679ef9b8b84a2c988455b3f0ab
1.4 GB Preview Download
md5:412b1a63489a63d95e3c62e3bc43640c
1.3 GB Preview Download
md5:2646318daf306c9c2b2bb4c003324433
1.4 GB Preview Download
md5:aad2ff437c45359444161f7a4eb3fa95
1.3 GB Preview Download
md5:b6158fd665fe58be9671529fe2845215
1.3 GB Preview Download
md5:df89673e8447022988947ff1a44a1c3f
1.1 GB Preview Download
md5:e60168d50e50a9f9ab3bff249ea9a3cc
1.4 GB Preview Download
md5:3d5d7f3e5de7fa68a109f601d95995f0
1.3 GB Preview Download
md5:f0800ad64af1963843e1f53406204653
1.3 GB Preview Download
md5:f3c788de822c7d5ea1e0494c91511e89
1.4 GB Preview Download
md5:e99560023859800993003a26c52208fb
1.4 GB Preview Download
md5:772f23b1bd258089aaebc720b69a30fc
1.4 GB Preview Download
md5:a26f889caa06c6d854418d2564f24d27
1.4 GB Preview Download
md5:3dd19b549d84eebe531ad1d09fd7f30f
1.4 GB Preview Download
md5:74c604910d645cf12ba3aee87ba84991
404.9 MB Preview Download
md5:867855e86553a492048dc111bcb5c034
1.4 GB Preview Download
md5:daff8185a1b6959bfc036b13701b4b72
1.4 GB Preview Download
md5:0392b1eeccac380beccc473fa0848564
1.4 GB Preview Download
md5:ae1df2fbc1eaf36a7fc7009f91ef146d
1.4 GB Preview Download
md5:479ecf14bddac4d5ac71c2e7080b1a04
1.4 GB Preview Download
md5:e3b90ee44aa171e6bde9a5968a4536f6
1.4 GB Preview Download
md5:13753c17d73757fc87ebfaf1fd07eba8
1.3 GB Preview Download
md5:3d72b8ca46847d64d11ddcacb394194f
1.4 GB Preview Download
md5:2757aa17949d0fc230131b1a25feb51e
1.3 GB Preview Download
md5:02558b9a6b9826baaa050305c1ee26c2
1.4 GB Preview Download
md5:8d196cd8d555a3d7f3aa09516e4c2006
1.4 GB Preview Download
md5:d34701227663fb28c6e45c82bb4330e3
1.3 GB Preview Download
md5:7e7d54309dd3da15eeceb244b8aa6df0
1.3 GB Preview Download
md5:288833cebfee7f3139860cbb3728cd9b
1.4 GB Preview Download
md5:95a3656f74c5401b433fbb7785a334e8
1.1 GB Preview Download
md5:3bce4d610a6efb0fb2b7036551523296
1.4 GB Preview Download
md5:b7fe7dcab332170a1e6a62b03fa32e52
1.3 GB Preview Download
md5:e07dd276d0e89187d06cfadffb9055b8
1.3 GB Preview Download
md5:3003125d2173700e99ebc24c07f33da3
539.3 MB Preview Download
md5:1f41e28979f06738b8e8f03c490d4d13
1.4 GB Preview Download
md5:58376de51f7ebb61d03958ddbda33b0b
1.4 GB Preview Download
md5:08c2afb0ccffb70dc66b84f3986e6c56
1.3 GB Preview Download
md5:b7e9067fba1004fc0ec00a96ad8a4b05
32.2 MB Preview Download
md5:ada028e1384b5a4ad5b3365116a95dce
1.3 GB Preview Download
md5:54ab91cb6d08aa752ef089c727652850
1.3 GB Preview Download
md5:8a0fa8f2e62e6da550b3f824c7d7baa0
1.3 GB Preview Download
md5:1253fd9662114b03d98a484ae58970d1
1.4 GB Preview Download
md5:b54c9e2b3bffa3217f9b84f7b7507ad5
1.3 GB Preview Download

Additional details

Related works

Is cited by
Conference paper: arXiv:2404.01748 (arXiv)
Poster: 10.5281/zenodo.10907137 (DOI)

Funding

UK Research and Innovation
UKRI Centre for Doctoral Training in Application of Artificial Intelligence to the study of Environmental Risks (AI4ER) EP/S022961/1

Software

Repository URL
https://github.com/riskaudit/OpenSendaiBench
Programming language
Python
Development Status
Wip