Published March 26, 2024 | Version v1
Conference paper Open

Sen2Fire: A Challenging Benchmark Dataset for Wildfire Detection using Sentinel Data

  • 1. ROR icon Linköping University
  • 2. Maxar Technologies (Sweden)

Description

Abstract

Utilizing satellite imagery for wildfire detection presents substantial potential for practical applications. To advance the development of machine learning algorithms in this domain, our study introduces the Sen2Fire dataset--a challenging satellite remote sensing dataset tailored for wildfire detection. This dataset is curated from Sentinel-2 multi-spectral data and Sentinel-5P aerosol product, comprising a total of 2466 image patches. Each patch has a size of 512×512 pixels with 13 bands. Given the distinctive sensitivities of various wavebands to wildfire responses, our research focuses on optimizing wildfire detection by evaluating different wavebands and employing a combination of spectral indices, such as normalized burn ratio (NBR) and normalized difference vegetation index (NDVI). The results suggest that, in contrast to using all bands for wildfire detection, selecting specific band combinations yields superior performance. Additionally, our study underscores the positive impact of integrating Sentinel-5 aerosol data for wildfire detection.

Data Description

The training set is composed of data from study areas 1 and 2, the validation set is derived from study area 3, and study area 4 is designated as the test set. It’s important to note that there is no overlap between different study areas to prevent potential label leakage issues during the model training process. Following this, each study area is partitioned into 512×512 patches, with an overlap of 128 pixels between adjacent patches. After this tiling process, the training, validation, and test sets consist of 1458, 504, and 504 patches, respectively. Each patch is a composite of 13 bands that include:

  • Multispectral data from Sentinel-2: B1, B2, B3, B4, B5, B6, B7, B8, B9, B10, B11, B12.

  • Aerosol index from Sentinel-5P: B13.

The ground-truth labels are collected based on

All collected data is resampled to achieve a consistent 10-meter spatial resolution. The Sen2Fire dataset is structured as follows:

│── scene1/
│   ├── scene_1_patch_1_1.npz
│   ├── scene_1_patch_1_2.npz
│   ├── ...
│   ├── scene_1_patch_32_27.npz
├── scene2/
│   ├── scene_2_patch_1_1.npz
│   ├── scene_2_patch_1_2.npz
│   ├── ...
│   ├── scene_2_patch_22_27.npz
├── scene3/
│   ├── scene_3_patch_1_1.npz
│   ├── scene_3_patch_1_2.npz
│   ├── ...
│   ├── scene_3_patch_14_36.npz
├── scene4/
│   ├── scene_4_patch_1_1.npz
│   ├── scene_4_patch_1_2.npz
│   ├── ...
│   ├── scene_4_patch_21_24.npz

Mapping classes used in the competition:

Class Number Class Name Class Code in the Label
1 Non-fire 0
2 Fire 1

Baseline Code

The baseline code is also provided in this repo.

Citation

Please cite the following paper if you use the data or the code:

@article{xu2024sen2fire,
  title={Sen2Fire: A Challenging Benchmark Dataset for Wildfire Detection using Sentinel Data},
  author={Yonghao Xu and Amanda Berg and Leif Haglund},
  journal={arXiv preprint arXiv:2403.17884},
  year={2024}
}

Files

Baseline_code.zip

Files (6.3 GB)

Name Size Download all
md5:ca9f5ef752e19b0dfcf76eab8413d8db
16.4 kB Preview Download
md5:135be2af2a8577c6deb12cbd7cc76c1a
6.3 GB Preview Download