Fire-D: Analysis and ML-Ready NASA-Centric Remote Sensing of Wildfire and Smoke
Creators
Description
Earth science remote sensing imagery is rich in structural and spectral information, making such data an ideal platform for benchmarking for a broad range of machine learning (ML) tasks, from pattern retrieval to physics-informed classification to anomaly detection to transfer learning. Nevertheless, the utility of Earth science remote sensing data remains largely unexplored by the broader ML community. Our goal is to bridge this gap and bring a rich variety of multisource multi-resolution Earth image data to a wider range of ML researchers who are non-experts in remote sensing, thereby increasing the utility and societal impact of such data products. In particular, motivated by the emerging wildfire crisis, we present radiometrically and geometrically calibrated radiance data from airborne and orbital instruments from the National Aeronautics and Space Administration (NASA), the National Oceanic and Atmospheric Administration (NOAA), and the Korean Meteorological Administration (KMA).
Given the scarce occurrence of wildfires and complex spatio-temporal dependencies in radiance data, these datasets are especially well suited for benchmarking unsupervised and self-supervised learning tasks both on images and non-Euclidean objects. Our experiments on these datasets indicate that contrastive learning and transfer learning algorithms can capture the structures of views and scenes, map pixel space of multi-sensor imagery to a high-level embedding space for further downstream tasks, and facilitate more cohesive integration of the state-of-the-art ML approaches into wildfire risk analytics.
All NASA-based observations are freely usable under the Creative Commons Zero License.There are also no restrictions on the use of GOES Data. GK2A data are also open data without any restrictions on its use.
Use:
On the data input, input geometrically and radiometrically calibrated radiance data has been pulled from various NASA, NOAA, and KMA archives. For instruments that have multiple different spatial resolutions within their spectral bands (GOES and GK2A), all bands have been resampled to the lowest collective spatial resolution.
Geometric and radiometric calibration has been done by the science data processing pipelines of the various missions, and would not need to be done by anyone else looking to curate the same data. Further information for each instrument can be found in each of the publicly available Level-1 algorithm theoretical basis documents (ATBDs)
All input and label data has been put in GeoTiff format. Each band is in a separate raster band and each scene is in a separate GeoTiff file. Label files and input files are in subdirectories labeled respectively, and the file names match for input and labels, with the exception of an additional .fire and .smoke in the respective label filenames and subfolders.
Files
Files
(12.0 GB)
Name | Size | Download all |
---|---|---|
md5:ecc2c308fc04f834ed526f4d85f8af26
|
12.0 GB | Download |