Published February 14, 2024 | Version v2
Dataset Open

DustNet - structured data and Python code to reproduce the model, statistical analysis and figures

Description

Data and Python code used for AOD prediction with DustNet model - a Machine Learning/AI based forecasting. 

Model input data and code

Processed MODIS AOD data (from Aqua and Terra) and selected ERA5 variables* ready to reproduce the DustNet model results or for similar forecasting with Machine Learning. These long-term daily timeseries (2003-2022) are provided as n-dimensional NumPy arrays. The Python code to handle the data and run the DustNet model** is included as Jupyter Notebook ‘DustNet_model_code.ipynb’. A subfolder with normalised and split data into training/validation/testing sets is also provided with Python code for two additional ML based models** used for comparison (U-NET and Conv2D). Pre-trained models are also archived here as TensorFlow files. 

Model output data and code

This dataset was constructed by running the ‘DustNet_model_code.ipynb’ (see above). It consists of 1095 days of forecased AOD data (2020-2022) by CAMS, DustNet model, naïve prediction (persistence) and gridded climatology. The ground truth raw AOD data form MODIS is provided for comparison and statystical analysis of predictions. It is intended for a quick reproduction of figures and statystical analysis presented in DustNet introducing paper. 

 

*datasets are NumPy arrays (v1.23) created in Python v3.8.18.

**all ML models were created with Keras in Python v3.10.10.

Files

READ_ME.pdf

Files (6.4 GB)

Name Size Download all
md5:bb17afdd5010586950ceb0b7e4af252c
6.4 GB Preview Download
md5:475c82e0145d39dba4e945b1c95c04ad
35.9 MB Preview Download
md5:15c512985046e11628e761ef5554a9f0
982.9 kB Preview Download

Additional details

Related works

Is derived from
Dataset: https://doi.org/10.5281/zenodo.10593152 (URL)

Funding

UKRI Centre for Doctoral Training in Environmental Intelligence: Data Science & AI for Sustainable Futures EP/S022074/1
Engineering and Physical Sciences Research Council

Dates

Collected
2023-01-07/2023-03-31
from NASA (LAADS DAAC) and Copernicous (CDS)

Software

Repository URL
https://github.com/Trish-hub/saharan-dust
Programming language
Python, Jupyter Notebook
Development Status
Wip