Pre-processed daily ERA5 and MODIS AOD data (2003 - 2022) ready for use in AI/ML forecasting
Creators
Description
Long-term, pre-processed, atmospheric datasets for use in Machine Learning/AI based forecasting. Initially intended to predict AOD, however can be adapted for prediction of other atmospheric particles.
Pre-processed data and code
Machine Learning ready NumPy* dataset constructed by pre-processing selected atmospheric variables at 5 pressure levels form ERA5 reanalysis (resulting in 35 features) and AOD data from MODIS on board of Aqua and Terra satellites. This is a long-term daily dataset which spans 20 years from 1st Jan 2003 to 31st Dec 2022 and is homogeneously structured into 1ºx1º grid cells. Missing days and AOD values from MODIS were imputed using Lattice Kriging method (Python code used for imputation included as Jupyter Notebook 'Combine_impute_AOD.ipynb'), but raw (unimputed) MODIS data are also available. All datasets were created for a purpose of training Convolutional Neural Network model designed to forecast Saharan dust (DustNet). These datasets can also be used to train other ML models, or indeed to forecast other variables.
This dataset was used to train the DustNet model and predict 24-hr ahead AOD. Please see doi: 10.5281/zenodo.10722953 for further details on predicting AOD and the DustNet model code.
*datasets are NumPy arrays (v1.23) created in Python v3.8.18.
Files
READ_ME.pdf
Additional details
Related works
- Is continued by
- Dataset: 10.5281/zenodo.10722953 (DOI)
- Computational notebook: 10.5281/zenodo.10722953 (DOI)
Funding
Dates
- Collected
-
2023-01-07/2023-03-31