Supplementary Data for the Predictive Model for Atmospheric Substances and Trace Pollutants in the Environment Using Machine Learning (PASTEL)
Description
Supporting datasets and ensemble members/submembers for the PASTEL model.
Brief description of individual entries:
-
v0_1_5_Awakens.csv— A merged dataset combining multiple airborne campaigns with supplementary 24-hour backward trajectory information. Represents the input samples used to train PASTEL. -
Koppen_npy_files.zip— Numpy arrays containing merged land (Beck et al. 2023) and ocean (Walterscheid 2011) Köppen climate classifications at 0.5° x 0.5° global resolution. Includes a Matplotlib colormap (Python.pkl), following Beck et al. (2023), along with alternative Köppen representations. -
worldcities.zip— Simplemaps basic dataset (see attribution and license within). -
df_preprocessed.csv— A preprocessed version ofv0_1_5_Awakens.csvcontaining additional derived features and statistics. Can be used to bypass preprocessing steps in the main PASTEL notebook. -
AllTrajectories.zip— All 24-hour backward HYSPLIT trajectories generated for each sample inv0_1_5_Awakens.csvanddf_preprocessed.csv, with varying meteorological inputs (see associated publication for details). -
ne_10m_land.zip— Natural Earth shapefile containing 10-meter resolution land boundaries. -
ERA5_32yr_monthly_avg.nc— NetCDF file containing 32-year monthly averages of ERA5 data (ozone, specific humidity, relative humidity, temperature) over the study period. -
ensemble.zip— Ensemble members and submembers contributing to PASTEL predictions, along with derived statistics and plots (≈27 GB uncompressed).
License
-
Code (not included here, see linked repository): GNU General Public License v3.0 (GPLv3).
-
Data: Creative Commons Attribution–ShareAlike 4.0 International (CC-BY-SA 4.0).
-
Third-party data (Simplemaps, Natural Earth) is redistributed under their respective licenses (see included attributions).
Citation
If you use this dataset, please cite:
Geiser, Victor (2025). Supplementary data for the Predictive model for Atmospheric Substances and Trace pollutants in the Environment using machine Learning (PASTEL). Zenodo. https://doi.org/10.5281/zenodo.17204569
How to Use
These datasets are intended for use with the PASTEL model, but may also be of independent value for climate classification, atmospheric transport analysis, or ensemble modeling.
Size Warning
"ensemble.zip" is roughly 27GB uncompressed as statistics/plotting information for all members/submembers is included!
Contact
For questions regarding this dataset or publication please contact victor.w.geiser[at]gmail.com
Files
AllTrajectories.zip
Files
(12.8 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:fcccf160c5f2e2254bf39b341494ab9f
|
64.5 MB | Preview Download |
|
md5:5038ba88cf985bdc47266a8936661116
|
463.7 MB | Preview Download |
|
md5:0a62228e04730472b9d32a53362dc053
|
10.3 GB | Preview Download |
|
md5:802c892ebfb9ac54ce163c19dcd5d2e6
|
1.7 GB | Download |
|
md5:42bd9e86200d1e75a667a0439490f453
|
1.6 MB | Preview Download |
|
md5:ae4392ba06f12ec492b64a3ed3ff33c4
|
3.3 MB | Preview Download |
|
md5:47db61a70a0fa73235c508f642aa5c0d
|
226.8 MB | Preview Download |
|
md5:0e85c9875e21b731db502bc13fb15c2a
|
1.8 MB | Preview Download |
Additional details
Related works
- Is published in
- Publication: 10.1175/AIES-D-24-0051.1 (DOI)
Dates
- Available
-
2025-09-29Date published on Zenodo
Software
- Repository URL
- https://github.com/vwgeiser/PASTEL
- Programming language
- Python
- Development Status
- Concept