Published December 23, 2024 | Version v2
Dataset Open

High-Quality Daily PM2.5 Datasets for India at 10 km Resolution (Version 2)

  • 1. Emmett Interdisciplinary Program in Environment and Resources, Stanford University, Stanford, CA, USA
  • 2. Department of Earth System Science, Stanford University, Stanford
  • 3. School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, NY, USA
  • 4. Program in Public Health, Stony Brook University, Stony Brook, NY
  • 5. Centre for Research on Energy and Clean Air (CREA)
  • 6. Department of Energy Science and Engineering, Stanford University, Stanford, CA, USA
  • 7. Doerr School of Sustainability, Stanford University, Stanford, CA, USA
  • 8. Center on Food Security and the Environment, Stanford University, Stanford, CA, USA
  • 9. National Bureau of Economic Research, Cambridge, MA, USA

Description

 

If you use this dataset in your research/work, please cite the following paper:

Kawano, Ayako, et al. "Improved daily PM2.5 estimates in India reveal inequalities in recent enhancement of air quality." Science Advances 11.4 (2025): eadq1071. DOI: 10.1126/sciadv.adq1071

Thank you for acknowledging our work!

----------------------------------------------
 
Open-source daily fine particulate matter (PM2.5) datasets at a 10 km resolution for India from 2005 to 2023, using a region-specific two-stage machine learning model carefully validated on held-out monitor data that it was not trained on. Our model demonstrates robust out-of-sample performance, substantially outperforming existing publicly-available monthly PM2.5 datasets.
 
To take advantage of both the longer available time series of Aerosol Optical Depth (AOD) data and information from newer sensors such as TROPOspheric Monitoring Instrument (TROPOMI), we developed two separate machine learning models - the "Full model" and the "AOD model".
 
Full model:
  • Predictive performance (spatial cross-validation): R2 value of 0.67, RMSE of 27.79 μg/m3
  • Input features: Moderate Resolution Imaging Spectroradiometer (MODIS) AOD and TROPOMI satellite inputs along with other remote sensing data
  • Daily PM2.5 predictions for: July 10, 2018 - September 30, 2023
AOD model: 
  • Predictive performance (spatial cross-validation): R2 value of 0.64, RMSE of 32.08 μg/m3
  • Input features: all inputs except TROPOMI used for the Full model
  • Daily PM2.5 predictions for: January 1, 2005 - September 30, 2023
 
Please note that we employed spatial cross-validation (CV) rather than more conventional random CV to be responsible for predicting daily PM2.5 concentrations for locations without air quality monitors across India. When the above Full model was evaluated using 10-fold random CV, it showed notably higher performance (R2 of 0.85 and RMSE of 18.48 μg/m3). This highlights the potential of random CV to overstate model performance on critical real-world applications.
 
Code and source data needed to replicate the results have been also deposited. 

Files

aod_model_2005.zip

Files (3.8 GB)

Name Size Download all
md5:382fa8c643efc53c921cb9029097c31a
82.5 MB Preview Download
md5:3796aabc49cf48ab5b224a1c59527b4d
82.5 MB Preview Download
md5:6e7c0dfdf0e844b5dcf9c778e6e469bf
82.4 MB Preview Download
md5:fba13f421ca02dd8a8622232f90f2b1f
82.5 MB Preview Download
md5:870584e08f994e8ae7e9df7724f09463
82.1 MB Preview Download
md5:6fc082846909375c9db7bfc38c44f93b
82.5 MB Preview Download
md5:1e2a9091b39a4cee441d97c4a8d51c05
82.6 MB Preview Download
md5:fde0c4dcc0a3475628c782670f7610c2
83.0 MB Preview Download
md5:5f4a88d391b398fcc26027fee515634a
83.0 MB Preview Download
md5:8951e3b49d646752332ad1f47a6da8e6
82.8 MB Preview Download
md5:a9ffbeee66f4017b795bf7b47492d6df
82.1 MB Preview Download
md5:909f4b4438f0e6dff90c0da4736cb10d
82.4 MB Preview Download
md5:b76e9eb159d8f1b2a092829990eac114
82.2 MB Preview Download
md5:5f088af9ece430373edf468c97e21612
82.0 MB Preview Download
md5:b87aeddc0aff334476d7af52b7be0090
82.5 MB Preview Download
md5:3f3396b3d7616451962b25f1733149de
83.1 MB Preview Download
md5:5480b113387b8991649f698a7c6b9d48
82.5 MB Preview Download
md5:5461f732be856a2cbeace5b5521c77b7
82.6 MB Preview Download
md5:d22346fe126d170e815c04d4b9a61d7f
62.3 MB Preview Download
md5:58d6241b4e9778ce423d66d2f45f5056
39.7 MB Preview Download
md5:fc86a70d2fac801ce597ea39b9936860
82.8 MB Preview Download
md5:0b22038f1c09b5bcdd6910210faf2635
83.4 MB Preview Download
md5:ff775d5045c20d1487a54f22ff167171
82.8 MB Preview Download
md5:04feb3dbe13d6dd8d22dad57dfdffb8e
82.9 MB Preview Download
md5:5a564bb38b765c4a03d33b0d0f74b718
62.5 MB Preview Download
md5:576c67c28225921c6f4730713807d495
64.1 kB Preview Download
md5:f8c36d5a6c9ebfa34860170da9bf6cb4
1.8 GB Preview Download

Additional details

Identifiers

Software

Programming language
Python, R