Small Area Estimation using a Reproduciable Geospatial Data Framework at 5 km Resolution: harmonized covariates, censoring adjusted child survival outcomes, under 5 population surfaces, and inhabited area masks
Authors/Creators
- 1. Statistician and Spatial Analyst, Research Center for Inclusive Development in Africa
- 2. Monitoring and Evaluation, Wollo University
Description
This repository provides reproducible resources for small-area estimation of under-five mortality (U5MR) in Ethiopia at 5 km spatial resolution. The deposit includes covariate preprocessing workflows, DHS survival-data preparation pipelines, informative censoring adjustment methods, processed raster outputs, synthetic DHS data, metadata files, validation summaries, and example outputs from Bayesian spatial modelling.
The repository contains two fully shared R Markdown workflows:
· 01_covariate_adjustment.Rmd
Processes and harmonizes 25 raster covariates derived from multiple geospatial sources, including ERA5, GBD, MODIS, WorldPop, Malaria Atlas Project, and related products. The workflow applies consistent preprocessing steps such as reprojection, resampling to a common 5 km grid, masking to Ethiopia, log1p transformation (where appropriate), z-score standardization, and capping of extreme values. Pairwise correlation analysis is then performed across all covariates, and highly correlated variables (|r| > 0.7) are identified and removed to generate a reduced set of uncorrelated predictors for downstream modelling. The repository includes fully processed covariate rasters, under-five population rasters, inhabited-area masks, correlation outputs, and accompanying JSON metadata files exported as GeoTIFFs. These adjusted raster covariates can be directly used as spatial predictors for analyses in Ethiopia. In addition, the shared R Markdown workflow is fully reproducible and can be adapted by other researchers to generate comparable harmonized covariate rasters for any country, provided the required raw geospatial datasets and sufficient computational resources are available.
· 02_censoring_adjustment.Rmd
Uses the processed and selected raster covariates together with Ethiopia DHS child-level data to construct survival datasets and perform informative censoring adjustment. The workflow links DHS cluster locations with the selected spatial covariates and integrates demographic predictors such as household wealth, birth plurality, maternal age at birth, and sex of child. Three survival modelling approaches are evaluated for censoring adjustment: Cox proportional hazards (CoxPH), Random Survival Forest (RSF), and Gradient Boosting Machine (GBM). Model performance is assessed using concordance index (C-index), area under the ROC curve (AUC), ROC-based evaluation metrics, and external mortality benchmark comparisons. Across evaluated years, the RSF model consistently achieved the best predictive performance and was therefore selected as the final censoring-adjustment model. The repository also includes:
- Processed covariate rasters at 5 km resolution (2000–2016, extended to 2023 where available)
- Population rasters and inhabited-area masks
- Synthetic DHS dataset (synthetic_dhs.rds) for reproducible testing without restricted DHS access
- Covariate summary statistics and multicollinearity assessment outputs
- Example outputs from the final Bayesian INLA + SPDE model
- Validation summaries, metadata files, and session information
The final Bayesian spatial-temporal INLA + SPDE modelling code used to generate annual under-five mortality maps is not publicly released. However, methodological descriptions, validation summaries, processed raster products, and example outputs are provided to support transparency and reproducibility.
This repository does not include restricted DHS microdata, household identifiers, or cluster-level geographic coordinates. All DHS-derived analyses were conducted under DHS Program data-use agreements.
GitHub repository: https://github.com/Bayuh23/Small-Area-Estimation-
License: MIT License
Files
Small_area_estimation_u5m_V1.0.0.zip
Files
(123.0 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:e9ba24619f4dbd597973213645e8ec31
|
123.0 MB | Preview Download |
Additional details
Dates
- Submitted
-
2026-04-19
Software
- Repository URL
- https://github.com/Bayuh23/Small-area-estimation-u5m
- Programming language
- R
- Development Status
- Active