Published January 16, 2026 | Version v1
Dataset Open

Dataset for "Interpolation-Driven Machine Learning Approaches for Plume Shine Dose Estimation: A Comparison of XGBoost, Random Forest, and TabNet"

Authors/Creators

  • 1. ROR icon Bhabha Atomic Research Centre

Description

This Zenodo record contains the datasets, trained machine learning models, and supporting preprocessing artifacts used in the study on surrogate modeling of plume shine dose for radiological consequence assessment.

The repository includes preprocessed discrete datasets, interpolated continuous datasets, and trained models developed to predict plume shine dose as a function of downwind distance, release height, radionuclide identity, and atmospheric stability category. Interpolation was performed along the downwind distance dimension using shape-preserving methods to enhance spatial resolution while maintaining physical monotonicity of dose attenuation.

Contents

Trained machine learning models

  • xgboost_model_final_ep100_dep30.json: Trained XGBoost regression model

  • random_forest_final_ep100_dep15.pkl: Trained Random Forest regression model

  • tabnet_final.pkl: Trained TabNet deep learning model

Preprocessing and encoding artifacts

  • scaler.pkl: Feature scaling object used during model training

  • label_encoders.pkl: Label encoders for categorical variables (radionuclide and stability category)

Datasets

  • filtered_distance_2000_height_200_train_99.csv: Original discrete training dataset

  • filtered_distance_2000_height_200_test_1.csv: Independent test dataset (real, non-interpolated)

  • finer_interpolated_data_filtered_distance_2000_height_200_train_99.csv: Distance-wise interpolated training dataset

  • finer_interpolated_data_filtered_distance_2000_height_200_train_9975.csv: Interpolated dataset with additional augmentation

  • finer_interpolated_data_filtered_distance_2000_height_200_test_0025.csv: Held-out validation dataset

Notes

  • Interpolated datasets were generated using physically consistent one-dimensional interpolation along the downwind distance axis, performed separately for each radionuclide–release height–stability category combination.

  • The independent test dataset contains only original (non-interpolated) samples and was not used during model training.

  • The trained models and preprocessing objects are provided to ensure full reproducibility of the reported results.

Files

filtered_distance_2000_height_200_test_1.csv

Files (765.2 MB)

Name Size Download all
md5:5aebab38bf5a6bfc45721b97525e1d67
35.5 kB Preview Download
md5:60bcea911ac9577b88154e5d190006cd
3.4 MB Preview Download
md5:2a943f195f62a27f24a5ad7708aa5e82
56.9 kB Preview Download
md5:ec61aea7ce24cbb791139f4c55d6c0aa
219.7 MB Preview Download
md5:2e3a4af4f433a6dcf67cbf5c3db04cd4
227.2 MB Preview Download
md5:e8c3eb442f490f4de81723c17d48f1be
557 Bytes Download
md5:1aebe6c630013e29f1c3bb4f75dad9e3
156.6 MB Download
md5:ab3e5df50fa42e2e139784bfb241730a
524 Bytes Download
md5:09be205cf244854b462716b7f0c3a585
1.5 MB Download
md5:b85185e3132d1631a8d1179d90962af2
156.7 MB Preview Download

Additional details

Software

Repository URL
https://github.com/BiswajitSadhu/PlumeDoseNet
Programming language
Python
Development Status
Active