Dataset for "Interpolation-Driven Machine Learning Approaches for Plume Shine Dose Estimation: A Comparison of XGBoost, Random Forest, and TabNet"
Description
This Zenodo record contains the datasets, trained machine learning models, and supporting preprocessing artifacts used in the study on surrogate modeling of plume shine dose for radiological consequence assessment.
The repository includes preprocessed discrete datasets, interpolated continuous datasets, and trained models developed to predict plume shine dose as a function of downwind distance, release height, radionuclide identity, and atmospheric stability category. Interpolation was performed along the downwind distance dimension using shape-preserving methods to enhance spatial resolution while maintaining physical monotonicity of dose attenuation.
Contents
Trained machine learning models
-
xgboost_model_final_ep100_dep30.json: Trained XGBoost regression model -
random_forest_final_ep100_dep15.pkl: Trained Random Forest regression model -
tabnet_final.pkl: Trained TabNet deep learning model
Preprocessing and encoding artifacts
-
scaler.pkl: Feature scaling object used during model training -
label_encoders.pkl: Label encoders for categorical variables (radionuclide and stability category)
Datasets
-
filtered_distance_2000_height_200_train_99.csv: Original discrete training dataset -
filtered_distance_2000_height_200_test_1.csv: Independent test dataset (real, non-interpolated) -
finer_interpolated_data_filtered_distance_2000_height_200_train_99.csv: Distance-wise interpolated training dataset -
finer_interpolated_data_filtered_distance_2000_height_200_train_9975.csv: Interpolated dataset with additional augmentation -
finer_interpolated_data_filtered_distance_2000_height_200_test_0025.csv: Held-out validation dataset
Notes
-
Interpolated datasets were generated using physically consistent one-dimensional interpolation along the downwind distance axis, performed separately for each radionuclide–release height–stability category combination.
-
The independent test dataset contains only original (non-interpolated) samples and was not used during model training.
-
The trained models and preprocessing objects are provided to ensure full reproducibility of the reported results.
Files
filtered_distance_2000_height_200_test_1.csv
Files
(765.2 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:5aebab38bf5a6bfc45721b97525e1d67
|
35.5 kB | Preview Download |
|
md5:60bcea911ac9577b88154e5d190006cd
|
3.4 MB | Preview Download |
|
md5:2a943f195f62a27f24a5ad7708aa5e82
|
56.9 kB | Preview Download |
|
md5:ec61aea7ce24cbb791139f4c55d6c0aa
|
219.7 MB | Preview Download |
|
md5:2e3a4af4f433a6dcf67cbf5c3db04cd4
|
227.2 MB | Preview Download |
|
md5:e8c3eb442f490f4de81723c17d48f1be
|
557 Bytes | Download |
|
md5:1aebe6c630013e29f1c3bb4f75dad9e3
|
156.6 MB | Download |
|
md5:ab3e5df50fa42e2e139784bfb241730a
|
524 Bytes | Download |
|
md5:09be205cf244854b462716b7f0c3a585
|
1.5 MB | Download |
|
md5:b85185e3132d1631a8d1179d90962af2
|
156.7 MB | Preview Download |
Additional details
Software
- Repository URL
- https://github.com/BiswajitSadhu/PlumeDoseNet
- Programming language
- Python
- Development Status
- Active