Dataset for "Machine learning predictions on an extensive geotechnical dataset of laboratory tests in Austria"
Contributors
Data collector (3):
Description
This dataset comprises over 20 years of geotechnical laboratory testing data collected primarily from Vienna, Lower Austria, and Burgenland. It includes 24 features documenting critical soil properties derived from particle size distributions, Atterberg limits, Proctor tests, permeability tests, and direct shear tests. Locations for a subset of samples are provided, enabling spatial analysis.
The dataset is a valuable resource for geotechnical research and education, allowing users to explore correlations among soil parameters and develop predictive models. Examples of such correlations include liquidity index with undrained shear strength, particle size distribution with friction angle, and liquid limit and plasticity index with residual friction angle.
Python-based exploratory data analysis and machine learning applications have demonstrated the dataset's potential for predictive modeling, achieving moderate accuracy for parameters such as cohesion and friction angle. Its temporal and spatial breadth, combined with repeated testing, enhances its reliability and applicability for benchmarking and validating analytical and computational geotechnical methods.
This dataset is intended for researchers, educators, and practitioners in geotechnical engineering. Potential use cases include refining empirical correlations, training machine learning models, and advancing soil mechanics understanding. Users should note that preprocessing steps, such as imputation for missing values and outlier detection, may be necessary for specific applications.
Key Features:
- Temporal Coverage: Over 20 years of data.
- Geographical Coverage: Vienna, Lower Austria, and Burgenland.
- Tests Included:
- Particle Size Distribution
- Atterberg Limits
- Proctor Tests
- Permeability Tests
- Direct Shear Tests
- Number of Variables: 24
- Potential Applications: Correlation analysis, predictive modeling, and geotechnical design.
Technical Details:
- Missing values have been addressed using K-Nearest Neighbors (KNN) imputation, and anomalies identified using Local Outlier Factor (LOF) methods in previous studies.
- Data normalization and standardization steps are recommended for specific analyses.
Acknowledgments:
The dataset was compiled with support from the European Union's MSCA Staff Exchanges project 101182689 Geotechnical Resilience through Intelligent Design (GRID).
Files
Zenodo_DATA_Soranzo.csv
Files
(94.3 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:46a970fcabd95dd543609f73fba1192f
|
94.3 kB | Preview Download |
Additional details
Dates
- Available
-
2024-11-30