A predictive machine learning framework for detecting important variables
Description
This code runs a predictive analysis framework that avoids common predictive ML pitfalls and produces robust, generalizable, and comparable results. The initial version is introduced in the following PhD thesis:
Jauhiainen, Susanne. "Potential of predictive modeling methods for individual response: applications and guidelines for sports sciences." JYU Dissertations (2023).
And utilized in the following publications
Jauhiainen, Susanne, et al. "New machine learning approach for detection of injury risk factors in young team sport athletes." International journal of sports medicine 42.02 (2021): 175-182.
"Jauhiainen, Susanne, et al. 2026 "Predictors of extreme behaviors of physical activity measured with accelerometer - a predictive machine learning approach" Submitted/Under review
The framework is model-agnostic and the balanced random forest can be replaced with other preferred methods. As an example, it is initialized to run with the Breast Cancer Wisconsin (Diagnostic) data from the UCI repository.
Files
Files
(7.6 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:a3b4a4308e8bf9d21383f22a287add40
|
3.3 kB | Download |
|
md5:8cb85ee3f578950012069b8a620be8ea
|
2.7 kB | Download |
|
md5:5c461bfe92682cb5abbfca03398a6c09
|
1.7 kB | Download |
Additional details
Software
- Programming language
- Python
References
- Jauhiainen, Susanne. "Potential of predictive modeling methods for individual response: applications and guidelines for sports sciences." JYU Dissertations (2023).
- Jauhiainen, Susanne, et al. "New machine learning approach for detection of injury risk factors in young team sport athletes." International journal of sports medicine 42.02 (2021): 175-182.
- "Jauhiainen, Susanne, et al. 2026 "Predictors of extreme behaviors of physical activity measured with accelerometer - a predictive machine learning approach" Submitted/Under review