Published July 3, 2023 | Version v1
Journal article Open

MDI+: A Flexible Random Forest-Based Feature Importance Framework

  • 1. University of California, Berkeley
  • 2. National University of Singapore

Description

Datasets used in "MDI+: A Flexible Random Forest-Based Feature Importance Framework".

  • Juvenile dataset
    • Covariates (processed): X_juvenile_cleaned.csv
    • Response: y_juvenile.csv
  • Splicing dataset
    • Covariates (processed): X_splicing_cleaned.csv
    • Response: y_splicing.csv
  • Enhancer dataset
    • Covariates (processed for real-data-inspired simulations): X_enhancer_cleaned.csv
    • Covariates (processed for prediction experiments): X_enhancer_all.csv
    • Response: y_enhancer.csv
  • Cancer Cell Line Encyclopedia (CCLE) gene expression dataset
    • Covariates (processed for real-data-inspired simulations): X_ccle_rnaseq_cleaned.csv
    • Covariates (processed for case study): X_ccle_rnaseq_cleaned_filtered5000.csv
    • Response: y_ccle_rnaseq.csv
  • The Cancer Genome Atlas (TCGA) breast cancer dataset
    • Covariates (processed for case study): X_tcga_cleaned.csv
    • Response: Y_tcga.csv

Files

X_ccle_rnaseq_cleaned.csv

Files (427.3 MB)

Name Size Download all
md5:f814fd3e8f1425a4da4802e53832fa18
264.5 MB Preview Download
md5:4bdac97c554ed0c30acffdd1b0b3695b
40.3 MB Preview Download
md5:9d585b3f0ba7d9f9d511a91fbf28f997
7.5 MB Preview Download
md5:8c8142a970fb44847c5d6e7bf1bffcca
3.4 MB Preview Download
md5:528e27bdca81be5549ec6b22b771b608
2.0 MB Preview Download
md5:f0d576336994ef473158084729fe0326
18.6 MB Preview Download
md5:163eaae2d0860eed53a0e497b705d2f9
90.8 MB Preview Download
md5:bce97e78fff06365e3b2c17abc52a6ee
75.8 kB Preview Download
md5:db1d1e20a2eecf2ba61e81e3361db616
15.6 kB Preview Download
md5:56a0943e12a0204b72928455e0ed2072
36.4 kB Preview Download
md5:dd0aa540adada61e56a8f6c15d4c7726
95.3 kB Preview Download
md5:ac8ee2443e7be20ac4b0a968160480b8
7.9 kB Preview Download