EVENFLOW Personalised Medicine use case: Clear Cell Renal Cell Carcinoma (ccRCC, aka KIRC) datasets collection
Description
These datasets contain the synthetic data generated with a VAE from the TCGA-KIRC dataset. The file Static_KIRC.csv contains a pre-processed version of the bulk-RNASeq dataset from TCGA-KIRC. nodes_metadata.csv contains the clinical information of the patients (with columns: sample name, gender, race, and stage). A synthetic dataset is available (compressed), where 50 timepoints were inferred: trajectories_forward_test.csv.tar.gz simulates the progression of RNASeq data between patients at stages early and late. Both the static and trajectory datasets where analyzed for their biological interpretation with a Differential Expression (DESeq) analysis followed by GSEA on all the pathways available in the Reactome database. The results of the static analysis are included in static_gsea_reports_kirc.csv and the results on the trajectories may be found in trajectories_gsea_reports_kirc.csv.tar.gz.
Trained models used for the synthetic data generation are available in Hugging Face.
Files
nodes_metadata.csv
Files
(1.4 GB)
| Name | Size | Download all |
|---|---|---|
|
md5:645039f1c12d67832743f742f74976f2
|
36.8 kB | Preview Download |
|
md5:8e229e18ba1383906a87a968eda2fd15
|
104.5 MB | Preview Download |
|
md5:66b3d6d757f7f0f7b14654940630a2e0
|
22.8 MB | Preview Download |
|
md5:22d473df4804f44478665e0c375bd770
|
490.0 MB | Download |
|
md5:5c23d97ad37598fa2f5ae348c5952c9b
|
736.3 MB | Download |
Additional details
Funding
Dates
- Created
-
2025-12-19
Software
- Repository URL
- https://github.com/gprolcastelo/renalprog
- Programming language
- Python , R
- Development Status
- Active