Zenodo.org will be unavailable for 2 hours on September 29th from 06:00-08:00 UTC. See announcement.

Dataset Open Access

Metascape Results for Prostate Cancer Multiomics Data

Zhandos Sembay

ABSTRACT 

Large p small n problem is a challenging problem in big data analytics. There are no de facto standard methods available to it. In this study, we propose a tensor decomposition (TD) based unsupervised feature extraction (FE) formalism applied to multiomics datasets, where the number of features is more than 100000 while the number of instances is as small as about 100. The proposed TD based unsupervised FE outperformed other conventional supervised feature selection methods, such as random forest, categorical regression (also known as analysis of variance, ANOVA), and penalized linear discriminant analysis when they are applied to not only multiomics datasets but also synthetic datasets. Genes selected by TD based unsupervised FE were biologically reliable. TD based unsupervised FE turned out to be not only the superior feature selection method but also the method that can select biologically reliable genes.  

Instructions: 

This is a supplementary file of paper submitted to bigdata2020

 

Inspiration:

This dataset uploaded to U-BRITE for "AI against CANCER DATA SCIENCE HACKATHON"

https://cancer.ubrite.org/hackathon-2021/

Acknowledgements

Y-h. Taguchi, July 17, 2020, "Metascape results for Prostate cancer multiomics data", IEEE Dataport, doi: https://dx.doi.org/10.21227/rdmb-jm40.

https://ieee-dataport.org/documents/metascape-results-prostate-cancer-multiomics-data

U-BRITE last update date: 07/21/2021

U-BRITE location: /data/project/ubrite/cancer-hackathon/org/ieee-dataport/metascape-results-prostate-cancer-multiomics-data
Files (8.5 MB)
Name Size
all.tix6y75jj.zip
md5:37584c901833bfec77c470ebe2bb1e65
8.5 MB Download
180
11
views
downloads
All versions This version
Views 180180
Downloads 1111
Data volume 93.0 MB93.0 MB
Unique views 160160
Unique downloads 1111

Share

Cite as