Published July 21, 2021 | Version v1
Dataset Open

Metascape Results for Prostate Cancer Multiomics Data




Large p small n problem is a challenging problem in big data analytics. There are no de facto standard methods available to it. In this study, we propose a tensor decomposition (TD) based unsupervised feature extraction (FE) formalism applied to multiomics datasets, where the number of features is more than 100000 while the number of instances is as small as about 100. The proposed TD based unsupervised FE outperformed other conventional supervised feature selection methods, such as random forest, categorical regression (also known as analysis of variance, ANOVA), and penalized linear discriminant analysis when they are applied to not only multiomics datasets but also synthetic datasets. Genes selected by TD based unsupervised FE were biologically reliable. TD based unsupervised FE turned out to be not only the superior feature selection method but also the method that can select biologically reliable genes.  


This is a supplementary file of paper submitted to bigdata2020



This dataset uploaded to U-BRITE for "AI against CANCER DATA SCIENCE HACKATHON"


Y-h. Taguchi, July 17, 2020, "Metascape results for Prostate cancer multiomics data", IEEE Dataport, doi:

U-BRITE last update date: 07/21/2021


U-BRITE location: /data/project/ubrite/cancer-hackathon/org/ieee-dataport/metascape-results-prostate-cancer-multiomics-data


Files (8.5 MB)

Name Size Download all
8.5 MB Preview Download