Published April 21, 2025 | Version v1
Dataset Open

Bimodal genomic approach predicting Semaphorin 7A (SEMA7A) as prognostic biomarker in adrenocortical carcinoma

  • 1. Developmental Therapeutics Branch & Laboratory of Molecular Pharmacology, Center for Cancer Research, National Cancer Institute, NIH Bethesda, MD 20892
  • 2. Computational Biology Branch, National Library of Medicine, NIH, Bethesda, MD 20892
  • 3. HiThru Analytics, LLC

Description

Bimodal Gene Detection using Gaussian Mixture Modeling in Tumor Expression Data.

 

Description:

This project provides an R-based analytical pipeline designed to identify genes with bimodal expression patterns across tumor samples. Bimodal expression may indicate tumor heterogeneity, subtype-specific gene regulation, or clinically relevant expression shifts.

The pipeline uses:

  • Gaussian Mixture Modeling (GMM) to fit two-component distributions for each gene

  • Hartigan’s Dip Test to test for non-unimodality

 

Required R Packages:

  • readxl
  • tidyverse
  • diptest
  • nor1mix

 

You can install the required packages in R using:

install.packages(c("readxl", "tidyverse", "diptest", "nor1mix"))

 

How to Run: Ensure the expression matrix file named Gene_Expression.csv is placed in the same directory as the script. Then run:

Rscript code.R

This will generate the ranked list and plots for downstream analysis or interpretation.

Files

Gene_Expression.csv

Files (80.8 MB)

Name Size Download all
md5:01a7f975b8785ef914415567c09c25f6
2.8 kB Download
md5:cf215592eb6bb43526f2635bee30d981
80.8 MB Preview Download

Additional details

Software

Programming language
R