THE ACTIVITY OF TELOMERE LENGTH MAINTENANCE MECHANISMS AND THE TELOMERE LENGTH DYNAMICS
Contributors
Supervisor:
Description
The dataset contains the TCGA R Scripts and Processed Data for Pan-Cancer and Glioma Telomere Maintenance Analysis.
The dataset is organized as follows:
ALT_TEL_ATRX_TERT_mutations: folder provides mutation datasets for ALT and TEL pathways, with a focus on ATRX and TERT gene alterations. It includes data preprocessing workflows and scripts for comparative analysis of ALT and TEL pathway activities.
Clinical_branch_info_ALT_TEL: folder provides clinical stage annotations associated with ALT and TEL pathway branches activity for all cancer types. It includes scripts for data preprocessing and downstream analysis.
Clinical_data_branch_info: folder contains clinical stage annotations linked to ALT and TEL pathway activity branches across multiple cancer types, analyzed individually. It includes scripts for data preprocessing and downstream analyses.
Gene_expression_ORA: folder contains differential gene expression and over-representation analysis (ORA) results for all cancer types, analyzed separately. It also includes R scripts used for these analyses.
Gene_expression_ORA_top: folder provides the top 50 differential gene expression and over-representation analysis (ORA) results for each phenotype across all cancer types, analyzed individually. It includes accompanying R scripts for data processing and analysis.
LGG
This folder contains subfolders with datasets and analyses related to lower-grade glioma (LGG):
· CGGA: Contains data from the Chinese Glioma Genome Atlas (CGGA), used for supporting analyses and R scripts for analysis.
· GSE124180: Includes GEO dataset GSE124180 (COPD – Chronic Obstructive Pulmonary Disease), used for validation of the TMM method and R scripts for analysis.
· Telomere_length_IDH_status: Contains IDH status data from two independent studies (IDH_status_Willsche and IDH_status_Ceccarelli), along with telomere length (TL) ratio data and R scripts for analysis.
· Telomere_length_IDH_subtype: Includes ATRX gene status and IDH subtype data, as well as R scripts for preprocessing.
· Telomere_length_PSF_branch: Contains telomere length data and comparative analysis of ALT and TEL pathway branch PSF activity, along with subtype data and R scripts.
· Telomere_length_survival: Includes telomere length and phenotype comparisons in survival analyses, with corresponding R scripts.
· Telomere_length_PSF: Provides data on ALT and TEL pathway activity and their correlation with telomere length, along with R scripts for analysis.
MSS_MSI: folder provides microsatellite stability (MSS) and microsatellite instability (MSI) status annotations for relevant cancer types, together with clinical data and ALT/TEL pathway activity measurements. It includes accompanying R scripts for data processing and analysis.
Protein_Exp: folder provides protein expression data and comparative analyses with gene expression for available targets. It includes datasets and R scripts used for integrative analyses.
Protein_expression_significance: folder presents statistically significant results from phenotype-based comparisons of protein expression, along with associated datasets and R scripts for analysis.
SC_GBM_phenotyping: folder provides single-cell glioblastoma (GBM) data used for telomere maintenance mechanism (TMM) analysis, including cell type annotations and pseudobulk profiles. It includes accompanying R scripts for data preprocessing and downstream analyses.
Survival: folder provides pan-cancer survival analysis results stratified by TMM phenotypes and R scripts for analysis.
Survival_separately: folder provides survival analyses performed separately for each cancer type, stratified by TMM phenotypes, and accompanying R scripts.
TCGA_clinical_data: folder provides curated clinical datasets from The Cancer Genome Atlas (TCGA) for all analyzed cancer types.
TCGA_phenotyping_all: folder containing TCGA phenotyping data for pan-cancer analyses, with samples stratified by ALT and TEL pathway activity thresholds, including R scripts for processing.
TCGA_TMM_calculation: folder contains RNA-seq data for each cancer type, along with calculations of TMM activity and associated p-values for significance. It includes R scripts for data normalization, preprocessing, batch correction, and calculation of PSF and TMM scores.
TCGA_TMM_phenotyping_separately: folder provides TCGA phenotyping datasets for each cancer type separately, with samples stratified according to ALT and TEL pathway activity thresholds. Accompanying R scripts for data processing and analysis are included.
Tumor_purity: folder provides datasets on tumor purity and phenotype-based tumor stratification, along with accompanying R scripts for data processing and analysis.
Files
Data.zip
Additional details
Software
- Programming language
- R