Analysis pipeline and code for "Pathogenic germline variations and cancer risk in pediatric patients"
Description
project/ ├── data/ # 数据文件目录 / Data files directory ├── src/ # 源代码目录 / Source code directory │ └── functions.R # 自定义函数 / Custom functions ├── result/ # 结果输出目录 / Results output directory └── analysis_NBS.R # 主分析脚本 / Main analysis script (this file)
Overview
This project analyzes the association between genetic variants (SNV and CNV) and clinical phenotypes in tumor samples, with a focus on evaluating the impact of pathogenic variants on tumor development risk. The analysis employs survival analysis, competing risk models, and descriptive statistics to assess how different variant classifications influence tumor outcomes.
Key Features
-
Comprehensive variant analysis: Includes both SNV and CNV mutations classified as PLP (Pathogenic/Likely Pathogenic), VUS-LP (Variant of Uncertain Significance/Likely Pathogenic), and other variants
-
Clinical correlation: Integrates clinical follow-up data to assess tumor development risk
- Incidence rate calculation: Computes tumor incidence rates per 1000 person-years
Analysis Pipeline
-
Data preprocessing: Loading and integrating sample information, mutation data, and clinical records
-
Descriptive statistics: Generating demographic and clinical characteristic tables
-
Distribution analysis: Examining tumor type distribution across cohorts and variant groups
-
Survival analysis:
-
Kaplan-Meier curves for tumor-free survival
-
Competing risk models distinguishing between benign and malignant tumors
-
-
Incidence calculation: Computing and comparing tumor incidence rates
Output
-
Tables: Descriptive statistics of sample characteristics
-
Figures: Survival curves and cumulative incidence plots
-
Incidence rates: Tumor occurrence rates per 1000 person-years by variant group
Requirements
-
R (≥4.0.0)
-
R packages: survival, survminer, dplyr, table1, openxlsx, gtsummary, cmprsk, tidycmprsk, ggsurvfit, RColorBrewer, ggpubr
Usage
-
Place input data files in the
data/directory -
Run
main_analysis.Rto execute the complete analysis pipeline -
Find outputs in the
result/directory
Notes
-
Variant classification follows priority: PLP > VUS-LP > other
-
Incident cohort samples with follow-up information are used for survival analysis
-
Competing risk models account for both benign and malignant tumor events
Files
paper_code.zip
Files
(5.5 kB)
| Name | Size | Download all |
|---|---|---|
|
md5:963760ec3608d2391306a0e577bb5ec2
|
5.5 kB | Preview Download |