Generate patients profiles from BRCA METABRIC data for further logical modelling.
More than 1800 patients with several kinds of omics data: exome-sequencing, Copy Number Alterations (CNA), RNA and clinical annotations.
Most patients have all omics data:
## NULL
We investigate relations betwenn RNAseq data and BC subtypes. Subtypes have been defined based on PAM50 method. First, here is the distribution of BC subtypes across the cohort:
RNAseq is projected on PC1/PC2 space (from Principal Component Analysis), from PAM50 genes only (using only 47 out of 50 genes in PAM50 list present in RNAseq)
We need to assign Boolean effects to mutations: either 0 (inactivating) or 1 (activating). A mutation can stay unassigned in absence of any evidence.
Assignement methods and their respective influence:
## [1] "Sankey plots of mutation assignments depending on methods used"
## [1] "Sankey plots of mutation assignments depending on methods used (restricted to model-related nodes)"
Now we can summarize patient mutation profiles after processing. In following plots we focus on model-related genes only.
For CNA, we have decided to focus on stringent amplifications/deletions corresponding to +2/-2 GISTIC results. We produce the same kind of plots.
RNA data is intrinsically continuous and therefore require preliminary data processing. It is important to notice that METABRIC data comes from microarray.
## ===========================================================================
Now, what about the distribution of gene categories (Bimodal, Unimodal…) across the cohort?
## [1] "META assignments:"
Bimodal | Unimodal |
---|---|
29 | 24339 |
## [1] "META assignments for model-related nodes:"
110
Here are some distributions plots randomly picked in each category in META cohort
And depending on distribution category, we can perform binarization
## [1] "Bimodal example:"
## [1] "Unimodal example:"
Before merging independant profiles into multi-omics profiles, let’s have a look at relations between data types
In particular, is there any mutation/CNA binary inconsistency?
Patient | Gene | Mut | CNA |
---|---|---|---|
MTS-T1304 | NF1 | 0 | 1 |
MB-0607 | ERBB2 | 0 | 1 |
MB-3614 | RB1 | 0 | 1 |
MB-4332 | CDKN1B | 0 | 1 |
In case of ambiguity, pritority is given to mutations over CNA
Patient | Gene | CNA | RNA |
---|---|---|---|
MB-0569 | BAD | 1 | 0 |
MB-3502 | CASP8 | 1 | 0 |
MB-6184 | COX4I2 | 1 | 0 |
MB-0149 | CCND2 | 1 | 0 |
MB-4660 | CCND2 | 1 | 0 |
MB-5135 | CCND2 | 1 | 0 |
MB-4426 | E2F5 | 1 | 0 |
MB-5174 | E2F5 | 1 | 0 |
MB-5190 | E2F5 | 1 | 0 |
MB-3165 | E2F6 | 1 | 0 |
MB-0569 | GLI1 | 1 | 0 |
MB-2923 | MYC | 1 | 0 |
MB-5174 | MYC | 1 | 0 |
MB-0582 | CDKN2B | 1 | 0 |
MB-5358 | CDKN1A | 1 | 0 |
MB-5107 | PRKCA | 1 | 0 |
MB-0079 | ARAF | 1 | 0 |
MB-4224 | ARAF | 0 | 1 |
MB-7198 | ARAF | 1 | 0 |
MB-0291 | RAG1 | 1 | 0 |
MB-0291 | RAG2 | 1 | 0 |
MB-2753 | NOX1 | 1 | 0 |
MB-4281 | ERBB4 | 0 | 1 |
MB-0346 | FLT1 | 1 | 0 |
MB-5259 | FLT4 | 1 | 0 |
MB-0131 | NTRK1 | 1 | 0 |
MB-0501 | NTRK1 | 1 | 0 |
MB-2742 | NTRK1 | 1 | 0 |
MB-2954 | NTRK1 | 1 | 0 |
MB-3211 | NTRK1 | 1 | 0 |
MB-4622 | NTRK1 | 1 | 0 |
MB-5018 | NTRK1 | 1 | 0 |
MB-7050 | NTRK1 | 1 | 0 |
MB-7253 | NTRK3 | 1 | 0 |
MB-4171 | SNAI2 | 1 | 0 |
MB-4236 | SNAI2 | 1 | 0 |
MB-4360 | SNAI2 | 1 | 0 |
MB-5236 | SNAI2 | 1 | 0 |
MB-7031 | SNAI2 | 1 | 0 |
MB-4697 | SNAI1 | 1 | 0 |
MB-5525 | SNAI1 | 1 | 0 |
MB-7299 | SNAI1 | 1 | 0 |
MB-0060 | TCF4 | 1 | 0 |
MB-2753 | TCF4 | 1 | 0 |
MB-0340 | TERT | 1 | 0 |
MB-5208 | TERT | 1 | 0 |
MB-3297 | TNFRSF1A | 1 | 0 |
MB-0053 | TSC2 | 1 | 0 |
MB-4224 | VEGFD | 0 | 1 |
In case of ambiguity, pritority is given to RNA over CNA
Patient | Gene | Mut | RNA |
---|---|---|---|
MB-0066 | CASP8 | 0 | 1 |
MB-6273 | NF1 | 0 | 1 |
MB-7215 | NF1 | 0 | 1 |
MB-3452 | CDKN2A | 0 | 1 |
MB-0100 | TP53 | 0 | 1 |
MB-0107 | TP53 | 0 | 1 |
MB-0149 | TP53 | 0 | 1 |
MB-0164 | TP53 | 0 | 1 |
MB-0191 | TP53 | 0 | 1 |
MB-0214 | TP53 | 0 | 1 |
MB-0259 | TP53 | 0 | 1 |
MB-0278 | TP53 | 0 | 1 |
MB-0340 | TP53 | 0 | 1 |
MB-0396 | TP53 | 0 | 1 |
MB-0400 | TP53 | 0 | 1 |
MB-0516 | TP53 | 0 | 1 |
MB-0582 | TP53 | 0 | 1 |
MB-0874 | TP53 | 0 | 1 |
MB-0895 | TP53 | 0 | 1 |
MB-4715 | TP53 | 0 | 1 |
MB-4732 | TP53 | 0 | 1 |
MB-4770 | TP53 | 0 | 1 |
MB-4792 | TP53 | 0 | 1 |
MB-4865 | TP53 | 0 | 1 |
MB-4911 | TP53 | 0 | 1 |
MB-5205 | TP53 | 0 | 1 |
MB-5298 | TP53 | 0 | 1 |
MB-5378 | TP53 | 0 | 1 |
MB-5421 | TP53 | 0 | 1 |
MB-5440 | TP53 | 0 | 1 |
MB-5526 | TP53 | 0 | 1 |
MB-5548 | TP53 | 0 | 1 |
MB-5559 | TP53 | 0 | 1 |
MB-5560 | TP53 | 0 | 1 |
MB-6055 | TP53 | 0 | 1 |
MB-6143 | TP53 | 0 | 1 |
MB-7031 | TP53 | 0 | 1 |
MB-7036 | TP53 | 0 | 1 |
MB-7089 | TP53 | 0 | 1 |
MB-7165 | TP53 | 0 | 1 |
MB-7262 | TP53 | 0 | 1 |
MB-0133 | PIK3CA | 1 | 0 |
MB-0308 | PIK3CA | 1 | 0 |
MB-0317 | PIK3CA | 1 | 0 |
MB-0422 | PIK3CA | 1 | 0 |
MB-0451 | PIK3CA | 1 | 0 |
MB-0486 | PIK3CA | 1 | 0 |
MB-4012 | PIK3CA | 1 | 0 |
MB-4426 | PIK3CA | 1 | 0 |
MB-5163 | PIK3CA | 1 | 0 |
MB-5204 | PIK3CA | 1 | 0 |
MB-0470 | ERBB2 | 1 | 0 |
MB-0365 | RB1 | 0 | 1 |