Objectives

Generate patients profiles from BRCA METABRIC data for further logical modelling.

Independant omics profiles

METABRIC dataset recap

More than 1800 patients with several kinds of omics data: exome-sequencing, Copy Number Alterations (CNA), RNA and clinical annotations.

Most patients have all omics data:

## NULL

We investigate relations betwenn RNAseq data and BC subtypes. Subtypes have been defined based on PAM50 method. First, here is the distribution of BC subtypes across the cohort:

RNAseq is projected on PC1/PC2 space (from Principal Component Analysis), from PAM50 genes only (using only 47 out of 50 genes in PAM50 list present in RNAseq)

Processing pipeline

Mutations profiles

We need to assign Boolean effects to mutations: either 0 (inactivating) or 1 (activating). A mutation can stay unassigned in absence of any evidence.

Assignement methods and their respective influence:

## [1] "Sankey plots of mutation assignments depending on methods used"
## [1] "Sankey plots of mutation assignments depending on methods used (restricted to model-related nodes)"

Now we can summarize patient mutation profiles after processing. In following plots we focus on model-related genes only.

CNA profiles

For CNA, we have decided to focus on stringent amplifications/deletions corresponding to +2/-2 GISTIC results. We produce the same kind of plots.

RNA profiles

RNA data is intrinsically continuous and therefore require preliminary data processing. It is important to notice that METABRIC data comes from microarray.

Binarization with classification tree

## ===========================================================================

Now, what about the distribution of gene categories (Bimodal, Unimodal…) across the cohort?

## [1] "META assignments:"
Bimodal Unimodal
29 24339
## [1] "META assignments for model-related nodes:"

Unimodal

  110

Here are some distributions plots randomly picked in each category in META cohort

And depending on distribution category, we can perform binarization

## [1] "Bimodal example:"

## [1] "Unimodal example:"

Normalization

Merged profiles

Data types relations

Before merging independant profiles into multi-omics profiles, let’s have a look at relations between data types

Mutations and CNA

In particular, is there any mutation/CNA binary inconsistency?

Patient Gene Mut CNA
MTS-T1304 NF1 0 1
MB-0607 ERBB2 0 1
MB-3614 RB1 0 1
MB-4332 CDKN1B 0 1

In case of ambiguity, pritority is given to mutations over CNA

CNA and RNA

Patient Gene CNA RNA
MB-0569 BAD 1 0
MB-3502 CASP8 1 0
MB-6184 COX4I2 1 0
MB-0149 CCND2 1 0
MB-4660 CCND2 1 0
MB-5135 CCND2 1 0
MB-4426 E2F5 1 0
MB-5174 E2F5 1 0
MB-5190 E2F5 1 0
MB-3165 E2F6 1 0
MB-0569 GLI1 1 0
MB-2923 MYC 1 0
MB-5174 MYC 1 0
MB-0582 CDKN2B 1 0
MB-5358 CDKN1A 1 0
MB-5107 PRKCA 1 0
MB-0079 ARAF 1 0
MB-4224 ARAF 0 1
MB-7198 ARAF 1 0
MB-0291 RAG1 1 0
MB-0291 RAG2 1 0
MB-2753 NOX1 1 0
MB-4281 ERBB4 0 1
MB-0346 FLT1 1 0
MB-5259 FLT4 1 0
MB-0131 NTRK1 1 0
MB-0501 NTRK1 1 0
MB-2742 NTRK1 1 0
MB-2954 NTRK1 1 0
MB-3211 NTRK1 1 0
MB-4622 NTRK1 1 0
MB-5018 NTRK1 1 0
MB-7050 NTRK1 1 0
MB-7253 NTRK3 1 0
MB-4171 SNAI2 1 0
MB-4236 SNAI2 1 0
MB-4360 SNAI2 1 0
MB-5236 SNAI2 1 0
MB-7031 SNAI2 1 0
MB-4697 SNAI1 1 0
MB-5525 SNAI1 1 0
MB-7299 SNAI1 1 0
MB-0060 TCF4 1 0
MB-2753 TCF4 1 0
MB-0340 TERT 1 0
MB-5208 TERT 1 0
MB-3297 TNFRSF1A 1 0
MB-0053 TSC2 1 0
MB-4224 VEGFD 0 1

In case of ambiguity, pritority is given to RNA over CNA

Mut and RNA

Patient Gene Mut RNA
MB-0066 CASP8 0 1
MB-6273 NF1 0 1
MB-7215 NF1 0 1
MB-3452 CDKN2A 0 1
MB-0100 TP53 0 1
MB-0107 TP53 0 1
MB-0149 TP53 0 1
MB-0164 TP53 0 1
MB-0191 TP53 0 1
MB-0214 TP53 0 1
MB-0259 TP53 0 1
MB-0278 TP53 0 1
MB-0340 TP53 0 1
MB-0396 TP53 0 1
MB-0400 TP53 0 1
MB-0516 TP53 0 1
MB-0582 TP53 0 1
MB-0874 TP53 0 1
MB-0895 TP53 0 1
MB-4715 TP53 0 1
MB-4732 TP53 0 1
MB-4770 TP53 0 1
MB-4792 TP53 0 1
MB-4865 TP53 0 1
MB-4911 TP53 0 1
MB-5205 TP53 0 1
MB-5298 TP53 0 1
MB-5378 TP53 0 1
MB-5421 TP53 0 1
MB-5440 TP53 0 1
MB-5526 TP53 0 1
MB-5548 TP53 0 1
MB-5559 TP53 0 1
MB-5560 TP53 0 1
MB-6055 TP53 0 1
MB-6143 TP53 0 1
MB-7031 TP53 0 1
MB-7036 TP53 0 1
MB-7089 TP53 0 1
MB-7165 TP53 0 1
MB-7262 TP53 0 1
MB-0133 PIK3CA 1 0
MB-0308 PIK3CA 1 0
MB-0317 PIK3CA 1 0
MB-0422 PIK3CA 1 0
MB-0451 PIK3CA 1 0
MB-0486 PIK3CA 1 0
MB-4012 PIK3CA 1 0
MB-4426 PIK3CA 1 0
MB-5163 PIK3CA 1 0
MB-5204 PIK3CA 1 0
MB-0470 ERBB2 1 0
MB-0365 RB1 0 1

Write profiles