R/make-bin-mat.R
binmat.Rdbinmat Enables creation of a binary matrix from a maf file with a predefined list of patients (rows are patients and columns are genes)
binmat( patients = NULL, maf = NULL, mut.type = "SOMATIC", SNP.only = FALSE, include.silent = FALSE, fusion = NULL, cna = NULL, cna.binary = TRUE, cna.relax = FALSE, specify.plat = TRUE, set.plat = NULL, rm.empty = TRUE, pathway = FALSE, col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification = NULL, Mutation_Status = NULL, Variant_Type = NULL), oncokb = FALSE, keep_onco = c("Oncogenic", "Likely Oncogenic", "Predicted Oncogenic"), token = "", ... )
| patients | a character vector that let's the user specify the patients to be used to create the matrix. Default is NULL is which case all patients in the MAF file will be used. |
|---|---|
| maf | A MAF file. |
| mut.type | The mutation type to be used. Options are "SOMATIC", "GERMLINE" or "ALL". Note "ALL" will keep all mutations regardless of status (not recommended). Default is SOMATIC. |
| SNP.only | Boolean to rather the genetics events to be kept only to be SNPs (insertions and deletions will be removed). Default is FALSE. |
| include.silent | Boolean to keep or remove all silent mutations. TRUE keeps, FALSE removes. Default is FALSE. |
| fusion | An optional MAF file for fusions. If inputed the outcome will be added to the matrix with columns ending in ".fus". Default is NULL. |
| cna | An optional CNA files. If inputed the outcome will be added to the matrix with columns ending in ".del" and ".amp". Default is NULL. |
| cna.binary | A boolean argument specifying if the cna events should be enforced as binary. In which case separate columns for amplifications and deletions will be created. |
| cna.relax | for cna data only enables to count both gains and shallow deletions as amplifications and deletions respectively. |
| specify.plat | boolean specifying if specific IMPACT platforms should be considered. When TRUE NAs will fill the cells for genes of patients that were not sequenced on that plaform. Default is TRUE. |
| set.plat | character argument specifying which IMPACT platform the data should be reduced to if specify.plat is set to TRUE. Options are "341" and "410". Default is NULL. |
| rm.empty | boolean specifying if columns with no events founds should be removed. Default is TRUE. |
| pathway | boolean specifying if pathway annotation should be applied. If TRUE, the function will return a supplementary binary dataframe with columns being each pathway and each row being a sample. Default is FALSE. |
| col.names | character vector of the necessary columns to be used. By default: col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification = NULL, Mutation_Status = NULL, Variant_Type = NULL) |
| oncokb | boolean specfiying if maf file should be oncokb annotated. Default is FALSE. |
| keep_onco | A character vector specifying which oncoKB annotated variants to keep. Options are 'Oncogenic', 'Likely Oncogenic', 'Predicted Oncogenic', 'Likely Neutral' and 'Inconclusive'. By default 'Oncogenic', 'Likely Oncogenic' and 'Predicted Oncogenic' variants will be kept (recommended). |
| token | the token affiliated to your oncoKB account. |
| ... | Further arguments passed to the oncokb() function such a token |
mut : a binary matrix of mutation data
library(gnomeR) # mut.only <- binmat(maf = mut) patients <- as.character(unique(mut$Tumor_Sample_Barcode))[1:200] bin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, specify.plat = FALSE) bin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, cna.relax = TRUE, specify.plat = FALSE, set.plat = "410", rm.empty = FALSE)