R/make-bin-mat.R
binmat.Rd
binmat \ Enables creation of a binary matrix from a maf file with a predefined list of patients (rows are patients and columns are genes)
binmat( patients = NULL, maf = NULL, mut.type = "SOMATIC", SNP.only = FALSE, include.silent = FALSE, fusion = NULL, cna = NULL, cna.binary = TRUE, cna.relax = FALSE, specify.plat = TRUE, set.plat = NULL, rm.empty = TRUE, pathway = FALSE, recode.aliases = TRUE, col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification = NULL, Mutation_Status = NULL, Variant_Type = NULL), oncokb = FALSE, keep_onco = c("Oncogenic", "Likely Oncogenic", "Predicted Oncogenic"), token = "", ... )
patients | a character vector that let's the user specify the patients to be used to create the matrix. Default is NULL is which case all patients in the MAF file will be used. |
---|---|
maf | A MAF file. |
mut.type | The mutation type to be used. Options are "SOMATIC", "GERMLINE" or "ALL". Note "ALL" will keep all mutations regardless of status (not recommended). Default is SOMATIC. |
SNP.only | Boolean to rather the genetics events to be kept only to be SNPs (insertions and deletions will be removed). Default is FALSE. |
include.silent | Boolean to keep or remove all silent mutations. TRUE keeps, FALSE removes. Default is FALSE. |
fusion | An optional MAF file for fusions. If inputed the outcome will be added to the matrix with columns ending in ".fus". Default is NULL. |
cna | An optional CNA files. If inputed the outcome will be added to the matrix with columns ending in ".del" and ".amp". Default is NULL. |
cna.binary | A boolean argument specifying if the cna events should be enforced as binary. In which case separate columns for amplifications and deletions will be created. |
cna.relax | By default this argument is set to FALSE, where only deep deletions (-2) and amplifications (2) will be annotated as events. When set to FTRUE all deletions (-1 shallow and -2 deep) are counted as an event same for all gains (1 gain, 2 amplification) as an event. |
specify.plat | boolean specifying if specific IMPACT platforms should be considered. When TRUE NAs will fill the cells for genes of patients that were not sequenced on that plaform. Default is TRUE. |
set.plat | character argument specifying which IMPACT platform the data should be reduced to if specify.plat is set to TRUE. Options are "341" and "410". Default is NULL. |
rm.empty | boolean specifying if columns with no events founds should be removed. Default is TRUE. |
pathway | boolean specifying if pathway annotation should be applied. If TRUE, the function will return a supplementary binary dataframe with columns being each pathway and each row being a sample. Default is FALSE. |
recode.aliases | bolean specifying if automated gene name alias matching should be done. Default is TRUE. When TRUE the function will check for genes that may have more than 1 name in your data using the aliases im gnomeR::impact_gene_info alias column |
col.names | character vector of the necessary columns to be used. By default: col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification = NULL, Mutation_Status = NULL, Variant_Type = NULL) |
oncokb | boolean specfiying if maf file should be oncokb annotated. Default is FALSE. |
keep_onco | A character vector specifying which oncoKB annotated variants to keep. Options are 'Oncogenic', 'Likely Oncogenic', 'Predicted Oncogenic', 'Likely Neutral' and 'Inconclusive'. By default 'Oncogenic', 'Likely Oncogenic' and 'Predicted Oncogenic' variants will be kept (recommended). |
token | the token affiliated to your oncoKB account. |
... | Further arguments passed to the oncokb() function such a token |
mut : a binary matrix of mutation data
library(gnomeR) # mut.only <- binmat(maf = mut) patients <- as.character(unique(mut$Tumor_Sample_Barcode))[1:200] bin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, specify.plat = FALSE)#> Warning: MUTATION DATA: To ensure gene with multiple names/aliases are correctly grouped together, the #> following genes in your maf dataframe have been recoded. You can supress this with recode.aliases = FALSE #> #> AMER1 recoded to FAM123Bbin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, cna.relax = TRUE, specify.plat = FALSE, set.plat = "410", rm.empty = FALSE)#> Warning: MUTATION DATA: To ensure gene with multiple names/aliases are correctly grouped together, the #> following genes in your maf dataframe have been recoded. You can supress this with recode.aliases = FALSE #> #> AMER1 recoded to FAM123B