binmat Enables creation of a binary matrix from a maf file with a predefined list of patients (rows are patients and columns are genes)

binmat(
  patients = NULL,
  maf = NULL,
  mut.type = "SOMATIC",
  SNP.only = FALSE,
  include.silent = FALSE,
  fusion = NULL,
  cna = NULL,
  cna.binary = TRUE,
  cna.relax = FALSE,
  specify.plat = TRUE,
  set.plat = NULL,
  rm.empty = TRUE,
  pathway = FALSE,
  col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification
    = NULL, Mutation_Status = NULL, Variant_Type = NULL),
  oncokb = FALSE,
  keep_onco = c("Oncogenic", "Likely Oncogenic", "Predicted Oncogenic"),
  token = "",
  ...
)

Arguments

patients

a character vector that let's the user specify the patients to be used to create the matrix. Default is NULL is which case all patients in the MAF file will be used.

maf

A MAF file.

mut.type

The mutation type to be used. Options are "SOMATIC", "GERMLINE" or "ALL". Note "ALL" will keep all mutations regardless of status (not recommended). Default is SOMATIC.

SNP.only

Boolean to rather the genetics events to be kept only to be SNPs (insertions and deletions will be removed). Default is FALSE.

include.silent

Boolean to keep or remove all silent mutations. TRUE keeps, FALSE removes. Default is FALSE.

fusion

An optional MAF file for fusions. If inputed the outcome will be added to the matrix with columns ending in ".fus". Default is NULL.

cna

An optional CNA files. If inputed the outcome will be added to the matrix with columns ending in ".del" and ".amp". Default is NULL.

cna.binary

A boolean argument specifying if the cna events should be enforced as binary. In which case separate columns for amplifications and deletions will be created.

cna.relax

for cna data only enables to count both gains and shallow deletions as amplifications and deletions respectively.

specify.plat

boolean specifying if specific IMPACT platforms should be considered. When TRUE NAs will fill the cells for genes of patients that were not sequenced on that plaform. Default is TRUE.

set.plat

character argument specifying which IMPACT platform the data should be reduced to if specify.plat is set to TRUE. Options are "341" and "410". Default is NULL.

rm.empty

boolean specifying if columns with no events founds should be removed. Default is TRUE.

pathway

boolean specifying if pathway annotation should be applied. If TRUE, the function will return a supplementary binary dataframe with columns being each pathway and each row being a sample. Default is FALSE.

col.names

character vector of the necessary columns to be used. By default: col.names = c(Tumor_Sample_Barcode = NULL, Hugo_Symbol = NULL, Variant_Classification = NULL, Mutation_Status = NULL, Variant_Type = NULL)

oncokb

boolean specfiying if maf file should be oncokb annotated. Default is FALSE.

keep_onco

A character vector specifying which oncoKB annotated variants to keep. Options are 'Oncogenic', 'Likely Oncogenic', 'Predicted Oncogenic', 'Likely Neutral' and 'Inconclusive'. By default 'Oncogenic', 'Likely Oncogenic' and 'Predicted Oncogenic' variants will be kept (recommended).

token

the token affiliated to your oncoKB account.

...

Further arguments passed to the oncokb() function such a token

Value

mut : a binary matrix of mutation data

Examples

library(gnomeR) # mut.only <- binmat(maf = mut) patients <- as.character(unique(mut$Tumor_Sample_Barcode))[1:200] bin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, specify.plat = FALSE) bin.mut <- binmat(patients = patients,maf = mut, mut.type = "SOMATIC",SNP.only = FALSE, include.silent = FALSE, cna.relax = TRUE, specify.plat = FALSE, set.plat = "410", rm.empty = FALSE)