RunExonModelWorkflow(ExonModelStrain) | R Documentation |
Run Strain Specific Exon Modeling
RunExonModelWorkflow(ExpSet,idlist=ExpSet,idlist = NULL,analysisType="transcript",dBPackage = "mouseexonensembl.db") RunExonModelWorkflow(ExpSet,idlist=NULL) RunExonModelWorkflow(ExpSet,analysisType="gene")
ExpSet |
ExpressionSet containing probeset-level Exon Expression Data. Currently, only Mouse Exon 1.0 data is supported. The ExpressionSet phenoData should contain a column called 'Strain' where the two strains are coded 1 and 2. |
idlist |
Character Vector containing a list of valid Ensembl Transcript or Gene IDs.
Note a list of all Transcript or Gene IDs can be queried from the database
package by using getAllTranscripts() or getAllGenes().
if idlist is NULL, then based on analysisType, the function will use getAllTranscripts() or getAllGenes() to obtain a valid gene list. |
analysisType |
Character value, must be either "transcript" (for transcript-level analysis) or "gene" (gene-level analysis). |
dbPackage |
Name of database package containing Ensembl-Exon Array mapping (currently only mouseexonensembl.db exists). |
Given an ExpressionSet of core expression values for an Affymetrix Exon array, RunExonWorkflow will attempt to model the expression data using one of two models.
For a multiple-exon transcript/gene, the following model is used.
Expression ~ Strain + Exon + Subject in Strain + Exon:Strain
for a single-exon, the model reduces to:
Expression ~ Strain + Subject in Strain
For a given Ensembl transcript or gene ID, the function will attempt to gather the information for all probesets with associated Exons, and subset the ExpressionSet, producing a data frame appropriate for analysis. The appropriate exon model is then run, and the location and identity of the exon with maximum strain difference is returned, along with the appropriate raw p-values for that model as well as other flags and metrics (see section below).
We strongly suggest the use of a FDR based method such as qvalue to adjust the raw p-values for multiple comparisons before further analysis.
RunExonModelWorkflow
returns a list with the following objects:
multi |
data frame that contains the following columns:
|
singles |
data frame that contains the following columns for single-exon transcripts:
|
notrun |
Character vector containing those ids that were not run. This usually is because the corresponding probeset values for that transcript do not exist in the ExpressionSet (due to masking or other reasons). |
#mouseexonensembl.db package required library(mouseexonensembl.db) #load in sample dataset data(exontestdata) #show list of Transcript IDs testTrans results <- RunExonModelWorkflow(TestSetTrans, testTrans, dBPackage="mouseexonensembl.db") #9 out of the 20 transcripts are multiple-exon transcripts results$multi #5 single-exon results in test set results$singles #some transcripts are not run results$notrun