Query verification
- A total of n = 134 target identifiers were provided (type: symbol, option ignore_id_err = TRUE)
- All query identifiers have been mapped towards identifiers for known human genes (including non-ambiguous aliases), and valid/invalid entries in the query set are indicated as follows:
- Invalid identifier : n = 0
- Valid identifier (mapped as alias) : n = 4
- Valid identifier : n = 130
Disease associations
- Each protein in the query set is annotated with:
Query set - cancer association rank
- Query set genes are ranked according to their overall strength of association to cancer phenotype ontology terms, visualized in varying shades of blue. Specifically, ranking is based on the sum of mean association scores pr. tumor type/tissue, and scaled as percent rank within the query set (column targetset_cancer_prank)
Query set - association strength pr. tumor type
- Top cancer-associated genes (maximum 100) in the query set are shown with their specific tumor-type association strengths (percent rank)
Cancer hallmark evidence
- Each gene in the query set is annotated with cancer hallmarks evidence (Hanahan & Weinberg, Cell, 2011), indicating genes associated with essential alterations in cell physiology that can dictate malignant growth.
- Data has been collected from the Open Targets Platform, and we list evidence for each hallmark per gene, indicated as either being promoted , or suppressed
Poorly characterized genes
- The aim of this section is to highlight poorly characterized genes or genes with unknown function in the query set
- A set of uncharacterized/poorly characterized human protein-coding genes (n = 1128) have been established based on
- Genes specifically designated as uncharacterized or as open reading frames
- Missing gene function summary in NCBI Gene AND function summary in UniProt Knowledgebase
- Missing or limited (<= 2) gene ontology (GO) annotations with respect to molecular function (MF) and biological process (BP)
- Query genes found within the set of poorly characterized genes are listed below, colored in varying shades of red according to the level of missing characterization (from unknown function to poorly defined function )
Drug associations
- Each protein/protein in the query set is annotated with:
- Targeted cancer drugs (inhibitors/antagonists), as found through the Open Targets Platform
- We distinguish between drugs in early clinical development/phase (ep), and drugs already in late clinical development/phase (lp)
Target tractabilities
- Each gene/protein in the query set is annotated with target tractability information (aka druggability) towards small molecules/compounds and antibodies
- Query genes are colored in varying shades of purple (from unknown tractability to clinical precedence )
Small molecules/compounds
Protein complexes
Here we show how members of the query set that are involved in known protein complexes, using two different collections of protein complex annotations:
- OmniPath - a meta-database of molecular biology prior knowledge, containing protein complex annotations predominantly from CORUM, ComplexPortal, Compleat, and PDB.
- We limit complex annotations to those that are supported by references to the scientific literature (i.e. manually curated)
- Human Protein Complex Map - hu.MAP v2.0 - created through an integration of > 15,000 proteomics experiments (biochemical fractionation data, proximity labeling data, and RNA hairpin pulldown data)
- Each complex comes with a confidence score from clustering (1=Extremely High, 2=Very High, 3=High, 4=Medium High, 5=Medium)
The protein complexes that overlap with members of the query set are ranked according to the total number of participating members in the query set
Function and pathway enrichment
- The query set is analyzed with clusterProfiler for functional enrichment/overrepresentation with respect to:
- Enrichment/overrepresentation test settings (clusterProfiler)
- P-value cutoff: 0.05
- Q-value cutoff: 0.2
- Correction for multiple testing: BH
- Minimal size of genes annotated by term for testing: 10
- Maximal size of genes annotated by term for testing: 500
- Background gene set description: All protein-coding genes
- Background gene set size: 19680
- Remove redundancy of enriched GO terms: TRUE
Enrichment tables
Molecular Signatures Database (MSigDB)
WikiPathways
-
No pathway signatures from WikiPathways were enriched in the query set.
Regulatory interactions
Using data from the OmniPath/DoRothEA gene set resource, we are here interrogating previously established transcription factor (TF) - target interactions for members of the query set. TF-target interactions in DoRothEA have been established according to different lines of evidence, i.e.
- literature-curated resources
- ChIP-seq peaks
- TF binding site motifs
- gene expression-inferred interactions.
In DoRothEA, each interaction is assigned a confidence level based on the amount of supporting evidence, ranging from A (highest confidence) to D (lowest confidence):
- A - Supported by all four lines of evidence, manually curated by experts in specific reviews, or supported both in at least two curated resources are considered to be highly reliable
- B-D - Curated and/or ChIP-seq interactions with different levels of additional evidence
- E - Used for interactions that are uniquely supported by computational predictions (not included in oncoEnrichR)
Here, we show regulatory interactions related to the queryset along three different axes:
- interactions for which both regulatory gene and regulatory target are found in the queryset
- interactions for which only the regulatory gene is found in the queryset
- interactions for which only the regulatory target is found in the queryset
We interrogate interactions in the query set for two separate collections of regulatory interactions in DoRothEA:
- regulatory interactions inferred with gene expression from GTex (global set),
- regulatory interactions inferred with gene expression from TCGA (cancer-focused set)
DoRothEA - global set
Regulatory gene and target
DoRothEA - cancer-focused set
Regulatory gene and target
Key regulatory network interactions
Subcellular structures/compartments
Subcellular anatogram
Heatmap - query set
- In the heatmap below, value refers to the fraction of target genes that are annotated with a particular compartment/subcellular structure
Legend - subcellular structures

Subcellular data browser
By Compartment
- Genes listed per compartment are calculated using only compartment annotations with a minimum confidence level of: 1 (number of sources)
Tissue and cell type enrichment
- Using data from the Human Protein Atlas (HPA) - Cell/Tissue Atlas, we are here interrogating classification of all protein coding genes in the query set with respect to elevated expression in normal/healthy tissues and cell types
Target set - tissue specificity
- Genes have been classified, based on mean expression (across samples) per tissue in GTex, into distinct specificity categories (algorithm developed within HPA):
- Not detected: Genes with a mean expression level less than 1 (TPM < 1) across all the tissues.
- Tissue enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to all other tissues.
- Group enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a group of 2-5 tissues compared to all other tissues, and that are not considered Tissue enriched.
- Tissue enhanced: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to the average levels in all other tissues, and that are not considered Tissue enriched or Group enriched.
- Low tissue specificity: Genes with an expression level greater than or equal to 1 (TPM >= 1) across all of the tissues that are not in any of the above 4 groups.
- Mixed: Genes that are not assigned to any of the above 5 groups.
- Enrichment of specific tissues in the query set (with respect to tissue-specific gene expression) is performed with TissueEnrich
- Only tissues that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed
Tissue specificities per target gene
Tissue enrichment - query set
-
Considering the tissue specificities of members of the query set, NO TISSUES are enriched (adjusted p-value < 0.05) compared to the background set.
Target set - cell type specificity
- Genes have been classified, based on mean expression (across samples) per cell type, into distinct specificity categories (algorithm developed within HPA):
- Not detected: Genes with a mean expression level less than 1 (NX < 1) across all the cell types.
- Cell type enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to all other cell types.
- Group enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a group of 2-10 cell types compared to all other cell types, and that are not considered Cell type enriched.
- Cell type enhanced: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to the average levels in all other cell types, and that are not considered Cell type enriched or Group enriched.
- Low cell type specificity: Genes with an expression level greater than or equal to 1 (NX >= 1) across all of the cell types that are not in any of the above 4 groups.
- Mixed: Genes that are not assigned to any of the above 5 groups.
- Enrichment of specific cell types in the query set (with respect to cell type-specific gene expression) is performed with TissueEnrich
- Only cell types that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed
Cell type specififies per target gene
Cell type enrichment - query set
-
Considering the cell-type specificities of members of the query set, NO CELL TYPES are enriched (adjusted p-value < 0.05) compared to the background set.
Protein-protein interaction network
- Using known protein-protein interactions (PPI), as evident from the STRING API (v11.5), we here create a dedicated PPI network for members of the query set
- Note that interactions in STRING are assembled from multiple sources, including co-expression, co-occurrence in the literature, experimental data, curated databases etc
- In addition to potential interactions within the query set, the network is expanded with n = 50 proteins that interact with proteins in the query set
- Network is here restricted to interactions with STRING association score >= 900 (range 0-1000))
- Drugs added to the network: TRUE
- Three different views are shown
- Complete protein-protein interaction network, also showing proteins with no known interactions
- Network community structures, as detected by the fast greedy modularity optimization algorithm by Clauset et al.
- Network centrality/hub scores pr. node, as measured by Kleinberg’s score
- Network legend:
- Target set proteins are shaped as circles, other interacting proteins are shaped as rectangles (note that sizes of nodes do not carry any value), drugs are shaped as diamonds
- Tumor suppressor genes (annotated from CancerMine) are HIGHLIGHTED IN RED
- Proto-oncogenes (annotated from CancerMine) are HIGHLIGHTED IN GREEN
- Genes predicted to have a dual role as proto-oncogenes/tumor suppressors (annotated from CancerMine) are HIGHLIGHTED IN BLACK
- Targeted cancer drugs (from Open Targets Platform):
- Compounds in late (3-4) clinical phases are HIGHLIGHTED IN ORANGE
- Compounds in early (1-2) clinical phases are HIGHLIGHTED IN PURPLE
- Use the mouse to zoom in/out, alter the position of nodes, mouse-over edges and nodes to view gene names/drug mechanism of actions (with indications)/interaction scores
Ligand-receptor interactions
Secreted Signaling
-
NO pair of genes in the queryset are involved in ligand-receptor interactions (CellChatDB - Secreted signalling)
ECM-Receptor
-
NO pair of genes in the queryset are involved in ligand-receptor interactions (CellChatDB - ECM Receptor)
Tumor aberration frequencies
SNVs/InDels - oncoplots
- Frequency of somatic SNVs/InDels in the query set genes (top mutated) are illustrated with oncoplots
- Query set mutation frequencies are sorted by type of diagnosis (i.e. cancer subtypes)
- The frequency of transitions/transversions is also shown per sample
Breast

Colon/Rectum

Lung

Skin

Esophagus/Stomach

Cervix

Prostate

Ovary/Fallopian Tube

Uterus

Pancreas

Soft Tissue

Myeloid

CNS/Brain

Liver

Kidney
