Status with respect to roles as tumor suppressors/oncogenes, fetched from the CancerMine literature mining resource and the Network of Cancer Genes
Target genes are colored in varying shades of blue according to their level of association to cancer (specifically the maximum association score to a cancer phenotype at any level of resolution)
Poorly or uncharacterized targets
The aim of this section is to highlight poorly characterized genes or genes with unknown function in the target set
A set of uncharacterized/poorly characterized human protein-coding genes (n = 2819) have been established based on
Genes specifically designated as uncharacterized or as open reading frames
Missing or limited (<= 2) gene ontology (GO) annotations with respect to molecular function (MF) and biological process (BP)
Ontology annotations attributed with an electronic annotation evidence code (IEA) are not considered in this calculation (less reliable due to lack of manually review)
Target genes found within the set of poorly characterized genes are listed below, colored in varying shades of red according to the level of missing characterization (from unknown function to poorly defined function )
Drug-target associations
Each protein in the target set is annotated with:
Targeted cancer drugs (inhibitors/antagonists), as found through the Open Targets Platform
We distinguish between drugs in early clinical development/phase (ep), and drugs already in late clinical development/phase (lp)
The complexes are ranked according to the total number of participating members in the target set
Subcellular structures/compartments
The target set is annotated with data from ComPPI, a database of subcellular localization data for human proteins, and results are here presented in two different views:
A subcellular anatogram - acting as a “heatmap” of subcellular structures associated with proteins in the target set
Compartments are here limited to the key compartments (n = 24) defined within the gganatogram package
An accompanying legend is also provided - depicting the locations of the various subcellular structures
A subcellular data browser
All subcellular compartment annotations pr. protein in the target set (“By Gene”)
All unique subcellular compartment annotations (unfiltered) and their target members (“By Compartment”)
Subcellular compartment annotations per gene are provided with a confidence level - indicating the number of different sources that support the compartment annotation
Minimum confidence level set by user: 1
Subcellular anatogram
Heatmap - target set
In the image below, value refers to the fraction of target genes that are annotated with a particular compartment/subcellular structure
Legend - subcellular structures
Subcellular data browser
By Gene
By Compartment
Genes listed per compartment are calculated using only compartment annotations with a minimum confidence level of: 1 (number of sources)
Tissue and cell type enrichment
Using data from the Human Protein Atlas (HPA) - Cell/Tissue Atlas, we are here interrogating classification of all protein coding genes in the target set with respect to elevated expression in normal/healthy tissues and cell types
Genes have been classified, based on mean expression (across samples) per tissue in GTex, into distinct specificity categories (algorithm developed within HPA):
Not detected: Genes with a mean expression level less than 1 (TPM < 1) across all the tissues.
Tissue enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to all other tissues.
Group enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a group of 2-5 tissues compared to all other tissues, and that are not considered Tissue enriched.
Tissue enhanced: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to the average levels in all other tissues, and that are not considered Tissue enriched or Group enriched.
Low tissue specificity: Genes with an expression level greater than or equal to 1 (TPM >= 1) across all of the tissues that are not in any of the above 4 groups.
Mixed: Genes that are not assigned to any of the above 5 groups.
Enrichment of specific tissues in the target set (with respect to tissue-specific gene expression) is performed with TissueEnrich
Only tissues that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed
Tissue specificities per target gene
Tissue enrichment - target set
Considering the tissue specificities of members of the target set, NO TISSUES are enriched (adjusted p-value < 0.05) compared to the background set.
Target set - cell type specificity
Genes have been classified, based on mean expression (across samples) per cell type, into distinct specificity categories (algorithm developed within HPA):
Not detected: Genes with a mean expression level less than 1 (NX < 1) across all the cell types.
Cell type enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to all other cell types.
Group enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a group of 2-10 cell types compared to all other cell types, and that are not considered Cell type enriched.
Cell type enhanced: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to the average levels in all other cell types, and that are not considered Cell type enriched or Cell type enriched.
Low cell type specificity: Genes with an expression level greater than or equal to 1 (NX >= 1) across all of the cell types that are not in any of the above 4 groups.
Mixed: Genes that are not assigned to any of the above 5 groups.
Enrichment of specific cell types in the target set (with respect to cell type-specific gene expression) is performed with TissueEnrich
Only cell types that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed
Cell type specififies per target gene
Cell type enrichment - target set
Considering the cell-type specificities of members of the target set, NO CELL TYPES are enriched (adjusted p-value < 0.05) compared to the background set.
Protein-protein interaction network
The target set is queried against known protein-protein interactions, as evident from the STRING API (v11)
Note that interactions in STRING are assembled from multiple sources, including co-expression, co-occurrence in the literature, experimental data, curated databases etc
In addition to potential interactions within the target set, the network is expanded with n = 50 proteins that interact with proteins in the target set
Network is here restricted to interactions with STRING association score >= 900 (range 0-1000))
Drugs added to the network: TRUE
Three different views are shown
Complete protein-protein interaction network, also showing proteins with no known interactions
Network community structures, as detected by the fast greedy modularity optimization algorithm by Clauset et al.
Network centrality/hub scores pr. node, as measured by Kleinberg’s score
Network legend:
Target set proteins are shaped as circles, other interacting proteins are shaped as rectangles (note that sizes of nodes do not carry any value), drugs are shaped as diamonds
Tumor suppressor genes (annotated from CancerMine) are HIGHLIGHTED IN RED
Proto-oncogenes (annotated from CancerMine) are HIGHLIGHTED IN GREEN
Genes predicted to have a dual role as proto-oncogenes/tumor suppressors (annotated from CancerMine) are HIGHLIGHTED IN BLACK