Project background

  • Project owner/contact: Raught et al.

  • Project description


Query verification

  • A total of n = 134 target identifiers were provided (type: symbol, option ignore_id_err = TRUE)
  • All query identifiers have been mapped towards identifiers for known human genes (including non-ambiguous aliases), and valid/invalid entries in the query set are indicated as follows:
    •   Invalid identifier    :   n = 0
    •   Valid identifier (mapped as alias)    :   n = 4
    •   Valid identifier    :   n = 130





Disease associations



Query set - cancer association rank


  • Query set genes are ranked according to their overall strength of association to cancer phenotype ontology terms, visualized in varying shades of blue. Specifically, ranking is based on the sum of mean association scores pr. tumor type/tissue, and scaled as percent rank within the query set (column targetset_cancer_prank)






Query set - association strength pr. tumor type

  • Top cancer-associated genes (maximum 100) in the query set are shown with their specific tumor-type association strengths (percent rank)





Cancer hallmark evidence

  • Each gene in the query set is annotated with cancer hallmarks evidence (Hanahan & Weinberg, Cell, 2011), indicating genes associated with essential alterations in cell physiology that can dictate malignant growth.
  • Data has been collected from the Open Targets Platform, and we list evidence for each hallmark per gene, indicated as either being  promoted  , or  suppressed  






Poorly characterized genes

  • The aim of this section is to highlight poorly characterized genes or genes with unknown function in the query set
  • A set of uncharacterized/poorly characterized human protein-coding genes (n = 1128) have been established based on
    1. Genes specifically designated as uncharacterized or as open reading frames
    2. Missing gene function summary in NCBI Gene AND function summary in UniProt Knowledgebase
    3. Missing or limited (<= 2) gene ontology (GO) annotations with respect to molecular function (MF) and biological process (BP)
  • Query genes found within the set of poorly characterized genes are listed below, colored in varying shades of red according to the level of missing characterization (from  unknown function  to  poorly defined function )






Drug associations

  • Each protein/protein in the query set is annotated with:
    • Targeted cancer drugs (inhibitors/antagonists), as found through the Open Targets Platform
    • We distinguish between drugs in early clinical development/phase (ep), and drugs already in late clinical development/phase (lp)






Target tractabilities

  • Each gene/protein in the query set is annotated with target tractability information (aka druggability) towards small molecules/compounds and antibodies
  • Query genes are colored in varying shades of purple (from  unknown tractability  to  clinical precedence )



Small molecules/compounds


Antibodies





Protein complexes

  • Here we show how members of the query set that are involved in known protein complexes, using two different collections of protein complex annotations:

    1. OmniPath - a meta-database of molecular biology prior knowledge, containing protein complex annotations predominantly from CORUM, ComplexPortal, Compleat, and PDB.
      • We limit complex annotations to those that are supported by references to the scientific literature (i.e. manually curated)
    2. Human Protein Complex Map - hu.MAP v2.0 - created through an integration of > 15,000 proteomics experiments (biochemical fractionation data, proximity labeling data, and RNA hairpin pulldown data)
      • Each complex comes with a confidence score from clustering (1=Extremely High, 2=Very High, 3=High, 4=Medium High, 5=Medium)
  • The protein complexes that overlap with members of the query set are ranked according to the total number of participating members in the query set



OmniPath



hu.MAP v2.0



Function and pathway enrichment


  • Enrichment/overrepresentation test settings (clusterProfiler)
    • P-value cutoff: 0.05
    • Q-value cutoff: 0.2
    • Correction for multiple testing: BH
    • Minimal size of genes annotated by term for testing: 10
    • Maximal size of genes annotated by term for testing: 500
    • Background gene set description: All protein-coding genes
    • Background gene set size: 19680
    • Remove redundancy of enriched GO terms: TRUE



Enrichment tables


Gene Ontology






Molecular Signatures Database (MSigDB)






KEGG pathways






WikiPathways





  •   No pathway signatures from WikiPathways were enriched in the query set.  





NetPath






GO enrichment plots


All subontologies



Molecular function



Biological Process



Cellular Component



Regulatory interactions


  • Using data from the OmniPath/DoRothEA gene set resource, we are here interrogating previously established transcription factor (TF) - target interactions for members of the query set. TF-target interactions in DoRothEA have been established according to different lines of evidence, i.e. 

    1. literature-curated resources
    2. ChIP-seq peaks
    3. TF binding site motifs
    4. gene expression-inferred interactions.
  • In DoRothEA, each interaction is assigned a confidence level based on the amount of supporting evidence, ranging from A (highest confidence) to D (lowest confidence):

    • A - Supported by all four lines of evidence, manually curated by experts in specific reviews, or supported both in at least two curated resources are considered to be highly reliable
    • B-D - Curated and/or ChIP-seq interactions with different levels of additional evidence
    • E - Used for interactions that are uniquely supported by computational predictions (not included in oncoEnrichR)
  • Here, we show regulatory interactions related to the queryset along three different axes:

    1. interactions for which both regulatory gene and regulatory target are found in the queryset
    2. interactions for which only the regulatory gene is found in the queryset
    3. interactions for which only the regulatory target is found in the queryset

We interrogate interactions in the query set for two separate collections of regulatory interactions in DoRothEA:

  1. regulatory interactions inferred with gene expression from GTex (global set),
  2. regulatory interactions inferred with gene expression from TCGA (cancer-focused set)



DoRothEA - global set

Regulatory gene and target





Regulatory gene only





Target only





DoRothEA - cancer-focused set

Regulatory gene and target





Regulatory gene only





Target only




Key regulatory network interactions

  • Visualization of known regulatory interactions (DoRothEA - cancer-focused set) where both regulator and target are found in the query set

    • Edge length between nodes reflects confidence level of regulatory interaction (shorter lengths - higher confidence)
    • Edge color between nodes indicates mode of regulation (   Stimulation     vs.   Repression    )



Subcellular structures/compartments


  • The query set is annotated with data from ComPPI, a database of subcellular localization data for human proteins, and results are here presented in two different views:

    1. A subcellular anatogram - acting as a “heatmap” of subcellular structures associated with proteins in the query set
      • Compartments are here limited to the key compartments (n = 24) defined within the gganatogram package
      • An accompanying legend is also provided - depicting the locations of the various subcellular structures
    2. A subcellular data browser
      • All subcellular compartment annotations pr. protein in the query set (“By Gene”)
      • All unique subcellular compartment annotations (unfiltered) and their target members (“By Compartment”)
      • Subcellular compartment annotations per gene are provided with a confidence level - indicating the number of different sources that support the compartment annotation
        • Minimum confidence level set by user: 1
        • Ignore cytosol as a subcellular location: TRUE



Subcellular anatogram


Heatmap - query set


  • In the heatmap below, value refers to the fraction of target genes that are annotated with a particular compartment/subcellular structure



Legend - subcellular structures


Subcellular data browser


By Gene





By Compartment
  • Genes listed per compartment are calculated using only compartment annotations with a minimum confidence level of: 1 (number of sources)






Tissue and cell type enrichment



Target set - tissue specificity

  • Genes have been classified, based on mean expression (across samples) per tissue in GTex, into distinct specificity categories (algorithm developed within HPA):
    • Not detected: Genes with a mean expression level less than 1 (TPM < 1) across all the tissues.
    • Tissue enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to all other tissues.
    • Group enriched: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a group of 2-5 tissues compared to all other tissues, and that are not considered Tissue enriched.
    • Tissue enhanced: Genes with a mean expression level greater than or equal to 1 (TPM >= 1) that also have at least four-fold higher expression levels in a particular tissue compared to the average levels in all other tissues, and that are not considered Tissue enriched or Group enriched.
    • Low tissue specificity: Genes with an expression level greater than or equal to 1 (TPM >= 1) across all of the tissues that are not in any of the above 4 groups.
    • Mixed: Genes that are not assigned to any of the above 5 groups.
  • Enrichment of specific tissues in the query set (with respect to tissue-specific gene expression) is performed with TissueEnrich
    • Only tissues that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed





Tissue specificities per target gene




Tissue enrichment - query set

  • Considering the tissue specificities of members of the query set, NO TISSUES are enriched (adjusted p-value < 0.05) compared to the background set.





Target set - cell type specificity

  • Genes have been classified, based on mean expression (across samples) per cell type, into distinct specificity categories (algorithm developed within HPA):
    • Not detected: Genes with a mean expression level less than 1 (NX < 1) across all the cell types.
    • Cell type enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to all other cell types.
    • Group enriched: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a group of 2-10 cell types compared to all other cell types, and that are not considered Cell type enriched.
    • Cell type enhanced: Genes with a mean expression level greater than or equal to 1 (NX >= 1) that also have at least four-fold higher expression levels in a particular cell type compared to the average levels in all other cell types, and that are not considered Cell type enriched or Group enriched.
    • Low cell type specificity: Genes with an expression level greater than or equal to 1 (NX >= 1) across all of the cell types that are not in any of the above 4 groups.
    • Mixed: Genes that are not assigned to any of the above 5 groups.
  • Enrichment of specific cell types in the query set (with respect to cell type-specific gene expression) is performed with TissueEnrich
    • Only cell types that are enriched with an adjusted (Benjamini-Hochberg) p-value < 0.05 are listed





Cell type specififies per target gene




Cell type enrichment - query set



  • Considering the cell-type specificities of members of the query set, NO CELL TYPES are enriched (adjusted p-value < 0.05) compared to the background set.





Protein-protein interaction network

  • Using known protein-protein interactions (PPI), as evident from the STRING API (v11.5), we here create a dedicated PPI network for members of the query set
    • Note that interactions in STRING are assembled from multiple sources, including co-expression, co-occurrence in the literature, experimental data, curated databases etc
    • In addition to potential interactions within the query set, the network is expanded with n = 50 proteins that interact with proteins in the query set
    • Network is here restricted to interactions with STRING association score >= 900 (range 0-1000))
    • Drugs added to the network: TRUE
    • Three different views are shown
      • Complete protein-protein interaction network, also showing proteins with no known interactions
      • Network community structures, as detected by the fast greedy modularity optimization algorithm by Clauset et al.
      • Network centrality/hub scores pr. node, as measured by Kleinberg’s score


  • Network legend:
    • Target set proteins are shaped as circles, other interacting proteins are shaped as rectangles (note that sizes of nodes do not carry any value), drugs are shaped as diamonds
    • Tumor suppressor genes (annotated from CancerMine) are HIGHLIGHTED IN RED
    • Proto-oncogenes (annotated from CancerMine) are HIGHLIGHTED IN GREEN
    • Genes predicted to have a dual role as proto-oncogenes/tumor suppressors (annotated from CancerMine) are HIGHLIGHTED IN BLACK
    • Targeted cancer drugs (from Open Targets Platform):
      • Compounds in late (3-4) clinical phases are HIGHLIGHTED IN ORANGE
      • Compounds in early (1-2) clinical phases are HIGHLIGHTED IN PURPLE


  • Use the mouse to zoom in/out, alter the position of nodes, mouse-over edges and nodes to view gene names/drug mechanism of actions (with indications)/interaction scores



Complete network



Network communities



Network hubs




Ligand-receptor interactions


  • Using data from the CellChatDB resource, we are here interrogating ligand-receptor interactions for members of the query set. Putative interactions are displayed along three different axes with respect to cell-cell comunication:

    1. Secreted Signaling (Paracrine/autocrine signaling)
    2. ECM-Receptor (extracellular matrix-receptor interactions)
    3. Cell-Cell Contact



Secreted Signaling



  •   NO pair of genes in the queryset are involved in ligand-receptor interactions (CellChatDB - Secreted signalling)  




ECM-Receptor



  •   NO pair of genes in the queryset are involved in ligand-receptor interactions (CellChatDB - ECM Receptor)  




Cell-Cell Contact



  •   NO pair of genes in the queryset are involved in ligand-receptor interactions (CellChatDB - Cell-Cell Contact)  




Tumor aberration frequencies



SNVs/InDels - oncoplots

  • Frequency of somatic SNVs/InDels in the query set genes (top mutated) are illustrated with oncoplots
  • Query set mutation frequencies are sorted by type of diagnosis (i.e. cancer subtypes)
  • The frequency of transitions/transversions is also shown per sample



Breast

Colon/Rectum

Lung

Skin

Esophagus/Stomach

Cervix

Prostate

Ovary/Fallopian Tube

Uterus

Pancreas

Soft Tissue

Myeloid

CNS/Brain

Liver

Kidney

Lymphoid

Head and Neck

Biliary Tract

Bladder/Urinary Tract

Pleura

Thyroid

SNVs/InDels - recurrent variants


  • Browse recurrent, protein-coding somatic SNVs/InDels from TCGA in the query set
    • Variants are listed as one record per tissue/site, in effect making the same variant occurring in multiple rows
    • Only variants with a site-specific frequency >= 2 are shown
    • Variants can be filtered based on various properties, e.g. Site/tissue variant recurrence, as well as overall recurrence across all tumor sites (column Pancancer variant recurrence)
    • Notably, each variant have been annotated/classified with a
      1. loss-of-function status, based on the LOFTEE plugin in VEP
      2. Status as  somatic mutation hotspots  in cancer, according to cancerhotspots.org. Format: <GENE>|<CODON>|<Q-VALUE>
  • Top 2,500 recurrent variants are listed here (all variants are listed in the Excel output of oncoEnrichR)






Copy number alterations

  • Genes targeted by somatic copy number alterations (sCNAs) in tumor samples have been retrieved from TCGA, where copy number state have been estimated with GISTIC
  • Gene aberration frequency are plotted in heatmaps across two categories of mutation types
    1.   sCNA - amplifications  
    2.   sCNA - homozygous deletions  
  • The values in the heatmaps reflect the percent of all tumor samples pr. primary site with the gene amplified/lost (percent_mutated)
    • Genes in the heatmap are ranked according to alteration frequency across all sites (i.e. pancancer), limited to the top 75 genes in the query set
  • Frequencies across all subtypes per primary site are listed in an interactive table
    • Only including genes that are aberrant in >= 1 percent of samples for a given tumor type/subtype
    • Limited here to top 2,500 gene-subtype frequencies (all aberration frequencies are listed in the Excel output of oncoEnrichR)










Tumor co-expression

  • Using RNA-seq data from ~9,500 primary tumor samples in TCGA, a co-expression correlation matrix (Pearson rank correlation coefficient) was calculated, indicating pairs of genes that have their expression patterns correlated in tumors
  • Here, we are showing, across the main primary tumor sites in TCGA:
    • Tumor suppressor genes, proto-oncogenes or cancer driver genes with a strong/very strong (r >= 0.7 or r <= -0.7 ) correlation to genes in the query set
    • Here, a maximum of 2500 associations are shown per correlation direction (the complete set are listed in the Excel output of oncoEnrichR)



Positive correlation




Negative correlation






Prognostic associations

Gene expression associations - Human Protein Atlas

  • Based on data from the Human Protein Atlas - Pathology Atlas, we are here listing significant results from correlation analyses of mRNA expression levels of human genes in tumor tissue and the clinical outcome (survival) for ~8,000 cancer patients (TCGA)
  • All correlation analyses have been performed in a gene-centric manner, and associations are only shown for genes in the query set. We separate between
    •   Favorable associations   : high expression of a given gene is associated with better survival
    •   Unfavorable associations   : high expression of a given gene is associated with worse survival
  • Strength of associations are provided through p-values (only associations with a p-value< = 0.001 are provided), in addition we provide a percentile rank for associations considering
    1. all significant (p-value <= 0.001) associations across all tumor sites (column percentile_rank_all), and
    2. only significant (p-value <= 0.001) associations found in the same tumor site (column percentile_rank_site)



Favorable associations




Unfavorable associations







Genetic determinants of survival in cancer

  • Based on data recently calculated by Smith et al., bioRxiv, 2021, we here show the relative prognostic implications of genes in the query set, for different genetic features ( here limited to expression, mutation, methylation, and CNA).
    • The data provided by Smith et al. was harvested through analysis of TCGA datasets, in which Cox proportional hazards models were generated linking the expression, copy number, methylation, or mutation status of every gene in the genome with patient outcome in different cancer types profiled by the TCGA.
    • Kaplan-Meier curves were generated by dividing patients into two groups and comparing the survival times between each group.
      • For RNA-Seq, the division is made based on the mean expression of the feature.
      • For CNAs, the division is made based on the mean copy number of the feature.
      • For methylation, the division is made based on the mean methylation level of the feature.
      • For mutations, the division is made between patients who have a mutation (protein-coding only) in a gene and patients who lack mutations in a gene.
    • Each gene has been attributed, for each genetic feature, with a Z-score (Wald statistic), indicating the relative strength of association of a particular feature to either
      •  Death (positive scores)   , or
      •  Survival (negative scores)  
    • Shown here are four heatmaps (CNA, expression, mutation, methylation) with prognostic Z-scores for members of the query set (limited to the top 100). Tumor type cohorts are designated with TCGA study type abbreviations (e.g. COAD = Colon Adenocarcinoma, BRCA = Breast Invasive Carcinoma etc.).


Mutation





Expression





CNA





Methylation





Synthetic lethality


  • Using recently published predictions on synthetic lethal interactions in human cancer cell lines (De Kegel et al., Cell Syst., 2021), we here show whether members of the query set are found among these interactions.
  • Note that predictions in the study by Kegel et al. are provided for human gene paralogs only
  • In the tables below, predictions below the 50% percentile are ignored (the complete set can be found in the Excel output of oncoEnrichR). Additional properties of each predicted pair of interactors are included for filtering, including:
    • Prediction score and percentile
    • Percent sequence identity between the genes
    • Size of gene paralog family
  • The higher the prediction score, the more confident the prediction is with respect to a synthetic lethality interaction.



Predicted synthetic lethality interactions


Both pair members in query set






Single pair member in query set






Gene fitness scores

  • In Project Score, systematic genome-scale CRISPR/Cas9 drop-out screens are performed in a large number of highly-annotated cancer models to identify genes required for cell fitness in defined molecular contexts
  • Here, we are showing, across the main human tissue types:
    • Genes in the query set that are annotated with a statistically significant effect on cell fitness in any of the screened cancer cell lines (fitness score is here considered a quantitative measure of the reduction of cell viability elicited by a gene inactivation, via CRISPR/Cas9 targeting). The fitness score is computed based on the BAGEL and CRISPRCleanR algorithms.
  • Settings
    • Maximum loss-of-fitness score pr. gene (BAGEL - scaled Bayes Factor): -2



Loss-of-fitness distribution




Loss-of-fitness table

  • A maximum of n = 2000 tissue-specific-cell-lines-per-gene are shown in the table below (the complete set of genes, including the loss-of-fitness scores per cell line, can be found in the Excel output of oncoEnrichR)




Target priority scores

  • Promising candidate therapeutic targets are indicated through target priority scores. Target priority scores are based on integration of CRISPR knockout gene fitness effects with genomic biomarker and patient data (Behan et al., Nature, 2019). All genes are assigned a target priority score between 0 – 100 from lowest to highest priority. In the heatmap shown below, genes in the query set are ranked according to their respective priority scores across all cancers (i.e. Pancancer), limited to the top 100 candidates.





Documentation

Annotation resources

The analysis performed in the oncoEnrichR report is based on the following main tools and knowledge resources:

  • Software
    • oncoEnrichR - R package for functional interrogation of genesets in the context of cancer (v1.0.9)
    • clusterProfiler - R package for comparing biological themes among gene clusters (v4.2.2)
    • tissueEnrich - R package used to calculate enrichment of tissue-specific genes in a set of input genes (v1.14.0)
    • oncoPhenoMap - Crossmapped phenotype ontologies for the oncology domain (v0.3.1)
    • visNetwork - R package for network visualization using vis.js library (v2.1.0)
    • rmarkdown - R package for conversion of Markdown documents into a variety of formats. (v2.13)

  • Databases/datasets
    • Omnipath - Database of molecular biology prior knowledge: gene regulatory interactions, enzyme-PTM relationships, protein complexes, protein annotations etc. (v3.2.8/OmnipathR)
    • hu.MAP - Human Protein Complex Map (v2.0)
    • dorothea - Gene set resource containing signed transcription factor (TF) - target interactions (v1.4.2)
    • STRING - Protein-protein interaction database (v11.5)
    • GENCODE - High quality reference gene annotation and experimental validation (v39)
    • TCGA - The Cancer Genome Atlas - Tumor gene expression and somatic DNA aberrations (v31.0 (October 29th 2021))
    • UniProtKB - Comprehensive resource of protein sequence and functional information (v2021_04)
    • NetPath - Manually curated resource of signal transduction pathways in humans (v1 (2010))
    • EFO - Experimental Factor Ontology (v3.40.0)
    • DiseaseOntology - Human Disease Ontology (2022-03-02)
    • COMPPI - Compartmentalized protein-protein interaction database (v2.1.1 (Oct 2018))
    • WikiPathways - A database of biological pathways maintained by and for the scientific community (20220310)
    • MSigDB - Molecular Signatures Database - collection of annotated gene sets (v7.5.1 (Jan 2022))
    • REACTOME - Manually curated and peer-reviewed pathway database (v78 (MSigDB v7.5.1))
    • CellChatDB - Multimeric ligand-receptor complexes (v1 (2021))
    • GeneOntology - Knowledgebase that contains the largest structural source of information on the functions of genes (Jan 2022 (MSigDB v7.5.1))
    • KEGG - Collection of manually drawn pathway maps representing our knowledge on the molecular interaction, reaction and relation networks (20220324)
    • CancerMine - Literature-mined database of tumor suppressor genes/proto-oncogenes (v43 - 20220221)
    • NCG - Network of cancer genes - a web resource to analyze duplicability, orthology and network properties of cancer genes (v7.0)
    • Human Protein Atlas - Knowledge resource on human proteins in relation to tissue/cell type specificity and cancer prognosis (v21.0 - 20211118)
    • Genotype-Tissue Expression (GTEx) project - Ongoing effort to build a comprehensive public resource to study tissue-specific gene expression and regulation (v8)
    • Project Score - Database with systematic genome-scale CRISPR/Cas9 drop-out screens in a large number of highly-annotated cancer models (v2 - July 2021)
    • Open Targets Platform - Comprehensive and robust data integration for target-disease associations (2022.02)



References

Ashburner, M, C A Ball, J A Blake, D Botstein, H Butler, J M Cherry, A P Davis, et al. 2000. “Gene Ontology: Tool for the Unification of Biology. The Gene Ontology Consortium.” Nat. Genet. 25 (1): 25–29. http://dx.doi.org/10.1038/75556.
Behan, Fiona M, Francesco Iorio, Gabriele Picco, Emanuel Gonçalves, Charlotte M Beaver, Giorgia Migliardi, Rita Santos, et al. 2019. “Prioritization of Cancer Therapeutic Targets Using CRISPR-Cas9 Screens.” Nature 568 (7753): 511–16. http://dx.doi.org/10.1038/s41586-019-1103-9.
Clauset, Aaron, M E J Newman, and Cristopher Moore. 2004. “Finding Community Structure in Very Large Networks.” Phys. Rev. E Stat. Nonlin. Soft Matter Phys. 70 (6 Pt 2): 066111. http://dx.doi.org/10.1103/PhysRevE.70.066111.
De Kegel, Barbara, Niall Quinn, Nicola A Thompson, David J Adams, and Colm J Ryan. 2021. “Comprehensive Prediction of Robust Synthetic Lethality Between Paralog Pairs in Cancer Cell Lines.” Cell Syst 12 (12): 1144–1159.e6. http://dx.doi.org/10.1016/j.cels.2021.08.006.
Drew, Kevin, John B Wallingford, and Edward M Marcotte. 2021. hu.MAP 2.0: Integration of over 15,000 Proteomic Experiments Builds a Global Compendium of Human Multiprotein Assemblies.” Mol. Syst. Biol. 17 (5): e10016. http://dx.doi.org/10.15252/msb.202010016.
Garcia-Alonso, Luz, Christian H Holland, Mahmoud M Ibrahim, Denes Turei, and Julio Saez-Rodriguez. 2019. “Benchmark and Integration of Resources for the Estimation of Human Transcription Factor Activities.” Genome Res. 29 (8): 1363–75. http://dx.doi.org/10.1101/gr.240663.118.
Giurgiu, Madalina, Julian Reinhard, Barbara Brauner, Irmtraud Dunger-Kaltenbach, Gisela Fobo, Goar Frishman, Corinna Montrone, and Andreas Ruepp. 2019. CORUM: The Comprehensive Resource of Mammalian Protein Complexes—2019.” Nucleic Acids Res. 47 (D1): D559–63. https://academic.oup.com/nar/article-abstract/47/D1/D559/5144160.
Hanahan, Douglas, and Robert A Weinberg. 2011. “Hallmarks of Cancer: The Next Generation.” Cell 144 (5): 646–74. http://dx.doi.org/10.1016/j.cell.2011.02.013.
Hart, Traver, and Jason Moffat. 2016. BAGEL: A Computational Framework for Identifying Essential Genes from Pooled Library Screens.” BMC Bioinformatics 17 (April): 164. http://dx.doi.org/10.1186/s12859-016-1015-8.
Iorio, Francesco, Fiona M Behan, Emanuel Gonçalves, Shriram G Bhosle, Elisabeth Chen, Rebecca Shepherd, Charlotte Beaver, et al. 2018. “Unsupervised Correction of Gene-Independent Cell Responses to CRISPR-Cas9 Targeting.” BMC Genomics 19 (1): 604. http://dx.doi.org/10.1186/s12864-018-4989-y.
Jain, Ashish, and Geetu Tuteja. 2019. TissueEnrich: Tissue-Specific Gene Enrichment Analysis.” Bioinformatics 35 (11): 1966–67. http://dx.doi.org/10.1093/bioinformatics/bty890.
Jin, Suoqin, Christian F Guerrero-Juarez, Lihua Zhang, Ivan Chang, Raul Ramos, Chen-Hsiang Kuan, Peggy Myung, Maksim V Plikus, and Qing Nie. 2021. “Inference and Analysis of Cell-Cell Communication Using CellChat.” Nat. Commun. 12 (1): 1088. http://dx.doi.org/10.1038/s41467-021-21246-9.
Joshi-Tope, G, M Gillespie, I Vastrik, P D’Eustachio, E Schmidt, B de Bono, B Jassal, et al. 2005. “Reactome: A Knowledgebase of Biological Pathways.” Nucleic Acids Res. 33 (Database issue): D428–32. http://dx.doi.org/10.1093/nar/gki072.
Kandasamy, Kumaran, S Sujatha Mohan, Rajesh Raju, Shivakumar Keerthikumar, Ghantasala S Sameer Kumar, Abhilash K Venugopal, Deepthi Telikicherla, et al. 2010. NetPath: A Public Resource of Curated Signal Transduction Pathways.” Genome Biol. 11 (1): R3. http://dx.doi.org/10.1186/gb-2010-11-1-r3.
Kanehisa, M, and S Goto. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes.” Nucleic Acids Res. 28 (1): 27–30. http://dx.doi.org/10.1093/nar/28.1.27.
Kelder, Thomas, Martijn P van Iersel, Kristina Hanspers, Martina Kutmon, Bruce R Conklin, Chris T Evelo, and Alexander R Pico. 2012. WikiPathways: Building Research Communities on Biological Pathways.” Nucleic Acids Res. 40 (Database issue): D1301–7. http://dx.doi.org/10.1093/nar/gkr1074.
Kleinberg, Jon M. 1999. “Authoritative Sources in a Hyperlinked Environment.” J. ACM 46 (5): 604–32. http://doi.acm.org/10.1145/324133.324140.
Koscielny, Gautier, Peter An, Denise Carvalho-Silva, Jennifer A Cham, Luca Fumis, Rippa Gasparyan, Samiul Hasan, et al. 2017. “Open Targets: A Platform for Therapeutic Target Identification and Validation.” Nucleic Acids Res. 45 (D1): D985–94. http://dx.doi.org/10.1093/nar/gkw1055.
Mermel, Craig H, Steven E Schumacher, Barbara Hill, Matthew L Meyerson, Rameen Beroukhim, and Gad Getz. 2011. Gistic2.0 Facilitates Sensitive and Confident Localization of the Targets of Focal Somatic Copy-Number Alteration in Human Cancers.” Genome Biol. 12 (4): R41. http://dx.doi.org/10.1186/gb-2011-12-4-r41.
Petryszak, Robert, Maria Keays, Y Amy Tang, Nuno A Fonseca, Elisabet Barrera, Tony Burdett, Anja Füllgrabe, et al. 2016. “Expression Atlas Update—an Integrated Database of Gene and Protein Expression in Humans, Animals and Plants.” Nucleic Acids Res. 44 (D1): D746–52. https://academic.oup.com/nar/article-abstract/44/D1/D746/2502589.
Smith, Joan C, and Jason M Sheltzer. 2021. “Genome-Wide Identification and Analysis of Prognostic Features in Human Cancers.” bioRxiv. https://www.biorxiv.org/content/10.1101/2021.06.01.446243v1.
Subramanian, Aravind, Pablo Tamayo, Vamsi K Mootha, Sayan Mukherjee, Benjamin L Ebert, Michael A Gillette, Amanda Paulovich, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proc. Natl. Acad. Sci. U. S. A. 102 (43): 15545–50. http://dx.doi.org/10.1073/pnas.0506580102.
Türei, Dénes, Tamás Korcsmáros, and Julio Saez-Rodriguez. 2016. OmniPath: Guidelines and Gateway for Literature-Curated Signaling Pathway Resources.” Nat. Methods 13 (12): 966–67. http://dx.doi.org/10.1038/nmeth.4077.
Uhlen, Mathias, Cheng Zhang, Sunjae Lee, Evelina Sjöstedt, Linn Fagerberg, Gholamreza Bidkhori, Rui Benfeitas, et al. 2017. “A Pathology Atlas of the Human Cancer Transcriptome.” Science 357 (6352). http://dx.doi.org/10.1126/science.aan2507.
Uhlén, Mathias, Linn Fagerberg, Björn M Hallström, Cecilia Lindskog, Per Oksvold, Adil Mardinoglu, Åsa Sivertsson, et al. 2015. “Proteomics. Tissue-Based Map of the Human Proteome.” Science 347 (6220): 1260419. http://dx.doi.org/10.1126/science.1260419.
Von Mering, Christian, Lars J Jensen, Berend Snel, Sean D Hooper, Markus Krupp, Mathilde Foglierini, Nelly Jouffre, Martijn A Huynen, and Peer Bork. 2005. STRING: Known and Predicted Protein–Protein Associations, Integrated and Transferred Across Organisms.” Nucleic Acids Res. 33 (suppl_1): D433–37. https://academic.oup.com/nar/article-abstract/33/suppl_1/D433/2505197.
Yu, Guangchuang, Li-Gen Wang, Yanyan Han, and Qing-Yu He. 2012. “clusterProfiler: An R Package for Comparing Biological Themes Among Gene Clusters.” OMICS 16 (5): 284–87. http://dx.doi.org/10.1089/omi.2011.0118.



DISCLAIMER:The information contained in this report is more of an exploratory procedure than a statistical analysis. The final interpretation, i.e. putting the results in the context of the study/screen, should be made by biologists/analysts rather than by any tool.