Published August 28, 2023 | Version v3
Journal article Open

Emergence of power law distributions in protein-protein interaction networks through study bias

  • 1. Biomedical Network Science Lab, Department Artificial Intelligence in Biomedical Engineering, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
  • 2. Department of Experimental Oncology, IEO European Institute of Oncology IRCCS, Milan, Italy
  • 3. Department of Computer Science, TU Braunschweig, Braunschweig, Germany
  • 4. Department of Computer Science, TU Braunschweig, Braunschweig, Germany - Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
  • 5. Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany

Description

Protein-protein interaction (PPI) networks have been found to be power-law-distributed, \ie, in observed PPI networks, the fraction of nodes with degree k often follows a power law (PL) distribution \(k^{-\alpha}\). It has been hypothesized that, during evolution, this property has emerged due to proteins encoded by paralogs copying the interaction partners of the proteins encoded by the duplicated genes. However, the experimental procedures used to detect PPIs are known to be heavily affected by technical and study bias. For instance, proteins known to be involved in cancer are often heavily overstudied and proteins used as baits in large scale experiments such as yeast two-hybrid (Y2H) or affinity purification-mass spectrometry (AP-MS) tend to have many false-positive interaction partner. This raises the question whether PL distributions in observed PPI networks could be explained by these biases alone. Here, we address this question using statistical analyses of the degree distributions of 1,427 observed PPI networks of controlled provenance as well as simulation studies. Our results indicate that study bias and technical bias can indeed largely explain the fact that observed PPI networks tend to be PL-distributed. This implies that it is problematic to derive hypotheses about the degree distribution and emergence of the true biological interactome from the PL distributions in observed PPI networks.

Files

databases.zip

Files (769.5 MB)

Name Size Download all
md5:5520ed8122187cbf0a592ca54d534233
769.5 MB Preview Download