Published February 16, 2024 | Version v2
Dataset Open

iGEMME Missense Mutational Effect Predictions for Entire Human Proteome

  • 1. ROR icon Sorbonne Université
  • 2. ROR icon Centre International de Recherche en Infectiologie

Description

This dataset contains iGEMME single point mutation predictions of about ~19000 human proteins. In iGEMME predictions, only evolutionary data coming from multiple sequence alignment files is used. 

Description of the data and file structure

This dataset contains iGEMME predictions for all human proteins. 

Data of each human protein is in a folder named after its uniprotID. Inside uniprotID folder, there is a subfolder called results that contain all input and output. An example results folder for uniprotID A0A0B4J245 will contain the following files:

  1. Raw igemme predictions (output file): A0A0B4J245_normPred_evolCombi_igemme.txt

  2. Ranksorted (between 0-1) igemme predictions in csv format (output file): A0A0B4J245_normPred_evolCombiTransposedRanksorted_igemme.csv

  3. Colabfold MSA file (input file): aliA0A0B4J245.fasta

  4. JET2 file containing JET scores for each amino acid (output file) : A0A0B4J245_jet_igemme.res

  5. Configuration file containing default parameters (output file): default.conf

  6. Log file (output file): igemme.log

Files

iGEMME-protein-list-v2.txt

Files (15.8 GB)

Name Size Download all
md5:2925bdd887c6c134593a374b613ab333
138.4 kB Preview Download
md5:bd5e2d17f245c82bcd5915f2643f15f9
15.8 GB Download

Additional details

Related works

Is published in
Publication: 10.1186/s13059-025-03581-y (DOI)

Dates

Available
2024-02-16