There is a newer version of the record available.

Published May 21, 2023 | Version 1.0.0
Dataset Open

CPT-1 whole-proteome variant effect prediction

Description

Cross-protein transfer learning for variant effect prediction

This repository contains the variant effect preditions of CPT-1 for 18,602 human proteins, initially released with the manuscript "Cross-protein transfer learning substantially improves zero-shot prediction of disease variant effects". The proteins are split into three files.

CPT1_score_EVE_set.zip: Proteins in the EVE set (Frazer et al., 2021)

CPT1_score_no_EVE_set_1.zip & CPT1_score_no_EVE_set_2.zip: Proteins not in the EVE set. Predictions for these proteins use imputed values for features depending on the EVE MSA.

 

Citation

Jagota, M.*, Ye, C.*, Rastogi, R., Albors, C., Koehl, A., Ioannidis, N., and Song, Y.S.†
"Cross-protein transfer learning substantially improves zero-shot prediction of disease variant effects", bioRxiv (2022)

*These authors contributed equally to this work.
†To whom correspondence should be addressed: yss@berkeley.edu

DOI: https://doi.org/10.1101/2022.11.15.516532

Files

CPT1_score_EVE_set.zip

Files (2.4 GB)

Name Size Download all
md5:3966f7b8c8f87a55e10953228e04d74f
482.9 MB Preview Download
md5:d8b1b0d4606a96e5aa6f7fc9f1d1932e
1.2 GB Preview Download
md5:6de0a3c13f6fcd6bcf7864dd4d21ead9
689.2 MB Preview Download