SC-AIR-BERT: A pre-trained single-cell model for predicting the antigen binding specificity of the adaptive immune receptor
Description
Accurately predicting the antigen binding specificity of adaptive immune receptors (AIRs), such as T-cell receptors (TCRs) and B-cell receptors (BCRs), is essential for discovering new immune therapies. However, the diversity of AIR chain sequences limits the accuracy of current prediction methods. In this study, we introduce SC-AIR-BERT, a pre-trained model that learns comprehensive sequence representations of paired AIR chains to improve binding specificity prediction. SC-AIR-BERT first learns the "language" of AIR sequences through self-supervised pre-training on a large cohort of paired AIR chains from multiple single-cell resources. The model is then fine-tuned with a multilayer perceptron (MLP) head for binding specificity prediction, employing the K-mer strategy to enhance sequence representation learning. Extensive experiments demonstrate the superior AUC performance of SC-AIR-BERT compared to current methods for TCR and BCR binding specificity prediction.
Files
SC-AIR-BERT-Zenodo.zip
Files
(2.5 GB)
Name | Size | Download all |
---|---|---|
md5:763e905a498741e6ff2e600a877bf5fc
|
2.5 GB | Preview Download |