Graph-pMHC: Graph Neural Network Approach to MHC Class II Peptide Presentation and Antibody Immunogenicity
Description
Antigen presentation on MHC Class II (pMHCII presentation) plays an essential role in the adaptive immune response to extracellular pathogens and cancerous cells. But it can also reduce the efficacy of large-molecule drugs by triggering an anti-drug response. Significant progress has been made in pMHCII presentation modeling due to the collection of large-scale pMHC mass spectrometry datasets (ligandomes) and advances in machine learning. Here, we develop graph-pMHC, a graph neural network approach to predict pMHCII presentation. We derive adjacency matrices for pMHCII using Alphafold2-multimer, and address the peptide-MHC binding groove alignment problem with a simple graph enumeration strategy. We demonstrate that graph-pMHC dramatically outperforms methods with suboptimal inductive biases, such as the multilayer-perceptron-based NetMHCIIpan-4.0 (+20.17% absolute average precision). Finally, we create an antibody drug immunogenicity dataset from clinical trial data, and develop a method for measuring anti-antibody immunogenicity risk using pMHCII presentation models. Our model increases ROC AUC by 2.57% compared to just filtering peptides by hits in OASis alone for predicting antibody drug immunogenicity.
NOTE!!
It's been brought to my attention that I accidentally shuffled the graph-pmhc and netmhciipan predictions on the antibody immunogenicity dataset (AB_df_w_preds), zenodo is not allowing me to add a new version. Besides these prediction columns the data is good, so the ada labels for the antibodies is fine. The graph-pmhc and netmhciipan predictions can be derived from AB_df_all_preds_w_preds with code like this:
df.groupby('Antibody').apply(lambda x: sum((x['Peptide Num OAS Subjects']<23)&(x[column]>0))).values
Where df is AB_df_all_preds_w_preds loaded in pandas, and column is the prediction column (graph-pmhc or netmhciipan) of interest. Sorry about the error!!