An ultrasensitive method for analysis of viral spike N-glycoforms

Viruses can evade the host immune system by displaying numerous glycans on their surface “spike-proteins” that cover immune epitopes. We have developed an ultrasensitive “single pot” method to assess glycan occupancy and the extent of glycan processing from high-mannose to complex forms at each N-glycosylation site. Though aimed at characterizing glycosylation of viral spike proteins as potential vaccines, this method is applicable for analysis of site-specific glycosylation of any glycoprotein.

complex N-glycans when PNGase F deglycosylation is carried out in the presence of H2 18 O 93 (differentiating these sites from any deamidated Ns). Unoccupied NGS results in no (+0 Da) 94 residual mass on N. Using this new method (Fig. 1a), we achieved >99% amino acid sequence 95 coverage and identified all theoretically possible 27 NGSs from a single LC-MS/MS run of 0.5 96 µg of peptides generated from a starting material of 5 µg purified protein ( Fig. 1b and  97   Supplementary Figures 1a,b). We used semi-quantitative label-free analysis based on precursor 98 peak areas to calculate the proportion of N-glycan occupancy (unoccupied: complex: high-99 mannose/hybrid N-glycans) for each NGS. We reanalyzed the N-glycan microheterogeneity 100 pattern on BG505 SOSIP.664 12 HIV Env trimer from data obtained using our previous approach 3 101 and compared it with results using the method we describe here (Supplementary Figures 2). The 102 results with improved method were highly comparable to those using our original approach, in 103 spite of being processed differently and the samples being prepared at different times in different 104

laboratories. 105 106
Initial results demonstrated that the new approach is at least 18 times more sensitive than our 107 previous approach 3 even though it uses a simpler and shorter workflow. To evaluate the limit of 108 sensitivity of our method, we processed progressively decreasing amounts of starting material, 109 ranging from 1 µg to 5 ng. We observed that a single LC-MS/MS run with 1 µg of starting 110 material was enough to cover >95% of the amino acid sequence and all NGS (Fig. 2a), which is 111 90 times more sensitive than our previous approach 3 . Major differences in microheterogeneity at 112 each NGS were generally observed when we started with <100 ng material (Supplementary 113 Figure 1c). This is likely due to low sampling as evidenced by a decrease in amino acid 114 sequence and NGS coverage (Fig. 2a), as well as the absolute number of identified peptides 115 representing each NGS (Supplementary Figure 1d). 116

117
The improved method is agnostic to mass spectrometry platform (Fig. 2b). A timsTOF Pro mass-118 spectrometer coupled to an Evosep One HPLC (timsTOF/Evosep) 13 was used to achieve >99% 119 sequence coverage and identification of all NGS using a single LC-MS/MS run with 0.5 µg of 120 starting material and an 88-minute LC gradient ( Fig. 1c and Supplementary Figures 3a,b). 121 Thus, the sensitivity of our method on this platform was 180 times higher than our previous 122 N+203 peptides when peptide sampling per NGS decreases due to less starting material) (Fig. 2b  134 and Supplementary Figures 1c,d and 3c,d). When enough sampling per NGS is achieved, these 135 variations are diminished (Figs. 1b,c). 136 We attribute improvements in our method to efficient sample handling strategies. We observed 138 reduced sequence coverage if the sample was from the digestion of a small amount of starting 139 material rather than an equal aliquot from a larger sample (Supplementary Figure 3e). We infer 140 that the sensitivity differences are not occurring during LC-MS/MS, but that sample is being lost 141 to the reaction-tube surface (during reaction and lyophilization) and the proportion of loss is 142 more pronounced when we start with less material. The kinetics of the enzyme-substrate reaction 143 may also account for sensitivity differences since a more "crowded" reactant environment (low 144 reaction volumes) is expected to result in better reaction kinetics 15 . 145

146
The simplicity and high reproducibility of this procedure will allow for high-throughput analyses 147 of viral spike-proteins and for any glycoprotein whether produced recombinantly or purified 148 from natural sources. Results were highly comparable in the two LC-MS/MS platforms we used. The Proteinase K/deglycosylation method described above was followed, except PK was 204 replaced with trypsin and reactions were incubated overnight at 37ᵒC. Trypsin generated a lower total number of peptides than PK, but we obtained >95% sequence coverage, including 26 of 27 206 NGS (Supplementary Figure 5a). Variations in N-glycan microheterogeneity at certain NGS 207 may be explained by low sampling at these sites (N386, N392) or difference in cleavage-208 specificity between PK and trypsin (N88, N611) (Supplementary Figures 5b,c). Samples were analyzed on an Q Exactive HF-X mass spectrometer (Thermo). Samples were 213 injected directly onto a 25 cm, 100 μm ID column packed with BEH 1.7 μm C18 resin (Waters). 214 Samples were separated at a flow rate of 300 nL/min on an EASY-nLC 1200 (Thermo). Buffers 215 A and B were 0.1% formic acid in 5% and 80% acetonitrile, respectively. The following 216 gradient was used: 1-25% B over 160 min, an increase to 40% B over 40 min, an increase to 217 17.026549 Q) were considered differential modifications. Data was searched with 50 ppm 253 precursor ion tolerance and 50 ppm fragment ion tolerance. Identified proteins were filtered 254 using DTASelect2 21 and utilizing a target-decoy database search strategy to limit the false 255 discovery rate to 1%, at the spectrum level 22 . A minimum of 1 peptide per protein and no tryptic 256 end (or 1 tryptic end when treated with trypsin) per peptide were required and precursor delta 257 mass cut-off was fixed at 10 ppm for data acquired with Q Exactive HF-X or 20 ppm for data 258 acquired with timsTOF Pro. Statistical models for peptide mass modification (modstat) were 259 applied (trypstat was additionally applied for trypsin-treated samples). Census2 23 label-free 260 analysis was performed based on the precursor peak area, with a 10 ppm precursor mass 261 tolerance and 0.1 min retention time tolerance. "Match between runs" was used to find missing 262 peptides between runs for Q Exactive HF-X data (for timsTOF Pro data, reconstructed-MS1 263 based chromatograms combining isotope peaks for all triggered precursor ions were pre-264 generated, and then chromatograms were assigned to identified peptides for quantitative analysis, 265 without retrieving missing peptides).
where nngs is the number of groups (pepz) covering a particular NGS. 293 The standard error of mean of the proportion of each N-glycosylation state g ∈ G for a particular 294