Prospective ICH Q2(R2)-aligned total-error validation of label-free untargeted proteomics for host cell protein quantification in biotherapeutics
Description
This repository contains the complete downstream statistical analysis pipeline, preprocessing scripts, and two peptide-level quantitative datasets (entrapment search results) supporting the prospective ICH Q2(R2)-aligned total-error validation of label-free ddaPASEF proteomics for host cell protein (HCP) quantification in biotherapeutic matrices.
Prospective ICH Q2(R2)-aligned total-error validation of label-free untargeted proteomics for host cell protein quantification in biotherapeutics
Somar Khalil, Jean-François Dierick, Pascal Bourguignon, and Michel Plisnier. Proteomes.
The pipeline implements:
- Dual entrapment database construction (shuffled and trimmed foreign-proteome) with empirical false discovery proportion estimation via bootstrap percentile bands and Wilson score confidence intervals.
- Deterministic greedy parsimony protein inference with unique-peptide constraints.
- Peptide-level quality control filtering (modified Z-score outlier removal, intraprotein intensity deviation screening, replicate CV gating).
- Hi3 label-free protein quantification with MassPREP response-factor calibration.
- Weighted least-squares calibration with HC3-robust inference.
- One-way random-effects ANOVA variance decomposition with Welch–Satterthwaite degrees of freedom adjustment.
- 95% beta-expectation and 95/95 content tolerance interval construction for aggregate total-error accuracy profiling.
- Abundance-stratified total-error analysis with nonparametric bootstrap-based stratum estimation for derivation of abundance-aware LLOQ and ULOQ.
All stochastic procedures use fixed random seeds. Validation experiments employed a SIL-HCP spike-in series in a NISTmAb matrix under a hierarchical replication design (7 concentration levels × 3 preparations × 3 injections).
Reproducibility of all numerical results, tables, and figures reported in the manuscript requires the processed peptide- and protein-level intensity matrices generated from the LC–MS/MS workflow described therein. Raw mass spectrometry data were acquired on a timsTOF Pro instrument and processed using SpectroMine v5.2 under fixed database-search parameters.
All analytical parameters, model specifications, and acceptance criteria were locked prior to execution of the validation campaign.
Execution instructions, dependency versions, and configuration details are provided in the README.
Files
README.md
Files
(90.9 MB)
| Name | Size | Download all |
|---|---|---|
|
md5:94f98477244c1b7cc4efc64a79dff167
|
6.1 MB | Preview Download |
|
md5:c2ccfa5ba1272636ea32eaf8b0f35e9e
|
48.7 MB | Preview Download |
|
md5:848c996baa48369c195f66fb4c59c708
|
78.9 kB | Preview Download |
|
md5:f4c683d5581ba3e3259a6347b0dd48a7
|
966.4 kB | Preview Download |
|
md5:df1d345ed181fbb736d18fc71f48ef8c
|
34.8 MB | Preview Download |
|
md5:6c5578deb5af7b9aa280e6bf06ade971
|
171.5 kB | Preview Download |
|
md5:23b252d4eba30d2c37090bb5878f6c5c
|
2.9 kB | Preview Download |
|
md5:375cc2e4fac812b550d92752da2c9d9d
|
3.5 kB | Preview Download |
|
md5:51c5fbc1eae45a68a2a7bb2dc741472c
|
687 Bytes | Download |
|
md5:527c93a26c851e0efb6efb9a0f40769b
|
21.0 kB | Preview Download |
|
md5:6b07d7aaaf7797b687809954419e3414
|
284 Bytes | Preview Download |
Additional details
Related works
- Is supplemented by
- Publication: 10.64898/2026.03.06.710150 (DOI)
Software
- Programming language
- Python