Published June 3, 2026 | Version v2
Dataset Open

Metabolomics data for context-aware screening and decoding of synthetic lethality with single-cell foundation and large language models

  • 1. ROR icon ShanghaiTech University

Description

Metabolomic profiling was performed using an ACQUITY UPLC I-Class Plus system (Waters Corporation, Milford, MA, USA) coupled with a Q Exactive HF mass spectrometer equipped with a heated electrospray ionization source (Thermo Fisher Scientific, Waltham, MA, USA). Chromatographic separation was achieved on an ACQUITY UPLC HSS T3 column . 

Mass spectrometry data were acquired in both positive and negative ion modes. Raw LC--MS data were processed using Progenesis QI (version 2.3, Nonlinear Dynamics, Newcastle, UK) for baseline correction, peak detection, alignment, and normalization. Metabolite identification was performed based on accurate mass, MS/MS fragmentation patterns, and isotope distribution by searching against multiple metabolite databases, including the Human Metabolome Database (HMDB), METLIN, LipidMaps, and in-house spectral libraries.

The processed data matrix was imported into R for multivariate statistical analysis. Principal component analysis (PCA) was performed to assess the overall distribution and analytical stability. Orthogonal partial least-squares discriminant analysis (OPLS-DA) and partial least-squares discriminant analysis (PLS-DA) were used to identify differential metabolites between groups. Model robustness was evaluated using seven-fold cross-validation and 200 permutation tests. Metabolites with a variable importance in projection (VIP) score $>1.0$ and a two-tailed Student's $t$-test $P<0.05$ were considered significantly altered. Identified differential metabolites were further subjected to pathway enrichment analysis using the KEGG database.

Files

Files (7.1 GB)

Name Size Download all
md5:763211a62d409a29836ef992cb420143
7.1 GB Download

Additional details

Dates

Available
2027-03-09